# Cost Forecast Using Excel Regression Analysis

When preparing financial projections for a business plan part of the process is to be able to estimate fixed and variable costs. A cost forecast can be undertaken using various methods; however, one simple technique is to use Excel functions to perform linear (straight line) regression analysis.

Most costs can be linked to a cost driver such as headcount, revenue, labor hours, users or machine hours and have variable and fixed cost components (referred to as mixed costs). If a business has access to past cost data, either from its own management accounting information or from competitors accounts, using linear regression, it is possible to separate out the fixed and variable cost components.

## Mixed Cost Forecast

A mixed cost which has a linear relationship with its cost driver assumes that both the variable cost per unit and the fixed cost remain constant over the selected activity range.

The cost can be summarized by the following formula. In this formula activity units refers to units of the cost driver selected, e.g if the cost driver is labor hours then an activity unit is an hour of labor.

So for example, if a business uses 2,000 labor hours, has a variable cost per unit of 10.00, and a fixed cost of 15,000, the total cost can be calculated as follows.

```Cost forecast = Variable cost per unit x Activity units + Fixed cost
Cost forecast = 10.00 x 2,000 + 15,000 = 35,000
```

Because this is a linear relationship where the variable cost per unit and the fixed cost remain constant over the range of activity being considered, the same formula can be used to find the cost forecast at say 4,000 units as follows.

```Cost forecast = Variable cost per unit x Activity units + Fixed cost
Cost forecast = 10.00 x 4,000 + 15,000 = 55,000
```

## Cost Forecast Example Using Linear Regression

The purpose of the simple linear regression technique is to use a set of past data to find values for the variable cost per unit and the fixed cost in order that a cost forecast can be made based on any given number of activity units within the range being considered.

Suppose a business has costs which are dependent on the number of daily average users to its website. In this example the cost driver is the number of users.

The business has managed to collect past data for each of the last five accounting periods as follows.

Past data on number of users and costs
Users Cost
Period 1 108,000 7,600
Period 2 157,000 10,800
Period 3 171,000 10,000
Period 4 183,000 12,000
Period 5 234,000 14,300

The business now wants to see whether is can use this data to forecast future costs based on a forecast number of users.

### Step 1: Predictability of Past Data

The first step is to check the past data to see whether the cost can be predicted using the number of website users. This can be done using a statistical indicator of predictability known as the coefficient of determination for which the symbol is r2 (r squared).

In Excel this is calculated using the RSQ function as follows.

Using the past data shown above

```r2 = RSQ({7600,10800,10000,12000,14300},{108000,157000,171000,183000,234000})
r2 = 0.94
```

It should be noted that if the data is presented in Excel in two columns A (users) and B (cost) with one row for each period, then instead of entering numeric values as shown above, the Excel formula could be written as follows.

r2 = RSQ(B1:B5,A1:A5)

The value of r2 will be between 0 and 1. The higher the value the better the data fits the straight line relationship and the more predictable one set of data (cost) will be from the other (users). It should be noted that r2 does not imply cause and effect only that the two sets of data move predictably in relation to each other.

### Step 2: Variable Cost Per Unit

It is assumed in this example that the cost relationship is a straight line one (usually shown as y = ax + b).

Cost forecast (y) = Variable cost per unit (a) x Users (x) + Fixed cost (b)

The constant a, in this case the variable cost per unit, is known as the slope of the line and is calculated using the Excel SLOPE function, and the constant b, in this case the fixed cost, is known as the intercept (the cost when the activity is zero) and is calculated using the Excel INTERCEPT function.

Using the Excel SLOPE function the variable cost per unit is calculated as follows.

```Variable cost per unit = SLOPE({7600,10800,10000,12000,14300},{108000,157000,171000,183000,234000})
Variable cost per unit = 0.0528
```

It should be noted that if the data is presented in Excel in two columns A (users) and B (cost) with one row for each period then instead of entering numeric values as shown above, the Excel formula could be written as follows.

Variable cost per unit = SLOPE(B1:B5,A1:A5)

### Step 3: Fixed Cost

Using the Excel INTERCEPT function the fixed cost is now calculated as follows.

```Fixed cost = INTERCEPT({7600,10800,10000,12000,14300},{108000,157000,171000,183000,234000})
Fixed cost = 1,938
```

Again, it should be noted that if the data is presented in Excel in two columns A (users) and B (cost) with one row for each period then instead of entering numeric values as shown above, the Excel formula could be written as follows.

Fixed costs = INTERCEPT(B1:B5,A1:A5)

### The Cost Forecast Function

Using regression analysis the past data has been used to calculate values for the variable cost per unit and the fixed cost. Our cost forecast equation using these two values can be stated as follows.

```Cost forecast = Variable cost per unit x Users + Fixed cost
Cost forecast = 0.0528 x Users + 1,938```

We now have a cost forecast equation which can be used to estimate the costs for future periods based on our forecast user numbers. For example if the number of users is forecast to be 250,000 in a future period then the cost forecast is determined as follows.

```Cost forecast = 0.0528 x Users + 1,938
Cost forecast = 0.0528 x 250,000 + 1,938 = 15,138```

This process can be repeated for each accounting period of the forecast.

## Cost Forecast Example 2

Suppose a business has the following data relating to labor costs and labor hours (the cost driver).

Past data on number of labor hours and costs
Hours Cost
Period 1 10,000 105,000
Period 2 12,000 121,600
Period 3 12,500 130,375
Period 4 13,000 130,050
Period 5 13,200 131,480

The regression analysis is carried out as before using the Excel functions as follows.

```Step 1: Predictability
r2 = RSQ({105000,121600,130375,130050,131480},{10000,12000,12500,13000,13200})
r2 = 0.97
There is a high level of predictability so it is worthwhile continuing with the regression analysis.

Step 2: Variable cost per unit
Variable cost per unit = SLOPE({105000,121600,130375,130050,131480},{10000,12000,12500,13000,13200})
Variable cost per unit = 8.5593

Step 3: Fixed cost
Fixed cost = INTERCEPT({105000,121600,130375,130050,131480},{10000,12000,12500,13000,13200})
Fixed cost = 19,791
```

The cost forecast equation can now be stated as follows.

```Cost forecast = Variable cost per unit x Users + Fixed cost
Cost forecast = 8.5593 x Labor hours + 19,791
```

We now have a cost equation which can be used to forecast the cost for future periods based on estimated labor hours. For example if the number of labor hours is estimated to be 14,500 in a future period then the cost forecast is determined as follows.

```Cost forecast = 8.5593 x Labor hours + 19,791
Cost forecast = 8.5593 x 14,500 + 19,791 = 143,901```

## Summary

Linear regression analysis is a simple technique used to forecast costs for use in financial projections.

The technique should be used with the following in mind.

• The more data points available the more accurate the cost equation generated.
• Determine that there is a high level of predictability (denoted by r2) before proceeding.
• Predictability does not mean there is a cause and effect relationship between the data sets.
• The analysis assumes a linear relationship meaning that the variable cost per unit and the fixed cost must be constant over the activity range considered. Do not use past data from one range of data and then try to use the cost equation generated to forecast over a substantially different range as the linear relationship might not hold.