Network Traffic Projection Method

March 20, 2003

 

Overview

 

The network traffic projection method described below is easy to use and remarkably accurate.  In a nutshell the method uses historical network traffic and headcount data to project future traffic growth.  Those two sources of data are used to derive the traffic per user.  The average growth or decline in traffic per user is applied to future headcount projections and the result is a fairly reasonable idea of future traffic levels.  The first year this method was employed at the large corporation I work at it came within 3.8% of actual traffic levels, on a network moving more than 100,000 Terabytes a year (yes it is true that the number just mentioned could be also be called 100 Petabytes, but for some reason many people are uncomfortable with that term).

 

This approach altered how we think about traffic loads at my company.  Empirical evidence from the past seemed to imply that headcount was not that big a factor in projecting future network loads because use was growing exponentially regardless of hiring or firing.  This was the case where I work from the early 90’s to 2000; network traffic was always increasing regardless of employment levels.  Traffic growth started to change in 2001, which is when this method was devised.  Our traffic levels started to level off and there were announcements of some pretty heavy layoffs coming.  At the same time my company had reached computing saturation, after years of expanding the network in every direction all the time, everyone who needed a networked computer finally had one and most if not all the software they needed.  This got me to thinking that things had evolved enough that large changes in headcount might indeed affect future traffic levels.

 

Using data collected from 1997 through 2001 this method was applied and averaged 5% accuracy per year over that five-year period, a period of substantial growth.  It was then used to project 2002 traffic and as alluded to above, it overestimated traffic by 3.8%, in a year where traffic hardly grew.  Using the traffic per user concept of projecting future network loads worked regardless of growth or no growth conditions.  The method presented here provides an easy way to predict future traffic loads based on factors that should be readily available at any company.

 

Don’ let the formulas below scare you off, you did harder math in your high school algebra class.


The Method

 

Calculating 'Average Traffic per User', ∆y

 

Dividing past traffic (T) by headcount (HC) produces an 'average traffic per user' (ATU), this number is calculated monthly.

 

ATU  = T / HC

 

Each month the change from the previous month is calculated as a ‘delta of growth’ () of the average user generated traffic.

 

∆ = ( ATUC - ATUC-1 ) / ATU C-1 

 

Where ATUC is the current month's average traffic per user and ATUc-1 is the previous month's average traffic per user.  The result, or delta of growth (), is the percent of change in the traffic generated by 'an average user' from the previous month.

 

The previous year of these deltas are averaged,

 

  y = ∑ {∆JAN, FEB, MAR, …, DEC} / 12 

 

The sum of the monthly deltas is divided by 12 yielding the average monthly growth for the year, y, for an average user.  y is the percentage factor that will be applied to projected headcount data to generate future traffic projections.  Of course the same process could be done for any other period of interest, such as three or six months.

 

Headcount Projection Presumption

 

In order to project traffic based on headcount some presumptions must be made about future employment numbers.  Typically HR’s will not provide projected employment information, but they do release periodic general announcements about future layoffs and hiring's.  In most cases this will be the only information available for projecting headcount.  With nothing else to work with, just presume no change in headcount.

 

Calculating 'Traffic Projection', TP

 

Given a projected headcount (HCp), the average traffic per user per month (ATUC) and the projected yearly growth rate (y ) the next months traffic can now be calculated (ATUp).  Using the Current months Average Traffic per User multiplied by the growth factor percentage plus one yields the Projected Average Traffic per User, (ATUp) for the next month.

 

  ATUp =ATUC * ( 1 +  y  ) 

 

Multiplying this result by the projected number of employees (HCP) gives the projected traffic count (TP).

 

TP =ATUp * HCp  

 

This process is repeated for each successive month to be projected.

 

TP+1 =ATUp+1 * HCp+1  

 

            That is all that is required to project future traffic, but if you want to try and add some additional reality factors into your calculations use the following method.

 

Dealing with recurring Anomalies

 

The December Drop

 

A pattern has been noted at my company that every December there is a noticeable drop in the amount of traffic generated enterprise wide, even though the data is normalized each month to account for the difference in the number of days of the month[1].  This drop has been much more pronounced in the last five years, so those are the years used to calculate future December drops in traffic levels

 

DD =∑ {∆DEC1997, DEC1998,  …, DEC2001} / 5 

 

The factor (DD) is created by averaging the Monthly Percent of Change of the last five Decembers.  The DD percentage is then applied to future Decembers in any projection.

 

In our case this phenomena is always in December and only lasts one month, as you become familiar with your data you may notice recurring patterns. If you can quantify them add them in. 

 

The January Adjustment

 

As a result of the annual December Drop the January projections cannot be based on the previous months data as in all other cases.  The projections for each January’s growth are based on the previous November’s data rather than on the December results.

 


Example Calculation

 

Calculating 'Average Traffic per User', ∆y

 

            Average traffic per user per month calculation, using hypothetical numbers,

 

T = 200 Tb, Total traffic for the month

HC = 200000, Total headcount for the month

 

ATUC  = T / HC

          = 200 Tb of traffic / 200000 people

          = 1 Mb per user per month

 

            Change in traffic per user is calculated

 

ATUC-1 =  .97 Mb, last months traffic per user

 

∆ =( ATUC - ATUC-1 ) / ATU C-1

∆ = ( 1 Mb - .97 Mb ) / .97 Mb

∆ = 0.03093

Meaning there was 3.1% growth from one month to the next

 

The previous year of these deltas are averaged,

 

  y =∑ {∆JAN, FEB, MAR, …, DEC} / 12

  y = (.002 + .034 + .020 + .021 + .022 + .023 + .025 + .034 + .020 + .021 + .022 + .031) / 12

  y =  0.02291

 

The average growth per user per month was 2.3% over this period.

 

Calculating 'Traffic Projection', TP

 

            Calculating next months traffic per user,

 

ATUC = 1 Mb, current average traffic per user

 

  ATUp =ATUC * ( 1 +  y  )

  ATUp = 1 Mb * ( 1 +  0.031  ) 

  ATUp = 1.031 Mb

 

Next months expected traffic per user is 1.031 Mb

 

            Calculating next months total traffic,

 

HCp = 200020, number of people expected to be employed next month

 

TP =ATUp * HCp  

TP = 1.031 Mb * 200,020

TP = 206220.62 Mb

 

Next months traffic is projected to be 206 Tb

 

Continuing with the next month’s calculations building off last month’s projections,

 

HCp+1 = 200010, number of people expected to be employed next month

 

ATUP+1 = 1.031 Mb * (1 + 0.031)

ATUP+1 = 1.063 Mb

 

TP+1 =ATUp+1 * HCp+1  

TP+1 = 1.063 Mb * 200,010

TP+1 = 212611 Mb

The following months traffic is projected to be 212 Tb   

 

This process is repeated for each successive month to be projected.

 

Dealing with recurring Anomalies

 

Lets say you find that every month in summer your traffic per user drops by 15% from average.  You can apply that factor to the projected traffic per user to refine your results.  Below the monthly traffic total is recalculating with the summer drop-off factored in.

 

ATUP+1 = 1.031 Mb * (1 + 0.031 - 0.15)

ATUP+1 = 0.908 Mb

 

TP+1 =ATUp+1 * HCp+1  

TP+1 = 0.908 Mb * 200,010

TP+1 = 181609 Mb

The adjusted traffic total is now 181 Tb

 

As you can see from this example, if you can spot recurring changes in traffic, and can quantify them, they can have a big affect on future projections.

 

Lessons of Experience

 

We are now starting to apply this method to smaller sets of data, such as regional areas, specific sites and/or links.  Naturally the old axiom of garbage in, garbage out still holds.  The more accurate the data you use to start with the closer the projections can come to reality.

 

If you can get the number of actual users, not full employment; it will increase your accuracy.

 

You can get some pretty interesting results with wildly fluctuating data.  What I have found is that by looking for patterns in the data and adjusting the time used to determine the future growth rate can help bring things back to reality.  For instance, if you have monthly data results over a year, but the first six months of data is very different than the last six months you might want to only use the last five months to determine your growth rate for the next year. 

 

As Winston Churchill said, “I do not believe any statistics until I have altered them myself.” Always remember to ask the question, “Does that make sense?”  If it doesn’t, re-check your calculations and adjust as required.

 

Conclusion

 

This method changed the way that traffic is being projected at my company, it also changed the way we think about where traffic comes from.  In the past only one variable was used to predict future traffic, and that variable was past traffic.  This new method introduces the idea of adding a second variable to the projection equation, the number of users generating traffic, or the traffic per user. Time has shown that this method works at my company.  It is very easy to apply if you have the data.  If you need to get a handle on what your future traffic loads might be this is an extremely cost efficient way to get your answers.

 

This mathematical model has the same hole in it as all other methods.  In reality there is a known third variable that encompasses all the un-quantifiable changes, spikes, additions and deletions that a network constantly goes through.  It is highly unlikely that this complex variable can ever be added to the equation, but it must always be considered when predicting future traffic.

 

I will send anyone asking for it a copy of an example Excel file via e-mail.

 

 

 

Steve McGourty

 

Bio

 

Steve has been working at Boeing for 17 years, all in networking.  His first few years were spent evangelizing the wonders and benefits of networks, the next few as lead LAN designer for an ever-expanding network in the largest building in the world and beyond.  His work for the last 10 years has been dedicated to Network Management.  He has a B.S. in Systems Science (Scientific Option) from the University of West Florida, 1985.



[1] Each month total traffic is calculated using the average utilization for a single day and multiplied by 30.42, the average number of days in a month in one year.