In Part 1 of our two-part series on Capacity Planning for Cloud Platforms, we tackled some major factors relating to what capacity planning is, why it's a constantly evolving exercise, and how best to prepare and forecast for future business, platform, and customer needs. Capacity planning for a cloud-based platform requires a careful look at the business landscape with regards to assessing future expected levels, and needs. Considerations such as purchasing decisions and patterns in conjunction with their associated cadences heavily impact and dictate capacity planning and implementation.

In Part 2 of the series, we're going to discuss capacity planning with regards to forecasting for different types of devices, customer behavior and process-orientated build patterns, integration strategies, and handling the unexpected.

How we Forecast for Different Types of Devices

The cloud platform is an infrastructure of connected devices of different types and different manufactures. Fortunately, there are many similarities between devices from different manufactures. For capacity planning purposes, these similarities between manufacturers often allow us to treat different devices similarly as long as they are the same type of device. There are many types of devices such as: network devices, storage devices, and compute devices. For both business and technical reasons, our forecast technique differs based upon the device type. For example, it takes more time to distribute a load across storage devices than it does to distribute across compute devices. To handle these differences we create an approach that allows us to use both linear and non-linear models to create forecasts.

How we Build the Model

We expect customers to continue to use our platform in similar ways to how they have previously used the platform. However, we do not expect to always see the same internal platform changes in the future. In order to increase the accuracy of our forecasts, we separate the growth in device utilization into two categories: process-oriented and behavior-oriented. We handle each type of change separately.

For behavior-based capacity changes, we start by accounting for the ebbs and flows of customer utilization patterns. We remove the behavior-based variation by examining the typical maximum utilization of the device. This technique allows our forecasting method to plan for the volatility in customer behavior. Additionally, we try to predict future use levels and hardware demand based on past rate of hardware demand.

We find the optimal growth rate for each device that best describes the historic rate of change for that device. Depending upon what technique is more accurate for the particular device in question, we will either use a linear or a non-linear prediction model.


Next, we need to account for the process-oriented changes to the device. Once we factor these into the model, we can then figure out the expected future utilization rate of that device. Since we know the maximum capacity for each device, this means we know roughly how long each device has until it is fully burdened. The next step is to share these expectations with the business and to integrate them within the larger decision-making process.


How we Integrate Forecasts with Business Processes

As previously mentioned, we try to account for process changes in the models for each device. However, there are also business-scale process changes. The biggest sources of these changes come from logistics and procurement. The time it takes to install new hardware depends upon the utilization rate of our data centers, the availability of hardware by suppliers, and the workload of our operations teams. It also takes time for procurement. The finance team helps further investigate the recommendations for capacity and obtains business approval for sustained increases to capacity. Two procurement factors take place: optimization and integration. Optimization is involved in model accuracy and strength while taking the current supply into account. Integration factors current project results and read-outs with procurement-driven business decisions going forward. The manner in which we purchase what we need along with the components themselves will change as the cloud industry itself is constantly changing.


To account for these process changes, we create elasticity in our models. We do this by first examining the typical time required for each stage of the capacity planning process including: planning, purchasing, ordering, building, and installation. Once we know the typical lead times required for increasing the capacity of our platform, we know how far we need to forecast. Next, we create the ideal forecast for this time window.


We also examine the accuracy of our forecast technique by studying the historic data for each device and examining how well our forecast would have done at predicting future capacity needs. Through understanding historic performance, we are able to figure out the most optimal way to configure our forecasts for this duration.

How we Handle Uncertainty in Forecasts

Another way we account for future uncertainty is by building both a business and a statistical buffer into our forecasts. The business buffer comes from our platform engineers. They have domain expertise that allows them to connect capacity projections with business processes. The statistical buffers are also based upon business thresholds for confidence. By telling us how likely they want to be to cover future use-cases, we can add the necessary tolerances to our forecast models to meet stakeholders' risk tolerance. What results is a projection of growth that accounts for consumer growth, business processes, and statistical uncertainty.

By applying the techniques outlined above, we have been able to consistently and accurately forecast the capacity needs of our platform. However, nothing is ever certain. We do not want to take the volatility of the cloud industry for granted. With proper planning we can help reduce the uncertainties associated with capacity planning. While we strive to do our best from an analytical perspective, there is no substitute for the agility of platform engineers and the operations teams that ensure the continued health and stability of the cloud platform on a daily basis.

Test Drive our Platform

Designed for your business needs today and tomorrow, the CenturyLink Public Cloud is reliable, secure, robust, and global. Take a Free Trial test drive of our Cloud today! We’re a different kind of cloud provider -- let us show you why.