What is a Demand Forecast?
Demand Forecasting refers to the process of analyzing historical data (i.e sales volumes, seasonality, and other correlated features) in order to forecast an estimate of future quantity demanded amount of goods.
For example, a retailer could use features such as known supplier ID, average demand in the past, date released to predict how many units of an item will be demanded in the next week, so that the stores can supply each item accordingly.
In general, each forecast model regression predicts quantity demanded for a fixed period (i.e 7 days in the future), or it would take in number of days in the future as a feature. This is different from a times series prediction or forecast where we use historical data only to predict future sales volumes.
Why Regression over Times Series?
The benefit of building a demand forecasting regression model with fixed dates and/or for a specific store is so that the model can have higher complexity and therefore generate more accurate predictions than a times series model for purposes like inventory planning.
However, with higher complexity models can also come more risk for accuracy degradation due to feature drift and concept drift.
Arize can help you monitor and visualize these problems quickly as they are surfaced up in your model.
Challenges With Demand Forecasting
Demand forecasting models have a high susceptibility to drift. While the complexity and a higher number of features can increase the accuracy of a model, the attendant increase in noise sources and concept drift (where the properties of an underlying variable change unexpectedly) can create a perfect storm for model failure. For example, a sophisticated model built to predict housing prices that leverages hundreds of features might be quickly challenged by evolving home-buying behavior or regulatory changes.
Events like COVID-19 can magnify and accelerate drift. In the immediate aftermath of COVID-19, few models likely predicted the wave of mass migration in the U.S. caused by a large portion of the country’s workforce suddenly and likely permanently working from home. During outlier events like these, the magnitude of drift’s impact on regression models can be so outsized that a simpler time series model might be worth swapping in for an interim period.
Limited feature diversity makes troubleshooting difficult. ML teams building demand forecasting models often rely on features that lack specificity, making troubleshooting more difficult. In general, features that are highly specific (and often numeric) to the problem they are trying to solve (i.e average housing price, driver delay, rating, etc) tend to be ideal because you can transform them, normalize them, drop entries, or cap it at a certain value. These stand in contrast to features like location_id that are categorical (discrete, no order) or ordinal features (discrete, but often arbitrarily ordered). In retail, for instance, an ordinal feature like a “packet_size” of “medium” could mean a medium-sized backpack or a medium youth T-Shirt — making it less useful in calculating, say, what a sudden spike in cotton prices might do to unit costs.
Importance of Demand Forecasting
Demand forecasting is often mission-critical to any operational logistics since forecasted demand ultimately dictates how much resources are given to some business units. Here are a few areas of application:
Operations Logistics: Optimizing resources to allocate or supply to meet demand forecasted. Examples include driver allocation, retailers shelf-spaces, transport inventory spaces, hardware compute power, and many more.
Pricing and Strategy: Knowing future demand helps businesses decide on when to release new product lines, features, and price goods accordingly.
Optimizing Lead Time: Using demand forecast to optimize lead time and reduce wait time, increasing customer satisfaction and retention rate.