Forecasters, should you remove your outliers? Like a Facebook relationship status, it's complicated. Let's decide whether those data rebels are keepers, or if they should be replaced, ghosted, or friend-zoned.
Outliers: Friend of Machine Learning Models
For some demand planners, machine learning (ML) models are a part of their toolkit on the regular. Outliers can be a friend with benefits to those models, and those benefits include higher accuracy and less tedious maintenance of data. Unlike traditional statistical methods, ML thrives on messy data.
Those anomalous data points often contain key information—unexpected shifts in supply, responses to external market conditions, or successful promotions. Machine learning models—when paired with complementary data streams—excel at interpreting this variation as explainers. Outliers provide an important portion of the coveted demand signal planners are looking for.
Outliers: Enemy of Traditional Statistical Algos
However, not all forecasting methods can harness outliers. For traditional statistical algos, outliers can be an enemy. Techniques like ARIMA, Holt-Winters, and even moving averages can be led astray by outliers.
For planners using traditional methods, outlier cleansing can be a helpful (and hopefully automated!) part of the process. Tools like control charts, widely used in the Six Sigma methodology, can help identify points that fall outside acceptable limits. This approach ensures the forecast isn't thrown off by special cause variation, while still feeding the common cause variation vital to preventing overfits and other statistical hubris. A common correction is to replace the outlier with a new data point on the upper or lower threshold.
Note, outliers don't have to be abnormally high sales. While I like using a statistical approach for setting the top guardrail, use a different "floor" that will recast stockouts as more regular demand volumes. Treat stockouts as outliers not because of their statistical distance from the mean or median, but because they are occurrences in the historical data stream that you don't plan to see again.
Frenemy? Use Forecast Value Add to Decide
What if you're still unsure about how to handle outliers? Maybe they are frenemies, a portmanteau of friend + enemy. Instead of relying on gut instinct or hard-and-fast rules, demand planners should take an empirical approach. Luckily Forecast Value Add (FVA) is just such an empirical tool to decide whether outlier cleansing is improving your forecast. FVA measures the impact of adjustments or models on forecast accuracy by comparing performance before and after changes are made, in this case outlier cleansing. Control, meet experiment. Experiment, control.
FVA puts the science in "data scientist" by using the scientific method.
Run your forecasts with and without outlier cleansing, either once a year or once a quarter. Compare the results. You may find that for some customers, channels, or products of your business, outliers improve your model, while for others, they undermine accuracy. FVA will tell you when it's a friend, and when it's an enemy.
If you see minimal to no improvement, don't be afraid to stop outlier cleansing. That's what FVA is for—to tell planners what not to do. Besides, you should already be monitoring the quality of the outputs of your forecasts, so if an errant forecast results leads you back to an inputs issue, take the appropriate action then.
Two last recommendations: 1. If you do make corrections, keep both the original data points and the corrected data points. 2. There are two kinds of outliers. There are unusual sales that happened. Then there are data processing errors; they didn't really happen. For instance the decimal place is in the wrong position inflating sales by 10X. Data processing errors should always be corrected.
Conclusion
Whether outliers are friend, enemy, or frenemy depends on your forecasting method and results you see. For machine learning models, outliers offer valuable information. For traditional statistical algos, they can be an unwelcome visitor. When unsure, take an empirical approach and choose a strategy depending on how they affect your objective: good forecasts.
At IBP2, we provide solid advice to define your relationship with outliers. Whether you want to embrace these statistical misfits or remove them from your dataset, IBP2 provides the insights and analytics needed to make informed, data-driven decisions. With our flexible solutions, you can strike the perfect balance for your unique forecasting environment giving you the chance to find data happiness.
Commenti