Supply Chain Luck (good and bad)


It's 7:50 am and it's not going to be a good day. While reviewing your scorecard , you notice a big hit in a key service metric. Your boss has clearly noticed it too, a meeting request just hit your calendar for 08:05. Now, you did have some problems earlier in the year but they seemed to clear up and for the last few months you've been enjoying the rosy glow that comes from near perfect service levels on your products. This month, sales spiked, inventory crashed to zero and you started cutting orders. People are not happy. What changed? What went wrong?

It may be that you missed a predictable shift in demand ("Promo? What promo?") or perhaps you do not have safety-stock levels set correctly (your bad) or, and please consider this carefully,you might just have been unlucky.  Now, I'm not suggesting that every nasty surprise you encounter is bad luck ; neither is every bit of good news due to your utter mastery of your supply chain, but sometimes... stuff happens.

Call it what you will - luck, chance, risk, volatility, variability, uncertainty, randomness or noise - stuff happens. However good your sales forecasting may be, it will be wrong. Even with robust planning, you will, occasionally, have issues with supply, weather events, product quality, equipment failure. You can and should minimize it, mitigate it, expedite around it, yes, even measure it, but you cannot, ever, ever, eliminate it.

Bottom line - if, for any metric, you can't separate "unusual, but still within the expectations of the system" from "now that's really weird" you can spend a lot of time and money trying (and failing) to control it.

I came across a great example of this yesterday, looking at the results of a simulation study for a supply chain inventory/replenishment system. The chart below shows days 200 to 1000 of a simulation run.

The 4 component charts (top to bottom) show:

  • daily demand for a product with a reasonable level of variation. This is sampling from the same exact statistical distribution throughout time - those occasional spikes are purely random and within the bounds of "reasonable" for this distribution.
  • The results in inventory. The solid blue line shows on-hand inventory, the dashed line is on-hand and on-order together. You can see that as total inventory drops below the re-order point an order is triggered, on-order inventory shoots up and some days later it is transferred into on-hand inventory. You can also see that sometimes inventory drops below the safety stock line, sometimes it is above and occasionally inventory drops to 0 and orders are cut.
  • shorts are order-cuts due to insufficient inventory. Shorts can only happen at the end of a replenishment cycle, and in fact for most of this 800 day period there are no shorts at all. Some of them are fairly small (less than 5) others not so much.
  • fill-rate calculated on a moving 180 day window. E.g. the fill-rate metric shown on day 400 is based off performance for the prior 180 days (221 to 400). For this model, the re-order-point (ROP) was set to meet a 97.5% fill-rate (in the long-term) but in this chart, it's all over the place.

Now, you may think 180 days is long-term but with this system ordering approximately every 20 days it's only had about 9 chances to test that ROP level in any given 180 day window. That's clearly not enough to get a stable metric. Note that if you average the result across the whole 1000 days, roughly 50 order-cycles, you get 98.1%. That's still over 0.5% points off from what I intended and actually, luck is on our side, it could have been much further from the "truth".

When I repeat this simulation 100 times (with different random numbers), I get a wide variety of results for the 1000 day fill-rate.

It's centered around the right place, so the good news is that my math is working, but for any 1000 day run, random variation could lead me into thinking I have a problem when the underlying system is in control and no corrective action is needed. (Anyone want to explain an average 96% fill-rate for 3 years to the boss?)

Dealing with uncertainty is the realm of statistics and I know that many of you do not want to go there. I do understand your pain (even if I don't feel it so much): trust me that in writing this post my largest issue has been the amount of text I have written, re-written and deleted because it contains too many statistical references.

Let's keep it simple then: generally, larger samples give you a better/tighter estimate of a metric than do small ones. For example, your company-wide fill-rate metric for the month is probably quite stable/reliable. The further down you drill into metrics for specific products, locations or dates the smaller your sample and the less stable the metric becomes. Looking at an individual product for last week (or perhaps even the last 6 months) is not giving you reliable information as to whether your supply chain is under control so try not to treat every little bit of bad news like a fire drill.