Back to Case Studies
AI / Supply ChainCase Study

Predictive Analytics Engine

A 200-store retail chain was losing $25M annually to inventory mismanagement — overstocking perishables that expired unsold while simultaneously running out of high-demand items. We built a predictive analytics engine that forecasts demand at the SKU-store-day level with 92% accuracy.

34%
Less Waste
92%
Forecast Accuracy
$8.7M
Annual Savings
15min
Pipeline Cycle
1,200
SKUs Optimized
The Challenge

The retailer's existing replenishment system used simple moving averages and safety stock formulas that hadn't changed in 15 years. Store managers supplemented with manual overrides based on intuition, creating inconsistent ordering patterns across locations. Perishable goods had a 23% waste rate, while stockout events were costing an estimated $12M in lost sales annually.

The data landscape was fragmented: POS transactions in one system, inventory levels in another, supplier lead times in spreadsheets, and promotional calendars in email threads. Weather data, local events, and competitive pricing — all known demand drivers — were not incorporated into any forecasting process.

ML Pipeline
Live Flow
DATA PIPELINE — FUNNELPOS DataWeatherEventsPromotionsFeature EngineeringModel Training (LightGBM)Order OptimizationStore Dashboard
Funnel pipeline — four data streams converge through feature engineering into model training, producing predictions that fork into order optimization and store dashboards
Our Approach

We built an end-to-end ML pipeline on AWS SageMaker with Apache Airflow orchestration. The feature engineering layer integrates POS data, inventory snapshots, weather forecasts, local event calendars, promotional schedules, and historical demand patterns into a unified feature store managed with dbt and Snowflake.

The forecasting model is a gradient-boosted ensemble (LightGBM) trained at the SKU-store-day level across 1,200 SKUs and 200 stores. We chose gradient boosting over deep learning for interpretability — store managers need to understand why the system recommends specific order quantities. The model retrains daily on the latest 18 months of sales data, with the full pipeline completing in under 15 minutes.

The optimization layer converts demand forecasts into recommended order quantities that minimize a weighted cost function: waste cost, stockout cost, and holding cost. Each store can adjust the cost weights based on their local constraints. A Streamlit dashboard gives store managers visibility into forecasts, recommendations, and model confidence, with the ability to override with documented reasoning.

Results & Impact

Perishable waste dropped from 23% to 15.2% — a 34% reduction. Stockout events decreased by 41%, recovering an estimated $4.9M in previously lost sales. Combined with waste reduction savings, the platform delivers $8.7M in annual value, achieving full ROI within 5 months of deployment.

Forecast accuracy at the SKU-store-day level reached 92%, up from 61% under the previous moving average system. Store manager adoption is at 87%, with override rates declining from 45% in month one to 12% in month six as trust in the model grew. The pipeline processes 240,000 daily forecasts across all store-SKU combinations in under 15 minutes.

Technology Stack
PythonPyTorchApache AirflowSnowflakedbtStreamlitAWS SageMakerFastAPI