Predictive Analytics Engine
A 200-store retail chain was losing $25M annually to inventory mismanagement — overstocking perishables that expired unsold while simultaneously running out of high-demand items. We built a predictive analytics engine that forecasts demand at the SKU-store-day level with 92% accuracy.
The retailer's existing replenishment system used simple moving averages and safety stock formulas that hadn't changed in 15 years. Store managers supplemented with manual overrides based on intuition, creating inconsistent ordering patterns across locations. Perishable goods had a 23% waste rate, while stockout events were costing an estimated $12M in lost sales annually.
The data landscape was fragmented: POS transactions in one system, inventory levels in another, supplier lead times in spreadsheets, and promotional calendars in email threads. Weather data, local events, and competitive pricing — all known demand drivers — were not incorporated into any forecasting process.
We built an end-to-end ML pipeline on AWS SageMaker with Apache Airflow orchestration. The feature engineering layer integrates POS data, inventory snapshots, weather forecasts, local event calendars, promotional schedules, and historical demand patterns into a unified feature store managed with dbt and Snowflake.
The forecasting model is a gradient-boosted ensemble (LightGBM) trained at the SKU-store-day level across 1,200 SKUs and 200 stores. We chose gradient boosting over deep learning for interpretability — store managers need to understand why the system recommends specific order quantities. The model retrains daily on the latest 18 months of sales data, with the full pipeline completing in under 15 minutes.
The optimization layer converts demand forecasts into recommended order quantities that minimize a weighted cost function: waste cost, stockout cost, and holding cost. Each store can adjust the cost weights based on their local constraints. A Streamlit dashboard gives store managers visibility into forecasts, recommendations, and model confidence, with the ability to override with documented reasoning.
Perishable waste dropped from 23% to 15.2% — a 34% reduction. Stockout events decreased by 41%, recovering an estimated $4.9M in previously lost sales. Combined with waste reduction savings, the platform delivers $8.7M in annual value, achieving full ROI within 5 months of deployment.
Forecast accuracy at the SKU-store-day level reached 92%, up from 61% under the previous moving average system. Store manager adoption is at 87%, with override rates declining from 45% in month one to 12% in month six as trust in the model grew. The pipeline processes 240,000 daily forecasts across all store-SKU combinations in under 15 minutes.
