Domain: Aerospace MRO - Engine shop for CFM56/LEAP Turbofans for Performance Restoration (€ 220 Million yearly turnover , approx. 1,800 shop visits in a year, AI rolled over since late 2025 to predict HPT module rework needs based on borescope images, oil debris analysis, and in-service data) Specific AI-enabled process: Predictive HPT Blade Rework Forecasting The AI will recommend if the module needs full blade rework, partial (only the tips), or none, all with the goal of eliminating unnecessary shop time and expense without losing the zero escape target on critical parts. It went live on all CFM56/LEAP visits in Q1 2026 and initially deliver an average 18% reduction in TAT on HPT modules. How we ensure & monitor the process continues to deliver intended outcomes We are treating this AI-human decision loop as a live control system and continuing to develop it over time not like one tine install, the focus is on sustainable business value – TAT savings, cost per visit going down, safety and zero quality escapes. What we monitor (daily / weekly / monthly) 1. Leading indicators (daily dashboard – shop floor + engineering) · Prediction accuracy of AI vs. actual rework result (confusion matrix updated every 50 engines). · AI suggestion Override rate by technicians / engineers (accept, tweak, reject AI recommendation). · Confidence score variation (how often is the model <80% sure?) · Data drift indicators, distributional shift of input variables (eg iron particles in oil, borescope crack density, EGT margin so on) 2. Lagging business outcomes (weekly review – operations + finance) · HPT Module: Turn Around Time Variance (target < 35 days). · Rework cost per engine vs. Baseline · Escape rate / quality holds on HPT (target 0) · Spare Parts Consumption vs. Forecast (Over/Under-Stocking Signals) 3. Model health metrics (monthly deep dive – MBB + data team) · Population stability index (PSI) on key inputs (>0.25 = moderate drift, >0.5 = severe). · Calibration plot (predicted probability vs observed rework rate) · Feature importance drift (which inputs is most important to the model now vs at launch) How we react when the going starts getting tough We have a three-level escalation protocol: Level 1 – Minor Drift (Weekly Trigger) · Override rate >25% or confidence <75% on >20% of cases. Response: · Immediate feed back loop i.e. every override by enginers requires 1-click reason (dropdown + optional voice note). · Retrain model based on last 100 engines + overrides justificatipn. · Notify shop team lead, usually fixes within 1-2 weeks Level 2 – Business impact emerging (weekly trigger) · TAT +3 days or rework cost increased +8% vs rolling 4-week average · OR escape / hold on HPT (even one) Response: · Hold AI recommendations - return to manual disposition within 48 hours/ · Root Cause A3 with MBB: Data drift? New failure mode? Change in user behavior? · Temporary rule: AI confidence > 90% required for auto-accept · Full model retrain + validation on hold-out set before re-release Level 3 – Systemic failure (monthly or immediate on escape) · PSI >0.5 on critical inputs OR calibration slope deviates >15% · OR sustained TAT/cost > 15% Response: · Full pause of AI in production · Independent audit: data lineage, labeling drift, concept drift · Notification to the regulator of any escape which occurred · Re-baseline from scratch or switch to a fall-back approach (manual and old rules) · Shared across sites post-mortem – we’ve had one Level 3 (new low-sulfur fuel changed oil debris patterns in Q3 2026) Practical setup we use today · Automated alerts using Teams/Slack when threshold breaches · Monthly “AI Health Review” (30-min standing meeting: MBB, ops manager, data lead) · Quarterly external benchmark against OEM data (CFM/Pratt) · Annual review of AI usage (EASA Part-145 requirement) Bottom line from the teardown bay AI Drift isn’t an ‘if’ but a ‘when’ In MRO, the price of slow degradation can be a long turn-around time, excessive spares, or even a failure in service. The way we monitor our AI is how we would monitor an engine, performing routine checks every day, and only grounding it completely when we have to. The process remains alive since we do not assume model is “set and forget”.