Skip to content
View in the app

A better way to browse. Learn more.

Benchmark Six Sigma Forum

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Topics

Leaderboard

Popular Content

Showing content with the highest reputation on 07/28/2025 in Posts

  1. Thaiyeb Hussain has provided the best answer to this question by also providing a relevant example. Answer from Yuvaraj is also a must read. My 2 cents - Since the output is a system generated output (in most cases a black box), I would recommend conducting Logical Validation at regular intervals to ensure that the AI brain and human brain are calibrated
  2. AI systems can degrade time based and it can impact workflows without anyone noticing it. Taking an example of customer support process. AI model is implemented for classifying and routing incoming tickets. Below mentioned are the early signs of AI degradation: 1. Drop in accuracy and Confidence scores - The model gives you low accuracy scores to predictions. 2. Increase in Manual efforts for routing and overrides - Agents are frequently correcting AI decisions, No of tickets escalated gets increased. 3. Change in input - Change caused due to new product lines, language shifts. 4. CSAT decline - Negative feedback increases, your customer satisfaction KPI downgrades. 5. Lead Time increases - Model response time increases which leads to decline in customer satisfaction. WIP or ques increases and throughput rate declines for incidents solving. 6. SLA Bridges starts increasing - As tickets are routed incorrectly, there lead time for solving increases and FTR decreases. Following are the ways in which these can be monitored are: Data Drift detection - It gives alerts when your input distribution is deviating from training baseline. Model performance monitoring - It measures confidence drop by x pre defined percentage that is set as baseline for stability in the system. HIL (Human In Loop) feedback - It tracks override rate, agent correction, It gives alerts if HIL override rate exceeds x percentage set as baseline. Business KPI's - KPI's such as CSAT , NPS and other customer feedbacks can keep the system in monitoring. Systemic Testing - Testing can be done periodically for known inputs to check the response, type of a poison test.
  3. Unlike traditional systems, which often exhibit signs of failure in a specific pattern, AI systems fail quietly and silently, not displaying obvious signs of failure. To detect AI failures, proactive monitoring is required to identify when inconsistent behavior began, when results deviated, and when it started to deviate from compliance, as well as when corruption of intent started. Example: A multi-model AI chatbot handling critical application information. Signs of Warning of AI failure - The model starts giving wrong information about applications - Model starts giving low confidence to queries that it previously did well - Ingesting of non-compliant data after initial load. - Low confidence indicates that the prompt is becoming misaligned. - Started to see escalations or users asking for human intervention. - Fail to handle queries effectively. - Hallucination increases - Poor sentiment analysis - Answer to queries degraded due to the user starting to ingest data in a format other than the initial training data. - The user is not ingesting new data when the system's changes occur. - New policies are not ingested into the system. To mitigate the above signs of failure, we need to implement a monitoring strategy. - The most important thing to monitor is the system's performance, checking the accuracy of answers and setting thresholds to track performance. - Always have a base KPI to measure the performance. - Check the escalations of users. Again, review the number of escalations occurring daily, weekly, and monthly, and aim for a percentage increase of 10-15% over the baseline. - Keep track of how often human intervention occurs. - Monitor the time it takes to obtain accurate data about applications and compare it to the baseline. - How is the quality of the result? - Prompt: Check how many tokens it's using now and what the latency is. Risks of AI Automation. AI poses unique risks in automation. - AI will fail without any notice. - Trust without validation leads to team failure. - An AI system will lose its learning ability when the feedback loop breaks
  4. While the business today, wants to be agile and implement several AI systems into their processes, it is also important that due diligence must be put into how and when can an AI system fail, how to mitigate the risks of an AI system failure and how quickly or agile the business can come back up after the failure with lessons learned. This is a key topic as the AI's failure unlike other systems is silent and confident. It doesn't raise any alarm, it simply provides a wrong assessment. In my view the failure of an AI starts with the designing of an AI system in the first place. Though no system is perfect by itself, a well designed system will not only be high in cognitive prowess but also highly accurate and reliable. The AI systems starts to fail when the accuracy and its reliability goes down. That is why a big chunk of time is spent on testing of AI models before implementation. The system goes through rigorous testing with high quality data and robust algorithms that minimizes errors and biases. Though all business does it (at least on paper sometimes) still there are chances of failure. The common cause of an AI failure is the drift. It is nothing but the performance degrade of AI system over time. This happens when the data with which the AI system was trained is different to the real world data when it is operational. Drift is of two types , Data drift and Concept drift Data Drift happens when the statistical properties of the input data changes to that of the machine trained data sets. Concept drift happens when the relationship between the input data and the output variable changes. The key is to have sufficient monitoring systems in place, identify the drifts, analyze , retrain and re-induct. This should be like a never ending cycle as the business environment is never static. I can think of an example where the AI system can undergo drift, and how to device a strategy to monitor and catch the drifts early and mitigate the risks. Example: Many companies used AI to integrate with ERP systems and make use of its predictive analysis prowess. Lets say a company used AI to integrate with its ERP for demand forecasting so as to optimize the inventory for the predicted sales. This system has to analyze the historical sales data, seasonality's, marketing promotions, supplier lead times for different supplies etc and predict future demand for all products , automatically create purchase orders for raw materials that is required to manufacture those products. All is good until a new regulation from government saying that import of specialty raw materials will attract more tax . Due to this the business decides to source some of the key raw materials locally for which the lead time is low, instead f the usual sister company in Germany. Now the AI system still looks into the previous Bill of Materials required to manufacture the same product, calculates the demand and places order to the previous supplier overseas with high lead time. This not only results in wrong supplier getting the order to manufacture the raw material, but also the lead time is very high where as in reality the lead time should be less also with significant lesser cost as the raw material will be locally sourced. This is a concept drift and external factor omission as the new supplier and local source decision was not included or not accounted by the AI system. This is one small example showing how an AI system can be drifted. In such cases it is always important to have a human element overseeing the approval of the purchase order and cancelling or redirecting the the same to appropriate supplier. Number of such manual overrides by itself will be a good indicator of how well the AI is functioning or drifting. Based on this , business can take appropriate steps to put definitive measures in place to monitor and catch the drift early and take necessary steps to prevent it for further scenarios. Business also uses Mean Absolute Percentage error which will measure AI's Forecast vs Actual. Its stability over a period of time is a direct measure of AI's stability. To sum up, though AI systems are becoming more powerful and autonomous, I still see Human to be the master of the AI system to Design, Train, Correct and Improve it at all times, without which it will be chaos.
  5. In the Business Process Outsourcing domain, we can think of an AI driven process used for automated email triage and classification of tickets in customer service. Failure Scenario: The AI reads the incoming emails and categorizes the emails into different buckets like billing, support, etc. It also routes the emails to the available associates based on category / topic & priority. The performance of AI might degrade due to various reasons: Change in Input pattern due to new terminology usage by customers; Changes or addition in services where the model hasn’t been updated with the current inputs Outdated prompts or logic leads to incorrect decisions Changes in the ticket integrations due to updated prioritization logic or updated fields, etc. Early Warning Signs: AI models might not crash immediately, but degrades slowly with visible warnings: Reduced efficiency of ticket resolution due to incorrect routing Increased cases of re-assignment with shifting of tickets between teams Unusual pattern of spikes or drops in categories. Increase in customer follow-ups or escalations due to inappropriate resolution. Monitoring Strategies: Its always important to detect the process degradation early to avoid customer impacts. Use Statistical thresholds to validate the sample auto-classified tickets comparing with the results of manual classification. Validate text embedding across tickets to detect any anomalies Calibrate agents with the test cases using Attribute analysis & validate results Compare the models performance with a initial sandbox version to understand difference. Risks with AI Automation: Spikes in volume might mislead the focus on categorization issues Keep an eye on bypassing the AI flaws without escalation.
  6. Observations from AI in Our Appeals Process In our RCM process, we have been using an AI tool to draft appeal letters, this helps in reducing time to prepare manually. It collects data like denial reasons, patient details, and payer rules from available supporting documents in the first draft. When we rolled it out, it did a good job. The letters were mostly spot-on and helped us get quite a few denials overturned. But over time, we noticed a few things started to slip and not in a way that set off alarms. Even though there are no errors reported, we could tell the quality wasn’t what it used to be. A few examples: The prompts feeding the tool was not updated for two quarters, so the letters felt outdated or missed key points. There were changes in the denial codes or policy from payers, but the AI was not able to recognize them or respond in the right way. Our Corporate Compliance teams have made updates to how letters should be written, maybe changes in wording or layout. But the AI continued using the older style. Because the system does not throw any errors, these kinds of changes quietly affect how well the AI performs. If no one’s checking, it can decline over weeks or months before it starts showing up in results. Signs That Something’s Off Here are some early signals we’ve learned to watch for: Appeal overturn rates drop, even though the types of denials and volume haven’t really changed. Team members are spending more time editing or fixing the letters before they go out. QA or nurse audits start flagging the same types of issues, problem with words, missing clinical reasoning, or formatting mistakes. During one of the MBR (Monthly Business Review), we get comments that payers saying the letters are not clear. Our client express concern over this statement. How We Stay Ahead of It 1. Regular Reviews Every week, we extract a handful of AI generated letters and assign to our senior QA to go through them manually. It’s not a big batch, but it gives us a sense of whether anything’s off. We also compare them with manually written letters to see if there’s a pattern or quality gap. 2. What We Track Metric When to Act Action Taken Appeal success rate Drops more than 10% below recent average Dig into specific cases, check AI inputs Manual edits per 100 letters More than 15% need rework Review prompt logic and update if needed 3. Spotting Data Drift We track changes in denial codes, document types, or shifts in payer policies. If something new starts appearing frequently, it may take as action to check the need of AI to be retrained or adjusted. 4. Real-Time User Feedback We have added a simple tagging feature in the letter review system. If our team sees something off, like wrong justification or missing details, they immediately flag it. We go through those tags monthly to spot any recurring themes and take action accordingly. Why AI Needs Closer Watch With regular systems, it’s usually easier to spot when something’s not right — you get an error message, or a result looks obviously wrong. But AI is different. Once it’s set up and passes UAT, we tend to assume it will just keep working fine. The most important part is, AI-generated content often considered as correct, hence we stop double-checking. That is when small mistakes start to creep in. And if there is no governance, the errors like outdated formats or missing key updates can cause bigger problems, especially when it affects compliance or our payer communication. AI is definitely useful, and it saves time no doubt about that. But it’s not something we can leave on autopilot. A bit of regular review, some honest feedback from the people using it, and tracking a few basic metrics can really help. It does not take much, just some weekly attention to catch things early before they turn into real issues.
  7. As example, Organization Applying AI Customer support for handling the complaint or queries over the time, there is possible to degrade silently due to many reasons like change of customer culture, customer behavior, customer skill, change of product or service regardless the reason, organization should be developing warning signs these warning signs shall alerts the organization that AI solution in not working properly some examples of warning signs 1) Tracking of performance metrics Track the metrics that was developed beginning like accuracy, speed, precise and ..etc organization should always track these metrics and take immediately action in case shift the performance 2) Monitor data input by customer or end user Organization should monitor the data input by customer and ensure consistency of dealing with data and ensure correct input data we can also use statistical tools and metric like mean, variance, and correlation 3) Monitoring data output as organization monitor the data input, also organization should monitor data output ensuring the data output is consistency organization should detect any reason cause degradation Therefore, organization need to catch the issues before causing real harm through the following 1) development automated Alerts it's difficult to manual monitoring so, organization should develop an automated alert that alerts organization if any drift or shift company can develop internal automated alerts or can use available automated platform or tool that available in marketing 2) detect non normal or drift data organization can use for example 8n8 for detect drifted data so take action immediately and evaluate AI Solution 3) Regular AI Solution assessment organization can arrange regular meetings with all concern to discuss and evaluate the AI Solution evaluate the data input and output evaluate the performance evaluate the metric evaluate all the system to ensure the suitability in many cases, organization don't need to remove the AI Solution by above action, we can enhance the AI Solution and upgrade and improve the system so organization performance will be always satisfied the objective and strategic
  8. Detecting early sign of AI process failure is very crucial and important to prevent incorrect output. Early Sign 1. Operational Anomalies- Infrastructure failure can degrade performance. 2. Early warning system: Automatic trigger when performance drop below limit and continuous retraining simulations to test robustness under changing conditions. 3. Monitor Data Quality & Input Drift- A change in input data patterns can degrade performance. Its important to set alert for missing values, outliers in real-time inputs. 4. Proactive measurement – training the model regularly, conduct regular audit can help us reduce chances for AI process failure. 5. Track performance continuously -A model might perform well initially but deteriorate over time. Track KPI’s and monitor error rates. By combining strategies, we can detect early signs of AI process failure.
  9. AI systems may degrade quietly; we have customer support system which uses bot for initial communication for chat. This AI degradation may lead to the following because where you identify if the AI is working upto the 100% expectations. 1. Drop in customer score/ feedback score 2. Increase in call logs when we have simple query that could be managed by AI 3. Change in human behaviour for using new jargon words 4. Escalation raised due to mis leading by AI tool on certain queries Proactive measures help to avoid the degradation of AI application 1. Testing AI - Shadow Mode Testing Run new versions alongside the live one and compare outcomes before full deployment. 2. Incorporate new behaviour - Rolling Retraining & Prompt Refreshes Regularly update prompts and models to incorporate fresh vocabulary and changing user behaviours. 3. Frequent Audit - Feedback Loop Integration Incorporate human-in-the-loop validation in edge cases and retrain models using flagged interactions. 4. Dynamic Thresholding Adjust thresholds based on seasonal trends or campaign pushes to avoid false positives.
  10. Periodic updates and maintenance strategy becomes the integral part of implementation and control strategy while implementing any AI solution. Without monitoring strategy accuracy and compliance can suffer. Lets consider below failure scenario from Accounts Payable Process: An AI solution is used to extract and validate invoice data like vendor name, amount, date, etc. from email and pdf copies and save those copies on OneDrive designated folder. Over the period of time performance may degrade due to various factor like: Prompt updates not considered for new invoice templates, formats changes in business requirements. Input scenarios not updated periodically. Changes I workflow like changes in approval flow and vendor policies Early Warning Signs: Increase in manual correction by the AP staff Payment delays Vendor complaints Mismatch between PO and invoice data Drop rate / error rate in processing Monitoring: Set alert if drop or error rate drops by 10% and review scenarios Trak manual corrections using tracker Monitor trends Monitor complaints from vendor and AP staff
This leaderboard is set to Kolkata/GMT+05:30

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.