Skip to content
View in the app

A better way to browse. Learn more.

Benchmark Six Sigma Forum

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

When Should People Trust an AI’s Recommendation — and When Should They Override It?

Featured Replies

Q843

In many processes, AI systems provide recommendations rather than final decisions. Over time, teams may either trust the AI too much or ignore it altogether, both of which can hurt outcomes. Think of a specific process in your domain where AI provides guidance, predictions, or recommendations. How should people decide when to accept the AI’s recommendation and when to override it? What cues, safeguards, or rules of thumb would help maintain the right balance?

⚠️ Any answer that is generic or does not connect with a specific process will not be approved.

🏆 The best answer will be selected on the basis of:

  • Relevance of the chosen process

  • Thoughtfulness in defining trust vs. override conditions

  • Practicality of the decision guidelines

Note for website visitors

Solved by Ankit Kulkarni

When you consider AI s recommendation, I would recommend some action options as follows: Partially accept the recommendation, Fully accept the recommendation and completely disregard the recommendation. Before adapting the recommendation by AI, I would certainly cross verify the facts that are presented to me. The primary factor will be to accept only those reports that are accompanied with a fact or proof that can be verified.

If i can split my process into a 20-60-20 weightage split on the initial to final stages, then I would recommend human intervention to be dominating in the initial 20 and final 20 stages. By this , I think we can have a clear direction in the initial stages as well as very accurate decision with a human element in the final stages. This weightage can be balanced on the severity of the outcome.

Domain: ITIL, cloud, digital services, cybersecurity and consulting based out of 26 countries

I use AI in my ITIL work as a helper or assistant that supports me but does not replace my judgment. I trust the AI when the situation is stable, normal and based on repetitive issues or good data. for example if the AI looks at past tickets and tells me this incident is likely related to a network outage and I see that the symptoms match whatever I have seen before then I usually follow it on routine tasks like suggesting the correct ticket category, predicting which teams would handle the issue or recommending a known solution. AI usually very reliable because these tasks follow patterns from history. Another example is problem management. sometimes the AI analyzes logs and says most similar cases were caused by a memory leak. If the logs support this, I trust the AI and investigate the direction first. It saves time compared to checking everything manually.

But there are times when I need to override the AI if the system is business critical like a payment service or a customer facing portal or an industry which will be critical impacted like railways airlines. For example if the AI suggests restarting a service during business hours I may ignore that advice because it could cause downtime. My experience tells me that even if the AI thinks it's safe, a small mistake can actually trigger major incidents and escalation.  Sometimes I also override the AI when the situation is unusual or new, once a ticket came in with very strange symptoms, the AI predicted a common root cause but I knew from experience that that this didn't make sense because the system had recently been updated. AI obviously did not know about the update, so I relied on my judgment, we spoke to the SMEs and the process owners and then found the real issue later. A simple ruler what is this I trust AI when the risk is low and the task is more routine with the outcome familiar. I override I when the risk is high the data is unclear we don't have any SMEs or process experts or process owners or delivery leads to support the case and validate. AI surely helps me finish work faster but I take the responsibility for the important decisions. Moreover we are also not allowed to upload real data and numbers directly on the AI tool, so many a times the answer is generic. With the current dynamics of work, I see that as a great helping hand in the near future.

Domain--> SAP BW Finance Reporting - Daily Revenue and Margin Decision Support.

Overview--> Here we discuss when Finance Controllers and Sales Leaders can rely on AI recommendations in SAP BW reports and when and when AI should be overwritten by Human decisions.

Business Context-->Business review daily Margin and Revenue using BW reports

These reports are used by Finance controllers and Sales Leaders to check

·       The daily revenues

·       Any Abnormal discounts

·       Corrective Actions

An AI layer is embedded into SAP BW dashboards to

·       Detect abnormalities in revenue or margin

·       Predict short term trends

·       Recommend actions

The recommendation provided by AI is directly going to affect the business decisions. Hence it is important to know when to use AI and when not to use.

When to Trust AI

·       You should trust AI suggestions when:

·       SAP BW data is clean and complete.

·       The pattern is consistent and based on solid historical data.

·       The AI's recommendation matches past business behaviour.

·       Previous AI alerts of the same type were accurate.

 

When to Override AI

·       There are business events not yet captured in data like new contracts.

·       Recent SAP/BW configuration changes may affect numbers.

·       The AI explanation is unclear or incomplete.

·       The decision has a large financial impact.

 

Override Rules are

·       Document the reason for overriding.

·       Overrides above a financial limit need manager approval.

·       Overrides are logged and reviewed weekly.

 

How Overrides can improve AI

·       Feeds into model training.

·       Adjusts rules and thresholds.

·       Reduces false alerts.

Governance Model

·       AI gives the first recommendation and humans validate exceptions.

·       Define clear rules when AI can act automatically.

·       Have a weekly review of the decisions and overrides.

·       AI should clearly explain what changed and why.

Business Benefits

·       Quick daily decision making

·       More trust in SAP BW reports

·       Reduced manual analysis effort

·       Improve collaboration between Finance and Sales

Summary

Use AI for repetitive data driven patterns in SAP BW reporting. Override AI when high impact decision needs to be taken. The goal is not blind trusting of AI but a balance between AI and People

We need people to trust their expertise and judgement while working with AI tools. AI will provide lot of information in a digestible format which makes it look good to believe in the content. We should look for

  • Historical references  - if data / information provided by AI has been valuable or not.

  • Human in the loop - Cross check the information with Subject Matter Expertise or other relevant tools. If data provided is not of good quality or there are loopholes, then review the output before putting it to use. In scenarios where historical data is not available then AI models will not work accurately.

 

If we believe data provided to AI was good and output is in line with what we were expecting, then we can utilize the output.

Narratives + Berkie AI MarketPlace ( Internal AI)

  1. Berkie pulls data where possible and drafts/reshapes sections (Property Overview, Market Overview, Borrower & Sponsor overview, Maps & Aerials, Strengths & Weakness)

  2. Analyst will have the final say on what to keep, change or delete before anything goes to Mortgage Bankers and Underwriters

Here AI is clearly recommending content and structure, not deciding. The risk is when:

  1. Some analysts might over trust (pass AI text through minimal checks)

  2. Others might ignore the AI ( rewrite everything) and lose the time savings we are aiming for.

Below is how we defined when to accept V/S when to override and the practical rules of thumb to keep the balance.

AI assisted Narrative drafting

For clarity, here is exactly what AI is recommending:

  1. Data interpretation and summarisation: From property details, sales/loan data, crime reports, comps, market stats, sponsor info AI proposes : A property overview, Market Overview, Sponsor description, and Initial strengths and weaknesses wording( based on the inputs shared by analysts)

  2. Language polishing and standardisation: AI rewrites analyst's bullet points into Narrative form, enforces US english, clarity, tone, and structure.

  3. Section organisation & emphasis: AI may choose what to lead with, how much weight to give to crime v/s amenities, etc , based on the prompt.

The AI recommendation is: "Given the provided inputs and prompts, this is the Narrative you can send to Mortgage Banker"

The Human decision should be: " Is this narrative accurate, appropriate, and aligned with banker expectations for this specific deal?"

Decide when to accept and when to override

We trained analysts to make the accept/override decisions in three simple phases.

  1. Facts & input checks: " Is everything that looks factual, actually true ?"

    Default Stance: NEVER blindly accept AI's factual statments. Verify first.

    Accept AI's wording when, every fact and statement is directly traceable to Berkie data, Analyst provided notes, Attached reports(crime, market study, rentroll, etc). AI is merely re-phrasing things you already confirmed.(e.g summarising CRIME report which analyst already read).

    Override or heavily edit when:

    1. AI infers anything beyond the input e.g a) "The sponsor is highly experienced".. when the input only says "2 prior deals" b)"The area is considered safe".. based on limited or ambiguous crime info. c)"The market is booming".. without strong support in the market data.

    1. Numbers are rounded/changed or additional figures appear which are not in reports

    2. AI uses vague, unanchored claims like strong demographics, high demand etc without references.

    Rule of thumb: If we cannot pin point the exact sources, we do not accept as-is. Either adjust to match what we can back up with source or remove it.

  2. Emphasis & Framing: "Is the story this drafts tells is right for this deal?"

    Here is the AI is recommending about how the deal is positioned, not just the facts.

    We accept AI's framing when a) The draft matches analysts judgement of the deals story. For a strong deal, it should lead with genuine strengths and mentioned balanced risks. For a challenging deal, it should acknowledge core risks clearly but professionally. b) The ordering feels right when high impact positives are clearly highlighted and Major risks are not buried at the bottom or softened to the point of misleading.

    Override and reshapen when a)The AI is too optimistic relative to the risk profile(e.g crime or sponsor concerns are downplayed) b)Important banker specific angles are missing, example MB has explicity asked to stress sponsor's track record or focus on a particular market risk.c) You read the narrative and feel, "If I were the banker, this would give me the wrong impression."

Rule of thumb: Ask "If the banker only reads the page once, would they walk away with the right high level stroy"? If the answer is anything other than Yes, then we must override AI's framing in that section.

  1. Language, Style and efficiency: "Is the AI doing better than I do manually?"

Here AI is the strongest and we want to avoid overriding out of habit.

  1. Accept AI's language when a) We have already validated the facts and framing (1&2) b)The text uses US english, Is grammatically correct and clear, avoids redundancy and overly salesy language. c) Our edits are purely minor preference tweaks(e.g swapping a word, reordering a clause)

  2. Overrider or partially edit when a) The tone does not match Berkadia style ex: Too casual, Too promotional or too negative b) There are repeated phrases or clunky sentences that could confuse a banker c) The same sentence can be flagged by error matrix for language/format.

Rule of thumb: Once facts and framing are correct, bias toward accepting the AI's language unless it clearly violates style guidelines or would confuse the reader. Dont rewrite just because we would phrase it differently.

Safeguards to prevent over trust

To stop analysts from rubbing stamping AI output we must:

  1. Mandatorily review checkpoints in the tool. Before marking a draft as "ready for MB review", analysts should check boxes likes a)" I have verified the factual accuracy of AI generated section" b)I have ensured that key risks are neither omitted nor downplayed. This keeps responsbility clearily with analyst.

  2. System warnings for high risk sections. We flagged certain sections as " always require extra check" a)Sponsor description (risk of overclaiming experience or financial strength) b)Crime/Safety commentary c)Any inference heavy market commentary. In the UI, These sections are a) Highlighted and b) Have a small message " High jugde area - verify carefully, do not rely on AI alone"

  3. Random sampling and feedback. Weekly once a random request of each analyst is reviewed to see a) Where analysts accepted AI text verbatim b)Whether those sentences were indeed fully supported by sources. If patterns of over trust is detected during audits, we will use them in training and tweaking the prompts if necessary to be more conservative.

This way, In the Narratives + Berkie Market Place ( Inhouse AI of Berkadia)

  1. Analysts know exactly when to trust AI(polishing, structuring, supported summaries)

  2. They know exactly when to override(judgement heavy areas, sponsor risk, crime, deal story)

  3. And we keep pratical balance between quality, speed, and human responsbility, instead of drifting towards blind trust or total disregard of the AI.

In my experience and a few AI project I have worked, AI tools are very good at providing summary of a content and most of the time is agreeable. If a process follows checklist used by humans, then AI can do the same. Most of the time these processes are rule based, follows repeatable patterns and has structured input method.

In my personal experience, we have been able to implement an AI solution integrated with API's where the AI is extracting data from document and feeding the data into a workflow tool for further processing. Before deploying this solution, we had to perform an extensive testing by uploading various format of the document and building the prompt to be able to capture accurate data. Hence, it is imperative for us to provide AI with large clean data and set accurate rules and test the output for multiple scenarios to develop trust on the output from AI.

Domain : Manufacturing-Oils and Gases

Context : In Air separation unit High Capacity compressor and High Capacity turbine are very sensitive equipment which operates at very high pressure and high speed, temperatures and expansion ratios, here the key is how this high precision process to be operated stable absorbing all the noises external and with in process noise and deliver a consistent parameters output values so that requirement of process is met, unless these this condition is met the process control always unpredictable and with quality variations.

Intent :

The intent was to build a Artificial Intelligence Predictive and control Model even from start of the process and then maintaining the End to End stable process parameters which leads to better temperature and flow distribution and pressure ratios to attain the desired cryogenic product out put.

It was clear decision and direction of business to deploy a BB Project to achieve the first level of consistent Sigma value of grater than or Equal to 3.0 and later to review on improving further.

HOW and what considerations are made to build the below AI Model :

AI has emerged has assistant, guide and consultant to review the present process conditions been operated based on real online data and analyse that in real time in few seconds and make the suitable decision to the bring the process bias, to reflect the process output the intended output.

To build the trust on AI Model by the operator, process operation team we have involved then in the design, considered all technical details, design conditions of the Air compressor and Turbine and communication flow to End to End stake holders and taken all suggestions that would call a need for inclusion in the AI Model as Operators and Process Owners being the face of the process and close to reality they know the process very well.

When Should People ''Trust'' an AI’s Recommendation :

I will give a real example : The operators and shop floor people are the owners of the process, they believe only when the AI model delivers a result which is as expected by them for their process objectives and centre line management KPI given for each parameters.

First Confidence with Simulation : So considering this point we have engaged the shop-floor team and operators in design and development to assure that how its build to take them in trust first, with their suggestions and then the same Operators and process owners were participated in simulation of the AI model, They run by themselves seen the output and given the feedback for corrections and fine-tune, many times they accepted the results also.  the expected results in the same.

''First Trust'' after 2nd-Simulation & Dry Run :

The logics and model were corrected based on End to End review and asked the Operators and process owners re-perform the simulation once again in presence of technical experts, The results to operators and process owners found to be favourable.

''Evident proof of trust'' on AI Model during Commissioning and Go-Live & Post Go-Live :

The plant, Air compressor and Multiple turbines were started and taken in line with AI Model predictive & Control system, Found to be favourable with expected desirable result and stable process parameters with in control limits, observed the simulation nodes and possible deviation would lead to failure and the bais, taken control to verify the observed deviation optimized further for the consistent stability of parameters.

Positive Business outcome :

For Air Compressor : Achieved the Process capability of 1.25 and with a sigma Value of 3.75

For Turbine : Achieved Process capability of 1.42 and with a sigma Value of 4.26.

Customer satisfaction due to improved product quality, due to consistency in Online supply to customer.

Savings in Power consumption

Savings idle running of equipment and deterioration.

Realization of product soon after few hours of Factory start up. 

Measures taken to sustain the developed Trust :

Regular Audits & AI performance review Documentation until complete sustenance

  • System alarm tracking/decision lists of shifts

  • MOC, Management of Change method implementation

  • Clear accountability (list of monitors, approver, in charges, especially on Logical changes)

  • Review for bias, fairness, and Legal & regulatory compliance

  • Maintain deviation logbooks for each parameter, failures against the target values

AI Model performance review, correction, maintenance  and development  :

Review the performance of the AI system on continuous basis with team and adapt the necessary change to be brought in to the model new dynamics. Take all the learnings from the feedback given by the alarms of system and people.

Training and communication

Ensure each and every change in the model is communicated to the end to end team members and provide full necessary training on the actions to take. Example, if AI Model is to be over rided by Manual control or visa versa from manual control to AI Model, Ensure to provide hands on experience.

2nd part of the Question :  When Should the operator and process Owners should Override AI Model.

AI Model doesn’t provide a Permanent Fix, it’s not a solution or application which is not affected by any External or Internal Factor, so considering this we have designed ‘’alarm’’ system which AI predicts the deviation which is out of it’s control and informs the user to take necessary action either online with AI or by taking AI in OFF Mode and override manually to control the process until the External or Internal factors are nullified and then take back the AI Model in line once the noise factor really disappeared.

One of the example of External Noise factor for which AI Model has to be taken off and Manual Human Override has to happen.

AI Model can’t accommodate the dynamic process variation of downstream or upstream caused due to sudden Pressure fluctuation by control valve malfunction, power frequency variation. This surge in Power frequency or Power factor leads sudden dip in Compressor pressure, RPM and Turbine RPM and expansion ratio.

During the above situation if the process is not taken in control sure it leads to destable the whole plant , plant will shutdown different out of control settings, so It’s necessary to take the process out of AI Predictive control and operate manually to bring the process under control and then after manual stabilisation give back the control to AI Model.

Conclusion :

AI has emerged has assistant, guide and consultant to review the present process conditions been operated based on real online data and analyse that in real time in few seconds and make the suitable decision to the bring the process bias, to reflect the process output the intended output.

But AI Model doesn’t provide a Permanent Fix, it’s not a solution or application which is not affected by any External or Internal Factor and it’s not one time installed or invested and forgotten, it’s to be tracked and treated as element of evolution and a process of Continuous improvement and evolution really holds good for this to assure system should not fall behind when the AI solution keeps merging with new horizons.

As additional details : How above AI enabled Model was developed and the major steps followed as below :

Risk & Bias : The whole process was studied for Severity of failure Air compressor and Multiple Turbines, occurrence and detection, especially on detection failure at each step of the process and instruments while incorporating the logics for predictive model, bias limits were identified, precision modelling were used to arrive at the accuracy of the flow meters and control valves, power and power factor tuning for ramp up and ramp down.

Model definition : AI predictive Model was developed by considering historical manual operations data, Risk thresholds, keeping continuous sustainable focus with Model metrics, logical validation, establishing stability and calibration of all instruments and analysers.

Simulation & Dry Run : The developed model was tested on dry run simulation for Air compressor and Multiple Turbines to see the model is a best fit and need of correction and fine tuning for desired output, simulation run was executed to identify all the predictive proposal based on assumption for the defined out put of each process parameters, power settings, pressure out outs, temperature, speed of turbines, inlet and out let pressures, temperatures, Valves, pressure relief valves, sensors with the reduced bias.

Process conditions of Air compressor and multistage turbines and instrumentation output was measured and noted, temperature and pressure requirements at each process/instrument outputs, logics were corrected, replaced, fine tuned for desired output.

Commissioning and Go-Live & Intended sustenance overtime : The plant was taken in line with AI Model predictive & Control system, observed the simulation nodes and possible deviation would lead to failure and the bais, taken control to verify the observed deviation optimized further for the consistent stability of parameters of Air compressor and Multiple Turbines and achieved the Process capability of 1.25 and with a sigma Value of 3.75 and for turbine parameters achieved Process capability of 1.42 and with a sigma Value of 4.26.

AI that we use in my organisation mainly provides a recall rate which the possibility of something being true had a human carried the same exercise. Model Monitoring is a critical process and it helps understand whether AI is working as expected.

The models are trained over a period of time and baseline data. if the data sets are higher, the model prediction in terms of a recall rate will be more than 99%.

Also the other metric we use is the confidence level of 95%, meaning model is allowed to make 5% errors. now these errors are man tally investigated and the model is retrained basis the results of these failures.

this is how model is groomed on a day to day basis and therefore to summarise:-

  1. capture metrics in tableau/dashboard read recall rate and confidence levels

  2. tune recall rate at 99% and confidence at 95%

  3. if confidence falls below expected , then manually investigate the errors and retrain the model on the manual investigation outcomes

  4. this is how model will be able to get a new baseline

  • Solution

Process Context

My team also manages the central master data management for 50+ plants today, and this will grow to 70+ plants by 2027. The entire fleet data management is handled by two people in my team.

Every time we commission a new plant or acquire a new plant, we need to align it’s material master with our fleet database, to avoid duplication, planning errors, and wrong spares being introduced into SAP.

In a typical post commissioning and acquisition, we review minimum of 5000+ incoming material records against an existing 345000 item fleet master.

Practically for my team, each item review takes at least 6 minutes without AI.

The AI-enabled Process

To solve this, I built a Python + AI solution using a MiniLM semantic model, combined with rule based checks.

The program setup classifies each incoming item into three categories, Auto, high confidence match to directly map & upload in SAP. Review, ambiguous match, reviewed by the master data team. Reject, no valid match, program generates a new master data creation template for my team, to directly load into SAP.

You can clearly see, AI does not create master data blindly in this case, it recommends, and the team decides.

When We Trust The AI

I have defined clear rules after testing the model for almost 10 days with millions of lines, semantic similarity is high & critical identifiers (model number, size, rating). It checks if descriptions and attributes are complete and consistent. One more rule I have setup is to keep standard, low-risk categories, and excluding verified MRP items, and these items directly flow straight into Auto category & are uploaded without manual touch.

When We Override The AI

Team deliberately does the review when similarity scores are close across multiple candidates, technical digits conflict even if text similarity is high. Then we also look at if item is maintenance critical or safety critical. We jump to the poor descriptions as well.

In all such cases, team’s priority is correctness, not the speed.

 

Safeguards That Keep The Balance

We have built simple controls to avoid blind trust or even excessive overrides,

Strict thresholds for Auto classification, mandatory team’s review for all Review cases, spot audits of Auto mappings, tracking & analysis of override patterns to improve program, and we have clear ownership, AI suggests, Team decides.

Impact In Real Numbers

Now with this program, my team completes 5000 item migration in 10 days in total instead of 2 months.

I have a clear breakdown of 10 days,

Data setup + AI pre-load + first analysis is done in 0.5 day

SAP mapping for Auto category takes 1 day

Manual review is done for Review category in 7 days

New MD setup for Reject category is done in 1.5 days

This has really improved my team’s output and bandwidth, and also reduced the onboarding risk for new plants, and best part is, it is allowing two people to scale this work for our growing fleet.

Bottom Line

I trust AI where signals are strong & mistakes are low impact, I override it where ambiguity or risk is high. As you can see, we are improving the overall process, idea isn’t to remove people from the process, it’s to make sure people spend time only where judgement actually matters.

When should people believe an AI Recommendation - and when should they ignore it?

 

Balancing the Score: How To Rely on AI in Customer Service Escalation process in BPO.

Efficiency and accuracy are the two forces of success in Business Process Outsourcing (BPO) in the dynamic world. In one place, this tension can be experienced and that is the customer service escalation management.

In this case, AI systems are being increasingly used to screen the incoming customer touch-points such as emails, chats, call transcripts, etc. and decide what cases need to get forwarded to a human specialist and what cases should be addressed by a low-level agent or an automated system.

The project: Adopting an AI-based "Smart Routing" solution in a global BPO conducting technical support of one of the consumer-electronics brands. This is an extremely pertinent process, as proper escalation rules out higher first-contact resolution, customer retention whereas improper decisions cause expensive specialist bottlenecks, frustration among the agents, and customer attrition.

It does not necessarily have to be the accuracy of the AI that can be a confident of more than 85 percent, but rather human behavior surrounding it. The teams may lose their skills in casual rubber-stamping, in which they automatically believe the tag of the AI to be either “ESCALATE” or “DO NOT ESCALATE”. On the other hand, having suffered due to past misjudgments, he or she may doubt any other solution that is right, defeating the system despite the good advice. In order to avoid such an eventuality, we have to establish clear, considered terms to trust and override which are not grounded in gut feel, but in observable signals.

Under what conditions is it reasonable to rely on the AI Recommendation?

One would be more trustful when the AI is performing in its predetermined and desired area of competence and when the setting fits its training. The major signals of acceptance are:

1.      High Confidence Score and reasoning: With a high degree of confidence, the Smart Routing AI must offer the percentage of confidence as well as, most importantly, indicate the key phrases or sentiment triggers on which this conclusion was based (e.g., 95% confidence: ESCALATE because such phrases as data loss, legal action, extreme negative sentiment were mentioned). In case the AI is sure and its logic is clear and corresponds to the familiar escalation guidelines, one can believe it.

2.      Pattern Recognition Level: The AI is highly accurate in identifying subtle patterns when the interaction happens on a scale of thousands; when a human may either overlook an instance of a specific product model name matched with an elusive error notification that has historically occurred before a significant failure. Whilst the agent may fail to identify an evident cause to mistrust this pattern based flag, he or she should accept it.

3.      High-Volume and Routine issue types: In the common and well-defined issue that the training data of an AI is strong (e.g., password reset, warranty status query, etc.), its “DO NOT ESCALATE” suggestion is to be followed to preserve the efficiency of the working process.

 

When to Override the AI:

Not a rebellion, but an action of responsible overriding, is caused by certain safeguard indicators to indicate the limitations of the AI.

·       Contextual/Cultural Case Unread: The AI can read an email as very aggressive and suggest escalation due to the use of strong language. Nonetheless, a human agent will be able to identify the wording to be culturally standard to an area or may notice there is a tone of desperation hiding in the indifference.

Rule of Thumb: Override in case you realize that there are meaningful contextual, cultural, or emotional tones that the model fails to detect.

 

·       Uniqueness or Uncertainty: A customer reports about an issue on a new product or employs very vague and non-technical speech. It might inaccurately be misclassified by the AI that has been trained on past data.

Rule of Thumb: Rule out where the case is truly new, where there is substantial uncertainty or where the information accessible to the AI is not available (e.g. a note on an earlier call (logged under a different system)).

·       Contradictory Evidence: The AI suggests non-escalation, however, when the agent reads it, he or she can find a moment of reference to a safety issue or a regulatory complain hidden in the text.

Rule of Thumb: Override in the event that you have or discover clear, factual information, opposing the reasoning the AI gives.

·       Low Confidence level and Unclear Justifications: In the case when the AI gives a low confidence rating (below 70%) and the explanations it has mentioned appear poor or indifferent, it is a clear signal that it would like people to be the ones making judgments.

 

Procedural Rules to follow in order to stay balanced:

In order to establish this operational, the initiative must incorporate these signals into an easy to use practical framework:

 

·       The Three-Check Flowchart: The visual checklist that the agents adhere to is:

(1) Check Confidence and Reasoning: Is it clear and high? Proceed. In case of low/unclear, then override.

(2) Check Contradiction/Uniqueness: Are there obvious contradictory facts or uniqueness? If yes, override.

(3) Uniqueness Test: Do we have desirable human intervention? If yes, override; if not, trust.

·       Introduction of Required Override Fields: All overrides must have an abbreviated defined remark of a drop-down menu (e.g.: "Cultural tone," "New issue," "Against policy XYZ”). This helps reduce non-essential overrides and develops essential feedback information to re-educate the AI.

·       Weekly Calibration Sessions: The supervisors and the agents are then going through a sample of overridden and accepted cases collectively. Was the override justified? Did trust in the AI pay off? It is a self-refinishing cycle of feedback, which makes the performance of the AI and human judgment more accurate.

 

Result of the Initiative:

Using this balance model the "Smart Routing” project obtained an objectively well-balanced outcome. The accuracy of the escalation increased by 22 percent in half a year, that is, the time specialists spent in providing services was spent in a more efficient way. It is very important to note that, there was a reduction of 30 in the number of harmful overrides (when agents falsely rejected a correct AI solution) and a decrease of 40 in the number of blind trust (when agents falsely executed a wrong AI suggestion). The system helped people to understand and contribute to it rather than hiding the logic. Employees did not feel substituted by artificial intelligence and its pattern became better progressively with the help of the high-quality override reasoning information.

 

Finally, human vs. AI is not the aim of escalation management in a BPO domain, but human with AI. Believe the AI when it is a regular, recognizing pattern of what is known. On occasions where human judgement is required in terms of uniqueness, tone, and enhanced understanding, override it. A balance is established on the one hand, there are clear controls, there are internal defenses, and there is a habit of considering override as the most valuable learning process in the system and not as its failure.

It is very critical and important to understand that Al doesn't think the way a human does. Al is like an automatic machine which is programmed, and its objective is to check thousands of files or activities faster and then provide the output the way it is programmed making it easier and convenient and enhancing the capability and efficiency quickly. Al is basically cross checking its own memory and functions in a way that are based on pre-defined rules and the way it is trained and programmed. This activity adds value, however if the rules are mis-configured or updated incorrectly it can have a severe damage on the business and company's reputation hence Al should be used to gain efficiency, but it should not replace human judgement.

Real Scenario in my office - I would want to highlight one scenario specific for a chat process, Due to state regulatory compliance specific for one of the states in US ex. XYZ state, The medical information of the patient was not supposed to be provided due to HIPPA compliance. The BOT was programmed in such a way that it should route those chats to a service representative and in return the service rep. will assist by providing the information only if a signed authorization letter was available from the patient to disclose the medical records. A new update was received and the BOT was supposed to be programmed accordingly for a different state ex. ABC state, however the same rule was applied and programmed to process XYZ erroneously due to which the BOT started providing medical information of the patient which was incorrect and non-compliant further prone to fines and penalties along with a possible litigation when the patient finds that their medical information is provided without their consent.  As part of the routine auditing work, QA picked up some random samples and was able to see the variance and highlighted it post which the concerned team was informed about the issue and the BOT was reprogrammed and overwritten as earlier

Al recommendations and outputs need to be trusted where there are low risk scenarios and rules are clear; outcomes are predictable and no ambiguities and this should match with historical patterns. We need to be very mindful in maintaining the right balance between trusting the Al outcomes and have a understanding on when to intervene and this would require good judgement and strong risk governance. In the above scenario, the high-risk actions should require human review, and overrides should be monitored continuously to help improve the system and the output provided to be accurate. Al needs to operate as a help in making decision support and not a decision authority

Cues, safeguards or thumb rules needed to maintain the right balance

1) The level of trust needs to be aligned with tasks that are simple and can easily rely on automation

2) As mentioned above on the scenario pertaining to state regulatory compliance. The tasks should be routed for a human review considering the severity and impact of these errors

3) Regular spot checks, targeted audits, sample checks need to be done to check if the rules are operating correctly as programmed

As a thumb rule, it should be followed that the simple, repetitive tasks with clearly defined rules and lesser error impact can be automated using Al. However, errors pertaining to high impact and business reputational harm must remain under human control. This way Al remains an important support mechanism and tool without compromising compliance, trust or responsibility.

Case Study: Smartphone Discount Optimization - Amazon Great Indian Festival 2023 vs. 2025

Executive Summary: During Amazon's Great Indian Festival (AGIF), smartphones represent the highest-revenue category, contributing approximately 35-40% of total electronics sales. This case study examines how AI recommendations guided discount strategies, where they succeeded, where human override was critical, and the resulting business impact.

Amazon faces a multi-dimensional optimization problem during AGIF smartphone sales such as (1). Brand partnerships - Negotiated discount caps with OEMs (Samsung, Apple, OnePlus, Xiaomi), (2). Margin protection - Maintaining profitability despite aggressive discounts, (3). Inventory management - Balancing stock across fulfillment centers, (4). Competitive pressure - Matching/beating Flipkart's Big Billion Days pricing, (5). Customer acquisition - Converting first-time smartphone buyers, (6). Same-Day Delivery promise - Ensuring SDD fulfillment at scale.

AI decisions and suggestions were trusted for Hourly Price adjustments based on Real-time data processing, inventory positioning based on Pattern detection at scale, and customer segmentation by studying hidden behavior discovery.

Human decisions (Override AI) are implemented while deciding the Brand-specific strategy – considering the Relationship & negotiation factors, Margin protection through multi-stakeholder trade-offs, Discount timing psychology is override when studying the behavioral economics, statistical interpretation during sample size & context awareness, and Competitive response - Real-time market intelligence

Quantified Impact Summary

Value Created by Trusting AI:

AI-Driven Decision

Revenue Impact

Margin Impact

Dynamic hourly pricing

+$89M

+$7.1M

Inventory optimization

+$42M

+$3.8M

Personalized discounts

+$67M

+$5.4M

SDD routing efficiency

+$12M

+$4.2M

Total AI Trust Value

+$210M

+$20.5M

 

Value Created by Human Override

Human Override Decision

Revenue Impact

Margin Impact

Apple strategic discount

+$107M

+$3.2M

Xiaomi margin protection

-$38M

+$4.25M

Flagship discount timing

+$61M

+$2.8M

Total Override Value

+$130M

+$10.25M

 

Combine Optimization – Total Incremental Values (2023 à 2025)

Incremental Values

Revenue in millions

Margin in millions

AI-Trusted Decisions

$210

$20.5

Human Override Decisions

$130

$10.25

Total - Combined Impact

$340

$30.75

 Attribution Analysis:

  1. Revenue Growth Attributed to Framework: 95% of USD 356 million total growth

  2. Margin Improvement: 0.9 percentage points (8.2% to 9.1%)

  3. AI Contribution: 62% of total value created

  4. Human Override Contribution: 38% of total value created

Key Learnings:

When to Trust vs. Override

Trust AI When:

1.       Processing millions of transactions in real-time

2.       Detecting patterns across pin codes

3.       Optimizing within defined constraints

4.       Speed of decision exceeds human capacity

5.       Historical patterns are strong predictors

6.       Objective function is clearly defined

 

Override AI When:

1.       Strategic relationships are at stake

2.       Competitive dynamics are fluid/real-time

3.       Behavioral psychology factors apply

4.       Statistical conclusions contradict business materiality

5.       Multi-stakeholder trade-offs exist

6.       Novel situations outside training data

7.       Qualitative factors can't be quantified

 

MBB Reliability Framework:

1. AI systems can be used for data collections by studying transaction logs, clickstream, inventory.

2. For pattern detection AI models can be used for customer segments and price elasticity.

3.It is the joint responsibility of an AI and MBB for statistical validations (T-tests, regression, significance testing).

4. At the stage of Contextual Interpretation MBB holds the major responsibility.

5. During Strategic Integration, both MBB and leadership holds a major responsibility (market share vs. margin trade-off).

6.MBB's post validating AI outputs against business reality can finalize the Reliability Assurance

 

Conclusion: In smartphone discount optimization during AGIF 2025, AI recommendations delivered $210M in incremental value through pattern detection and real-time optimization. However, human override decisions contributed an additional $130M by incorporating strategic relationships, competitive intelligence, behavioral psychology, and contextual interpretation of statistical results. The optimal approach is neither blind trust nor reflexive override, but a structured framework that assigns decisions to the entity—human or AI—best equipped to handle them.

  • Author

🏆 Best Answer
Ankit Kulkarni
Exceptional response with a complete “benchmark” structure: clear process context (SAP master data alignment), explicit decision categories (Auto / Review / Reject), strong trust vs. override logic (similarity thresholds + critical identifier checks), and practical governance (spot audits, ownership, override analysis). Also includes quantified impact with a credible breakdown. This is exactly what the question asked for.


Approved

Smitha Muralidharan
Excellent, high-quality framing and strong operational logic. Clear trust vs. override cues, good balance between AI assistance and human judgment, and a mature safeguard mindset. Very strong response overall.

Taby Sheikh
Very well structured BPO escalation-routing case. Strong focus on confidence + reasoning, contextual override conditions (culture/tone), and good governance mechanisms like override reason capture and calibration sessions. The quantified outcome adds credibility.

Bharath CN
Strong manufacturing example with real operational depth. Good linkage between simulation → trust building → commissioning → sustained performance. Also clearly answers the override portion using alarm systems and manual takeover during external noise. Very credible.

vijay gonsalves
Good compliance-driven example (HIPAA/regulatory). Clearly highlights why AI must not be blindly trusted in high-risk settings and reinforces the role of audits and human control. Practical and relevant.


🟡 Conditionally Approved
(Good intent and relevance, but future responses should consistently anchor on a clearly defined process, explicit decision thresholds, and measurable business impact.)

Preethi Bijesh
Good finance process context and sensible trust vs. override cues. Future responses should strengthen decision thresholds and define more explicit operational triggers for override.

Aloke Biswas
Strong ITIL framing and good common-sense override logic. Future responses should be more structured (explicit decision rules, confidence cues, and measurable process impact).

Anoop Krishnakumar
Useful principles and a sensible “partial vs full accept” idea, but the response is not anchored in a specific domain process. Needs a concrete example with explicit metrics and triggers.

Manish_Gupta_Tpgl
Contains good general points, but remains generic and not tied to a specific process. Needs clearer operational cues and business impact.

Himanshu_Lohani_WpY8
Relevant automation example, but it is more deployment-focused than decision-focused. Needs explicit accept/override triggers and a clearer process framing.

Dhruva Kapur
Has a valid monitoring mindset (recall rate, confidence levels, dashboards), but the response is too generic and not anchored to a defined business process. Needs clearer override logic, business risk framing, and practical governance structure.

Vijay Yivaturi
High effort and structured with tables and a clear “trust vs override” split. However, the example feels more like a strategic business write-up than a real process-based operational AI recommendation scenario. Future responses should reduce assumed numbers, tighten realism, and focus more on practical override governance instead of only business impact.

Create an account or sign in to comment

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.