Skip to content
View in the app

A better way to browse. Learn more.

Benchmark Six Sigma Forum

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.
Message added by Nisusho Zhimomi,

AI or Artificial Intelligence is a self learning and/or self rewriting technology that mimics human mind, intelligence and decision making. It has the ability to evolve and learn basis the responses it receives in different situations. As per IEEE SA, AI is “the combination of cognitive automation, machine learning (ML), reasoning, hypothesis generation and analysis, natural language processing and intentional algorithm mutation producing insights and analytics at or above human capability.”

 

An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Shashi Prakash on 20 October 2025.

 

Applause for all the respondents -  Adil Khan, Manik Sood, Shashi Prakashi, Sanjib Ghosal

How Confident Should AI Be Before It Acts?

Featured Replies

Q 816.
In many AI-driven processes, a model or agent produces a result with a certain level of confidence — but confidence doesn’t always mean correctness.

If AI acts too early, it risks errors; if it waits too long, it may slow down the process or frustrate users.

 

Think of one process in your domain where AI makes or recommends decisions.

How should the system decide when it’s confident enough to act automatically, and when it should pause or escalate to a human?

What factors — such as data quality, past accuracy, or risk level — should influence that threshold?

 

⚠️ Note: Any answer that is generic or does not connect with a specific, relevant process will not be approved.

 

🏆 The best answer will be selected on the basis of:

 

  • Relevance of the chosen process scenario
  • Clarity and depth in defining confidence thresholds
  • Practicality of balancing speed, accuracy, and trust

 

Note for website visitors -

Solved by Shashi Prakash

Domain: Quality Assurance in Manufacturing

 

How Confident Should AI Be Before It Acts?

In a Quality Department confidence level is not about a percentage threshold (>80% GO / <80 % NOGO) but it is about risk, reversibility and responsibility.


AI is now embedded in many day to day QA activities, from in-line inspection and NCR tracking to critical suppliers performance monitoring.
While final accountability stays with humans, AI can make informed decisions in defined, low-risk zones that help speed up the process without compromising integrity.


The Process: In-Process Inspection and Defect Flagging

In the shop floor, AI integrated laser 3D scanners detect surface marks, scratches or stains.
Here is how it plays out:

  • For critical sealing surfaces or safety features, even if AI detects a high probability of defect. It should only alert and hold the parts, never proceed to reject or scrap.
    QA must validate before taking any actions.

For example, AI integrated laser 3D scanner may flag a White patch mark under 20X magnification. Is the white patch below or above the coating surface. Rejection criterion is “Visible to the naked eye not 20X magnification” and “fluorescent under UV light in dark room” only QA can check and decide. No matter how much confidence % based on co-relation to historic data, AI should not decide.

AI’s confidence is information — not authority. It only raises alarm, will not pull the plug.

  • For non-functional or cosmetic areas if the detection system has proven reliable over time, AI can auto-tag parts for recheck or rework and allow the line to continue.
    These are safe, reversible decisions where AI genuinely adds value.

When Can AI Decide on Its Own?

AI should act automatically only in areas where:

  1. The risk is minimal :- e.g., non-functional cosmetic issues or repetitive, well-understood defects. E.g Dent upto 1 x 1 mm in 10CM2 is permitted. If laser scanner measures this and is with in the acceptance limit, Then AI can decide to accept by saving the pictures for future reference.
  2. The action is reversible :- such as routing a part for re-inspection or initiating re-measurement.
  3. The decision is data driven and routine — for example:
    • Auto-adjusting sampling plans when process stability is proven.
    • Triggering calibration reminders when measurement drift from mean is detected.
    • Flagging repeated rejection to tighten incoming checks and supplier outgoing inspection.

These are operational support decisions, not customer-impacting ones.
They save time, reduce fatigue, and allow engineers to focus on complex, judgment-based issues.


 

The Balance

AI should be trusted to act where it can’t harm and required to ask where it can harm.
It can manage data, spot deviations and trigger safe, reversible actions — but the moment a decision touches safety, customer experience or compliance. The final say belongs to humans, That’s the real balance:

 

AI ensures speed and consistency, humans ensure wisdom and accountability.
Together, they create a Quality system that is fast, reliable and deeply human at its core.

 

  • Solution

AI-Based Quality Inspection for Tea Bags – Ekaterra Lipton - UAE

I am sharing this personal experience from my training & consulting experience at Ekaterra Lipton – UAE. Like most of the modern manufacturing plants, Ekaterra Lipton UAE also uses AI-powered vision systems to inspect tea bags for defects such as empty tea bags, torn/damaged tea bags, missing tags, smears etc. These tea bags once filled roll over the conveyor belt before packaging wherein the screener constantly scans every tea bag. Their inhouse AI model assigns each item a defect probability score (0–1) based on image analysis.

 

Confidence Threshold Logic

AI Confidence Level

System Action

Rationale

> 0.90 (High confidence – clear defect or no defect)

AI acts automatically — accept or reject the item

Their model has historically achieved >95% accuracy in these cases, allowing fast throughput without delaying the process.

0.5–0.90 (Moderate confidence)

AI flags item for human re-inspection

Mixed visual screener indicators (e.g., slight smear, tags) which reduces certainty to accept or reject; a quality inspector validates the decision.

< 0.5 (Low confidence)

AI pauses and escalates for manual inspection and potential model retraining with new information

new defect patterns, Image noise, or faulty sensors make autonomous action risky.

 

Factors Influencing Confidence Thresholds

  1. Sensor Conditions & Calibration
    The most common reasons are poor lighting conditions, dust accumulation on camera / screener lens, or camera calibration issues can distort images and trigger lower confidence score thereby triggering more human reviews.
  2. Past Model Accuracy & Drift
    Thresholds are dynamic and it emphasizes by putting weightage on the most recent trend and reduces weightage on older results. If rolling 60 data points i.e. recent performance false rejection rate exceeds 2.5%, the system self corrects & tightens the auto-decision range to prevent wastage.
  3. Continuous Learning Loop
    Every Human re-inspection results are fed back real time to retrain the model. Over time, this raises confidence reliability and reduces manual interventions.

 

Balancing Speed, Accuracy, and Trust

  • Speed: AI handles routine inspections in real time, maintaining production flow.
  • Accuracy: Ambiguous cases are reviewed by experts, reducing false positives.
  • Trust: Operators see AI as a collaborator, not a replacement — decisions are explainable, auditable, and based on confidence logic.

Example

In a tea bag production at Ekaterra Lipton, the AI system detects a torn tea bag.
AI Confidence Score = 0.97 : Clear Auto-reject case.
Another image shows a minor smear (confidence = 0.66) : Flagged for human review.
If confirmed defective, the feedback improves future accuracy.

 

Consider the use of AI in content moderation within the publishing industry, where AI reviews manuscripts submitted by authors before publication. It automatically flags plagiarism, ethical issues, or copyright violations. Confidence is just one signal. AI should weigh confidence against consequence, context, and historical reliability. A well-designed publishing AI will not just ask “Am I confident?”—it will ask “Am I confident enough, given what is at stake?”

To determine when the AI should act automatically versus when it should escalate to a human, the AI needs a decision framework built around several key factors, as follows:

1. Risk associated with a decision: High-risk decisions should always involve human review, regardless of AI’s confidence level. For example, publishing a medical research article on COVID-19 carries far more risk than approving a mathematics book’s manuscript.

2. Ability to give reasoning with sources: If the AI can clearly explain its reasoning with the reference sources, its decisions can be trusted. e.g., “This article matches 95% of this article, and this is the website link”.

3. Past feedback through human overrides: The system should learn from past human overrides. If humans have often reversed the AI’s decisions in a specific domain, that domain should be flagged for mandatory human review until the model's accuracy improves.

4. Novelty of articles: AI algorithms are best at handling familiar patterns taught to them. If the manuscript content includes emerging technologies or an unknown subject, the system should escalate to humans.

5. Matters that can have legal consequences: When it comes to sensitive topics like politics or religion, even if the AI has a high confidence level, the reputational and legal risks are substantial to automate these decisions, and they must be escalated to humans.

In manufacturing, especially in multi-stakeholder environments like cold rolling steel strip manufacturing, AI must act only when its confidence level is high enough to avoid costly errors. Confidence here refers to the probability that the AI's prediction or decision is correct. The correct confidence level is not a single static number. It depends on the risk and impact of the action on final product quality. The higher the potential cost of an error (a false positive and false negative), the higher the AI confidence level.

  • Low confidence -> AI should alert an operator or wait for more data.
  • Medium confidence -> AI can suggest actions but not execute them.
  • High confidence (typically >95%) -> AI can act automatically, especially in routine or time-sensitive operations.

Since Cold rolling is a continuous process where steel strips are passed through rollers to reduce thickness and improve surface finish. A key challenge is strip breaks—tears in the steel that stop production and damage costly machine. Let’s consider strip break classification process as an example, which is one of the most critical and high-impact areas in cold rolling process.

If AI decision is 99% confidence, a strip will break due to high tension, it slows the rollers to adjusts tension automatically. Whereas if it’s 70% confidence, it alerts the operator to check manually and validate with functional expert.

Example: Strip Break Classification

Problem Impact:

  • Strip breaks cause 4%- 5% production loss annually few crores of money
  • Diagnosing causes manually takes few weeks to concludes.
  • Unplanned downtime and production loss

AI Solution:

  • AI analyses sensor data (tension, torque, electric voltage) every 10 milliseconds.
  • It classifies strip breaks in real-time, reducing downtime and manual effort.

Confidence Threshold:

  • AI acts only when confidence >95% in its classification. Because 95% and above confidence is statistically significant to justify the decision.
  • If confidence is <95%, it flags the event for manual review or potential break for RCA.

image.thumb.png.3df91f7a3da467c8c5eea005c694f72f.png

 

Create an account or sign in to comment

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.