Project : Reduce cycle time and improve quality of Narratives draft for property offering final memos. Problem statement : Current average time to produce a first draft Narrative ( Property overview, Market Overview, Borrower Sponsor, Maps & Aerials, Demographics, Crime Reports, Strenghths & Weakness) is 5hours per deal, with 16% of drafts requiring major rework due to : Missing or inconsistent data Grammar, UK/US English used interchangeably Misalignment across sections(e.g strenghths not matching the market facts) Goal : Reduce average drafting time by 30 - 40% and reduce work by 20 - 30% while maintaining or improving Narratives quality( rated by originators & underwriters) Where AI Fits : Auto generating data from known sources ( internal systems & vetted external websites) Gnerating a first draft of : Property overview, Market Overview, Borrower Sponsor, Maps & Aerials, Demographics, Crime Reports, Strenghths & Weakness Improving grammar, UK/US English usage, Overall stylistic alignment Analyst still : confirms source credibility, checks factual accuracy, adds nuance specific to the deal, approves/ edits strengths & weakness. Hypothesis Formation : In this project, AI outputs are best treated as structured hypotheses, for example: The property's location near XYZ business park is a key strength due to strong employment growth Crime rates in the submarket have declined over the five years, supporting stability of the assest Market rent growth has moderated but remains above long term averages. From an MBB standpoint: These are not facts, they are claims to be tested AI's role in Define/Measure is propose: Here is what might be true, here are patterns implied by the data I see. Guidance to analysts: Treat AI text as hypothesis rather than validated conclusions Use prompts that frame AI output as provisional. For example: "List 5 potential strengths of this property based on the data below, and mark each as High confidence or Needs verification with a short reason. Hypothesis testing : AI can help structure tests, but LSS will be led by humans. For Narratives quality, Hypothesis will be like H1 : Using AI to draft market overview reduces average drafting time by 30% without lowering quality scores H2 : AI generated strengths/weaknesses will be atleast 80% aligned with analysts final conclusion AI can help design experiments or sampling plans( eg: How many deals do we need to test to detect a 20% reduction in cycle time at 95% confidence level) and Suggest such metrics to track ( Time saved, number of factual errors, number of grammar corrections, etc) But the actual test( data collection, calculations, significance tests) should be performed in transparent tools like Excel, Minitab, etc. where we can see the formulas and logic. AI derived statistics ( eg: This is 90% likely) should not be treated as valid inferential statistics unless we have a clearly documented model and method behind it. Hence MBB needs to have rule of thumb that : AI is acceptable in forming hypotheses and in helping us set up tests AI is not acceptable as the only source of evidence that a hypothesis is confirmed. Statistical validation must be done through transparent methods. Statistical Confidence: We are building marketplace with in Berkie (In house AI) for all the prompts to be used for Narrative drafting In the marketplace initiative, there are two layers of confidence Confidence about workflow change (Process Improvement) : Does using AI actually improve the quality/productivity? Confidence about individual deals output ( deal level narrative accuracy) : Is the AI generated narrative for this deal accurate enough to use? Confidence in workflow change: Here as an MBB, we should use classic LSS experiments. Define Metrics : Drafting cycle time per deal, Number of factual corrections per Narrative, Number of grammar/style fixes, and Originator & analysts satisfaction score Design : Pre/Post study like N deals before AI, N deals after AI or Some analysts use AI, some dont, over the same period of time Analyze: Use standard tools (T tests, control charts) to confirm 1. Is there a statistically significant decrease in drafting time 2. Is quality maintained or improved Document : Effect sizes, confidence intervals and any risks observed(e.g common type of AI errors) Confidence about individual deals output: Here instead of 95% confidence in a strict statistical sense, we define operational acceptance criteria. For example : For an AI assisted Narrative to be acceptable as a draft, we require 1. 0 critical factual errors (data that could mislead credit decision) 2. less than or 2 minor factual discrepancies ( currently aiming for 97% accuracy, acceptable error is three low critical errors) 3. Grammar and usage of UK english errors below X per 1000 words 4. Analyst can review and finalize within 30mins for 90% of the deals. For an MBB perspective, we can: Use sampling and inspection like 1. Randomly sample AI drafts 2. Score them against a checklist 3. Track defects per unit and apply control charts Overtime we can characterize AI as a process with a certain defect rate, Just like any human process. Assessing data quality and credibility when AI is in the loop This is critical as the data for Narrative is pulled from multiple sources (Internal systems, public websites, vetted websites, etc) Separate source credibility from AI credibility : A) As MBB, We should enforce a clear hierarchy Source credibility rules(upstream of AI) : Internal : Salesforce/Omniview, Berkadia internal property/loan systems, internal market research has highest trust, Trusted external: Offical census data, reputable third party market reports, public crime data portal has highest trust, Open web or generic searches (google search) has low trust unless specifically validated. AI Usage rules: AI is allowed to Summarize and rephrase known, structured inputs from trusted sources and Highlight patterns or inconsistencies across those sources. However, AI is not allowed to freely invent data not present in the supplied sources and acras the primary source of record for quantitaive facts. B) Data lineage and traceability: To keep Lean Six Sigma like rigor, we need traceability For each factual statement in the AI narrative, analysts should be able to trace source and transformation We should ask AI to annonate paragraphs with references e.g(Source : Crime Report 2024 Q2) or (Source: Internal rent roll 2025-01) or generate a separate evidence log for the narrative From an MBB angle, this is like keep data collection forms and measurements system documentation for our Y's and X's. C) Measurement system analysis for AI and analysts : We will take AI and analysts as measurement devices for Narrative content. We wil run Gage R&R style excerise : we will select a set of deals we will have a) AI only draft(first pass) b) Analyst only draft(without AI) c) AI+Analyst (AI draft, then analyst edit) We will have independent reviewers ( seasoned and quality specialist) rate accuracy, clarity, grammar/UK US english usage, alignment between sections We will assess variations attributing to AI v.s Human v.s Combination. Where AI improves repeatbility/consistency(e.g style/alignment) v.s where it adds risk ( e.g subtle factual errors) This quantifies data quality and helps set where AI is safe v.s risky Where AI should accelerate decisions, and where traditional validation is non - negotiable A) Areas where AI can safely accelerate and automate: In the Narrative AI marketplace, AI is well suited for: Drafting and rephrasing : a) Converting bullet research into fluent US english paragraphs b)Enforcing consistent style and structure Summarization: Summarizing long third party reports like Crime Reports, Market Reports, Sponsor histories, etc Formatting and aligning : a) Aligning terminology across sections(e.g consistently referring to submarket names, asset class labels b)Ensuring internal consistency between section (e.g strength and weaknesses must refer to facts already stated in the narrative. Highlighting potential strengths and weaknesses : a) Suggesting deal strengths/risks based on the data provided b) Tagging statements as "requires analyst verification" where sources are weaker or ambigous Quality checks : a) Flagging likely grammar issues b)Enforcing US spelling conventions c)checking for obvious contradictions (We said crime is low in one section and high in another section of Narrative) Here an MBB can reasonably reduce manual efforts while keeping risk low, especially with defined guardrails and human review. B) Areas where traditional validation is non negotiable From a Lean Six Sigma and risk perspective the following should not be delegated solely to AI Quantitative facts that influence credit/investment decisions e.g Occupancy rates, rent levels, reports fulled from trusted source Casual statments and forward looking claims e.g Because of X, We expect Y, Crime reductions mean lower risk for the asset. AI may propose such statements however Analyst must evaluate casuality using domain knowledge and data. Use traditional reasoning and where relevant, simple statistical checks(e.g trend analysis) rather than trusting AI's intuition Compliance, regulatory and reputation risk : Any statement that touch on fair housing, sensitive demographic descriptions, legal regulatory interpretations must be reviewed and where necessary, crafted and reviwed by analysts. AI may help draft neutral language but it should not be the final authority. Final sign off and accountability: The accountable owner of the narrative is the analyst ( and ultimately the deal team) not the AI. MBB guidance should codify that a) AI is a tool in the Measure/Analyze steps b)Approve/Reject decisions remain with analysts. In conclusion, AI should treated as a powerful assistant, not a replacement for Lean Six Sigma rigor. It can speed up research and drafting the Narrative, but the standards for data validity, statistical confidence, and final judgement must remain with the analysts. Where the impact is high, traditional verification is still essential and where risk is low, AI can safely help us work faster and more consistently.