Q 714. How do Named Entity Recognition (NER) systems handle ambiguous terms, and what techniques can enhance their accuracy in real-world applications? Try running this through different large language models (LLMs) and share the varied responses as examples. Feel free to compare their outputs for added insights! Note for website visitors - This platform hosts two weekly questions, one on Tuesday and the other on Friday. All previous questions can be found here: https://www.benchmarksixsigma.com/forum/lean-six-sigma-business-excellence-questions/. To participate in the current question, please visit the forum homepage at https://www.benchmarksixsigma.com/forum/. The question will be open until Tuesday or Friday at 5 PM Indian Standard Time, depending on the launch day. Responses will not be visible until they are reviewed, and only non-plagiarised answers with less than 5-10% plagiarism will be approved. If you are unsure about plagiarism, please check your answer using a plagiarism checker tool such as https://smallseotools.com/plagiarism-checker/ before submitting. All correct answers shall be published, and the top-rated answer will be displayed first. The author will receive an honorable mention in our Business Excellence dictionary at https://www.benchmarksixsigma.com/forum/business-excellence-dictionary-glossary/ along with the related term. Some people seem to be using AI platforms to find forum answers. This is a risky approach as AI responses are error prone as our questions are application-oriented (they are never straightforward). Have a look at this funny example - https://www.benchmarksixsigma.com/forum/topic/39458-using-ai-to-respond-to-forum-questions/ We also use an AI content detector at https://quillbot.com/ai-content-detector. Only answers with less than 45-50% AI-generated content will be approved.

Message added by Mayank Gupta, October 25, 20241 yr

Named Entity Recognition (NER) is a subtask of Natural Language Processing (NLP) that identifies and classifies entities present in a text into predefined categories such as names of people, organizations, locations, dates, monetary values etc. This helps in extraction of meaningful information from unstructured text data.

An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Sachin Tanwar on 23rd Oct 2024.

Applause for all the respondents - Narendra Purushothama, Sachin Tanwar, Deep Dave.

Named Entity Recognition (NER)

Followers

October 22, 20241 yr

Q 714. How do Named Entity Recognition (NER) systems handle ambiguous terms, and what techniques can enhance their accuracy in real-world applications? Try running this through different large language models (LLMs) and share the varied responses as examples. Feel free to compare their outputs for added insights!

Note for website visitors -

This platform hosts two weekly questions, one on Tuesday and the other on Friday.
All previous questions can be found here: https://www.benchmarksixsigma.com/forum/lean-six-sigma-business-excellence-questions/.
To participate in the current question, please visit the forum homepage at https://www.benchmarksixsigma.com/forum/.
The question will be open until Tuesday or Friday at 5 PM Indian Standard Time, depending on the launch day.
Responses will not be visible until they are reviewed, and only non-plagiarised answers with less than 5-10% plagiarism will be approved.
If you are unsure about plagiarism, please check your answer using a plagiarism checker tool such as https://smallseotools.com/plagiarism-checker/ before submitting.
All correct answers shall be published, and the top-rated answer will be displayed first. The author will receive an honorable mention in our Business Excellence dictionary at https://www.benchmarksixsigma.com/forum/business-excellence-dictionary-glossary/ along with the related term.
Some people seem to be using AI platforms to find forum answers. This is a risky approach as AI responses are error prone as our questions are application-oriented (they are never straightforward). Have a look at this funny example - https://www.benchmarksixsigma.com/forum/topic/39458-using-ai-to-respond-to-forum-questions/
We also use an AI content detector at https://quillbot.com/ai-content-detector. Only answers with less than 45-50% AI-generated content will be approved.

Solved by Sachin Tanwar

October 23, 20241 yr

Go to solution

October 23, 20241 yr

NER systems often encounter ambiguous terms, which can lead to incorrect entity classifications. These ambiguities can arise due to various reasons, such as:

Polysemy: A word having multiple meanings. For example, if we are doing customer complaints analysis, "Satisfied" can refer to a "customer representative service/response", "satisfaction regarding product/technology", or dissatisfaction quoted by customer such as "not satisfied".
Homonymy: Words with the same spelling but different meanings. "firm" can refer to a financial institution or the direction of strength.
Contextual ambiguity: The meaning of a term depending on the surrounding context. In case Image to text extraction if we are trying to extract name of storefront from name plate, "The Burger House" might refer to a store name, while "The Burger House Special" can refer to the dish on the menu.

Techniques mentioned by difference LLMs

GPT : Transfer Learning, Domain Specific Training & Data Augmentation

Gemini: Contextual (Window-based feature, Long Short-term memory), Lexical (Gazetteers, Part of Speech tagging)

Claud: Contextual (Window-based approaches examining surrounding words, Long-range dependencies using attention mechanisms & Syntactic parsing to understand grammatical relationships), Disambiguation Strategies (Statistical modeling of entity co-occurrence, Domain-specific rules and gazetteers & Word sense disambiguation techniques & Accuracy Enhancement Methods) etc.,

The Real time example which we used in Image to Text Extraction is using LTSM, Gradient boosting method and eliminating contextual ambiguity is Image text extraction to match storefront name. The text with less bench mark score would be eliminated in each iteration and finally end up with words matching 80%+ Accuracy.

Example: Extract the Text from the Image using OCR --> Build Bag of words --> Eliminate all usual abbreviations and other probable incorrect words --> Build a correlation model based on type of business to eliminate contextuality --> Finally arrive at useful text and match with store front names.

October 23, 20241 yr

Solution

Named Entity Recognition systems are used to identify specific entities in the text, such as people, places, or organizations. More often than not, though, these systems are constrained by ambiguity. A word can denote more than one meaning, so that ambiguity can occur when the system is not certain about the proper meaning for a particular context.

Strategies for Handling Ambiguity:

Contextual Analysis: NER systems take into account the words around a potential ambiguous term to dissect what that term means. Consider the word "Orange", which could refer to either a fruit or a company providing logistics support. If it is surrounded with words like "Warehouse" or "inventory," it is more likely to be identified as the technology company.

Gazetteers: They are list of entities along with their types. If a word can be found in a gazetteer, then the system is more likely to identify it as the listed entity.

Machine Learning - Advanced NER makes use of machine learning algorithms to learn for large, labeled datasets of text. Machine learning identifies patterns and relationships that will allow the system to make better predictions.

Techniques for Improving Accuracy:

Quality of Training Data: Quality of the training data is critical. If the noisy and inconsistent data are fed to the system, it will most certainly produce incorrect results.

Feature engineering: building informative features can enable the system to have a better appreciation of the context in which a word is being used. As such, it could be essential to include features like whether it is part of speech, whether it has been capitalized, and distance from other entities.

Ensemble Methods: The accuracy of a number of multiple NER systems can be enhanced by combining these together. These different systems have their strengths and weaknesses, and by combining them, errors from individual systems are decreased.

Domain Knowledge: If the domain is medicine or law, then the addition of domain knowledge helps them to understand the nuances of language.

By employing these strategies and techniques, NER systems can become more accurate and reliable in real-world applications.

October 25, 20241 yr

Named Entity Recognition (NER) - A technology used in the Natural Language Processing (NLP) to identify and classify entities in text. Say for example, we are saying that “Narendra Modi was born in Vadnagar” then NER should identify “Narendra Modi” as a person & “Vadnagar” as a location.

Above was a simple example but let’s add some ambiguity. Let’s say we are typing in that “Apple manufactures iPhone”. Here, NER should be able to identify “Apple” as name of the organization and not as fruit. Hence, NEP tools should have capability to identify entities like names of people, places, organizations, dates in right context even after ambiguity as mentioned in above example.

Now, logically thinking with human brain we can distinguish Apple as a fruit or Apple as a mobile manufacturing company by looking at the context in which the word is used. Let’s see how NER systems in Natural Language Processing deal and process when ambiguity:

1. Through Contextual Analysis: In GPT-4 the NER technology uses contextual analysis using pre-trained language models like BERT or GPT-3 through which context is understood and accuracy is improved.

2. Google Bard: Data augmentation & context-aware models.

3. LLaMA: Heuristic rules

4. Claude: Contextual embeddings, attention mechanisms, multi-task learning and knowledge graph

To summarize, the basis for all LLMs is contextual analysis, pre-trained models, heuristic rules & attention mechanisms.

1 yr1 yr Rohit Gandhi locked this topic

October 25, 20241 yr

While none of the answers are complete in the true sense, the closest that comes is from Sachin Tanwar and hence has been selected as the winner.

I recommend going through all answers to get a more complete perspective.

1 yr1 yr Rohit Gandhi unlocked this topic

Create an account or sign in to comment

Followers

Go to topic listing

Named Entity Recognition (NER)

Featured Replies

Solved by Sachin Tanwar

Create an account or sign in to comment

Who's Online (See full list)

Lead AI Transformation without coding

Most Solved

Forum Statistics

Member Statistics

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)