Skip to content
View in the app

A better way to browse. Learn more.

Benchmark Six Sigma Forum

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.
Message added by Mayank Gupta,

Large Language Model (LLM) is an advanced statistical tool used in artificial intelligence (AI) to lean and process human language and subsequently generate responses in the same format. These are foundational in natural language processing tasks such as speech recognition, machine translation, text generation etc. and are built using neural networks with millions of parameters.

 

Statistical Analysis is the process of collecting, organizing, interpreting, and presenting data to uncover patterns, trends, relationships, and insights. Statistical analysis could be either Descriptive, Diagnostic, Predictive or Prescriptive.

 

An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Rahul Das on 25th Dec 2024.

 

Applause for all the respondents - Rahul Das, Puneet Kumar, R Rajesh, Radhakrishnan Annamalai, Dieylani LO.

Using LLM for Statistical Analysis

Featured Replies

Q 732. If you upload datasets in an Excel file and use an LLM like ChatGPT to analyze whether there is a significant difference between two sets of continuous data, what are the potential pitfalls or errors that could occur during the analysis?

 

Note for website visitors -

Solved by RD RD

  • Solution

Below are the are the potential pitfalls or errors that could occur during the analysis when wevuse an LLM like ChatGPT to analyze whether there is a significant difference between two sets of continuous data

1.       Firstly, LLMs may not understand which statistical characteristic is being analysed if it’s the mean, median ..etc 

2.       Direct analysis may be performed by LLMs without considering the Prechecks prior to analysis such as detecting outliers or handling missing values, may not be performed automatically leading to misleading results

3.       LLMs may lack context and there are chances of choosing inappropriate alpha value. Choosing a smaller alpha value is critical specially in high-stakes scenarios an example can be of drug testing, where the continuous data might require a smaller alpha value to minimize Type I Errors

4.       LLMs may also fail to differentiate between statistical tests required to be performed whether a paired t-test (used for comparing the means of two related groups) or an independent t-test (used for comparing the means of two independent groups) is appropriate for the given data.

5.       Validating assumptions before performing any statistical test, such as checking for normality or equal variance, is another area where LLMs may fall short. While they may provide numerical summaries in response, they often do not generate the graphical summaries necessary for a thorough validation

6.       Additionally, LLMs might proceed with a non-parametric test without verifying whether the data actually requires it. applying them unnecessarily can result in less powerful or less meaningful analysis, particularly when parametric tests are suitable for the data.

With my limited experience with LLMs and extensive experience with Excel and basis some experiments with continuous I felt that data quality of the output would primarily depend on the prompts. I am using Copilot for this. While results from excel are fine, Copilot results are quite patchy.  For simple continuous data  calculations like correlation didn't work and I had to review all output carefully and spotted multiple errors. Also, may be it was only for me as error, copilot only worked on files which were in share drive and share point. Locally saved excel file wasn't working with Copilot and basic feature like Auto save was to be enabled.

 

When I tried this with Chatgpt. Below are the results

 

Analyse data set
ChatGPT said:
ChatGPT

The uploaded Excel file contains three sheets: Sheet1, Sheet2, and Sheet3. Which sheet would you like me to analyze, or should I analyze all of them?

You said:
Sheet1
ChatGPT said:
ChatGPT

The data in Sheet1 appears to have a header row shifted or extra blank columns. Here's the structure:

Unnamed: 0 Name of the Employee Salary
NaN A 2000
NaN B 2200
NaN C 2400
NaN D 2600

Would you like me to clean this data by removing unnecessary columns and correcting the header row?

You said:
Yes
ChatGPT said:
ChatGPT

The data has been cleaned and now looks like this:

Name of the Employee Salary
A 2000
B 2200
C 2400
D 2600
E 2800

Would you like to perform any specific analysis or visualization on this data?

You said:
Calculate standard deviation
ChatGPT said:
ChatGPT

It seems like I can’t do more advanced data analysis right now. Please try again later. If there's anything else you'd like to discuss or analyze, let me know!

 

Above exercise took almost 10 mins and with no result

 

Seems like LLM though can understand the language of the data however cannot interpret formats or relationships. 

 

Overall, I feel LLMs cannot give accuracy and reliability of complex excel products.

Edited by Puneet K
Tried on Chatgpt as well

I have put the below data like this in an excel sheet and compared the respective datasets using ChatGPT.  It threw 2 comparisons.  Comparison 1 said there is inconsistency in units for measuring these factors. Comparison 2 said that each of the factors can be having specific unit.     image.png.7ab01bcf05ae91df5debc4c76e717ac2.png

 

 Some pitfalls that I see here:
1.    The comparison 2 (option 2) suggested that the unit for the value ‘80’ could be ‘cm’(centimetres) and 174 as ‘Kg’. Actually it(80) was a misplaced value that was supposed to be under ‘Weight’ but the comparison was on the unit measured.  ChatGPT tried to identify unit discrepancies but did not think of whether it could be a misplaced value of two columns!! So I feel reasoning ability might be one challenge
2.    Over-reliance on patterns – I guess based on pattern matching only the comparison amongst the 2 datasets should have happened earlier. I guess sometimes this may not yield the right outcome.

 


Conclusion:

ChatGPT requires lot of existing data, pre-trained knowledge, pattern matching to be successful in its usage.. In this case, either pattern matching or pre-trained data was probably incorrect as a result of which it suggested to unit (both height and weight were mentioned as arbitrary numbers) specificality or discrepancy whereas it was actually the values of height and weight that were interchanged in the original file.. Considering the values on both the columns, ChatGPT could have analysed the values and think of if there can be of such non-normal height and weight.  

 

Therefore, these are the findings with my limited LLM knowledge

Large Language Models (LLMs) are largely used in data analytics. One of the common issues that we could face while using ChatGPT after uploading the datasets in an Excel file is reading or accessing the Excel files. It is mainly due to the

 

- Irregular formatting

- Inconsistent data structure

- Delimiters (like , .) present in the data itself

- Formulas, macros, conditional formatting in the file itself

- File size limitations

When using a LLM like ChatGPT to analyze data and check if there’s a significant difference between two sets of data, there are a few potential pitfalls or errors to be aware of: Data Quality Issues: If your data isn’t clean, like having missing values, wrong entries, or being disorganized, then the results might be inaccurate. Think of it like trying to bake a cake with spoiled ingredients. It won’t turn out right. Incorrect Test Selection: A LLM like ChatGPT might suggest a statistical test, but if the wrong test is used for the data, the results could be misleading. For example, if you use a test like t-test when the data isn’t normally distributed, it could give you a false conclusion. A LLM like ChatGPT may not always check for the data’s distribution before recommending a test. Skipping Assumptions: Many statistical tests, like the t-test, assume certain things about your data, such as it being evenly distributed. If you don’t check whether your data meets these assumptions, you might end up with wrong results. A LLM like ChatGPT might not always remind you to verify these assumptions, so it’s easy to miss them. Misunderstanding the Results: A common pitfall is not fully understanding the results, especially something like the pvalue. If you don’t know what the p-value means (how likely the observed difference is due to chance), you could misinterpret the results and make a wrong conclusion. A LLM like ChatGPT might simplify the explanation too much, making it harder to get the full picture. Over-Simplification: LLMs like ChatGPT do a great job of making complicated topics easier to understand, but sometimes they might miss important details. For example, they could overlook outliers those odd data points that don't quite fit with the rest, or hidden factors that might influence your results. Lack of Domain-Specific Context: Another potential pitfall is that ChatGPT doesn’t have specific expertise in your field. It can help with general analysis, but it might miss specialized knowledge or details that are important for your analysis. Relying solely on the LLM without considering domainspecific knowledge might result in an incomplete or inaccurate conclusion. In summary, while a LLM like ChatGPT can be helpful, it’s important to double-check the data, ensure the right statistical test is used, confirm the assumptions, and fully understand the results. It’s important to make sure to fill in any gaps with your own expertise to avoid these common pitfalls.

 

When we analyze datasets using an LLM like ChatGPT, below are the potential pitfalls or errors that could occur

1.      LLM can misinterpret the dataset if the excel file is not structured correctly, for example headers are missing or the format is not the same.

2.      Duplicate datasets might not be identified automatically leading to skewed results.

3.      If LLM is not able to identify the outliers, then it could result to incorrect results.

4.      If LLM has not completely understood the context of the data or the specific research question, then it will give us inappropriate analysis or irrelevant conclusions.

5.      At times LLM's explanations might not be detailed enough for users to fully understand the analysis.

6.      LLM might not function appropriately if the dataset is too large and requires computations. 

Some interesting answers to this question. Two answers stand out - Rahul Das and Dieylani LO. Rahul's answer has been selected as the winner because of its well structured format, otherwise both the answers are pretty close.

Create an account or sign in to comment

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.