Skip to content
View in the app

A better way to browse. Learn more.

Benchmark Six Sigma Forum

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.
Guest

Sample Size

 

Sample Size is the number of observations or data points or objects in a sample. Sufficiency of sample size is a key element in hypothesis testing to be able to make inferences about the population. The right sample size is primarily dependent on the cost & time involved in data collection and the need for statistical significance. Statistically, sample size is affected by the following parameters
a. Significance Level (σ) or the maximum allowed probability of committing Type I error
b. Power of the test (1-β), where β is the maximum allowed probability of committing Type II error
c. Minimum difference (in the test statistic) to be detected.

 

An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Prashanth Datta on 12th April 2019.

 

Applause for the respondents- Prashanth Datta

Featured Replies

Q. 150  Using the Sample Size Calculator for 1 Sample T-Test at https://www.benchmarksixsigma.com/calculators/sample-size-calculator-for-1-sample-t-test, highlight the factors which affect the sample size determination by using some examples.  

 

  

Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

 

Solved by Prashanth Datta

  • Solution

Simply stating, the One-Sample T - test compares the mean of our sample data to a known value. For example, if we want to measure the Intelligence Quotient [IQ] for a group of selected people in India, we compute their IQ using a set of predefined tests (mapping to global standards). With the results we get the average IQ for the team selected as well as their individual IQ scores. This average IQ score of the group can always be compared to a known value of 82, which is the average IQ of Indians [which is already computed by accredited testing organizations]. Further, an average score of < 70 means poor IQ and the lower threshold is also computed and made available through global studies. In this case, we can see two sets of averages that can be compared to your teams evaluated IQ score to draw some meaningful conclusions i.e. if group scores <70, they are poor, close to 82 maps to Indians average IQ scores and greater than 82 implies the group has some really intelligent folks. While you may want to strengthen your argument by further statistical analysis, it serves as a starting point of discussion.

 

One Sample T-test is used when we don't know the population standard deviation. Like any other statistical testing, One Sample T-Test also works on certain assumptions. To sum up the assumptions, 

  • Dependent variable Y, should be a Continuous data type
  • Data analysed should be independent
  • No significant outliers in data as we are keeping mean as reference here
  • Data is normally distributed.

Further, as we are aware that one of the technique that we use to identify critical X's are statistical hypothesis testing. An interesting question that we will land up is how much of data in sample size should be analyzed to arrive at a meaningful conclusion i.e. data leading to root cause identification, so as we can build more effective solutions during our Improve Phase.

 

The sample size calculator at https://www.benchmarksixsigma.com/calculators/sample-size-calculator-for-1-sample-t-test, is a good place to start with. 

 

Before I delve into the calculator specifics, let me take an example. We have a Diabetic Clinic where they keep HbA1C readings as their baseline measurement of their patients. While I am not getting into the technicalities of how HbA1C readings work, a global average of 8 is kept as acceptable. The clinic has been running tests for their patients and computing the results. Their sample data gave an average HbA1C reading of 8.04 with a Standard deviation of 0.34. As expected, the Clinic had introduced a new alternative diabetic drug when the computed the sample average of 8.04. The clinic now wants to run an hypothesis if the new drug has really helped them to bring diabetes under control. 

 

While we want to define the problem statement and use Hypothesis analysis to see if the change in drug has resulted as a critical X, the first step is to really see what amount of data needs to be evaluated before proceeding further. The calculator will help us with this critical step. Looking at the parameters of the calculator,

  1. Confidence Level  - Preventing Type I error - This implies rejecting null hypothesis while it remains true. As a rule of thumb 5% rejection is acceptable which  means 95% probability we need to prevent this Type I error. Lets keep this value to 95%
  2. Power of the Test - Preventing Type II error - This implies accepting null hypothesis while it is false. By rule of thumb we keep it at 10%, which means 90% we need to prevent this error. [ Type and Type II error can be subjected to change based on risk appetite of producer and consumer] 
  3.  Reference Mean Value - we will keep it 8 as it is defined HbA1C globally accepted value as normal.
  4. Sample standard deviation - we will keep it at 0.34 basis the samples.
  5. Sample mean value - 8.04 as arrived from the tests.

With the above data, we see we need 759 samples to check if our mean is similar to reference mean. Anything less than this will not give us any meaningful inference.

 

An interesting thing to observe is, if we compromise on our Type I and Type II error i.e. accepting more Producer and consumer risk, the sample size will fall. Again, it depends on the industry where you are analyzing the data. While Medical, critical research industries will not accept high allowances other industries may allow some tolerance. 

 

In summary, this calculator can help to identify sample size when sample mean is available, especially in service sectors to start with some basis analytics in analyze phase to come up with some good inferences for solution building activities.

Prashanth has answered the question in detail. It provides complete understanding of the factors impacting the sample size while comparing the average of a population with an external standard. Great job!

Create an account or sign in to comment

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.