Jump to content
  • 0

Number of Samples for Hypothesis Test

Vishwadeep Khatri

Message added by Mayank Gupta

Hypothesis Testing - it is the process of using statistical tests to determine if the observed differences between two or more samples is statistically significant or not. A null hypothesis (Ho) is a stated assumption that there is no difference or the difference is due to a random chance while the alternate hypothesis (Ha) is a statement that there is a true difference. With the help of hypothesis testing, we arrive at one of the following conclusions.

1. Fail to reject the Null hypothesis (accept the Null hypothesis)
2. Reject the Null hypothesis (accept the Alternate hypothesis)

From a practical point of view, hypothesis testing allows to collect sample sizes and make decisions based on facts and it takes away the decisions based on gut feeling or experience or common sense. You have statistical proof of whatever you "feel" or "think" is right.


Sample Size is the number of observations or data points or objects in a sample. Sufficiency of sample size is a key element in hypothesis testing to be able to make inferences about the population. The right sample size is primarily dependent on the cost & time involved in data collection and the need for statistical significance. Statistically, sample size is affected by the following parameters
a. Significance Level (σ) or the maximum allowed probability of committing Type I error
b. Power of the test (1-β), where β is the maximum allowed probability of committing Type II error
c. Minimum difference (in the test statistic) to be detected.


An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Rajesh Patwardhan on 27th November 2019.


Applause for all the respondents - Rajesh Patwardhan, Abhishek Mitra, Nilesh Gham, Mukul Kandpal, Deepak Pardasani, Shashank Parihar


Q 213. It is common to see two aproaches in Hypothesis testing. These are mentioned below. 


  • Take a certain number of samples (whatever is feasible practically), and test the decided hypothesis. If the statistical significance is not proved, take some more samples. Keep increasing samples as long as possible and stop if significance is proved. This aproach keeps initial cost low.  
  • Decide the number of samples using a sample size calculation (considering alpha, beta, difference to be detected and hypothesized value etc.) and take a decision based on the outcome of hypothesis testing. This approach provides decision in one go. 


Which of the two approaches for Hypothesis testing will you select? Why? 


Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

Link to comment
Share on other sites

7 answers to this question

Recommended Posts

  • 0

The second approach is more scientific. The first approach is adhoc and actually assumes the result of hypotheses. While the first approach may seem to require lower investment initially, since you are trying to prove significance, you are gradually increasing sample size and hence may end up spending more due to iterative approach. 


In case of the second one, we are adopting an unbiased approach to problem solving. The second approach defines clearly the sample size needed to reach a conclusion with certain confidence level, and may seem costlier, but in the end will save a lot of money and provide unbiased outcome which can be relied upon

Link to comment
Share on other sites

  • 0

The purpose of a hypothesis test is to validate an assumption. Statistical significance is the indicator of the validation. Adequate sample size is the pre-requisite. Minimizing cost while doing a hypothesis test may be an ask, but minimizing risk or damage is a need. Keeping all these factors in mind, let's now address the question.


Adequate sample is a pre-requisite, therefore my choice of approach will depend on the availability of sample. But minimizing risk is a need, so I want to be able to validate my assumption with the least casualty. 


So if I were experimenting a drug's effectiveness through a clinical trial on monkeys, I will not take a pre-determined sample size as I would like to minimize casualty. I would start with the least possible sample size, and increase slowly to be able to reach a decision point. Such a trial will stop immediately at the first evidence of failure, to minimize future casualties. That, is the tolerance level of such a trial, which is defined at the onset. Please note that a casualty may not mean death, but the surfacing of unexpected symptoms beyond the perceived effects of the drug.


Now if I were experimenting a new version of a website to see if it generates more footfall, I would typically use the same incremental approach, but this time, I would potentially create different treatment segments and whenever a specific treatment reaches statistical significance, I would stop the trial. That minimizes cost. 


Lastly, if I were to conduct a hypothesis test to predict which party would win the election, and if I had the ability to pull the required sample without much cost or risk, I would take the second approach due to availability, and less exposure to cost and risk. 


However, having said all that, we need to be cognizant of the alpha, and the difference to be detected. If I were to prove something with the best possible certainty, and I know what difference is critical, we need to be able to bear the cost of a higher sample while casualty minimization remains a priority and a potential show stopper.

Link to comment
Share on other sites

  • 0

I would choose the second one, mainly owing to the statistical power I desire. 

Statistical power ie. the probability of rejecting an incorrect null hypothesis is usually understood as (1-beta), depends on the alternative hypothesis being true !! Ways to increase such statistical power could depend on:

1. The direction of the hypothesis: uni-directional hypothesis would be more stronger as it would concentrated only on one side of the curve

2. The Alpha level I choose: I may get more statistical power if I choose a "relaxed" alpha (eg. 0.1, which may be suitable for some experiments though)

3. The Sample Size: A higher sample size would help with the Standard Error, making the curves leaner

4. Standard Deviation: would have the same effect as point 3. 


Having an eye on the statistical power, would be a far better approach to choose a sample size (as in pt. 3) than an ad hoc method as in option 1. Even if option 1 may be used owing to feasibility reasons, it is still better to keep an eye on the statistical power.

Also, an inexperienced experimenter, given option 1, may loose track of the critical value as well !!



Link to comment
Share on other sites

  • 0

Second approach would be better because in first scenario only initial cost will be low however probability of getting results intially is also only 50%. Post that with every added sample cost will increase (and timeline also increases with each step) and probability of success in each step will remain 50% unless and until final results are achieved.

Whereas if we select second option then initial cost might be bit high but probability of getting results is 100% which will save cost and time both. And time becomes more critical depending upon the process.

So my answer will be second one. 

Link to comment
Share on other sites

  • 0

Hypothesis testing is a procedure of making inferences about the population based on the information derived from the samples. It requires making some initial assumptions and then you try to prove your assumption or hypothesis with the support of statistical tests.


You are given a practical problem; you convert that problem into the statistical problem and try to find out a statistical solution of that problem based on samples data and statistical tests. Then you convert this statistical solution into the practical solution.


You collect samples from the population under study based on certain assumptions. There are two methods in this regard---

1.    Trial and error method

2.    Scientific method


Trial and error method is based on making assumptions or hypotheses about our subject under study and the appropriate sampling procedure. Taking samples again and again and drawing inferences again and again until the statistical significance is proved.


statistical significance is based on----

      i.        Type I error rate

    ii.        Type II error rate

   iii.        Selection of the correct type of test based on hypothesis formulation.

   iv.        Value of your test statistic- which is obtained from sample data and                         applying appropriate statistical test to that data.

    v.        P-value or the critical value


If the P-value is less than the level of significance then we reject our assumption or if the value of the test statistic falls in the region of rejection, we reject our initial assumption.


In trial and error method, we take samples again and again until statistical significance is proved but we are not sure about type I and Type II errors state.


Because, if we take less sample then we will have to compromise with type I error and if we take large enough samples again, we will further type I error rate but any outlier will be detected if present and difference between the population parameter and the test statistic becomes small. While taking optimum samples means fixing type I error rate and reducing type II error rate.


We may prove the statistical significance but is it justifiable?

 How we will justify our conclusion, are our conclusions reliable?


Are we able to detect a difference that really exist or we have detected a difference which really does not exists?

Initially the cost of conducting such a study is low but as we move forward taking more and more samples successively, the study tends to be costly as taking samples will cost time, money and sometimes samples are destroyed during testing, sampling error might enter our study, due to fatigue, we might make error in sampling selection and/or measurement. Also, how reliable are our estimate (how close our estimate is to the population parameter), we don’t have correct idea.


Scientific method is based on several factors like---

      i.        Type I error

    ii.        Type 2 error

   iii.        Selection of the correct type of test based on hypothesis formulation.

   iv.        Value of the test statistic

    v.        E = margin of error (how much difference we want to look)

   vi.        Power of the test

  vii.        P-value


 Based on this information, we calculate optimum sample size for our study.

Thus, we will be able to control Type I error and type II error rate and will be able to detect a difference if it really exists. The cost of procedure will be optimum and we will get a optimum solution.

Link to comment
Share on other sites

This topic is now closed to further replies.

  • Forum Statistics

    • Total Topics
    • Total Posts
  • Member Statistics

    • Total Members
    • Most Online

    Newest Member
    Aakar Gupte
  • Create New...