Skip to content
View in the app

A better way to browse. Learn more.

Benchmark Six Sigma Forum

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.
Message added by Mayank Gupta,

q-value is a modified p-value and it gives the proportion of false positives among all the positive results in a hypothesis test. It is also interpreted as False Discovery Rate (FDR). q-value is preferred over p-value in areas where we run multiple hypothesis tests.

 

An application-oriented question on the topic along with responses can be seen below. The best answer was provided by Rahul Arora on 3rd Jul 2022.

 

Applause for all the respondents - Shraddha Sequeira, Anshul Vaidya, Rahul Arora, Sohan Subhash Mirajkar, Chandra Shekhar Chauhan.

q-value

Featured Replies

Q 483. q value - A q-value is a p-value that has been adjusted for the False Discovery Rate (FDR). Explain q-value with an example. While doing hypothesis testing, under what circumstances would you prefer to work with a q-value instead of p-value?

 

Note for website visitors - Two questions are asked every week on this platform. One on Tuesday and the other on Friday.

Solved by Rahul.Arora2

 

A p-value is a statistical measurement used to validate a hypothesis against observed data. It measures the probability considering the null hypothesis is true. It is used to determine the confidence levels in a hypothesis testing and is generally considered statistically significant. If P value is low then then null hypothesis is invalid but if p is high null will have an impact. High p-values indicate that that there is no strong impact on the population. Effect exist however the impact on population is not so high. the threshold for p-value is 5%

 

The concept of the False Discovery Rate (FDR) was used when multiple test are conducted. The FDR is the ratio of the number of false positive results to the number of total positive test results.

 

A p-value of 0.05 would mean that 5% of all tests will result in false positives. An FDR-adjusted p-value which is also called q-value of 0.05 indicates that 5% of significant tests will result in false positives. In other words, an FDR of 5% means that, among all results called significant, only 5% of these are truly null.

 

We would use Q value when false positive it high and there is significant difference but in reality none exists. so if we set the cut off to 5% there is possibility that we may not choose the right decision. An FDR adjusted p value would mean that we are willing to accept 5% test of all false positives. 

 

 

Assurance activities in many experimental setups, employ the measure of statistical significance, p-value, to the test-statistics. The P-value, is defined as, “the probability of accepting null hypothesis parameters, when null hypothesis is true”.
Hypothesis-Testing commonly involves testing significance of null and alternative hypothesis, utilizing p-value, to determine if the set of data values obtained from activity or procedure, is a correct representation or consequence of the experiment. Thus, a Null and an Alternate hypothesis is designed for the test-statistics, with baseline:

A.      Null hypothesis that, “the set of resultant data values are representative of the experiment” demonstrated using test statistics on characteristics numeric data value of sample population.

B.      Alternative hypothesis that, “the set of resultant data values are not representative of the experiment” demonstrated using test statistics on characteristics of sample population.

Subsequently Type-I error is defined as “rejecting null hypothesis, when null hypothesis is true” & alternatively Type-II error is defined as “rejecting alternate hypothesis, when null hypothesis is false”

A special case observation of Positive False Discovery Rate FDR, is term assigned to the “rate of occurrence of Type-I Error in multiple test observations or the ratio between the count of false positive observations to the total count of positive incidents”.

Or pFDR = True Positive Observations/ (False Positive Observations+ True Positive Observations)

A false positive occurrence, refers to numeric value observation with higher significance than targeted statistical significance of p-value = 0.05 & confidence interval less than 95%. A false positive occurrence leads to such numeric value observations, being considered as a qualifier observation for the test-statistics & a valid consequence of experiment design. However, these false positive values, ideally, should have been rejected during analysis, record and reporting.

Consequently, in case of multiple test-runs, it may be provisioned to have a limit of 5% occurrence of false positives, indicating that a total of 95% of results are accurate and true representative of experimental design. It can be therefore inferred that the 5% of the result observations, will be false positive. This interpretation is shared is biased as the occurrence and count of false positive using this logic is estimated basis, the threshold accepted p-value for the data set.

Positive False Discovery Rate may be restricted to lower band by researcher, however, in low-cost experimental setup or research centric design; elastic pFDR may be setup and monitored, to generate meaningful insights from research.

An alternative approach to account/adjust false positive occurrence, is to adjust accepted p-value for test statistics, by reducing p-value with mathematical steps, to reach lower threshold alpha values. q-value is a replacement value used in place of p-value when False Positive Detection in shared in experimental setup. The estimation of q-value is achieved using Benjamini-Hochberg Procedure given here-under:

1.       In multiple hypothesis testing, conduct all hypothesis testing and find p-value for each test case.

2.       Rank and arrange p-value in order from smallest to largest.

3.       Estimate Benjamini-Hochberg critical value for all p-value observation using the formula (i/m) *Q          where:

i = rank of p-value

m = total number of hypothesis testing or total number of multiple test run

Q = chosen false discovery rate

4.       Find the largest p-value from test result data, which is lesser than its corresponding Benjamini-Hochberg critical value.

5.       Consider p-values that are smaller than its corresponding Benjamini-Hochberg critical value as significant q-values.

6.       Consider all p-value smaller than largest p-value obtained in step 4, as significant q-values.

A priori probability is defined as, “the chances of an event to occur, given that there are limited number of outcomes and each outcome is equally likely to occur.” Let’s suppose there is a specific outcome x which is part to set of total outcomes represented by y, then priori probability of x is given as x/y.  Or the priori probability of getting heads in a single toss of coin is 0.5.

pFDR are used in conjugation with sensitivity analysis utilizing priori probability technique in biological studies. False positive detection rate is estimated by researcher to limit estimation of false positive estimation in sample data, while drawing inference about count of the infected personal in sample universe. When less stringent False Discovery Rate threshold is used, the number of detections in patient data and expected false positive detection increases.  It can be empirically proved that, if prior probability is low, the False Detection Rate will be high for a given p-value and vice-versa.

  • Solution

The Concept of q-value

 
Whenever we are running hypothesis test on a sample of data values that are drawn from the same population, there are chances that we will be getting a test statistic that is very extreme compared to the hypothesized value which indicates that the sample values belongs to a different population while in reality it belongs to the same population from where it is drawn. This extreme value is termed as False Positive(FP) & the probability of getting this false positive or Type-I error is expressed as p-value for that single testthe threshold of this p-value is generally kept at 0.05 (i.e. significance level) & it is desirable to have the p-value to be less than 0.05 or 5%.
 
Let us now apply this to a multiple testing scenario where we will perform multiple hypothesis tests taking different samples from the same population, now for all these tests we set a threshold p-value of 0.05, it will mean that 5% of all the tests conducted will result in false positives(FP) (i.e. let’s say if we have conducted 1000 tests thus generating 1000 p values there are chances that 50 of those tests will result in p-value<0.05) although all the different samples taken for different tests belong to the same population. If we try to plot the p-values generated from all the tests in a histogram this will result in a shape similar to uniform distribution where each bin represents a p-value range & the frequency represents the number of tests in which p-values fall within this range.
 
In case of tests conducted by taking samples from two different populations & try to plot the histogram for different p-values generated from these tests we will get the histogram shape that will be right skewed as most of the p-values would be falling within the p<0.05 & thus will be true positives. So let’s say that we conducted 1000 tests & found that 80% of those result in true positives i.e. 800 of them are True Positives(TP) (having p-value < 0.05) thus significant & 200 of them are not significant. Now out of these 800 true positives or significant tests there are false positives as well & this is not possible to identify with the p-value alone.
 
In order to overcome this limitation we will be leveraging the concept of q-value which leverages the concept of False Discovery Rate (FDR) (defined as FDR = FP / (FP+TP)). q values are basically the p-values that are adjusted by leveraging an optimized FDR approach. Thus let’s say if we have a q-value of 0.05 which means that 5% of the true positives obtained after performing multiple tests are actually false positives. So taking the reference of the above scenario if 800 are true positives (or significant) initially identified then 5% of them i.e. 40 will turn out to be false positives after adjusting the p-values basis the FDR approach. The range of q -values lies between 0 & 1.
 
Generally the cut off for significance for FDR is < 0.05 which means less than 5% of the significant results will be false positives.
 
There is one popular method i.e. The Benjamin-Hochberg Method which adjusts the q-value in a way that limits the number of false positives that are reported as significant or True positives by making the p-values larger for eg before the FDR correction, the p-value may be 0.04(significant) & post the FDR correction it may become 0.06(not significant).
 
Let us now see an example in order to understand the above method of adjusting the p-value:- 
 
Let’s say we have conducted 10 tests taking 10 pairs of sample each time from the same distribution & below are the p-values obtained for each test :
 
0.91 0.11 0.71 0.31 0.51 0.41 0.61 0.21 0.81 0.01
 
First step is to order the p-values from smallest to largest & rank these values as shown below:
 
p-values : 0.01 0.11 0.21 0.31 0.41 0.51 0.61 0.71 0.81 0.91
rank :          1      2     3      4.     5     6      7      8     9     10
 
Here we have one false positive which is the first one i.e. p=0.01 which is < 0.05. let us now see whether this false positive is significant or not.
 
Next step is to calculate the adjusted values starting from the last value ranked 10th
Here the largest FDR adjusted p-value is same as the largest p-value.
 
For the next largest adjust p-value we will choose the smaller of the two options i.e. the previous adjusted value & current p-value x (total number of p-values / p-value rank whose adjusted p-value is to be calculated)
Thus for calculating the adjusted p-value for the 9th ranked p-value it will be smaller of previous adjusted p-value which is 0.91 or current p-value at 9th rank i.e. 0.81 x (total no. of p-values i.e. 10 / p-value rank whose adjusted p-value is being calculated i.e. 9) which comes out to be 0.90, thus between 0.91 & 0.90 we will choose the smaller value i.e. 0.90.
 
Similarly we will be repeating this for the other ranked p-values as well & the final output will be as shown below:- 
 
adjusted p-values : 0.10 0.55 0.70 0.77 0.82 0.85 0.87 0.89 0.90 0.91
 
Now as you can see the false positive value of 0.01 is now converted into an equivalent adjusted p-value or in other words q-value of 0.10 which is no longer significant as it is now > 0.05.
 
Thus we can see how we can leverage q-value in order to perform adjustments on p-value in order to separate the true positives from the false positives.

A q-value is a same as p-value that has to be adjusted for the False Discovery Rate(FDR). The False Discovery Rate (FDR) is the proportion of false positives you can expect to get from a test. While a p-value gives you the probability of a false positive on a single test; If you’re running hundreds or thousands of tests from small samples commonly used in fields like genomics, we can use q-values.

 

Usually we need to decide ahead of time the level of false positives we are willing to accept for which under 5% is the norm. This means that you run the risk of getting a false statistically significant result every 5% of time. As we know false positives (p-values) form facts of life and are unavoidable. While 5% might be an acceptable false positive rate for running one test, it becomes completely unacceptable if you run thousands of tests on the same small data set. 

 

Circumstances to prefer to work with a q-value instead of p-value
Imagine we are planning scratch off lotto system, and we have a 5% chance of getting a winning lotto ticket. One lotto ticket gives us a 5% chance, but if we buy enough lotto tickets, probability tells us that we will eventually get a winner (Buying 1000 lotto tickets should do the solution and will in fact give us, on average, 50 winning lotto tickets). 

 

The same is also true for laboratory test results.

  • In the first test on our lab data, we have a 5% chance of a false positive.
  • In the second test on our lab data, we have another 5% chance of a false positive.
  • In the thousandth test on your data, you have had a 5% chance of a false positive a thousand times.

We can get a false positive, a false significant result, if we run enough tests. For example at a 5% FDR, we get 5 false results for every 100 tests we run, or 50 for every thousand. This is a pretty high value. This is known as multiple testing problem.

The False Discovery Rate (FDR) approach to p-values assigns an adjusted p-value for each test. This is called the “q-value.”

What is Q value? 

Q Value is a p-value that has been adjusted for the false discovery rate (FDR). The false discovery rate is the proportion of false positives we can expect to get from a test. A p-value gives us the probability of a false positive on one sample test. If we are running hundreds or thousands of tests from small samples, we should use q-values.  

image.png.069f35db3c813de341024e1ed4b0e044.png

 

Why are q-values necessary during hypothesis testing ? 

Generally, we decide ahead of time the level of false positives we are willing to accept - In Hypothesis under 5% is the norm. This means that you run the risk of getting a false significant result 5% of the time. You cant escape this fact when you are running tests as false positives (p-values) are a fact of life and are unavoidable. While 5% might be an acceptable false positive rate for running one test, it becomes completely unacceptable if you run thousands of tests on the same small data set. 

Assume that we are planning scratch off lottery and we have a 5% chance of getting a winning ticket. One ticket gives us a 5% chance, but if we buy enough tickets, probability tells us that we will eventually get a winner (Buying 1000 lottery tickets should do the trick and will in fact give us, on average, 50 winning tickets). The same is true for Lab test. 

 

# The first test on our data, we have a 5% chance of a false positive

#The second test on our data, we have another 5% chance of a false positive

# The thousandth test on our data, we have had a 5% chance of a false positive a thousand times

 

Therefore, If we run enough tests, we will get a false positive- a false significant result. In fact, at a 5% FDR, we will get 5 false results for every 100 tests we run, or 50 for every thousand. That's pretty high. This is called multiple testing problem. 

The False Discovery Rate demonstrate the p-values assigns an adjusted p-value for each test. This is the q-value. A p-value of 5% of all tests will result in false positives. A q-value of 5% means that 5% of significant results will result in false positives. Q values usually result in much smaller numbers of false positives, although this is not always the case. 

The Q value is not the same as the Q we sometimes see in statistics. Q on its own refers to elements is=n a set that don't have a particular attribute. for example, lets say we had 100 students and 55 of them like Math subject. The proportion of students who like Math is p=0.55. therefore, q=0.45; which is just 1-p. 

Rahul Arora has provided the best answer to this tricky question. Well done!

Create an account or sign in to comment

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.