February 6, 200917 yr Here is a simple quiz... When we compute the standard deviation of an entire population, we divide by N Sigma = Sqrt [( (x1-xbar)^2 + (x2 - xbar)^2) + ... )/N] However, when we compute the standard deviation of a sample, we divide by N-1 S = Sqrt [( (x1-xbar)^2 + (x2 - xbar)^2) + ...)/(N-1)] Where, x1, x2, x3... are the individual data points and xbar is the average of the data points. N is the number of data points. While xbar in both cases is given by xbar = (x1 + x2 + x3 + ...)/N Why is this so? SJ. This quiz was started in Yahoo Groups on Fri Sep 12, 2008 ___________________ FOLLOWING CORRECT RESPONSES WERE RECEIVED Hi SJ, Here's my explanation. Degrees of freedom can be defined as the number of values we can choose freely. Assume that we are dealing with two sample values, a and b and we know that they have a mean of 18. eg., Symbollically, the situation is (a+b.)/2 = 18. How can we find what values a and b can take on this situation? The answer is that a and b can be any two values whose sum is 36, because 36/2 = 18 Suppose we learn that a has a value of 10. Now b is no longer free to take on any value but must have the value of 26, because if a=10 then (10+b.)/2 = 18 so 10+b = 36 therefore b=26. This example shows that when there are two elements in a sample and we know the sample mean of these two elements, we are free to specify only one of the elements because the other element will be determined by the fact that the two elements sum to twice the sample mean. We will use Df when we select a t distribution to estimate a population mean, and we will use n-1 df, where n is the sample size. For example, if we use a sample of 20 to estimate a population mean, we will use 19 Df in order to select the appropriate t distribution. Regards, Dilip Kumar ______________ Hello all: If denominator N is used for the sample S.D, its role as an estimator of the population S.D will be biased.That is, if you consider all possible samples of size N and average all of the resulting variances, this average will not be the population variance. When this happens, the related statistic is maybe termed as "biased", as the last thing you want is a biased approach. The multiplier N/(N-1) corrects this bias in theory. In statastics we could/might infer this as the 'Degrees of Freedom'. (The actual population S.D might (most cases IS) be higher than the sample S.D) When N is fairly large, the difference between the different formulas is small and trivial. Using the N-1 version of the formula, we still define the standard deviation as the average amount by which scores in a distribution differ from the mean. Thanks Philip _____________
Create an account or sign in to comment