During the first decade of the 20th century, an interesting observation has been made. While simulating the standard normal distribution, defined as having the mean of zero and standard deviation equal to one, the following happened. Over the many trials, the means of random normal deviates indeed approximated the expected mean of zero. The standard deviations of the random normal deviates also approximated the expected standard deviation of one. However, as the sample sizes became very small, the standard deviations of the random normal deviates were consistently less than one, even though their means correctly approximated the expected mean of zero. During these simulation experiments, the true variance was defined as
![]()
The true standard deviation was defined as the square root of the above equation. A question naturally arises whether some other index than true variance could approximate the expected value better. Since the expected standard deviation was consistently underestimated only for the small values, a prime candidate for a new variance index was a variance defined as
since division by a smaller value makes the value of the fraction larger. A minute decrement of the n by the -1 seemed a logical candidate, since for the large n, division by n or by n - 1 makes for a small, negligible increase of the value of the fraction. However, for the small ns, the increase of the value of a fraction with n decremented by 1, can be large.
For example, define the sum of squared deviation scores in the numerator of the variance expression to be some arbitrary value, say 10. Division of 10 by 30 is .33. Division of 10 by 29 is .37. Decrementing n by 1 increased the fraction by .04. Now, divide 10 by 5. The result is 2.0. Divide 10 by 4, the result is 2.50. Decrementing n by 1 increased the fraction by .50.
When the definition of the variance was changed in such a way that the sum of the deviation scores was divided by n - 1, the standard deviations of the random normal deviates started to approximate the expected values of one even for the small sample sizes. The new index was called the unbiased variance. Its square root was called the unbiased standard deviation.
Simulations using the random number generators are often called the Monte Carlo experiments, as Monte Carlo is to Europe as Las Vegas is to the United States. During our Monte Carlo experiment, let's generate sets of random variables with expected mean of 0 and expected variance of 1. Results of this simulation experiment are shown below for n equal to 100, 30, 10, 5, and 3. As you may observe, for ns greater than 30, the differences between true and unbiased variances are negligible. Even for ns as small as 10, the differences are very small. However for ns of 5 and 3, the differences are substantial. Considering that very few real-life experiments are done with groups of subjects so small, why even to bother to introduce a new index for the variance? You are absolutely right. The unbiased variance index is not a necessary alternative to the true index of variance. However, what is necessary within the context of the inferential statistics is the concept of the degrees of freedom.

The degrees of freedom, ν, depending on circumstances, often equal n - 1, k - 1, or n - k. Remember how we defined one of the variance components within the context of the analysis of variance as the variance between the means? While in the course of statistical measurements sample sizes so small as to warrant the use of the unbiased variance virtually never occur, the experiments comparing two or three means are common. Consider an experiment involving a control and an experimental group with number of groups designated by a k. Computing the variance between the means by using k (2) or k -1 (1) as a divisor results in definitely not negligible difference in the variance estimates.
We strive for the most parsimonious description of the general linear model of statistics and thus use the true variance, introducing the unbiased variance only when necessary,