Intelligent use of the t-test assumes knowledge of its assumptions and limitations, conventions of its interpretation, and the circumstances necessary for its proper use. The assumptions about properties of data to which the t-test is to be applied are shared with those of the coefficient of correlation, since the ratio of the coefficients of determination and alienation is at its core. These are the assumptions of linearity, normality, and homoscedascity.
It is impossible to violate the assumption of linearity when using the t-test, just as it is when using the point biserial coefficient of correlation. The numerator of the t-test is the point biserial, indexing values of all possible differences between group means. Within the framework of bivariate regression, the slope of the regression line for the point biserial is determined by the means of both groups, indexed by parent vector X. These circumstances are depicted in the figure below.

Two points representing the group means define the regression line. Within the framework of the bivariate regression, the assumption of linearity cannot be violated.
Most computer programs routinely check for equality of variances for both groups before computing a t-test, using the formula
![]()
The use of subscripts 0 and 1 in the above formula is arbitrary. The larger variance, subscripted either 0 or 1, is always placed in the numerator so that the obtained F will be greater or equal to 1.00. If the obtained F is larger or equal to the tabulated value, we may conclude that the variances are not equal.
Let us consider an example of an experiment in which the variances of the control and experimental groups markedly differ, as suggested in the following table.

The plot of the above data reveals no differences between the means (M0 =5 and M1 = 5)

and a large difference in the variability between the groups.

That means that the treatment had different effects on individual subjects. The factors affecting the differential treatment effects should be identified and explicated. The data set can be reanalyzed as shown below

If found, the differences between
the control group (M0 = 5.00;
), and the new experimental groups one
(M1
= 2.00;
) and two (M2 = 8.00;
) should be tested.
The differential effect of the treatment in the above example was exaggerated to stress the point. However, if the difference between the variability of both groups is real, then explicit description of reasons causing this difference should be an integral part of the interpretation of the results.
Another possibility arises when variability of the experimental group is constricted or expanded, and the differentiation into separate subgroups is not indicated. In this instance, the search for the causal factors affecting different variability of groups is more difficult. Whether these factors can be identified depends on the circumstances surrounding each particular experiment.
When there are unequal group variances, various methods of separate variance estimates were proposed by Cochran and Cox, Behrens and Fisher, and by Welch to compensate for the lack of homoscedascity. The method of Welsh gained wide recognition, perhaps because it is implemented by most computer packages for statistical analysis, notably by the SPSS.
When the group distributions are not homoscedastic, Welch proposes an elaborate correction for the degrees of freedom as

The
use of the Welch's correction often results in estimates of
the degrees of freedom that are not integers. Rather than applying these arbitrary remedies, a
sensible approach to violation of the homoscedascity
assumption is to look for the factors causing the variance
to expand or constrict, as discussed in the previous
section.
The t-test is robust with respect to violations of the assumptions of normality. Statistical inference assumes a population of events. Our observations or experiments sample from that population. What is likely to happen in the course of this sampling? Let us define a population X= [ 1 2 3] and plot its histogram as
The mean of this population equals 2, its variance equals .67. Let us draw from this population all possible samples of the size 2 and compute their means.
|
|
|
1 |
2 |
3 |
|
Sample
1-3 |
1 |
(1,1) |
(1,2) |
(1,3) |
|
Means |
|
1 |
1.5 |
2 |
|
Sample
4-6 |
2 |
(2,1) |
(2,2) |
(2,3) |
|
Means |
|
1.5 |
2 |
2.5 |
|
Sample
7-9 |
3 |
(3,1) |
(3,2) |
(3,3) |
|
Means |
|
2 |
2.5 |
3 |
It is important to keep in mind that in this case n (sample size) equal 2, not 3. Let us plot the distribution of these sampled means
and ascertain its properties. The distribution of the sampled means approximates the binomial distribution and this fact is quite remarkable if one considers that the distribution of the population sampled from [1 2 3] is definitely rectangular. This observation also lead credence to the often-repeated statement that the sampling distribution of means is robust with respect to the assumption of normality.