Double Classification Analysis of Variance
In the previous chapter, we were discussing partitioning of the
total component of variance into the column component of variance and the
residual component. In this chapter, we will add another variance component,
capturing variance due to the row marginal referents of a data matrix. Within
the research in the social sciences, the row component of variance typically
captures variance among subjects. Using the same data sets, you may observe
that as the total and column variance components remain the same, the variance
captured by the row variance component diminishes the residual component of
variance.
Another prototypical scientific experiment involves subjecting a
group of subjects to two types of experimental treatment. All subjects are
subjected to all conditions of the experiment. An idealized arrangement of
subjects prior to the onset of an experiment is outlined as
|
|
Y0 |
Y1 |
Y |
|
Allen |
1 |
1 |
1 |
|
Becky |
2 |
2 |
2 |
|
Cathy |
3 |
3 |
3 |
|
Allen |
|
|
1 |
|
Becky |
|
|
2 |
|
Cathy |
|
|
3 |
|
M |
2 |
2 |
2 |
|
σ2 |
.67 |
.67 |
.67 |
As within the
framework of the independent groups design, initially, we have no reason to
assume that the means and variances of both groups will differ. Following some
type of intervention induced by the experiment,
|
|
Y0 |
Y1 |
Y |
|
Allen |
1 |
1+2=3 |
1 |
|
Becky |
2 |
2+0=2 |
2 |
|
Cathy |
3 |
3+1=4 |
3 |
|
Allen |
|
|
3 |
|
Becky |
|
|
2 |
|
Cathy |
|
|
4 |
|
M |
2 |
3 |
2.5 |
|
σ2 |
.67 |
.67 |
.92 |
ideally, the variances of the control and experimental groups should
not change (assumption of homoscedascity) and the change of the total group's
variance should be due to the variance between the changed mean of one of the
groups , where the mean of 2 changed to a mean equal to 3.
Let us illustrate this
conjecture, as shown in the following table, containing the means of the
|
|
M |
|
Y0 |
2 |
|
Y1 |
3 |
|
M |
2.5 |
|
σ2 |
.25 |
Note that the variance of
any three consecutive integers equals .67 and the variance of any two
consecutive integers equals .25. Thus, the variance of the means for our
example (2 and 3) equals .25, and we can partition the variance of the total variable
as .92 = .67 + .25.
Let us illustrate this
conjecture, as shown in the following table, containing the means of the
|
|
Y0 |
Y1 |
Y0 + Y1 |
M |
|
Allen |
1 |
3 |
4 |
2 |
|
Becky |
2 |
2 |
4 |
2 |
|
Cathy |
3 |
4 |
7 |
3.5 |
|
M |
2 |
3 |
5 |
2.5 |
|
σ2 |
.67 |
.67 |
2 |
.50 |
By finding variance corresponding to the row means, we can
partition the total variance of the data into its column, row, and remaining
components as .92 = .25 + .50 + .18. Let
us see whether the two-way analysis of variance will support this conjecture.
For the example, the spreadsheet for the one-way analysis of
variance, described in the previous chapter is
|
|
Data |
Sums |
Squares |
Corrections |
|
Data |
1 3 2 2 3 4 |
|
|
|
|
Sums |
6 9 |
15 |
43 |
|
|
Squares |
36 81 |
5.5 |
255 |
|
|
Corrections |
12 27 |
39 |
1.5 |
37.5 |
Adding sums for the rows of the data matrix can further partition
the variance of the data
|
|
Data |
Sums |
Squares |
Corrections |
|
Data |
1 3 2 2 3 4 |
4 4 7 |
|
|
|
Sums |
6 9 |
15 |
43 |
|
|
Squares |
36 81 |
5.5 |
225 |
|
|
Corrections |
12 27 |
39 |
1.5 |
37.5 |
and their squares
|
Data |
1 3 2 2 3 4 |
4 4 7 |
16 16 49 |
|
|
Sums |
6 9 |
15 |
43 |
|
|
Squares |
36 81 |
5.5 |
225 |
|
|
Corrections |
12 27 |
39 |
1.5 |
37.5 |
Dividing the squared row sums by their respective ns, for the
example (16 / 2), (16 / 2), and (49 / 2), results in
|
Data |
1 3 2 2 3 4 |
4 4 7 |
16 16 49 |
8 8 24.5 |
|
Sums |
6 9 |
15 |
43 |
|
|
Squares |
36 81 |
5.5 |
225 |
|
|
Corrections |
12 27 |
39 |
1.5 |
37.5 |
Summing the corrected values of the row sums, for the example 8 +
8 +24.5, and entering this sum into the spreadsheet, fills the last but one
empty filed of the spreadsheet, as
|
Data |
1 3 2 2 3 4 |
4 4 7 |
16 16 49 |
8 8 24.5 |
|
Sums |
6 9 |
15 |
43 |
40.5 |
|
Squares |
36 81 |
5.5 |
225 |
|
|
Corrections |
12 27 |
39 |
1.5 |
37.5 |
To obtain the row variance
component, subtract the correction term from this sum (40.5 - 37.5), as
|
Data |
1 3 2 2 3 4 |
4 4 7 |
16 16 49 |
8 8 24.5 |
|
Sums |
6 9 |
15 |
43 |
40.5 |
|
Squares |
36 81 |
5.5 |
225 |
3 |
|
Corrections |
12 27 |
39 |
1.5 |
37.5 |
In tabular representation,
|
Source of Variance |
Extended
Variance Components |
Standard
Variance Components |
Variance
Components |
|
Columns |
1.5 |
.27 |
.25 |
|
Rows |
3.0 |
.55 |
.50 |
|
Residual |
1.0 |
.18 |
.17 |
|
Total |
5.5 |
1.00 |
.92 |
Note that the amount of information accounted for increased. The
column variance component, due to the experimental treatment, remained the
same. The residual component decreased by the row component of variance, due to
variability among subjects.
In the course of the previous analysis, we were searching for the
sums of squares to be entered to a table such as outlined in the table below.
|
Source of Variance |
Degrees of Freedom |
Sums of Squares |
Mean Square |
F |
Probability |
|
Columns |
k-1 |
SS ? |
SS / df |
MS / MSRES |
p ? |
|
Rows |
n-1 |
SS ? |
SS / df |
MS / MSRES |
p ? |
|
Residual |
(k-1)(n-1) |
SS ? |
SS / df |
|
|
|
Total |
nk-1 |
SS ? |
|
|
|
For the example, the traditional summary table of the analysis of
variance is
|
Source of Variance |
Degrees of Freedom |
Sums of Squares |
Mean Square |
F |
Probability |
|
Columns |
1 |
1.5 |
1.5 |
3.0 |
.1440 |
|
Rows |
2 |
3.0 |
1.5 |
3.0 |
|
|
Residual |
2 |
1.0 |
.5 |
|
|
|
Total |
5 |
5.5 |
|
|
|
Using the standard components of variance, the above table can be
conceptualized
|
Source of Variance |
df |
Standard
Variance Components |
Unbiased
Variance Components |
F |
Probability |
|
Columns |
1 |
.27 |
.27 |
3.0 |
.1440 |
|
Rows |
2 |
.55 |
.27 |
3.0 |
|
|
Residual |
2 |
.18 |
.09 |
|
|
|
Total |
5 |
1.00 |
|
|
|
into a form that, as in the case of the one way analysis of
variance, is more informative and easier to interpret. As compared with the
summary table for the single classification of variance where the residual
source of variance consists of the residual plus row sources of variance, the
double-classification analysis of variance is more likely to find the
relationships to be statistically significant.
As in the case of the independent measures t-test being a special
case of the one way analysis of variance, the t-test for the repeated measures
design is a special case of the two-way analysis of variance. For our example, the
conceptual framework of the correlated t-test is shown in the following table
|
|
Y1 |
Y2 |
Y1+ Y2 |
|
S1 |
1 |
3 |
4 |
|
S2 |
2 |
2 |
4 |
|
S3 |
3 |
4 |
7 |
|
M |
2 |
3 |
5 |
|
2 |
.67 |
.67 |
2 |
|
2 /k2y2 |
|
|
.55 |
Conceptualized
within the correlational framework, this formula is
In the above
table, the k is a constant equal to the number of groups. For the example, the
rxy equals .253 and the k2σy2 expression equals 22(12.80).
Le us recall the
idealized form of a repeated measures design
|
|
Y0 |
Y1 |
Y |
|
Allen |
1 |
1+2=3 |
1 |
|
Becky |
2 |
2+0=2 |
2 |
|
Cathy |
3 |
3+1=4 |
3 |
|
Allen |
|
|
3 |
|
Becky |
|
|
2 |
|
Cathy |
|
|
4 |
|
M |
2 |
3 |
2.5 |
|
σ2 |
.67 |
.67 |
.92 |
with variances
due to changed column means computed as
|
|
M |
|
Y0 |
2 |
|
Y1 |
3 |
|
M |
2.5 |
|
σ2 |
.25 |
and the
variances due to changed row means computed as
|
|
Y0 |
Y1 |
Y0 + Y1 |
M |
|
Allen |
1 |
3 |
4 |
2 |
|
Becky |
2 |
2 |
4 |
2 |
|
Cathy |
3 |
4 |
7 |
3.5 |
|
M |
2 |
3 |
5 |
2.5 |
|
σ2 |
.67 |
.67 |
2 |
.50 |
We can also
directly find the residual component of variance by subtracting the Y0
and Y1 variables
|
|
Y0 |
Y1 |
Y0- Y2 |
M |
|
S1 |
1 |
3 |
-2 |
-1 |
|
S2 |
2 |
2 |
0 |
0 |
|
S3 |
3 |
4 |
-1 |
-.5 |
|
M |
2 |
3 |
-1 |
-.5 |
|
σ2 |
.67 |
.67 |
.67 |
.17 |
In the above
tables, the variances of the means of sums and means of the differences could
have been computed directly from the variances of sums and differences. To do
that recall that variance of a variable divided by a constant equals to the
variance of that variable divided by a square of that constant. Thus the
variance components of a two variables can be found as
|
Source of Variance |
Variance
Components |
|
Columns |
(M0-M1)2
/ k2 |
|
Rows |
σ2(0+1) / k2 |
|
Residual |
σ2(0-1) / k2 |
|
Total |
σ2(0..1) |
For the example
|
|
Y0 |
Y1 |
Y0+Y1 |
Y0-Y1 |
|
Allen |
1 |
3 |
4 |
-2 |
|
Becky |
2 |
2 |
4 |
0 |
|
Cathy |
3 |
4 |
7 |
-1 |
|
M |
2 |
3 |
5 |
-1 |
|
σ2 |
.67 |
.67 |
2 |
.67 |
|
σ2 / k2 |
|
|
.50 |
.17 |
summarized as
|
Source of Variance |
Variance
Components |
|
Columns |
.25 |
|
Rows |
.50 |
|
Residual |
.17 |
|
Total |
.92 |
the standard
variance components of two variables as can be computed as
|
Source of Variance |
ν |
Standard Variance Components |
Correlational Framework |
|
Experimental Effect |
k-1 |
|
|
|
Subjects |
n-1 |
|
|
|
Residual |
(n-1)(k-1) |
|
|
|
Total |
nk-1 |
|
1.00 |
For the example
|
|
Y0 |
Y1 |
Y0+Y1 |
Y0-Y1 |
|
Allen |
1 |
3 |
4 |
-2 |
|
Becky |
2 |
2 |
4 |
0 |
|
Cathy |
3 |
4 |
7 |
-1 |
|
M |
2 |
3 |
5 |
-1 |
|
σ2 |
.67 |
.67 |
2 |
.67 |
|
σ2 / k2σ2(0..1) |
|
|
.55 |
.18 |
the standard
variance components can be summarized as
|
Source of Variance |
Standard Variance
Components |
|
Columns |
.27 |
|
Rows |
.55 |
|
Residual |
.18 |
|
Total |
1.00 |
The sequential
presentation of the independent-measures and repeated-measures t-tests leads
within the one way and two way analysis of variance methods leads directly to
computation of both the strength and significance of the observed relationships
and provides for gradual transition to the more general analysis of variance
methods.