Single Classification Analysis of Variance

Partitioning of variance into its components is a central concept of statistical data analysis. The variance components can be computed for the obtained and deviation scores, or for the standard scores. The variance of the data in the obtained or deviation score forms can take on any value, however the variances of the obtained and deviation scores are the same. The variance of the standard scores is one. Under certain conditions the variance components are additive and if expressed as the standard variance components, they can be directly interpreted in terms of proportions of variance accounted for. Aside of the obtained scores, deviation scores, and standard scores frameworks, there are two additional frameworks where data can be partitioned into extended ('sum of squares') and expanded data components. No matter in which framework we describe the partitioning of variance into its additive (orthogonal) components, we can always standardize the obtained components by dividing them by the total variance component. In this chapter, we will describe partitioning of variance into the extended variance components within the framework of the single classification (one-way) analysis of variance.

Idealized Experimental Design

A prototype of a scientific experiment involves two groups of subjects, divided randomly into a control and an experimental group. Subjects are assumed to have no relationship to each other and different subjects are used for the different conditions of the experiment. An idealized outline of the arrangement of subjects prior to the onset of an experiment is presented in the following table.

 

 

 

Y0

Y1

Y

Allen

1

 

1

Becky

2

 

2

Cathy

3

 

3

Debra

 

1

1

Edgar

 

2

2

Francis

 

3

3

M

2

2

2

σ2

.67

.67

.67

 

       Initially, we have no reason to assume that the means and variances of both groups will differ. There is also no reason to assume that the means and variances of both groups combined will differ from those of the groups considered separately. The scores in the above table do not simulate the actual scores, but are, instead, hypothetical assumptions about the scores that could be expected in the absence of the experimental treatment. Changes in these idealized scores following the introduction of an experimental treatment are presented in the following table.

 

 

Y0

Y1

Y

Allen

1

 

1

Becky

2

 

2

Cathy

3

 

3

Debra

 

1+2=3

3

Edgar

 

2+0=2

2

Francis

 

3+1=4

4

M

2

3

2.5

σ2

.67

.67

.92

 

       Since the variances of the control and experimental groups did not change because of the experiment, the increase of variance for the total group should be due to the variance between the changed means.

Variance Due to Changed Means

 Let us illustrate this conjecture, as shown in the following table, containing the means of the control and experimental groups together with their mean and variance.

 

 

M

m

m2

Y0

2

-.5

.25

Y1

3

.5

.25

M

2.5

0

.25

σ2

 

 

.25

 The remaining variance should be the variance among subjects. Thus, we can summarize the results of this experiment as in the following table

 

Source of Variance

Variance Components

Standard Variance Components

Between Means

.25

.27

Within Subjects

.67

.73

Total

.92

1.00

 

The standard variance components were obtained by dividing all variance components by their total sum, for the example equal to .92.

Regression Analysis of Idealized Experimental Designs

These variance components could also have been obtained by the regression analysis, for the example, summarized as

 

 

 

The experimental treatment thus accounted for 27 percent of variance. The remaining 73 percent of variance are due to the variability between the subjects due to some other factors. In formal notation, the partitioning of variance table can be written as

 

Source of Variance

Variance Components

Standard Variance Components

Information

 

 

Residual

 

 

Total

 

 

 

Since

 

and

 

the above table can be also written as

 

Source of Variance

Standard Variance Components

Variance Components

 

Information

 

 

 

Residual

 

 

 

Total

 

 

 

These equivalencies are important for understanding the principles of analysis of variance and the mutual relationships between the principal models used for this task.

Algebraic Substratum of Extended Component of Variance

The key to understanding the traditional approach to the analysis of variance, extracting the extended components of variance (also called the sums of squares), is to realize that this solution is based on the variance formula

 

 

 

algebraically manipulated as

 

 

and

 

 

The expression on the left of the above equation represents the 'sums of squares.' The sum of squares is simply the sum of squared deviation scores. The first term on the right side is the sum of squared obtained scores and the last term on the right hand side of the above equation is the correction term. Since we are using the obtained scores to compute the sum of squared deviation scores, we must subtract the correction term from our calculations.

The basic problem of analysis of variance is to compute variance in a form that is comparable. The incomparability of variance may occur when the coefficients of variance are based on different number of data elements and the 'sum of squares' method circumvents this difficulty.

The Microsoft Excel Framework

The above example is a prototype of a scientific experiment, involving two groups of subjects, divided randomly into a control and an experimental group. Different subjects are used for the different conditions of the experiment.

Suppose our example concerns the effect of the consumption of alcohol on reaction time. Using a placebo and an alcoholic beverage, the control group is given placebo while the experimental group is given the alcoholic beverage. Reaction time measurements, taken one hour after the placebo and alcohol were consumed, are presented in the following table.

 

 

 

To compute the analysis of variance by using a spreadsheet, enter data into the data area and compute its column sums and the grand sum, as

 

 

Data

Sums

Squares

Corrections

 

Data

1     3

2     2

3     4

 

 

 

Sums

6     9

15

 

 

Squares

 

 

 

 

Corrections

 

 

 

 

 

Next, square the sums, as shown in the following table.

 

 

 

Data

Sums

Squares

Corrections

 

Data

1     3

2     2

3     4

 

 

 

Sums

6     9

15

 

 

Squares

36    81

 

255

 

Corrections

 

 

 

 

 

And divide the squared sums by their corresponding ns. For the example (36/3), (81/3) and (255/6), as

 

 

Data

Sums

Squares

Corrections

 

Data

1     3

2     2

3     4

 

 

 

Sums

6     9

15

 

 

Squares

36   81

 

255

 

Corrections

12   27

 

 

37.5

 

The value in the lower right corner of the spreadsheet is called the correction term. To obtain the extended variance components, this correction term must be subtracted from the obtained intermediate values, computed as follows.

Column Component of Variance

For the column variance component, add the corrected values for the data columns, for the example (12 + 27)

 

 

Data

Sums

Squares

Corrections

 

Data

1     3

2     2

3     4

 

 

 

Sums

6     9

15

 

 

Squares

36   81

 

255

 

Corrections

12   27

39

 

37.5

 

And subtract the correction term from this sum, as 39 - 37.5, which equals 1.5. Enter this value to the above table between the values 39 and 37.5 and into the variance table, as

 

Source of

Variance

Variance

Components

Standard Variance Components

Extended Variance Components

Columns

.25

.27

1.5

Residual

.67

.73

 

Total

.92

1.00

 

Total Component of Variance

To obtain the total variance component, square and sum all elements of the data (12 + 22 +32 +32 +22 +42) and enter this sum (43) to the spreadsheet as

 

 

Data

Sums

Squares

Corrections

 

Data

1     3

2     2

3     4

 

 

 

Sums

6     9

15

43

 

Squares

36   81

 

255

 

Corrections

12   27

39

1.5

37.5

 

Subtract the correction term from this value (43 - 37.5) and enter the result to the appropriate cell within the spreadsheet,

 

 

Data

Sums

Squares

Corrections

 

Data

1     3

2     2

3     4

 

 

 

Sums

6     9

15

43

 

Squares

36   81

5.5

255

 

Corrections

12   27

39

1.5

37.5

 

as well as to the appropriate cell of the summary variance table

 

Source of

Variance

Variance

 Components

Standard Variance Components

Extended Variance Components

Columns

.25

.27

1.5

Residual

.67

.73

 

Total

.92

1.00

5.5

Residual Component of Variance

The residual term for the extended variance components can be obtained by subtracting the column from the total variance component (5.5 - 1.5) and entered to the variance table, as

 

Source of

Variance

Variance

 Components

Standard Variance Components

Extended Variance Components

Columns

.25

.27

1.5

Residual

.67

.73

4.0

Total

.92

1.00

5.5

Dividing them by the extended total variance component can standardize the extended variance components. For the example, (1.5 / 5.5), (4.0 / 5.5), and (5 .5 / 5.5) equals .27, .73, and 1.0. The experimental treatment thus accounted for the 27 percent of the total component of variance.

Summary Tables

Results of the analyses of variance are traditionally reported in a tabular form, such as outlined in the table below.

 

Source of

Variance

Degrees of

Freedom

Sums of

Squares

Mean

Square

 

F

 

Probability

Columns

k-1

SS ?

SS / df

MS / MSRES

p ?

Residual

k(n-1)

SS ?

SS / df

 

 

Total

nk-1

SS ?

 

 

 

 

The degrees of freedom are computed for the column source of variance as the number of columns of the data matrix minus one and for the total source of variance the total number of elements in the data matrix minus one. The number of degrees of freedom for the residual term can be obtained either by subtracting degrees of freedom for rows and columns from the total degrees of freedom. For the example, the traditional summary table of the analysis of variance is

Source of

Variance

Degrees of

Freedom

Sums of

Squares

Mean

Square

 

F

 

Probability

Columns

1

1.5

1.5

1.5

.1440

Residual

4

4.0

1.0

 

 

Total

5

5.5

 

 

 

 

Using the standard components of variance, the above table can be conceptualized

 

Source of

Variance

 

df

Standard Variance

Components

Unbiased Variance

Components

 

F

 

Probability

Columns

1

.27

.27

1.5

.1440

Residual

4

.73

.18

 

 

Total

5

1.00

 

 

 

 

into a form that is more informative and easier to interpret.

Independent Measures t-Test as a Special Case of One-Way Analysis of Variance

Let us reconsider the current example of the effect of the consumption of alcohol on reaction time within the framework of the independent measures t-test. Reaction times of the group of subject that received placebo (X0) and a group that received alcohol (X1) are recorded as the dependent variable Y in the following table.

 

 

 

The coefficient of determination for variables X and Y is .27 and the coefficient of alienation is .73. The t-square

 

for 3 degrees of freedom is computed as ( .27 / .73) 3 which equals 1.50. Note that for the special case of two groups

 

 

The t equals 1.22 and the probability associated with the t-ratio is .2880.