Coded regression analysis is used to partition variance within the context of various experimental designs where the coded predictor variables index the membership of subjects in various conditions of the experiment. Whenever possible, the conditions of the experiment should be coded by using orthogonal predictor variables. The values of the orthogonal coding vectors should sum to zero, as should their products. One of algorithms to generate orthogonal variables is Helmert's procedure.
A convenient way to obtain a set of mutually orthogonal variables is to use Helmert's procedure. If you need k orthogonal variables, outline a matrix with k columns and k+1 rows and enter the elements column-wise. Place k as the first element of the first column and fill the rest of the column by -1s. Place 0 as the first element of the second column, k-1 as its second element, and fill the rest of the column with -1s. Create the third column by entering two 0's, k-2, filling the rest of the elements with -1s. Continue in this fashion, entering 0s, decrementing k and entering -1s until all columns are filled. The last column will be filled with 0s but for its last two elements, 1 and -1. For example, when k = 4, Helmert's coefficients can be constructed as
|
|
|
Notice that every column of the above matrix sums to zero and that the sum of the product of any variable with any other variable also equals zero. Correlation matrix, associated with Helmert's coefficients is an identity matrix.
|
|
|
For three variables, the Helmert's contrasts are
|
|
|
and for two variables
|
|
|
Let us consider a simple experiment. One of the symptoms of Korsakoff psychosis, a disease of chronic alcoholics, is a marked loss of short-term memory and this experiment was designed to answer the question whether a new experimental drug, physostigmine, will improve the impaired memory of subjects suffering from Korsakoff psychosis.
In the course of this experiment, the control group was given placebo, the first experimental group was given dexfermetrazine, a stimulant. The second experimental group was given physostigmine, a drug we thought will improve memory of patients suffering from Korsakoff psychosis. The reason one group of patients was given dexfermetrazine was to differentiate the effect of physostigmine from a likely memory improvement due to a general arousal of subjects.
The criterion variable, within the context of experiments often called the dependent variable, was the number of nonsense syllabi subjects remembered. The dependent variable, measured for the control group C and both experimental groups, designated by a subscripted letter E, was recorded for nine subjects, randomly assigned to each condition of the experiment, as
|
|
|
To code this experiment, we need two orthogonal vectors, one less than the number of groups coded. Using Helmert's contrasts for two variables
|
|
|
|
|
|
The analysis of variance summary table for this example is
|
|
|
The combined experimental conditions accounted for about 68% of variability in the dependent variable. The administration of physostigmine accounted for about 23% of the total variance in the dependent variable. The differences between compared means were in the positive direction leading to the conclusion that physostigmine is a promising drug in therapy of the memory loss.
In the case of orthogonal predictors, the B weights reflect the differences between the compared means. Helmert codes contrast the group means as
|
|
|
For the H1 code the relevant B weights are [-3 1.5]. For the H2 code the relevant B weights are [-1.5 1.5]. The differences between the compared means are, for the first contrast
|
|
|
For the second contrast
|
|
|
Thus, the first orthogonal component, coded as 2 -1 -1, describes a contrast between the average performance of the group of patients given placebo (2) and an average performance of patients given dexfermetrazine or physostigmine (6.5). The difference between these means is -4.5
The second orthogonal component, coded as 0 1 -1, describes a contrast between the average performance of patients given dexfermetrazine (5) and patients given physostigmine (8). On the average, patients receiving dexfermetrazine recalled 5 nonsense syllabi, patients receiving physostigmine recalled 8 nonsense syllabi. The difference between these means is -3.
Sometimes researchers code the group membership by using arbitrary codes and compensate for the non-orthogonality of the coding vectors afterwards. Our example can be coded by arbitrary coding vectors as
|
|
|
Since the predictor variables are correlated, the variance contributions of each coding vector contains also the covariance term and thus the variance contributions [.30 1.20] of the coding vectors are not additive, i.e., do not sum to the standard variance of the criterion variable. In these cases one may compensate for non-orthogonality of coding vectors, after the regression analysis, by various correction formulae, such as proposed by Scheffe, Tukey, or by Student-Newman-Keuls, discussed elsewhere.