Theoretically, there is no upper limit on the size of the matrix that can be analyzed by the principal components analysis. The smallest matrix that can be analyzed is a matrix which contains a single coefficient. Analyzing single coefficient of correlation can provide us with new insights into its properties. Consider that, unlike most statistical indices that range from 0 to 1, the coefficient of correlation's range is from -1 to +1, suggesting that this index may encompass some more primitive relationships.
Consider the correlation matrix R
This matrix contains two eigenvalues, corresponding to the
determinant of the first term of the characteristic equation
This determinant can be written as
and
Expanding the binomial and rearranging,
The above equation can be solved as
where
and
For an example of variables X1 [
1 2 3 ] and X2 [2 4 3] which correlate .50
Where signifies the standard variance contribution
of each eigenvalue.
In standard scores, variance of the sum of two variables equals
two times the first eigenvalue and the variance of their difference equals two
times the second eigenvalue. For the example
For the above example, the first eigenvalue equals 3 / 2 that is
1.50. The second eigenvalue equals 1 / 2 that is .50. Variance contribution of
the first eigenvalue, for the example equals 1.5 /2 that is .75 and for the
second eigenvalue .5 /2 that is .25. In formal notation where k is the number
of variables, equal to 2, the above observations can be expressed as
and
For the variance components,
and
Coefficient of correlation can be
defined in terms of variance of its principal components as
For the example, the correlation between variables X1
and X2 equals .50. The first variance component equals .75, the
second variance component equals .25. The variance components sum to 1.00 and
their difference equals .50, the coefficient of correlation.
In 1907, during the formative
years of statistics, two seminal manuscripts reached the offices of the British Journal of Psychology. Written
from different perspectives, both manuscripts pertained to the same topic and
reached the same conclusion. Since Spearman's manuscript arrived in the morning
mail and the manuscript written by Brown was delivered in the afternoon, the
statistical index they described was not named the Brown - Spearman, but the
Spearman - Brown coefficient of reliability. The Spearman - Brown coefficient
of reliability was defined as
Which could have also be written as
In terms of our preceding observations, we can write the S-B reliability
as
which captures its real meaning.
Consideration of the above relationships
within the simplest possible context of eigenanalysis facilitates conceptual
understanding of this key method for data analysis. Eigenvalues are related to variances of sums and
differences. Principal components describe how events are similar and how they
differ and on the basis of these observations extract the principal components
of data. This approach to data analysis often uncovers the latent meaning of surface
events.