|
Cruise Scientific Visual Statistics Studio Measurement and Scaling |
Using matrix algebra, variance can be measured by computing all possible differences between elements of a variable. A major difference of a variable X with its transpose results in a skew-symmetric matrix. Its elements describe all possible differences between values of the variable X.
Skew symmetric matrices are redundant, as the negative values can be guessed from the symmetric positive values. Removing this redundancy, a skew-asymmetric matrix can be defined as
Variance can be obtained from the sum of the squared elements of a skew-asymmetric matrix divided by the square of its order.
For the example (12 + 22 +12 + 32 + 22 + 12 ) / 42 = 1.25.
Before the computer era, computing a skew-asymmetric matrix of differences was an arduous task. To simplify it, let us consider what happens if instead of computing a major difference of a variable from its transpose we compute the major difference of a variable from its mean.
In the above matrix, the X - M difference is repeated n times in every row. This repetition is redundant and we can write
The result of the above operation is a matrix of deviation scores x, since
Thus
Since the sum of the deviation scores is always zero, a meaningful index can be obtained by the averaging the sum of its squares, for the example 5 / 4 which equals to 1.25, a value the same as obtained above. Squaring the scores is congruent with the criterion of least squares, and thus these two concepts form the basis of the general linear model of statistics.
Transformation of the variable X into its adjacent binary matrix, for our example of the variable X as
and subtraction of the transpose of this matrix from itself
results in the same skew symmetric matrix as that of the major difference of the transpose of variable X from itself. Triangulation of the above matrix from its skew symmetric to its skew asymmetric form,
provides information about the number of 0-1 changes (bits) contained within the data.
The concept of least squares was developed by Laplace (1749-1827) in his work explaining the differences in motion of Jupiter and Saturn. The concept of variance in terms of all possible differences between values of a variable was introduced by von Andrae (1872) and Helmert (1876) in a series of articles to Astronomische Nachtrichten. The convention to use the Greek lowercase character σ for the standard deviation was coined by Karl Pearson in a series of articles published in Philosophical Transactions and Biometrika between 1896 and 1906.
Using all possible differences between values of a
variable as a foundation of statistical theory was contemplated by Kendall
(1943, p. 47) who defined a coefficient, called here u,
as
![]()
For the discontinuous infinite case, the above equation can be written as
![]()
and for the finite case as

where the summed term in the above equation is a vector of all possible
differences between elements of variable x. Pointing out that the value of the u
coefficient is dependent on the spread of the variate-values among
themselves and not on the deviations from some central value, Kendall
(1943, p.47) shows that u
= 2σ
,
concludes that the initial defining formula is nothing but twice the
variance, and abandons the idea. One can only wonder which direction
statistics could have taken if
The unbiased variance can be expressed in the matrix algebra notation as
and in the algebraic notation as