Elements of Statistics in Matrix Algebra Notation

The central statistics of the general linear model are the algebraic mean, variance, followed by the coefficients of covariance and correlation. The concepts of correlational analysis can be extended to include the coefficients of statistical significance as t and F, obtained in the course of computations of t-test and analysis of variance. These univariate and bivariate statistical methods are usually expressed by using the summation notation, but can be expressed as well by using the matrix algebra notation.

Algebraic Mean

Using summation notation, the algebraic mean of a variable X can be written as

 

where n signifies the number of cases. In the notation of matrix algebra, the mean of the vector X can be written as

 

where 0 is a null vector and n is the length of both the X and 0 vectors.

Consider vector X [1 2 3 4 5]. Its length equals 5 and its arithmetic mean is computed as

 

Alternatively,

 

where 1 is a unit vector. Thus,

 

 

True Variance

Using the notation in obtained scores, the true variance of a variable X is

 

 

 

For the example of the variable X = [1 2 3 4 5] its variance can be computed, using the obtained scores and their squares, as

 

 

For the example, variance of the variable X can be computed as (5(55) - (15)2) / 25 which equals (275 - 225) / 25 which, in turn, equals 2.

In the notation of matrix algebra, the same expression can be written as

 

 

 

where 1 is a unit vector, and the Greek letter delta signifies triangulation of the skew matrix into a skew-positive matrix. For the vector X [1 2 3 4 5] the above expression is written as

 

 

 

Subtracting the X' - X expression within the parentheses,

 

 

 

and triangulating the skew-symmetric matrix

 

 

 

Squaring the matrix elements

 

 

the true variance of the variable X can be computed as 50/25 that is 2.

Covariance

Using summation notation, the covariance of variables X and Y can be written as

 

Consider the following example

 

 

 

Using the deviation scores, covariance can be computed as 16/4, which equals 4.0. In the notation of matrix algebra, the covariance of the matrix X can be written as

 

 

where D signifies the matrix X, linearly transformed into deviation scores, and n is the number of rows in the matrix D. For the above example,

 

 

 

 

 

Consider another example, where the matrix X equals

 

 

 

and its corresponding matrix of deviation scores D is

 

 

The number of rows equals 5 and the covariance of the matrix X is computed as

 

 

The matrix C is also called the variance-covariance matrix, since the variance of each variable is in its principal diagonal and the covariance among its variables is in the off-diagonal elements.

Correlation

Using summation notation, the correlation of variables X and Y can be written as

 

 

where and  are standard scores, obtained from the deviation scores corresponding to variables X and Y by linear transformations and . The n signifies the number of cases. In the notation of matrix algebra, the correlation matrix R corresponding to the data matrix X can be written as

 

where Z signifies the matrix X, linearly transformed into standard scores, and n is the number of rows in the matrix Z. Consider matrix X

 

 

and its corresponding matrix of standard scores Z

 

 

The number of rows of both matrices equals 5 and the correlation matrix R corresponding to the matrix X is computed as

 

Multiplying matrices in the numerator

 

and dividing by the scalar number in the denominator

 

 

It is important to realize that all discussed operations, done with respect to columns (attributes) of the data matrix, can be also done with respect to its rows (entities).

Summary

The statistical formulae in both the summation and the matrix notation are summarized

 

 

Summation Notation

Matrix Algebra Notation

 

Mean

 

 

Variance

 

 

 

for means and variances in the preceding table. For covariance and correlation, the key formulae are summarized as

 

 

Summation Notation

Matrix Algebra Notation

 

Covariance

 

Correlation