Matrix Algebra

A matrix is an array of values; for example, a table of data.

Matrix X

Quartz Feldspar Rock Fragments Matrix

1 50.0 10.0 30.0 10.0

2 40.0 20.0 10.0 10.0

3 50.0 20.0 10.0 20.0

4 60.0 10.0 30.0 0.0

	Quartz	Feldspar	Rock Fragments	Matrix
1	50.0	10.0	30.0	10.0
2	40.0	20.0	10.0	10.0
3	50.0	20.0	10.0	20.0
4	60.0	10.0	30.0	0.0

The columns of the table contain the values of the variables and the rows contain the samples studied. The matrix is labeled X and is described by giving the number of rows and columns - the dimensions of the matrix. X is a 4 by 4 matrix. The individual entries in the matrix are the elements of the matrix. A particular element is specified by giving its coordinates (row, column). X(2,4) is 10.0 -- the amount of matrix (the 4th column) in the second (row) sample.

The data table can be thought of as N row vectors (the samples or objects) and M column vectors (the variables). These vectors can be interpreted algebraically or geometrically.

The simple summary statistics already introduced (mean, standard deviation, etc.) can be thought of algebraic descriptors of the properties of a column vector. Covariances and correlation coefficients are algebraic measures that describe the pair wise behavior of column vectors.

One the the problems with using a computer application is that the machine is capable of doing something which does not make sense. Suppose, for example, that you had a matrix consisting of the long axes of 100 pebbles (measured in centimeters) and their weight (measured in grams). For the variables you could compute the summary statistics introduced previously. However, although you could compute the mean of a row vector by adding up the values in each row and dividing by 2.0, would this make sense given that the units of measurement are different -- 20 cm + 98 grams = ?? You must always consider the units of measurement before subjecting data to a transformation or computation which could be inappropriate.

Variable Space

A vector is a directed line segment. If the columns (variables) are selected as the axes of reference, the objects can be located with respect to their coordinates in 4-dimensional space. Clearly, this call for dealing with abstract space and, as long as the user does not insist on a picture of this space in 2-d or 3-d, the geometrical and algebraic concepts that follow hold. Comparing vectors requires deciding the basis for comparison. For example, we could elect to compare the vectors in the following diagram on the basis of the distance between their end points. The distance between vectors 2 and 3 is shorter than the distance between vectors 3 and 4. Therefore, vectors 2 and 3 are more similar than vectors 3 and 4. We could also compare vectors on the basis of the angle (Theta) between them. A very short vector might point in nearly the same direction as a very long vector. The two vectors would be very similar on the basis of the angle between them but very different on the basis of the distance between their end points. In general, the investigator must decide which is the appropriate measure of similarity. As noted previously, however, some transformations are inappropriate for data measured in different units. These considerations will be taken up later on in multivariate applications.

Object Space

If the rows (samples or objects) are selected as the axes of reference, the objects can be located with respect to their coordinates in 4-dimensional space.

Thus, a data matrix can be viewed as either:

a set of M mutually perpendicular axes of reference which locate objects or
a set of N mutually perpendicular axes of reference which locate variables

In multivariate statistics we will have occasions in which we want to work in either or both spaces.

Geometrical properties of vectors are useful in working with multivariate statistical applications.

The resultant vector is given by adding the two vectors together. Vectors can be added if they have the same number of elements. (2,1) + (1,3) = (3,4). The difference vector is given by subtracting one vector from another. (1,3)-(2,1) = (-1,2). These vectors are plotted in the figure given above.
The length of a vector is the square root of the sum of the squares of the elements of the vector -- length (2,1) = square root of (4 + 1) = 2.225 and the length of the vector (1,3) = the square root of 10 or 3.165.
The Euclidean distance between two points is given by the square root of the sum of the products of the difference vector. For the difference vector (-1,2) the distance between the two vectors is the square root of 5 or 2.225.
The cosine of the angle between two vectors is given by the square root of the sum of the products of the two vectors divided by the product of the lengths of the two vectors. For the vectors (2,1) and (1,3) the numerator is 5 divided by 7.042 or 0.7100 -- approximately 45 degrees.
Return to the Geo Analysis Home Page