Least squares. The system AX = C (A is m by n, X is an n by 1 vector of variables, and C is an m by 1 vector of constants) has a solution (i.e., is consistent) if and only if C is in the image of A. If C is not in the image of A, but C' denotes its projection on the image of A (which is the closest vector to C that is in the image of A) then a solution of AX* = C' is called a "least squares" solution of AX = C. Finding a least squares solution is equivalent to solving (A^T A)X* = A^T C, and this is the recommended method, since it avoids the necessity of computing the projection C'.
Note that if the columns of A are independent, then A^T A is invertible and the least squares solution is unique.
In the rest of this summary, all matrices are assumed square, unless otherwise specified.
Determinants. Know how to expand by minors with respect to any row or column. One should know:
(1) Switching two rows (or columns) reverses the sign.
(2) Multiplying a row (or column) by a scalar multiplies the
determinant by the same scalar.
(3) If all columns but one are held fixed, the determinant is
a linear function of the column that is allowed to vary. (The
same holds for rows.)
(4) Adding or subtracting a multiple of one row to or from a different
row does not change the determinant. (The same holds for columns.)
(5) det(A) = det(A^T).
(6) det(A) = 0 if and only if the rows are dependent if and only
if the columns are dependent if and only if A is not invertible.
(7) In particular, if two rows of A are equal, det(A) = 0.
The same holds if two columns of A are equal.
(8) det(AB) = det(A) det(B). If A has an inverse, det(A^{-1})
= 1/det(A).
(9) The determinant of an orthogonal matrix is either 1 or -1.
(10) The determinant of an upper or lower triangular matrix is
the product of the diagonal entries.
Frequently, the best way to compute a determinant is by
doing row reduction (at least partially) to get the matrix
to be triangular. One must keep track of the effect of switching
rows on the sign and the effect of multiplying a row by a scalar.
Let T be a linear transformation from R^n to R^n with matrix A. Then |det(A)| is the volume of the parallelepiped determined by the columns of A, which is the image of the unit n-cube under T. If S is any subset of R^n having n-dimensional volume V, then T(S) has volume |det(A)|V. If M is an n by k matrix, the k-volume of the parallelepiped in R^n determined by the columns of M is the square root of the determinant of M^T M.
The inverse of A is 1/det(A) times the transpose of the matrix of cofactors, [(-1)^{i+j} det(A_[ji}]. To solve a linear system of equations AX = C, where A is an n by n invertible matrix, one may use Cramer's rule: the value of x_j is det(B_j)/det(A), where B_j is the matrix obtained by replacing the j th column of A by C.
Eigenvalues. We use x here instead of lambda for the variable in the characteristic polynomial, and we often use c to denote a single eigenvalue. The scalar c is an eigenvalue of the square matrix A if there is a vector v, not 0, such that Av = cv. We call the nonzero vector v an eigenvector of A for the eigenvalue c. We can discuss both real eigenvalues and complex eigenvalues. Note that c is an eigenvalue if and only if cI - A has a nonzero kernel, which is called the eigenspace for c.
The characteristic polynomial of a square matrix A is det(xI - A), where I is an identity matrix the same size as A. This polynomial has leading coefficient one. Its roots are the eigenvalues of A (the complex roots are eigenvalues in the sense that they will have complex eigenvectors.)
The algebraic multiplicity of an eigenvalue c is the biggest integer m such that (x-c)^m divides the characteristic polynomial. The geometric multiplicity of c is the dimension of the kernel of cI - A, i.e., the dimension of the subspace of eigenvectors for the eigenvalue c. The geometric multiplicity of an eigenvalue is always at least one and less than or equal to the algebraic multiplicity. The number of eigenvalues over the complex numbers is always equal to n, the size of the matrix, if one counts them with multiplicities. That is, the sum of the algebraic multiplicities of the eigenvalues, including all complex eigenvalues, is n.
Every real matrix of odd size has at least one real eigenvalue. The complex eigenvalues of a real matrix occur in conjugate pairs. Every real square matrix A either has an eigenvector or else there is a plane W such that AW is contained in W. (Choose a complex eigenvalue of A with eigenvector u + vi, where u, v are real vectors, not both 0. Then A takes W = Span{u,v} into itself.)
Eigenvectors for mutually distinct eigenvalues are automatically independent. An n by n matrix over R has a real eigenbasis if there is a basis for R^n consisting of eigenvectors. This is the case if and only if the characteristic polynomial has n real roots (counting multiplicities) and, for every root, the geometric multiplicity is equal to the algebraic multiplicity. To find the eigenbasis, for each eigenvalue c find a basis for Ker(cI - A) (i.e., for the eigenspace of c). These will combine to given an eigenbasis for the whole basis if the sum of the geometric multiplicities is n.
Complex numbers. Complex numbers have the form a+bi where a, b are real and i is a square root of -1. In this paragraph a,b,c,d etc. are real and y,z, etc. are complex. We think of real numbers as complex numbers in which b = 0. The conjugate of the complex number z, written with a line over z or as z*, is a-bi if z = a+bi. Note that
(a+bi) + (c+di) = (a+c) + (b+d)i
(a+bi)(c+di) = (ac-bd)+(ad+bc)i
and that 1/(a+bi) = (1/(a^2 + b^2))(a - bi).
We define |a+bi| as the square root of a^2+b^2. Then |z|^2 = zz*. Note that (y+z)* = y* + z* and (yz)^* = y*z*. z = z* if and only if z is a real number. The polar form of a complex number z is r(cos t + i sin t) where r = |z| and cos t + i sin t is a complex number of absolute value 1. For typographical reasons, we are using "t" instead of the more common theta.
If z = 0, any value of t can be used. if z is not 0, then t is uniquely determined up to multiples of 360 degrees. cos t + i sin t is abbreviated cis t. Note that if z = a+bi, (r, t) are polar coordinates for (a,b). Multiplication satisfies (r cis t)(r' cis t') = (rr') cis (t+t') and, hence, (r cis t)^n = r^n cis nt. From this one can deduce that if r is not 0, r cis t has n distinct n th roots, and they are given by r^{1/n} cis (t/n + 360k/n) as k runs through the values 0, 1, 2, ..., n-1. Over the complex numbers, every polynomial of degree n has n complex roots, i.e. it can be factored into n linear factors. If the coefficients are real, the complex roots occur in conjugate pairs.
Application of eigenvalues to discrete dynamical systems. Suppose that a system (described by a vector in R^n which gives the state of the system) is changing in such a way that if the state at time 0 is v then the state at time t is A^t v, where A is an n by n matrix. If A has an eigenbasis (even over the complex numbers) consisting of eigenvectors v_1, ..., v_n for eigenvalues c_1, ..., c_n, one can find a formula for A^t C as follows. Write C as a linear combination of the eigenvectors: say v = a_1 v_1 + ... + a_n v_n. (If S is the matrix whose columns are v_1, ..., v_n, then a_1, ..., a_n as a column is the solution of S X = v, i.e., the a_j are the entries of S^{-1} v.) Then A^t C = a_1(c_1)^t v_1 + ... + a_n(c_n)^tv_n. If all of the eigenvalues are less than one in absolute value then then A^t v approaches the 0 vector as t approaches infinity and we say that 0 is an asymptotically stable equilibrium state of the system. (This criterion is necessary and sufficient, and can be shown to be correct even if there is no eigenbasis.)
Change of basis. Let W = v_1, ..., v_n be a basis for R^n. (The book usually uses a script letter for a basis -- we can not do that here, for typographical reasons.) Each element v in R^n can be written uniquely as
a_1 v_1 + ... + a_n v_n, and the vector
a_1 a_2 . . . a_nis the coordinate vector of v with respect to the basis W, denoted [v]_W. The ORDER of the elements in W MATTERS.
If W is the standard basis e_1, ..., e_n then [v}_W = v.
Let S be the matrix whose columns are v_1, ..., v_n. Multiplication by S converts the W-coordinates of v to standard coordinates, i.e.,
v = S [v]_W.
S is invertible, and we get
[v]_W = S^{-1} v,
i.e., multiplication by S^{-1} converts standard coordinates to W-coordinates.
Let T be a linear transformation from R^n to R^n. The matrix [T]_W of T with respect to the basis W is the matrix B with the property that for all v in R^n,
B [v]_W = [T(v)]_W.
That is, B describes how T behaves on a vector in terms of its W-coordinates, giving the answer in W-coordinates. This implies:
B = S^{-1} A S (and so A = SBS^{-1} ).
Also, the columns of B are
[T(v_1)]_W, [T(v_2)]_W, ..., [T(v_n)]_W
If A, B are two n by n matrices (over R) then B is similar to A if there exists an invertible matrix S such that B = S^{-1} A S.
A matrix is similar to itself, B is similar to A if and only if A is similar to B, and if A is similar to B and B is similar to C then A is similar to C. Similar matrices have the same
rank,
characteristic polynomial,
eigenvalues,
algebraic and geometric multiplicities for each eigenvalue,
trace, and
determinant.
One can also consider similarity over the complex numbers, and the same statements hold.
Diagonalizability. A matrix is diagonalizable over R if and only if it has all real eigenvalues and an eigenbasis, i.e., if and only if the geometric multiplicity of every eigenvalue is equal to the algebraic multiplicity. A matrix is diagonalizable if and only if it is similar to a diagonal matrix. (If one takes S to have columns that given an eigenbasis, then S^{-1} A S is diagonal, with the eigenvalues of A on the diagonal.)
The same criteria work over the complex numbers, except that one allows complex eigenvalues, which increases the number of real matrices that are diagonalizable.
The spectral theorem. Let A be a real symmetric n by n matrix. Then if V is a subspace of R^n such that AV is contained in V (V is said to be stable under A), the orthogonal complement of V has the same property. If v, w are eigenvectors of A corresponding to distinct eigenvalues, then v is perpendicular to w.
Moreover, all eigenvalues of A are real, and A has an ORTHONORMAL basis consisting of real eigenvectors. Thus, there is an orthonormal matrix S such that S^{-1} A S is diagonal. In fact, a matrix is symmetric if and only if it can be written S D S^{-1} where D is diagonal and S is orthogonal.
To find an orthonormal eigenbasis, find the eigenvalues of A, and for each eigenvalue c, find a basis for Ker (cI- A). Use Gram-Schmidt to convert it to an orthonormal basis. (If there is only one element in the basis, just divide it by its length.) Put the bases for the eigenspaces together to get an eigenbasis for A.
Abstract linear spaces. Emphasis on linear spaces consisting of matrices and linear spaces consisting of polynomials. Determining a basis for a subspace of a linear space of this type.
Practice Problems
4.4 19, 21
5.1 21, 25
5.2 5, 7
5.3 13, 23
6.1 1, 3, 15
6.2 2, 4, 10
6.3 5, 15, 17
6.4 21, 23
6.5 1, 3, 5
7.1 1, 3
7.2 3, 9, 10
7.3 7, 11
9.1 25, 27