SIMILARITY OF MATRICES OVER THE COMPLEX NUMBERS

We want to show here that every matrix over the complex numbers is similar (over C) to an upper triangular matrix. The key point is the fundamental theorem of algebra: every polynomial of positive degree over C has a root in C. This means that over C, there is always at least one eigenvector.

Here's why:
Look at a smallest size matrix A for which the result is false. Suppose that A has size n > 1. (Any 1 by 1 matrix is already upper triangular.)

Pick one eigenvector v = v_1 and include v_1 in a basis for C^n. Call the corresponding eigenvalue c, i.e., Av = cv. Consider a matrix for the transformation given by A with respect to this basis. Since Av = cv, the first column of the new matrix is

c
0
0
.
.
.
0

and so A is similar to a matrix A' in block form:

c B
0 D

where [c] is a 1 by 1 block, B is a 1 by n-1 block, 0 denotes an n-1 by 1 block, and D is an n-1 by n-1 block.

Because D is size n-1, we can choose an invertible matrix Q (size n-1 by n-1) such that Q^{-1} D Q is upper triangular. Let S be the block matrix

1 0
0 Q

where the block of 0's in the first row is 1 by n-1. Then S^{-1} =

1   0 
0 Q^{-1} 

and it is easy to check that

S^{-1} A' S =

c     E
0 Q^{-1} D Q 

which is upper triangular. Thus, we can get a similar matrix that is upper triangular, and there is no smallest size for which the result fails. QED

The elements on the diagonal of the similar upper triangular matrix will, of course, be the eigenvalues of the original matrix.

Fact: An n by n matrix is nilpotent (some power equal to 0) if and only if all eigenvalues are 0. In this case, A^n = 0.

Here's why:
the issue is unaffected by replacing the matrix by a similar matrix. Therefore, we can assume that the matrix is upper triangular. If any eigenvalue c is not 0, the matrix cannot be nilpotent: c^t will be an eigenvalue of A^t, which means that A^t cannot be zero. Therefore, we only need show that an upper triangular matrix such that the diagonal is 0 has n th power equal to 0. The condition implies that e_1 maps to 0, e_2 maps into Span{e_1}, e_3 maps into Span{e_1, e_2}, etc. From this one checks that the image of A^k is contained in the span of e_1, e_2, ..., e_{n-k} for each k = 1, 2, 3, ..., n-1. E.g., the image of A^{n-1} is contained in Span{e_1}, and applying A one more time produces 0.

Jordan Canonical Form

A much more precise result, called Jordan canonical form, is known. A Jordan block of size k is an upper triangular matrix with the following properties:

(1) the main diagonal consists of a single constant  c
(2) the diagonal just above the main diagonal consists 
     of k-1  1's
(3) all other entries are 0.

For example, a Jordan block looks like

c          (k = 1) or

c 1
0 c        (k = 2) or

c 1 0 0 0
0 c 1 0 0
0 0 c 1 0
0 0 0 c 1
0 0 0 0 c  (k = 5) 

The main theorem is that every square matrix over C is similar to a matrix which in block form looks like


A_1  0   0  0 ...  0
 0  A_2  0  0 ...  0
 0   0  A_3 0 ...  0     
 .   .   .  . ...  . 
 .   .   .  . ...  .
 .   .   .  . ...  .
 0   0   0  . ... A_h

where every A_s is a Jordan block. The Jordan form is unique except for the order in which the Jordan blocks are "strung out" along the main diagonal. Thus, every matrix is similar to a matrix in Jordan form, and two matrices in Jordan form are similar if and only if they have the same blocks, although not necessarily arranged in the same order.

E.g. if there are two blocks of size one, one of size 2, and one of size 3, the matrix would look like this

a 0 0 0 0 0 0
0 b 0 0 0 0 0
0 0 c 1 0 0 0
0 0 0 c 0 0 0
0 0 0 0 d 1 0
0 0 0 0 0 d 1
0 0 0 0 0 0 d

When a matrix is in Jordan form, it is upper triangular. Any nonzero entries not on the main diagonal must be 1's, and these must be just above the main diagonal.

One can say more. One can write a matrix in Jordan form as the sum of a diagonal matrix D with the same diagonal entries, and an upper triangular matrix N with all 0's on the diagonal. If the size of the matrix is n, then N is nilpotent: in fact, N^n = 0. Morever, D and N commute: DN = ND.

The punchline:
Every square matrix A over the complex numbers can be written as S (D + N) S^{-1}, where D is diagonal, N is nilpotent, and DN = ND.

It is important to note that when A is diagonalizable, the Jordan form of A is simply a diagonal matrix with the eigenvalues of A in some order on the diagonal. The Jordan blocks all have size 1 by 1, and N = 0.

There is an algorithmic process for finding a matrix S such that S^{-1} A S is in Jordan form, but we shall not give the details here: the method, along with an explanation of uniqueness, can be found in any text for a second course in linear algebra. However, we shall explain some applications.

High Powers of Matrices

By computing the Jordan form, one can give a "closed" expression for the high powers of a matrix A, which is what one wants for studying a discrete dynamical system. The idea is that, using Jordan form, one has

A = S (D+N) S^{-1}

where D is diagonal, N is nilpotent, and DN = ND.

Then A^t = S (D+N)^t S^{-1}

To carry this further we illustrate what happens if, saym N^3 = 0. There are similar formulas no matter which power of N is 0. When D and N commute, one can apply the binomial theorem.

Thus, (D + N)^t = D^t + t D^{t-1} N + (1/2)t(t-1) D^{t-2} N^2,

and all other terms are 0 because they involve N^3 or a higher power of N.

Because D is diagonal, it is easy to find powers of D.

Exponentials of Matrices

Suppose we want to find e^A, the exponential of the square matrix A. As before, we can write

A = S (D + N) S^{-1}

with D diagonal, N nilpotent, and DN = ND.

Then e^{D + N} = e^D e^N. Both terms can be calculated because e^D is diagonal (exponentiate each diagonal entry of D), and e^N is a polynomial in N: this high powers of N.

Thus, e^A = S (e^D e^N) S^{-1} can be calculated. If u represents a variable taking on real (or complex) values, we have that uA = uD + uN as well, where uD is diagonal, uN is nilpotent, and the two still commute. Thus, we get a formula

e^{uA} = S (e^{uD} e{uN}) S^{-1}

as well, valid for every value of the scalar u.

First Order Linear Systems of Differential Equations

Let y_1, ..., y_n be functions of a variable x. For the moment we let n = 2.

We may have a linear system of differential equations like:

dy_1/dx = 3 y_1 + 4 y_2

dy_2/dx = 7 y_1 - 5 y_2

If we make the unknown functions into a vector

Y =

y_1 y_2

this can be rewritten as

dY/dx = AY

where A =

3 4
7 -5

More generally, we consider systems

dY/dx = AY

where Y =

y_1
y_2
.
.
.
y_n

and A is an n by n matrix of real scalars.

If n = 1, we just have the simple differential equation dy/dx = ay, and the solution is

y = e^{xa}c

where c is y(0).

It turns out that in the general case the linear system

dY/dx = AY

has the solution

Y = e^{xA}C

where C = Y(0) is a vector. Thus, the solution of such linear systems reduces to finding the Jordan form of A, which, when A is diagonalizable over C, comes down to diagonalizing A. This explains the crucial importance of eigenvalues in solving linear systems of differential equations.