Appendix B
Cayley-Hamilton Theorem

The Cayley-Hamilton Theorem can be used to quickly generate the powers of a matrix. For a matrix \(A\), the characteristic equation is a polynomial in \(x\) defined as the determinant of \(A-xI\) where \(I\) is the identity matrix

\begin{equation}\tag{B.1} g(x) = \mathrm{det}(A-xI) = 0 \end{equation}

The Cayley-Hamilton Theorem simply says that the matrix \(A\) satisfies its characteristic equation

\begin{equation}\tag{B.2} g(A) = 0 \end{equation}

We will not prove the theorem since it is well outside the scope of this book but we will show how to use it to calculate the powers of a 2 by 2 matrix.

It is easy to show that the characteristic equation of a 2 by 2 matrix is

\begin{equation}\tag{B.3} g(x) = x^2 - \mathrm{Tr}(A)x + \mathrm{det}(A) \end{equation}

where \(\mathrm{Tr}(A)\) is the trace of the matrix (the sum of the diagonal elements) and \(\mathrm{det}(A)\) is the determinant of the matrix. The Cayley-Hamilton Theorem then says that

\begin{equation}\tag{B.4} A^2 - \mathrm{Tr}(A)A + \mathrm{det}(A)I = 0 \end{equation}

From this equation you can generate all the powers of the matrix. For example with \(t=\mathrm{Tr}(A)\) and \(d=\mathrm{det}(A)\), equation B.4 can be rearranged to give

\begin{equation}\tag{B.5} A^2 = tA - dI \end{equation}

To get \(A^3\) just multiply this equation by \(A\) and substitute back in for \(A^2\)

\begin{eqnarray}\tag{B.6} A^3 & = & tA^2 - dA\\ & = & t(tA-dI) - dA\nonumber\\ & = & (t^2-d)A - tdI\nonumber \end{eqnarray}

Continuing this recursively you can generate any power of \(A\). It is clear that in general you can express the \(n^{th}\) power of \(A\) as follows

\begin{equation}\tag{B.7} A^n = a(n)A + b(n)I \end{equation}

where \(a(n)\) and \(b(n)\) are two scalar functions of \(n\). To find expressions for these functions, start by letting \(S\) be the matrix that diagonalizes \(A\) so that

\begin{equation}\tag{B.8} S^{-1}AS = D = \begin{pmatrix}x_1&0\\0&x_2\end{pmatrix} \end{equation}

where \(x_1\) and \(x_2\) are the eigenvalues of \(A\) (these are the roots of the characteristic equation given by equation B.3. Operating on both sides of equation B.7 with \(S\) from the left and \(S^{-1}\) from the right will transform it into

\begin{equation}\tag{B.9} D^n = a(n)D + b(n)I \end{equation}

which in matrix form is

\begin{eqnarray}\tag{B.10} \begin{pmatrix}x_1^n&0\\0&x_2^n\end{pmatrix} & = & a(n)\begin{pmatrix}x_1&0\\0&x_2\end{pmatrix}\\ & &+b(n)\begin{pmatrix}1&0\\0&1\end{pmatrix}\nonumber \end{eqnarray}

This provides 2 equations in the unknowns \(a(n)\) and \(b(n)\) which can be solved to give

\begin{eqnarray}\tag{B.11} a(n) & = & \frac{x_1^n-x_2^n}{x_1-x_2}\\ b(n) & = & \frac{x_1x_2^n-x_1^nx_2}{x_1-x_2}\nonumber \end{eqnarray}

As an example we will calculate the powers of the binary Markov matrix

\begin{equation}\tag{B.12} M=\begin{pmatrix}\frac{1}{2}+a&\frac{1}{2}-a\\\frac{1}{2}-b&\frac{1}{2}+b\end{pmatrix} \end{equation}

The trace and determinant are

\begin{eqnarray}\tag{B.13} t & = & \mathrm{Tr}(M) = 1+a+b\\ d & = & \mathrm{det}(M) = a+b\nonumber \end{eqnarray}

The characteristic equation is then

\begin{equation}\tag{B.14} g(x) = x^2 - (1+a+b)x + a + b = 0 \end{equation}

and the eigenvalues are

\begin{equation}\tag{B.15} x_1 = 1,\;\;\;\;\;\;x_2=a+b \end{equation}

The functions \(a(n)\) and \(b(n)\) are then

\begin{eqnarray}\tag{B.16} a(n) & = & \frac{1-(a+b)^n}{1-(a+b)} = \frac{1-d^n}{1-d}\\ b(n) & = & \frac{(a+b)^n-(a+b)}{1-(a+b)} = \frac{d^n-d}{1-d}\nonumber \end{eqnarray}

The \(n^{th}\) power of \(M\) is therefore

\begin{eqnarray}\tag{B.17} & & M^n =\\ & & {\Tiny \frac{1}{1-d} \begin{pmatrix} \frac{1}{2}-b+(\frac{1}{2}-a)d^n&(\frac{1}{2}-a)(1-d^n)\\ (\frac{1}{2}-b)(1-d^n)&\frac{1}{2}-a+(\frac{1}{2}-b)d^n \end{pmatrix} } \end{eqnarray}