请输入您要查询的字词:

 

单词 LectureNotesOnTheCayleyHamiltonTheorem
释义

lecture notes on the Cayley-Hamilton theorem


1 Overview

You should all know about the characteristic polynomialMathworldPlanetmathPlanetmath of a squarematrixMathworldPlanetmath, A. To calculate the characteristic polynomial of A, onesubtracts a variable, say t, from the diagonal entries of A, andthen takes the determinantMathworldPlanetmath of the result. In other words, lettingpA(t) denote the characteristic polynomial of A, one has

pA(t)=det(A-tI),

where I as usual denotes the identity matrixMathworldPlanetmath.For example, set

A=[1230-21110].

Evaluating the determinant of A-tI one gets

pA(t)=-t3-t2+6t+7.

Now the interesting thing about square matrices is that one can doalgebra with them. So if A is a 3×3 matrix then A2,A3, indeed every power of A will also be a 3×3 matrix.Indeed, one can take any polynomialPlanetmathPlanetmath p(t), and happily plug A intoit. The result will be some other 3×3 matrix. The obviousquestion now is: what will happen when one plugs a square matrix Ainto its own characteristic polynomial? Let’s see what happens forthe sample 3×3 matrix above. Straightforward calculations showthat

A2=[41515-2104],A3=[91113-1-108563],

Next adding the various powers of A with the coefficients ofcharacteristic polynomial (note that one uses the identity matrix inplace of the constants) one gets

-A3-A26A7I-[91113-1-108563]-[41515-2104]+6[1230-21110]+7[100010001]
=[000000000]

Zero! One gets zero. This seemingly miraculous answer is not acoincidence. Indeed one gets zero regardless of what matrix onestarts with. I encourage you to try this with a few examples of your own.

Theorem 1 (Cayley-Hamilton).

Let A be an n×n matrix, and let pA(t)=det(A-tI) be thecorresponding characteristic polynomial. Then, pA(A)=0.

The goal of these notes will be to explain and prove the abovetheorem. There are various hidden reasons that make theCayley-Hamilton theoremMathworldPlanetmath work. It is the purpose of these notes tobring these reasons into the open.

2 The Gist of the Matter.

Indeed there are two factors that make the Cayley-Hamilton theoremsuch a striking and interesting result. Recall that if U is ann-dimensional vector spaceMathworldPlanetmath, and 𝐮1,𝐮2,𝐮3,,𝐮n+1, are any n+1 vectors in U, then there will some kind ofa linear relationMathworldPlanetmath between the 𝐮i’s, i.e. for some choice ofscalars a1,a2,,an+1 one will have

a1𝐮1+a2𝐮2+a3𝐮3+an+1𝐮n+1=0.

Now the space of 3×3 matrices is 9-dimensional. Thereforefor every 3×3 matrix A there must be a linear relationshipbetween the 10 different matrix powers

A9,A8,A7,A6,A5,A4,A3,A2,A1=A,A0=I.

The “miracle” of the Cayley-Hamilton theorem is twofold. First, alinear relation arises already for the powers A3,A2,A1,A0.Second, the coefficients for this linear relation are precisely thecoefficients of the characteristic polynomial of A.

Let’s put it another way. Look at the first column vectorsMathworldPlanetmath of thematrices A3,A2,A1,A0, i.e. the vectors

[9-15],[411],[101],[100].

Now 3 is a 3-dimensional vector space, and so thereshould be a linear relation between the above 4 vectors. Indeedthere is: the coefficients of the linear relation are -1,-1,6,7( i.e. -1 times the first vector, plus -1 times the second, plus6 times the third, plus 7 times the fourth is equal to zero —try it yourself!). What about the second column vectors of A3,A2,A1,A0? Now the vectors in question are

[11-106],[150],[2-21],[010].

Again, we have here 4 vectors from a 3-dimensional vectorsspace, and therefore there should be a linear relation between thevectors. However by some miracle the coefficients of the linearrelation for the second column vectors are the same as thecoefficients of the linear relation between the first column vectors,namely -1,-1,6,7. Furthermore, these coefficients are preciselythe coefficients of the characteristic polynomial: -t3-t2+6t+7.Needless to say the third column vectors are joined in a linearrelation with the same coefficients: -1,-1,6,7. Why is thishappening?

3 The Cyclic Basis

Let’s look again at the first column vectors of the matrices A0,A1,A2 (recall that A0 is just the identity matrix):

𝐮0=[100],𝐮1=[101],𝐮2=[411],

and let’s take these 3 vectors as a new basis, 𝐁. A basisobtained in this fashion, i.e. by starting with a vector andsuccessively applying a matrix to it, is called a cyclic basis. Whatwill be the representation of the matrix A relative to this cyclicbasis? Now 𝐮0 is just the first elementary vector, 𝐞1.Furthermore, note that 𝐮1 is nothing but A𝐮0, and that𝐮2=A𝐮1. Now A𝐮2 is the first column vector of A3 andwe already determined the linear relation between the first columnvectors of A3,A0. The bottom line is that

A𝐮2=-(𝐮2-6𝐮1-7𝐮0),

and consequently A will havethe following appearance relative to the basis 𝐁:

[A]𝐁=[00710601-1]

The transition matrix P from 𝐁 to the standard basis 𝐞1,𝐞2,𝐞3 is given by

P=[114001011].

Of course P is relevant to our discussion precisely because

[A]𝐁=P-1AP.
Proposition 1.

Let A be an n×n matrix, P a non-singular n×nmatrix, and set B=P-1AP. The matrices A and B havethe same characteristic polynomials.

Proof.

The characteristic polynomial of B is given as

det(B-tI)=det(P-1AP-tI)=det(P-1(A-tI)P).

Recall that the determinant of a productPlanetmathPlanetmath is the product of thedeterminants, and that the determinant of an inversePlanetmathPlanetmathPlanetmathPlanetmathPlanetmath is the inverse ofthe determinant. Therefore

det(B-tI)=det(P-1)det(A-tI)det(P)=det(A-tI).

In other words, according to the above theorem we should expect thecharacteristic polynomial of [A]𝐁 to be equal to thecharacteristic polynomial of A. Let’s check this using a co-factorexpansion.

|-t071-t601-1-t|=-t(t2+t-6)+71=-t3-t2+6t+7

Also note that the last column of [A]𝐁 contains all but one ofthe coefficients of the characteristic polynomial. This too is not acoincidence.

Proposition 2.

Consider an n×n matrix B such that the jthcolumn vector for j=1,2,n-1 is the basic vectorej+1, while the last column of B is the vector [-b0,-b1,,-bn-1]T. In other words B has the followingform:

B=[0000-b01000-b10100-b20010-b30001-bn-1]

Then, the characteristic polynomial of B is given by

(-1)npB(t)=tn+bn-1tn-1++b2t2+b1t+b0.
Proof.

We will calculate the determinant of B-tI by doing a co-factorexpansion along the first row. Let B1 be the matrix obtained bydeleting the first row and the first column from B-tI, and let Dbe the matrix obtained by deleting the first row and the last columnfrom B-tI. Doing a co-factor expansion along the top row, it iseasy to see that

det(B-tI)=-tdet(B1)+(-1)nb0det(D).

Now D is anupper triangular matrixMathworldPlanetmath with ones on the diagonal, and thereforedet(D)=1. The matrix B1, on the other hand has the samestructureMathworldPlanetmath as B-tI, only it’s 1 size smaller. To that end letB2 be the matrix obtained by deleting the first two rows andcolumns from B-tI. By the same reasoning as above it’s easy tosee that

det(B1)=-tdet(B2)+(-1)n-1b1,

and therefore

det(B-tI)=(-1)nb0-t((-1)n-1b1-tdet(B2)).

Continuing inductively we see that for even n,the determinant of B-tI will have the form:

b0-t(-b1-t(b2-t(-b3-t())))=b0+b1t+b2t2+b3t3++bn-1tn-1+tn

For odd n, det(B-tI) will be just like theformulaMathworldPlanetmath above, but multiplied through by a negative sign.∎

4 Putting it all together

Thanks to PropositionsPlanetmathPlanetmath 1 and 2 we arenow in a position to understand and to prove theCayley-Hamilton Theorem. Let A be an n×n matrix. Start bysetting 𝐮0=𝐞1, and then create a sequenceMathworldPlanetmath of vectors bysuccessively applying A, i.e. 𝐮1=A𝐮0, 𝐮2=A𝐮1, etc.Notice that 𝐮k=Ak𝐮0; in other words, 𝐮k is the firstcolumn of the matrix Ak.

Next, suppose that the n vectors 𝐮0,𝐮1,𝐮2,,𝐮n-1 form a basis, 𝐁, of n (There are matrices Afor which this doesn’t happen, but we’ll consider this possibilitylater.) There will therefore exist scalars b0,b1,bn-1such that

𝐮n+bn-1𝐮n-1++b1𝐮1+b0𝐮0=0.

Now the representation of A relative to the cyclic basis 𝐁 willhave the form

[A]𝐁=[0000-b01000-b10100-b20010-b30001-bn-1]

By Proposition 1 the characteristic polynomial of[A]𝐁 is equal to the characteristic polynomial of A.Furthermore, by Proposition 2 the characteristicpolynomial of [A]𝐁 is equal to

±(tn+bn-1tn-1+b1t+b0).

Only one conclusionMathworldPlanetmathis possible: b0,b1,,bn-1 must be precisely thecoefficients of the characteristic polynomial of A. Let ussummarize these findings.

Proposition 3.

Let A be an n×n matrix, with characteristic polynomial

pA(t)=±(tn+bn-1tn-1++b1t+b0).

Fix anumber k between 1 and n, and let uj be the kth column of the matrix Aj. If the vectorsu0,u1,,un-1 form a basis of Rn, thenthe vectors u0,u1,,un-1,un satisfy thelinear relation:

𝐮n+bn-1𝐮n-1++b1𝐮1+b0𝐮0=0.

5 A Complication

We are almost done with the proof of the Cayley-HamiltonTheorem. First, however, we must deal with the possibility that thesquare matrix A is such that the column vectors of A0,A1,,An-1 do not form a basis. Consider, for example

A=[1-1214-412-2]

An easy calculation shows that the characteristic polynomial is givenby

pA(t)=t3-3t2+t+2.

Writing down the sequence of powers of A:

A3A2A1A0[3-24215-1427-6][2-1217-613-2][1-1214-412-2][100010001]

we notice that the first columns do, in fact, obey a linear relationwith the coefficients of the characteristic polynomial:

[322]-3[211]+[111]+2[100]=[000].(1)

However these first column vectors do not form a basis of3, and therefore Proposition 3 is not enoughto explain why these vectors obey the above linear relation.

In order to find an explanation, let us proceed as follows. Just asbefore, start by setting 𝐮0=𝐞1, and 𝐮1=A𝐮0. If wetake 𝐮2=A𝐮1, then 𝐁=(𝐮0,𝐮1,𝐮2) will not form abasis, so instead, let us choose 𝐮2 that is linearly independentMathworldPlanetmathfrom 𝐮0 and 𝐮1, thereby ensuring that 𝐁 is a basis.There are many, many possible such choices for 𝐮2. To keep thediscussion concrete, let us take 𝐮2=𝐞3=[0,0,1]T. Note that

A𝐮0=𝐮1
A𝐮1=[2,1,1]T=𝐮0+𝐮1.
A𝐮2=[2,-4,2]T=6𝐮0-4𝐮1+2𝐮2.

Therefore, representing A relative to the basis 𝐁 we obtain

[A]𝐁=[01611-4002]

By Proposition 1, we know that the characteristicpolynomial of [A]𝐁 is equal to the characteristic polynomial ofA. However, we know much more.

Proposition 4.

Let B be an n×n matrix of the form

B=[B1B2𝟎B3]

whereB1 is a k×k matrix, B2 is a k×(n-k) matrix, andB3 is a (n-k)×(n-k) matrix. Then, the characteristicpolynomial of B is the product of the characteristic polynomials ofB1 and B3, i.e. pB(t)=pB1(t)×pB3(t).

Proof.

Note that

B-tI=[B1-tI1B2𝟎B3-tI3]

where I1 is the k×k identity matrix, and I3 is the(n-k)×(n-k) identity matrix. The Proposition now follows fromthe fact that the determinant of a matrix whose shape is like B isthe determinant of the upper-left block times the determinant of thelower-right block.∎

Thanks to Proposition 4 we know that thecharacteristic polynomial of [A]𝐁 is a product of thecharacteristic polynomial of the 2×2 matrix

[0111]

and the characteristic polynomial of the 1×1 matrix [2]. Inother words,

p[A]𝐁(t)=(t2-t-1)(t-2).

Furthermore by Proposition2 we know that A𝐮1, 𝐮1, 𝐮0, i.e. thefirst column vectors of A2,A1,A0, obey a linear relation withthe coefficients of the polynomial t2-t-1:

A𝐮1-𝐮1-𝐮0=𝟎,
[211]-[111]-[100]=[000].(2)

Multiplying this relation through by A wed deduce that the firstcolumn vectors of A3,A2,A1 obey the same linear relation:

[322]-[211]-[111]=[000].(3)

Next think about what it means to multiply a polynomial such ast2-t-1 by another polynomial such as t-2. Indeed, one canstructure the multiplication by multiplying the first polynomialthrough by t, then multiplying it through by -2, and then addingthe two terms:

t3-t2-t-2t22t2t3-3t2+t+2

The bottom line is, of course, just the characteristic polynomialof A, and the whole idea behind the above calculation is thatpA(t) can be “formed out of” the polynomial t2-t-1. Thisshows that we can combine relations (2) and(3) and produce in the end the desired relation(1). All we have to do is take relation (3),and add to it -2 times the relation (2). This explainswhy the first column vectors of A3,A2,A1,A0 obey a linearrelation whose coefficients come from the characteristic polynomial ofA.

Proposition 5.

Let A be an n×n matrix, with characteristic polynomial

pA(t)=±(tn+bn-1tn-1++b1t+b0).

Fix anumber k between 1 and n, and let uj be the kth column of the matrix Aj.The vectors u0,u1,,un-1,un satisfy thelinear relation:

𝐮n+bn-1𝐮n-1++b1𝐮1+b0𝐮0=0,

even if the vectors u0,u1,,un-1 do not form abasis of Rn.

Proof.

Suppose that there is a number m<n such that 𝐮m can be givenas a linear combinationMathworldPlanetmath of 𝐮0,𝐮1,,𝐮m-1; let’ssay

𝐮m+cm-1𝐮m-1++c1𝐮1+c0𝐮0=0.

Choose vectors 𝐯m,𝐯m+1,,𝐯n-1 so that thelist

𝐁=(𝐮0,𝐮1,,𝐮m-1,𝐯m,𝐯m+1,,𝐯n-1)

forms a basis of n. Relative to this basis, A will havethe form

A=[A1A20A3],

where the upper-left block has the form

A1=[000-c0100-c1010-c2001-cm-1]

By Proposition 2,

pA1(t)=±(tm+cm-1tm-1++c1t+c0)

.Let ±(tn-m+dn-m-1tm-1+d1t+d0) denote the characteristic polynomial of A3.By Proposition 4, the characteristic polynomial of[A]𝐁 (which is equal to the characteristic polynomial of A) isthe product pA1(t)×pA3(t), and therefore pA(t) canbe obtained by taking linear combinations of pA1(t) timesvarious powers of t:

d0×tm+cm-1tm-1++c2t2+c1t+c0d1×tm+1+cm-1tm+cm-2tm-1++c1t2+c0t1×tn+cm-1tn-1++c1tn-m+1+c0tn-mtn+bn-1tn-1+bn-2tn-2++b2t2+b1t+b0

Corresponding to the above polynomials are the relations

𝐮m+cm-1𝐮m-1++c2𝐮2+c1𝐮1+c0𝐮0=0𝐮m+1+cm-1𝐮m+cm-2𝐮m-1++c1𝐮2+c0𝐮1=0𝐮n+cm-1𝐮n-1++c1𝐮n-m+1+c0𝐮n-m=0

Adding these relations in the same way as the polynomials yieldsthe desired relation:

𝐮n+bn-1𝐮n-1++b1𝐮1+b0𝐮0=0,

Now we really are finished. Thanks to Propositions 3and 5 we know that for every k=1,,n, the kth column vectors of the matrices

An,An-1,,A1,A0

obey a linear relation with the coefficients of the characteristicpolynomial of A. Since this is true for every column of the abovematrices, it is certainly true for the full matrix — and that is theprecisely the conclusion of the Cayley-Hamilton theorem.

随便看

 

数学辞典收录了18232条数学词条,基本涵盖了常用数学知识及数学英语单词词组的翻译及用法,是数学学习的有利工具。

 

Copyright © 2000-2023 Newdu.com.com All Rights Reserved
更新时间:2025/5/5 0:32:51