Lagrange multipliers on manifolds
We discuss in this article the theoretical aspects ofthe Lagrange multiplier method.
To enhance understanding,proofs and intuitive explanations of the Lagrange multipler methodwill be given from several different viewpoints,both elementary and advanced.
Contents:
- 1 Statements of theorem
- 1.1 Formulation with differential forms
- 1.2 Formulation with gradients
- 1.3 Formulation with tangent maps
- 2 Proofs
- 2.1 Beautiful abstract proof
- 2.2 Clumsy, but down-to-earth proof
- 3 Intuitive interpretations
- 3.1 Normals to tangent hyperplanes
- 3.2 With infinitesimals
- 3.3 As rates of substitution
- 4 Stationary points
1 Statements of theorem
Let be a -dimensional differentiable manifold (without boundary), and , and, for , be continuously differentiable.Set .
1.1 Formulation with differential forms
Theorem 1.
Suppose are linearly independentat each point of .If is a local minimum
or maximum point of restricted to ,then there exist Lagrange multipliers, depending on ,such that
Here, denotes the exterior derivative.
Of course, as in one-dimensional calculus, the condition by itself does not guarantee is aminimum or maximum point, even locally.
1.2 Formulation with gradients
The version of Lagrange multipliers typicallyused in calculus is the special case in Theorem 1.In this case,the conclusion of thetheorem can also be writtenin terms of gradients
instead of differential forms:
Theorem 2.
Suppose are linearly independentat each point of .If is a local minimum or maximum point of restricted to ,then there exist Lagrange multipliers, depending on ,such that
This formulation and the first oneare equivalent sincethe 1-form can be identified with the gradient, via the formula
.
1.3 Formulation with tangent maps
The functions can also be coalesced into a vector-valued function. Then we have:
Theorem 3.
Let .Suppose the tangent map is surjectiveat each point of .If is a local minimum or maximum point of restricted to ,thenthere exists a Lagrange multiplier vector ,depending on , such that
Here, denotes the pullback of the lineartransformation (http://planetmath.org/DualHomomorphism) .
If is represented by its Jacobian matrix,then the condition that it be surjective is equivalent toits Jacobian matrix having full rank.
Note the deliberate use of the space instead of — to which the former is isomorphic to —for the Lagrange multiplier vector. It turns out that theLagrange multiplier vector naturallylives in the dual space and not the original vector space
.This distinction is particularly important in the infinite-dimensionalgeneralizations
of Lagrange multipliers.But even in the finite-dimensional setting,we do see hints that the dual spacehas to be involved, because a transpose
is involvedin the matrix expression for Lagrange multipliers.
If the expression is writtenout in coordinates, then it is apparent that the components of the vector are exactly thoseLagrange multipliers from Theorems 1 and 2.
2 Proofs
The proof of the Lagrange multiplier theorem is surprisingly short and elegant,when properly phrased in the language of abstract manifolds anddifferential forms.
However, for the benefit of the readers not versed in these topics,we provide, in addition to the abstract proof, a concrete translation of the arguments
in the more familiar setting .
2.1 Beautiful abstract proof
Proof.
Since are linearly independent at each point of , is an embedded submanifold of ,of dimension . Let , with open in , bea coordinate chart for such that .Then has a local minimum or maximum at ,and therefore at .But at is an isomorphism
,so the preceding equation says that vanishes on .
Now, by the definition of , we have ,so . So like , vanishes on .
In other words, is in the annihilator (http://planetmath.org/AnnihilatorOfVectorSubspace) of the subspace
.Since has dimension , and has dimension ,the annihilator has dimension .Now are linearly independent,so they must in fact be a basis for .But we had argued that .Therefore may be written as a unique linear combination
of the :
The last paragraph of the previous proof can also be rephrased,based on the same underlying ideas,to make evident the fact that the Lagrange multiplier vectorlives in the dual space .
Alternative argument..
A general theorem in linear algebra states that for any lineartransformation ,the image of the pullback is the annihilator of the kernel of .Since and ,it immediately follows that exists such that .∎
Yet another proofcould be devised by observingthat the result is obvious if and the constraint functionsare just coordinate projections on :
We clearly must haveat a point that minimizes over .The general case can be deduced to this by a coordinate change:
Alternate argument..
Since are linearly independent,we can find a coordinate chart for about the point ,with coordinate functions such that for .Then
but at the point . Set at .∎
2.2 Clumsy, but down-to-earth proof
Proof.
We assume that .Consider the list vector discussed earlier,and its Jacobian matrix in Euclidean coordinates.The th row of this matrixis
So the matrix has full rank (i.e. ) if and onlyif the gradients are linearly independent.
Consider each solution of .Since has full rank, we can apply the implicit function theorem,which states that there exist smooth solution parameterizations around each point . ( is an open set in , .)These are the coordinate charts which give to a manifold structure.
We now consider specially the point ; without loss of generality, assume .Then is a function on Euclidean space having a local minimum or maximum at ,so its derivative
vanishes at .Calculating by the chain rule
, we have.In other words, .Intuitively, this says that the directional derivatives
at of lying in the tangent space
of the manifold vanish.
By the definition of and , we have .By the chain rule again, we derive .
Let the columns of be the column vectors , which span the-dimensional space ,and look at the matrix equation again.The equation for each entry of this matrix, which consists of only one row, is:
In other words, is orthogonal to ,and hence it is orthogonal to the entire tangent space .
Similarly, the matrix equation can be split into individualscalar equations:
Thus is orthogonal to .But are, by hypothesis, linearly independent,and there are of these gradients, so they must form a basis forthe orthogonal complement
of , of dimensions.Hence can be written as a unique linear combination of :
3 Intuitive interpretations
We now discuss the intuitive and geometricinterpretations of Lagrange multipliers.
3.1 Normals to tangent hyperplanes
Each equation defines a hypersurface in , a manifold of dimension .If we consider the tangent hyperplane
at of these hypersurfaces, , the gradient gives the normal vector
to these hyperplanes.
The manifold is the intersection of the hypersurfaces .Presumably, the tangent space is the intersection of the , and the subspace perpendicular
to would be spanned by the normals .Now, the direction derivatives at of with respect to each vector in , as we have proved,vanish. So the direction of , the directionof the greatest change in at , should be perpendicularto . Hence can be written as a linear combination of the .
Note, however, that this geometric picture, and the manipulations with the gradients and , do not carry over to abstract manifolds.The notions of gradients and normals to surfaces depend on theinner product structure of , which isnot present in an abstract manifold (without a Riemannian metric
).
On the other hand, this explains the mysterious appearance of annihilators inthe last paragraph of the abstract proof.Annihilators and dual space theory serve as the proper toolsto formalize the manipulations we made with the matrix equations and , without resorting to Euclidean coordinates, which, of course, are not even defined on an abstract manifold.
3.2 With infinitesimals
If we are willing to interpret the quantities and as infinitesimals,even the abstract version of the result has an intuitive explanation.Suppose we are at the point of the manifold ,and consider an infinitesimal movement about this point.The infinitesimal movement is a vector in the tangent space, because, near , looks like the linear space .And as moves, the function changes by a corresponding infinitesimal amount that is approximately linear in .
Furthermore, the change may be decomposedas the sum of a change as moves along the manifold ,and a change as moves out of the manifold .But if has a local minimum at , then there cannot beany change of along ; thus only changeswhen moving out of .Now is described by the equations ,so a movement out of is described by the infinitesimal changes.As is linear in the change ,we ought to be able to write it as a weighted sum of the changes .The weights are, of course, the Lagrange multipliers .
The linear algebra performed in the abstract proof can be regarded as the precise, rigoroustranslation of the preceding argument.
3.3 As rates of substitution
Observe that the formula for Lagrange multipliers is formallyvery similar to the standard formula for expressinga differential form in terms of a basis:
In fact, if are linearly independent,then they do form a basis for ,that can be extended to a basis for .By the uniqueness of the basis representation,we must have
That is, is the differentialof with respect to changes in .
In applications of Lagrange multipliers to economicproblems, the multipliers are rates of substitution —they give the rate of improvement in the objective function as the constraints are relaxed.
4 Stationary points
In applications,sometimes we are interested infinding stationary points of — defined aspoints such that vanishes on , or equivalently,that the Taylor expansion of at , under any system of coordinatesfor , has no terms of first order. Then the Lagrange multiplier methodworks for this situation too.
The following theorem incorporatesthe more general notion of stationary points.
Theorem 4.
Let be a -dimensional differentiable manifold (without boundary), and ,, for ,be continuously differentiable. Suppose , and are linearly independent.
Then is a stationary point (e.g. a local extremum point) of restricted to ,if and only if there exist such that
The Lagrange multipliers , which depend on , are unique when they exist.
In this formulation, is not necessarily a manifold,but it is one when intersected with a sufficiently small neighborhood about .So it makes sense to talk about , although we are abusing notation here.The subspace in question can be more accurately described as theannihilated subspace of .
It is also enough that be linearly independentonly at the point .For are continuous, so they will belinearly independent for points near anyway,and we may restrict our viewpoint to a sufficiently small neighborhoodaround , and the proofs carry through.
The proof involves only simple modifications tothat of Theorem 1 — for instance,the converse implication followsbecause we have already proved that the form a basis for the annihilator of ,independently of whether or not is a stationary point of on .
References
- 1 Friedberg, Insel, Spence. Linear Algebra. Prentice-Hall, 1997.
- 2 David Luenberger. Optimization by Vector Space Methods.John Wiley & Sons, 1969.
- 3 James R. Munkres. Analysis on Manifolds. Westview Press, 1991.
- 4 R. Tyrrell Rockafellar. “Lagrange Multipliers and Optimality”. SIAM Review. Vol. 35, No. 2, June 1993.
- 5 Michael Spivak. Calculus on Manifolds. Perseus Books, 1998.