Leibniz notation for vector fields

Consider a vector field $V$ ,such as

V(x,y)=(y,-x)\\qquad\\text{for $(x,y)\\in\\mathbb{R}^{2}$.}

Oftentimes $V$ will be expressed in the following Leibnizian notation

y\\frac{\\partial}{\\partial x}-x\\frac{\\partial}{\\partial y}\\,,

which is extremely strange the first time one encounters it. After all, whynot justwrite

V(x,y)=ye_{1}-xe_{2}

where $e_{1}=(1,0)$ , $e_{2}=(0,1)$ ?What do partial derivatives have to do with anything?(That was the question I asked.)

However, if viewed in the right way, it is not so strange.For if a vector field $V$ is given, then one natural thing that can be donewith it is to differentiate a scalar-valued function $f$ in the direction of $V$ .In functional notation, this would be

\\operatorname{D}f(p)\\cdot V(p)

where $p$ is a point, $V(p)$ is gives the particular vectorin the vector field at the point $p$ , and $\\operatorname{D}f(p)\\cdot v$ denotes the directional derivative of $f$ at $p$ with respectto the direction $v$ .

For our example, $\\operatorname{D}f(p)\\cdot V(p)$ (where $p=(x,y)$ ) equals

	$\\displaystyle\\operatorname{D}_{1}f(x,y)\\cdot y+\\operatorname{D}_{2}f(x,y)\\cdot%(-x)$	$\\displaystyle=\\frac{\\partial f}{\\partial x}y-\\frac{\\partial f}{\\partial y}x$
		$\\displaystyle=y\\frac{\\partial f}{\\partial x}-x\\frac{\\partial f}{\\partial y}$
		$\\displaystyle=\\left(y\\frac{\\partial}{\\partial x}-x\\frac{\\partial}{\\partial y}%\\right)[f]=V[f]\\,.$

We have written out the steps explicitly to showwhere the partial derivative notation for $V$ comes from.At the second last step we considered $\\partial/\\partial x$ and $\\partial/\\partial y$ as operators acting on thefunction $f$ .

The Leibniz notation on manifolds

But there is more. We can consider a more general situation,where $V$ is a vector field on a manifold.In this case, because the tangent space to a manifold varieswith each point $p$ , we cannot fix certain basis vectors $e_{i}$ to describe our vector field anymore.Loosely speaking, the basis vectors now have to vary smoothly.

Suppose we have a coordinate system $\\{x^{i}\\}_{i=1,\\ldots,n}$ on the manifold.Then the tangent vector on the manifold corresponding to an infinitesimalchange in $x^{i}$ is often written

\\frac{\\partial}{\\partial x^{i}}\\,.

This makes total sense, because one of the favorite ways to definetangent vectors on abstract manifolds is to identify themwith directional derivatives. The basis $\\partial/\\partial x^{i}$ (whichvaries smoothly with $p$ )then becomes the replacement for the fixed basis $e_{i}$ in Euclidean space.Note that this is consistent with the calculus notation $\\partial f/\\partial x^{i}$ for the partial derivative of $f$ with respect to the $x^{i}$ variable,because partial derivatives are merely directional derivativeswith respect to the direction $e_{i}$ .

If $\\{y^{j}\\}$ is another coordinate system for the manifold,then we have the formula (for the derivation, see the entry onvector fields (http://planetmath.org/VectorField))

\\frac{\\partial}{\\partial y^{j}}=\\frac{\\partial x^{i}}{\\partial y^{j}}\\frac{%\\partial}{\\partial x^{i}}\\,,

(1)

We have used the Einsteinsummation convention above to emphasize the mnemonic cancelling of fractions.

The quantity $\\partial x^{i}/\\partial y^{j}$ is the directional derivative in the direction $\\partial/\\partial y^{j}$ of the $i$ th coordinate function, mapping a point $p$ on the manifold tothe coordinate $x^{i}$ . Notice the subtle subtle change ofviewpoint here:we are not considering $x^{i}$ as mere “variables”,but as functions of the point $p$ .

Formula (1), which is a linear combination of vectorsin a tangent space,of course looks like the chain rule learned in elementary multivariatecalculus,but it is much more than that:we are saying that the formula holds for general curvilinear coordinate systems on manifolds!This is definitely one of the virtues of the Leibnizian notation — making advanced conceptslook simple.

As a simple example to get used to this notation,consider the function $f\\colon\\mathbb{R}^{3}\\to\\mathbb{R}$ defined by $f(x,y,z)=x^{2}+y^{2}+z^{2}$ .The Euclidean space $\\mathbb{R}^{3}$ can also be thought of as a manifold,and suppose we use a spherical coordinate system $(r,\\theta,\\phi)$ on it.

Let us compute $\\partial f/\\partial r$ . If $f$ is “viewed as a function of $(r,\\theta,\\phi)$ ”,then $f(r,\\theta,\\phi)=r^{2}$ ,so we certainly hope that $\\partial f/\\partial r=2r$ with our definition of directional derivatives.This easily follows from the formula for differential forms on manifolds:

df=\\frac{\\partial f}{\\partial r}\\,dr+\\frac{\\partial f}{\\partial\\theta}\\,d%\\theta+\\frac{\\partial f}{\\partial\\phi}\\,d\\phi\\,.

But let us see this by calculating from (1) too.We have

\\displaystyle x=r\\sin\\theta\\cos\\phi\\,,\\quad y=r\\sin\\theta\\sin\\phi\\,,\\quad z=r%\\cos\\theta\\,.

\\displaystyle\\frac{\\partial x}{\\partial r}=\\sin\\theta\\cos\\phi\\,,\\quad\\frac{%\\partial y}{\\partial r}=\\sin\\theta\\sin\\phi\\,,\\quad\\frac{\\partial z}{\\partial r%}=\\cos\\theta\\,,

and substituting in (1),

	$\\displaystyle\\frac{\\partial f}{\\partial r}$	$\\displaystyle=\\frac{\\partial x}{\\partial r}\\frac{\\partial f}{\\partial x}+\\frac%{\\partial y}{\\partial r}\\frac{\\partial f}{\\partial y}+\\frac{\\partial z}{%\\partial r}\\frac{\\partial f}{\\partial z}$
		$\\displaystyle=(\\sin\\theta\\cos\\phi)(2x)+(\\sin\\theta\\sin\\phi)(2y)+(\\cos\\theta)(2z)$
		$\\displaystyle=2r^{-1}(xr\\sin\\theta\\cos\\phi+yr\\sin\\theta\\sin\\phi+zr\\cos\\theta)$
		$\\displaystyle=2r^{-1}(x^{2}+y^{2}+z^{2})$
		$\\displaystyle=2r\\,.$

Needless to say, this calculation can be done with the usual functional notation,but it will be somewhat clumsy. We have to say: let $\\alpha$ be the spherical coordinate chart; and then $\\operatorname{D}_{1}(f\\circ\\alpha)(r,\\theta,\\phi)=\\operatorname{D}f(\\alpha(r,%\\theta,\\phi))\\cdot\\operatorname{D}_{1}\\alpha(r,\\theta,\\phi)$ would be the quantity $\\partial f/\\partial r$ .That is not to say Leibnizian notation has no disadvantages.For example,one of the typical objections to the Leibnizian formula

\\frac{df}{dx}=\\frac{df}{dy}\\frac{dy}{dx}

is that the function $f$ means something different on the two sides of the equation.However, formula (1) partly escapes this objection:we can consider $f$ to be a function on a manifold, ignoring the vector space structure of $\\mathbb{R}^{3}$ .Cartesian coordinates $(x,y,z)$ simply become another coordinate chart.Spherical coordinates constitute another.Then $\\partial f/\\partial r$ (i.e. the directional derivative $\\partial/\\partial r$ applied to $f$ ) is a natural quantity to consider,rather than “the derivative of $f\\circ\\alpha$ ’ with respect to the first variable”(which is what the functional notation says).

Physicists seem to grasp the Leibnizian formalism very readily,and it is a shame that many calculus textbooks do not fully explain the logicbehind the formalism — probably because it looks to be unrigorous —but the point we are trying to drive here is that when differentials are suitablyinterpreted, they are rigorous.

The dual to the tangent vectors $\\partial/\\partial x^{i}$

There is a subtle ambiguity in the Leibniz notationthat we should also discuss here.Suppose we are on a two-dimensional manifold with coordinates $u$ and $v$ . The notation $\\partial f/\\partial u$ is ambiguous, because it implicitly depends on the $v$ coordinate as well. What we really mean when we refer to $\\partial f/\\partial u$ is a displacement where $u$ changes at a uniform rate of 1, and where $v$ does not change at all.

Say, for some bizarre reason, I decided to use a coordinate system on the Euclidean plane made up of the Euclidean $x$ coordinate and the radial $r$ coordinate. Now when I write $\\partial f/\\partial x$ , I mean something quite different than when I write $\\partial f/\\partial x$ relative to the Euclidean coordinates.In the first instance the derivative is with respect to the vector field

e_{1}-(y/x)e_{2}\\,.

In the second instance, the derivative is with respect to the vector field

e_{1}+0e_{2}\\,.

On closer thought, we can see that $\\partial/\\partial x$ in the elementary calculus interpretationhas the same ambiguity, but the problem is so trivial thatwe often forget that it exists.For instance, if we have a function $f=xyz$ , and we stipulatethat also $y=x^{2}$ ,then obviously $\\partial f/\\partial x\eq yz$ , because $y$ is changingat the same time as $x$ .The definition of a partial derivative with respectto $x$ is the derivative when $x$ changes and all the other variables are heldfixed. So this rule should be applied whenworking with the tangent vectors $\\partial/\\partial x^{i}$ on a manifold too.

If we agree to use different letters for each coordinate system,and not mix them up (always a reasonable thing to do),then we will not make any mistakes arising from this ambiguitywith the Leibniz notation.

Another way to understand the ambiguity is as follows.In a vector space, there is no natural isomorphism betweenit and its dual space (unless we involve the inner product or something like that). On manifolds, the role of the dual spaceis taken by the space of differential one-forms.(See differential forms (http://planetmath.org/DifferentialForms)for the rigorous details.) A basis for this dual spaceis $dx^{j}$ for $j=1,\\ldots,n$ .There is a basis in the tangent space that is dual to $dx^{i}$ :namely, these are the $\\partial/\\partial x^{i}$ :

dx^{j}\\left(\\frac{\\partial}{\\partial x^{i}}\\right)=\\delta_{i}^{j}\\quad\\text{(%Kronecker delta).}

(2)

But if we are given a lone element $dx^{j}$ , wecannot producea unique vector $\\partial/\\partial x^{j}$ from it (i.e. there is no isomorphism) —we need to be given the entire basis $\\{dx^{j}\\}_{j=1,\\ldots,n}$ .So it is not surprising why the vectors $\\partial/\\partial x^{i}$ should dependon each other.

Motivation for the notation of differential forms

Incidentally, the formula (2) explains the followingidentity involving differential forms (often seen in calculus textbooks with hardlyany explanation of what it means):

df=\\frac{\\partial f}{\\partial x^{j}}\\,dx^{j}\\,.

(3)

The various $d$ ’s floating around obscures the essential idea somewhat,but the derivation of this formula is basic linear algebra. Since $d f$ is a linear functional on the tangent space,defined by $df(V)=V[f]$ , it can be writtenas a linear combination of the dual basis $dx^{j}$ .That is, for some $a_{j}$ , we have

df=a_{j}\\,dx^{j}\\,.

And these $a_{j}$ are solved for by evaluating at the tangentvectors $V=\\partial/\\partial x^{i}$ :

\\frac{\\partial f}{\\partial x^{i}}=\\frac{\\partial}{\\partial x^{i}}[f]=df\\left(%\\frac{\\partial}{\\partial x^{i}}\\right)=a_{j}dx^{j}\\left(\\frac{\\partial}{%\\partial x^{i}}\\right)=a_{j}\\delta_{i}^{j}=a_{i}\\,,

giving formula (3).

For those who are not familiar with the language of differential forms,the definition $df(V)=V[f]$ just usedmight seem to be somewhat artificial, designed solely to makethe classical formula (3) work out.The following comment by Spivak[2] might help clarify matters:

Classical differential geometers (and classical analysts)did not hesitate to talk about “infinitely small” changes $dx^{i}$ of the coordinates $x^{i}$ , just as Leibniz had.No one wanted to admit that this was nonsense, because trueresults were obtained when these infinitely small quantitieswere divided into each other (provided one did it in the right way).Eventually it was realized that the closest one can come to describing aninfinitely small change is to describe a direction in which this changeis supposed to occur, i.e. a tangent vector. Since $d f$ is supposedto be the infinitesimal change of $f$ under an infinitesimal changeof the point, $d f$ must be a function of this change,which means that $d f$ should be a function on tangent vectors.The $dx^{i}$ themselves then metamorphosed into functions,and it became clear that they must be distinguished from the tangentvectors $\\partial/\\partial x^{i}$ .Once this realization came, it was only a matter of making new definitions,which preserved the old notation, and waiting for everybodyto catch up. In short, all classical notions involving infinitelysmall quantities became functions on tangent vectors, like $d f$ ,except for quotients of infinitely smallquantities, which became tangent vectors, like $dc/dt$ .

We can also give an analogy as follows (also from [2]).Suppose $f$ is afunction on a manifold. Let “ $x^{i}=x^{i}(t)$ ” be acurve on this manifold (this is the classical notation).Then by the chain rule,

\\frac{df}{dt}=\\frac{\\partial f}{\\partial x^{i}}\\frac{dx^{i}}{dt}\\,,

where $f$ on the left side really means $f(x^{i}(t))$ .The formal identity obtained by multiplying both sides by $d t$ ,

df=\\frac{\\partial f}{\\partial x^{i}}\\,dx^{i}\\,,

means that “true results are obtained by dividing by $d t$ again,no matter what the functions $x^{i}(t)$ are.”Also, the left- and right-hand sides individuallydo not depend on any particular curve $x^{i}(t)$ at all, butreally just the tangent vectors $dx^{i}/dt$ to that curve.Again, this leads us to realization that $d f$ and $dx^{i}$ should be treated as functions of a tangent vector.

References

1 Vladimir I. Arnol’d (trans. Roger Cooke).Ordinary Differential Equations. Springer-Verlag, 1992.
2 Michael Spivak. A Comprehensive Introduction to Differential Geometry,Volume I. Publish or Perish, 1979.
3 Michael Spivak. Calculus on Manifolds. Perseus, 1965.

Leibniz notation for vector fields

The Leibniz notation on manifolds

The dual to the tangent vectors ∂/∂⁡xi

Motivation for the notation of differential forms

References

The dual to the tangent vectors $\\partial/\\partial x^{i}$