请输入您要查询的字词:

 

单词 ConditionalDistributionOfMultivariateNormalVariable
释义

conditional distribution of multi-variate normal variable


Theorem.

Let X be a random variableMathworldPlanetmath, taking values in Rn, normally distributedwith a non-singular covariance matrixMathworldPlanetmath Σ and a mean of zero.

Suppose Y is defined by Y=B*X for some linear transformation B:RkRnof maximum rank.( to denotes the transpose operator.)

Then the distributionPlanetmathPlanetmath of X conditioned on Y is multi-variate normal,with conditional means and covariancesMathworldPlanetmath of:

𝔼[XY]=ΣB(BΣB)-1Y,Var[XY]=Σ-ΣB(BΣB)-1(ΣB).

If k=1, so that B is simply a vector in Rn,these formulasMathworldPlanetmathPlanetmath reduce to:

𝔼[XY]=ΣBYVar[Y],Var[XY]=Σ-ΣBBΣVar[Y].

If X does not have zero mean, then theformula for E[XY]is modified by adding E[X] and replacing Y by Y-E[Y],and the formula for Var[XY] is unchanged.

Proof.

We split up X into two stochastically independent parts,the first part containing exactly the information embodied in Y.Then the conditional distribution of X given Y is simplythe unconditional distribution of the second part that is independentPlanetmathPlanetmath of Y.

To this end, we firstchange variables to express everything in terms of a standard multi-variate normalZ. Let A:nn be a “square root” factorization ofthe covariance matrix Σ,so that:

AA=Σ,Z=A-1X,X=AZ,Y=BAZ.

We let H:nn be the orthogonal projectionPlanetmathPlanetmath onto the range ofAB:kn, and decompose Z into orthogonalMathworldPlanetmathPlanetmathPlanetmath componentsPlanetmathPlanetmath:

Z=HZ+(I-H)Z.

It is intuitively obvious that orthogonalityof the two random normal vectorsMathworldPlanetmath implies their stochastic independence.To show this formally, observe that the Gaussian density function for Zfactors into a productPlanetmathPlanetmath:

(2π)-n/2exp(-12z2)=(2π)-n/2exp(-12Hz2)exp(-12(I-H)z2).

We can construct an orthonormal system of coordinates on nunder which the components for Hz arecompletely disjoint from those components of (I-H)z.On the other hand, the densities for Z, HZ, and (I-H)Zremain invariantMathworldPlanetmath even after changing coordinates,because they are radially symmetricMathworldPlanetmathPlanetmathPlanetmathPlanetmath.Hence the variables HZ and (I-H)Z are separable in their joint densityand they are independent.

HZ embodies the information in the linear combinationMathworldPlanetmath Y=BAZ.For we have the identity:

Y=(BA)Z=(BA)(HZ+(I-H)Z)=(BA)HZ+0.

The last term is null because (I-H)Z is orthogonal to the range of ABby definition. (Equivalently, (I-H)Z lies in the kernel of (AB)=BA.)Thus Y can always be recovered by a linear transformation onHZ.

Conversely, Y completely determines HZ,from the analytical expression for H that we now give.In general, the orthogonal projection onto the range of an injectivePlanetmathPlanetmathtransformationMathworldPlanetmath Tis T(TT)-1T. Applying this to T=AB, we have

H=AB(BAAB)-1BA
=AB(BΣB)-1BA.

We see that HZ=AB(BΣB)-1Y.

We have proved that conditioning on Y and HZ are equivalentMathworldPlanetmathPlanetmathPlanetmathPlanetmath, and so:

𝔼[ZY]=𝔼[ZHZ]=𝔼[HZ+(I-H)ZHZ]=HZ+0,

and

Var[ZY]=Var[ZHZ]=Var[HZ+(I-H)ZHZ]
=0+Var[(I-H)Z]
=𝔼[(I-H)ZZ(I-H)]
=(I-H)(I-H)
=I-H-H+HH=I-H,

using the defining propertyH2=H=Hof orthogonal projections.

Now we express the result in terms of X, and remove the dependence on the transformation A(which is not uniquely defined from the covariance matrix):

𝔼[XY]=A𝔼[ZY]=AHZ=ΣB(BΣB)-1Y

and

Var[XY]=AVar[ZY]A=AA-AHA=Σ-ΣB(BΣB)-1BΣ.

Of course, the conditional distribution of X given Y is the sameas that of (I-H)Z, which is multi-variate normal.

The formula in the statement of this theorem, for the single-dimensional case,follows from substituting in Var[Y]=Var[BX]=BΣB.The formula for when X does not have zeromean follows from applying the base case to the shiftedvariable X-𝔼[X].∎

随便看

 

数学辞典收录了18232条数学词条,基本涵盖了常用数学知识及数学英语单词词组的翻译及用法,是数学学习的有利工具。

 

Copyright © 2000-2023 Newdu.com.com All Rights Reserved
更新时间:2025/5/4 3:35:44