请输入您要查询的字词:

 

单词 2StochasticMaps
释义

2. Stochastic maps


Any conditional distribution p(y|x) on finite setsMathworldPlanetmath X and Y can berepresented as a matrix as follows. Let 𝒱X={φ:X} denote the vector spaceMathworldPlanetmath of real valued functions on X and similarlyfor Y. 𝒱X is equipped with Dirac basis{δx:X|xX}, where

δx(x)={1if x=x0else.

Given a conditional distribution p(y|x) construct matrix 𝔪p withentry p(y|x) in column δx and row δy. Matrix 𝔪p isstochastic: it has nonnegative entries and its columns sum to 1.Alternatively, given a stochastic matrix 𝔪:𝒱X𝒱Y, we can recover the conditional distribution. The Dirac basis inducesEuclidean metric

|:𝒱X𝒱X:αxδx|βxδx=αxβx(1)

which identifies vector spaces with their duals 𝒱X(𝒱X)*. Let p𝔪(y|x):=δy|𝔪(δx).

Definition 2.

The categoryMathworldPlanetmath of stochastic maps 𝚂𝚝𝚘𝚌𝚑 has function spaces𝒱X for objects and stochastic matrices 𝔪:𝒱X𝒱Y with respect to Dirac bases for arrows. Weidentify of (𝒱X)* with 𝒱X using the Dirac basiswithout further comment below.

Definition 3.

The dual of surjectivePlanetmathPlanetmath stochastic map 𝔪:𝒱X𝒱Y is the composition 𝔪:=𝒱Y𝔪*ren𝒱X, where ren is the unique mapmaking diagram𝔪 of 𝔪 with columns renormalized to sum to 1. Thestochastic dual 𝔪 is

commute. Precomposing 𝔪* with ren renormalizes11If 𝔪 is not surjective, i.e. if one of the rows has allzero entries, then the renormalization is not well-defined. itscolumns to sum to 1. The stochastic dual of a stochastic transform isstochastic; further, if 𝔪 is stochastic then(𝔪)=𝔪.

Category 𝚂𝚝𝚘𝚌𝚑 is described in terms of braid-like generators andrelations in [2]. A more general, but also more complicated,category of conditional distributions was introduced by Giry [3],see [5].

Example 1 (deterministic functions).

Let 𝙵𝚂𝚎𝚝 be the category of finite sets. Define faithfulfunctorMathworldPlanetmath 𝒱:𝙵𝚂𝚎𝚝𝚂𝚝𝚘𝚌𝚑 taking set X to𝒱X and function f:XY to stochastic map𝒱f:𝒱X𝒱Y:δxδf(x). It is easy to see that 𝒱(X×Y)=𝒱X𝒱Y and 𝒱(XY)=𝒱X×𝒱Y.

We introduce special notation for commonly used functions:

  • Set inclusion.For any inclusion i:XY of sets, let ι:=𝒱i:𝒱X𝒱Y denote the corresponding stochasticmap. Two important examples are

    • Point inclusion.Given xX define ιx:𝒱X:1δx.

    • Diagonal map.Inclusion Δ:XX×X:x(x,x)induces ιΔ:𝒱X𝒱X𝒱X:δxδxδx.

  • Terminal map.Let ωX:𝒱X:δx1 denote theterminal map induced by X{}.

  • ProjectionPlanetmathPlanetmath.Let πXY,X:𝒱X𝒱Y𝒱X:δxδyδx denote the projectioninduced by prX×Y,X:X×YX:(x,y)x.

Proposition 1 (dual is Bayes over uniform distribution).

The dual of a stochastic map applies Bayes rule to compute theposterior distribution m(δy)|δx=pm(x|y) using the uniform probability distribution.

Proof:The uniform distributionMathworldPlanetmath is the dual ωX:𝒱X:11|X|xδxof the terminal map ωX:𝒱X. It assignsequal probability pω(x)=1|X| to all of X’selements, and can be characterized as the maximally uninformativedistributionPlanetmathPlanetmath [4]. Let 𝔪:𝒱X𝒱Y.The normalized transposeMathworldPlanetmath is

𝔪(δy)=xp𝔪(y|x)xp𝔪(y|x)δx=xp𝔪(y|x)pω(x)xp𝔪(y|x)pω(x)δx=xp𝔪(x|y)δx.
Remark 1.

Note that p𝔪(x|y):=𝔪(δy)|δxδy|𝔪(δx)=:p𝔪(y|x).Dirac’s bra-ket notation must be used with care since stochasticmatrices are not necessarily symmetricMathworldPlanetmathPlanetmathPlanetmathPlanetmath [1].

Corollary 2 (preimages).

The dual (Vf):VYVX ofstochastic map Vf:VXVY isconditional distribution

p𝒱f(x|y)={1|f-1(y)|if f(x)=y0𝑒𝑙𝑠𝑒.(2)

Proof:By the proof of PropositionPlanetmathPlanetmath 1

(𝒱f)(δy)=1|f-1(y)|{x|f(x)=y}δx.

The supportMathworldPlanetmathPlanetmathPlanetmath of p𝒱f(X|y) is f-1(y). Elements in thesupport are assigned equal probability, thereby treating them as anundifferentiated list. Dual (𝒱f) thus generalizes theinverse imagePlanetmathPlanetmath f-1:Y2¯X. Conveniently however,the dual (𝒱X) simply flips the domain and range of𝒱f, whereas the inverse image maps to powerset 2¯X,an entirely new object.

Corollary 3 (marginalization with respect to uniform distribution).

Precomposing VXVYmVZwith the dual πX to VXVYπXVX marginalizes pm(z|x,y)over the uniform distribution on Y.

Proof:By Corollary 2 we have πX:𝒱X𝒱X𝒱Y:δy1|Y|yYδxδy. It followsimmediately that

p𝔪πX(z|x)=1|Y|yYp𝔪(z|x,y).

Precomposing with πX treats inputs from Y asextrinsic noise. Although duals can be defined so that theyimplement Bayes’ rule with respect to other probability distributions,this paper restricts attention to the simplest possible renormalizationof columns, Definition 2. The uniform distribution isconvenient since it uses minimalPlanetmathPlanetmath prior knowledge (it depends only on thenumber of elements in the set) to generalize pre-images to the stochasticcase, Proposition 2.

References

  • 1 P A M Dirac (1958): ThePrinciples of Quantum Mechanics. Oxford University Press.
  • 2 Tobias Fritz (2009): ApresentationMathworldPlanetmath of the category of stochastic matrices. arXiv:0902.2554v1 .
  • 3 M Giry (1981): AcategoricalPlanetmathPlanetmath approach to probability theory. In B Banaschewski, editor: Categorical Aspects of Topology and Analysis,Springer.
  • 4 E T Jaynes (1957):Information theory and statistical mechanics. Phys. Rev.106(4), pp. 620–630.
  • 5 P Panangaden (1998):Probabilistic relationsMathworldPlanetmathPlanetmath. In C Baier, M Huth,M Kwiatkowska & M Ryan, editors: PROBMIV’98, pp. 59–74.
随便看

 

数学辞典收录了18232条数学词条,基本涵盖了常用数学知识及数学英语单词词组的翻译及用法,是数学学习的有利工具。

 

Copyright © 2000-2023 Newdu.com.com All Rights Reserved
更新时间:2025/5/4 7:02:39