2. Stochastic maps
Any conditional distribution on finite sets and can berepresented as a matrix as follows. Let denote the vector space
of real valued functions on and similarlyfor . is equipped with Dirac basis, where
Given a conditional distribution construct matrix withentry in column and row . Matrix isstochastic: it has nonnegative entries and its columns sum to 1.Alternatively, given a stochastic matrix , we can recover the conditional distribution. The Dirac basis inducesEuclidean metric
(1) |
which identifies vector spaces with their duals . Let .
Definition 2.
The category of stochastic maps has function spaces for objects and stochastic matrices with respect to Dirac bases for arrows. Weidentify of with using the Dirac basiswithout further comment below.
Definition 3.
The dual of surjective stochastic map is the composition , where is the unique mapmaking diagram of with columns renormalized to sum to 1. Thestochastic dual is

commute. Precomposing with renormalizes11If is not surjective, i.e. if one of the rows has allzero entries, then the renormalization is not well-defined. itscolumns to sum to 1. The stochastic dual of a stochastic transform isstochastic; further, if is stochastic then.
Category is described in terms of braid-like generators andrelations in [2]. A more general, but also more complicated,category of conditional distributions was introduced by Giry [3],see [5].
Example 1 (deterministic functions).
Let be the category of finite sets. Define faithfulfunctor taking set to and function to stochastic map. It is easy to see that and .
We introduce special notation for commonly used functions:
- •
Set inclusion.For any inclusion of sets, let denote the corresponding stochasticmap. Two important examples are
- –
Point inclusion.Given define .
- –
Diagonal map.Inclusion induces .
- –
- •
Terminal map.Let denote theterminal map induced by .
- •
Projection
.Let denote the projectioninduced by .
Proposition 1 (dual is Bayes over uniform distribution).
The dual of a stochastic map applies Bayes rule to compute theposterior distribution using the uniform probability distribution.
Proof:The uniform distribution is the dual of the terminal map . It assignsequal probability to all of ’selements, and can be characterized as the maximally uninformativedistribution
[4]. Let .The normalized transpose
is
Remark 1.
Note that .Dirac’s bra-ket notation must be used with care since stochasticmatrices are not necessarily symmetric [1].
Corollary 2 (preimages).
The dual ofstochastic map isconditional distribution
(2) |
Proof:By the proof of Proposition 1
The support of is . Elements in thesupport are assigned equal probability, thereby treating them as anundifferentiated list. Dual thus generalizes theinverse image
. Conveniently however,the dual simply flips the domain and range of, whereas the inverse image maps to powerset ,an entirely new object.
Corollary 3 (marginalization with respect to uniform distribution).
Precomposing with the dual to marginalizes over the uniform distribution on .
Proof:By Corollary 2 we have . It followsimmediately that
Precomposing with treats inputs from asextrinsic noise. Although duals can be defined so that theyimplement Bayes’ rule with respect to other probability distributions,this paper restricts attention to the simplest possible renormalizationof columns, Definition 2. The uniform distribution isconvenient since it uses minimal prior knowledge (it depends only on thenumber of elements in the set) to generalize pre-images to the stochasticcase, Proposition 2.
References
- 1 P A M Dirac (1958): ThePrinciples of Quantum Mechanics. Oxford University Press.
- 2 Tobias Fritz (2009): Apresentation
of the category of stochastic matrices. arXiv:0902.2554v1 .
- 3 M Giry (1981): Acategorical
approach to probability theory. In B Banaschewski, editor: Categorical Aspects of Topology and Analysis,Springer.
- 4 E T Jaynes (1957):Information theory and statistical mechanics. Phys. Rev.106(4), pp. 620–630.
- 5 P Panangaden (1998):Probabilistic relations
. In C Baier, M Huth,M Kwiatkowska & M Ryan, editors: PROBMIV’98, pp. 59–74.