Consistent way of ensuring i.i.d. condition for infinitely many random variables

In probability theory, the term i.i.d. condition is loosely stated to describe random variables that are independent of each other and are identically distributed. This is equivalent to saying that there is a joint distribution whose marginals are identical and independent of each other. In an infinite dimensional setting, Daniel-Kolmogorov theorem is invoked to explore the conditions under which such a joint distribution can be established.

Main question about i.i.d. condition

In an elementary probably class, we learn that F(x)F(x) is defined to be probability that certain random variable is less than or equal to some value xx. Can we reverse it? In other words, given distribution function F, can we ensure that there is a corresponding random variable XX satisfying P({wX(w)x})=F(x)\mathbb{P}(\{w\mid X(w)\leq x\})=F(x)?

Well, this is a result of the existence of Lebesgue-Stieltjes measure as a consequence of the renowned Caratheodory extension theorem: as long as FF is right-continuous, non decreasing, limxF(x)=1\lim_{x\to\infty}F(x)=1 and limxF(x)=0\lim_{x\to-\infty}F(x)=0, the existence is ensured.

Caratheodory extension theorem proceeds by initially assuming a pre-measure on an algebra, which is a collection of sets closed under finite union and intersection. This pre-measure is then expanded to an outer measure that applied to all subsets of a given set Ω\Omega. This outer measure qualifies as a measure on the collection of Caratheodory-measurable sets, which form a σ\sigma-algebra. The theorem additionally stipulates that σ\sigma-finiteness makes this extension unique.

Now going back to our case of finding a random variable XX, we can simply put (Ω,F,P)=(R,B(R),μF)(\Omega,\mathcal{F}, \mathbb{P})=(\mathbb{R}, \mathcal{B}(\mathbb{R}), \mu_F) and set X(w)=wX(w)=w, where μF\mu_F is a Lebesgue Stieltjes measure. Then we have P(wX(w)x)=μF((,x])=F(x)\mathbb{P}(w\mid X(w)\leq x)=\mu_F((-\infty,x])=F(x).

More generally, this tells us how to construct a random vector (X1,X2,,Xn)(X_1,X_2,\dots,X_n) out of FX1,X2,,Xn(x1,x2,,xn)F_{X_1,X_2,\dots,X_n}(x_1,x_2,\dots,x_n).

How about an entire sequence X1,X2,X_1,X_2,\dots for which we prescribe the finite-dimensional distribution function FX1,X2,,Xn(x1,x2,,xn)=Πi=1i=nFXi(xi)F_{X_1,X_2,\dots,X_n}(x_1,x_2,\dots,x_n)=\Pi_{i=1}^{i=n} F_{X_i}(x_i) for every nNn\in\mathbb{N}?

In probability theory textbooks, this condition is loosely referred to as “i.i.d. condition” of inifinitely many random variables. In order to show that this is feasible, one way is to establish some probability measure P\mathbb{P} whose marginal distribution corresponds to every random variable in the sequence.

Daniel-Kolmogorov theorem

Let (Fτ)τT(F_\tau)_{\tau\in\mathbb{T}} be a given family of finite-dimensional probability distribution functions, and denote by (μτ)τT(\mu_\tau)_{\tau\in\mathbb{T}} the corresponding (induced) distributions. If these distributions satisfy the consistency conditions

1) If s=(ti1,,tin)s = (t_{i_1},\dots,t_{i_n}) is a permutation of τ=(t1,,tn)\tau=(t_1,\dots,t_n) then for any Borel sets A1,,AnA_1,\dots,A_n of the real line we have μτ(A1××An)=μs(Ai1××Ain) \mu_\tau(A_1\times\dots\times A_n) = \mu_s(A_{i_1}\times\dots\times A_{i_n})

2) If t=(t1,,tn)τnt = (t_1,\dots,t_n)\in\tau_n and s=(t1,,tn,tn+1)τn+1s=(t_1,\dots,t_n,t_{n+1})\in\tau_{n+1} then for any Borel set BB(Rn)B\in\mathcal{B}(\mathbb{R}^n) we have μτ(B)=μs(B×R)\mu_\tau(B) = \mu_s(B\times \mathbb{R})

then on the “canonical” space (Ω,F)(\Omega,\mathcal{F}), there exists a probability measure P\mathbb{P} such that μ(t1,,tn)(A)=P((Xt1,,Xtn)A)\mu_{(t_1,\dots,t_n)}(A)=\mathbb{P}((X_{t_1},\dots,X_{t_n})\in A) with AB(Rn)A\in\mathcal{B}(\mathbb{R}^n) for every nNn\in\mathbb{N}.

The proof follows the natural logic of reduction, detailed here, where the author conveniently assumed that τ\tau is in time domain. It largely follows the following four steps.

1) We begin by defining μ(Cτ(B)):=μτ(B)\mu(C_\tau (B)):=\mu_\tau(B) for BB(Rn)B\in\mathcal{B}(\mathbb{R}^n) where cylinder set is defined by Cτ(B)={fA(R0,R),(f(t1),,f(tn))B}C_\tau (B)=\{f\in\mathcal{A}(\mathbb{R}_{\geq 0},\mathbb{R}), (f(t_1),\dots,f(t_n))\in B\}. Then we have μ(A(R0,R))=1\mu(\mathcal{A}(\mathbb{R}_{\geq 0},\mathbb{R}))=1, and it is well-defined due to condition 1 and 2 (think when B is a rectangular measurable set; permutation condition is necessary).

2) We can check that cylinder sets satisfy the set conditions of Caratheodory extension theorem and since μ\mu has ambient value 1, it is σ\sigma-finite. Therefore, it suffices to show that μ\mu is countably additive.

3) We reformulate the problem into showing that μ(Dn)0\mu(D_n)\to 0 by some sequence of “decreasing” set DnD_n. Note that this reduces the problem into the one involving Borel set by denoting Dn={f:(f(t1),,f(tn))Bn}D_n=\{f:(f(t_1),\dots,f(t_n))\in B_n\} for some “decreasing” sequence of sets BnB(Rn)B_n\in\mathcal{B}({\mathbb{R}^n}). This makes the problem easier to handle.

4) We can approximate BnB_n with compact sets KnK_n. This approximation allows us to use limiting argument. Assuming that μ(Dn)\mu(D_n) does not converge to zero, we can extract some nice sequence of points satisfying (x1,,xn)Kn(x_1,\dots,x_n)\in K_n for every nNn\in\mathbb{N}, and this leads to the contradiction because {fA(R0,R),(f(t1),,f(tN))=(x1,,xN)}DN\{f\in\mathcal{A}(\mathbb{R}_{\geq 0},\mathbb{R}), (f(t_1),\dots,f(t_N))=(x_1,\dots,x_N) \}\in D_N, which implies an absurd statement NNDNϕ\cap_{N\in\mathbb{N}} D_N\neq \phi.

Continuing with i.i.d.

Let T=N\mathbb{T}=\mathbb{N} and define Fτ(x1,,xn)=F(x1)F(xn)F_\tau(x_1,\dots,x_n)=F(x_1)\dots F(x_n) for (x1,,xn)Rn(x_1,\dots,x_n)\in\mathbb{R}^n with τ=(t1,,tn)τn\tau=(t_1,\dots,t_n)\in\tau_n and nNn\in\mathbb{N}. This is a probability distribution function on Rn\mathbb{R}^n and induces a probability measure μτ=μFτ=μμ\mu_\tau=\mu_{F_\tau}=\mu\otimes\dots\otimes\mu, the n-product measure of μ=μF\mu=\mu_F with itself on B(Rn)\mathbb{B}(\mathbb{R}^n). It is clear that the family (μτ)τT(\mu_\tau)_{\tau\in\mathbb{T}} satisfies the consistency conditions above in the theorem.

According to the theorem, there exists a probability measure P\mathbb{P} on (Ω,F)(\Omega,\mathcal{F}) with P[wΩ:w(t1)A1,,w(tn)An]=μ(t1,,tn)(A1××An)=μ(A1)μ(An)=Πj=1nP(wΩ:w(tj)Aj)\mathbb{P}[w\in\Omega:w(t_1)\in A_1,\dots,w(t_n)\in A_n]=\mu_{(t_1,\dots,t_n)}(A_1\times\dots\times A_n)=\mu(A_1)\dots\mu(A_n)=\Pi^n_{j=1}\mathbb{P}(w\in\Omega:w(t_j)\in A_j) for every A1,,AnA_1,\dots,A_n in B(R)\mathcal{B}(\mathbb{R}), nNn\in\mathbb{N}. But this means that the random variables Xn(w):=w,nNX_n(w):=w, n\in\mathbb{N} are independent.

Application of Daniel-Kolmogorov in the construction of Brownian motion

On a side note, this is very similar to how the Brownian motion is constructed.

Consider

C={wR[0,)];(w(t1),,w(tn))A} C=\{w\in\mathbb{R}^{[0,\infty)]};(w(t_1),\dots,w(t_n))\in A\}

and let C\mathcal{C} denote the field of all such sets. Further, let σ\sigma-algebra generated by this set be denoted by B(R[0,))\mathcal{B}(\mathbb{R}^{[0,\infty)}).

Using Daniel-Kolmogorov theorem, we could establish the probability measure P\mathbb{P} on (R[0,),B(R[0,)))(\mathbb{R}^{[0,\infty)},\mathcal{B}(\mathbb{R}^{[0,\infty)})), under which the coordinate mapping process Bt(w)=w(t);wR[0,),t0B_t(w)=w(t);\quad w\in\mathbb{R}^{[0,\infty)},t\geq 0 has stationary, independent increments. Further, an increment BtBsB_t-B_s where 0s<t0\leq s<t is normally distributed with mean zero and variance tst-s (we can do this by explicitly writing down joint density function).

In order to establish the continuity, we invoke the modification theorem by Kolmogorov and Centov, whose proof detailed in Karatzas and Shreve:

Suppose that a process X={Xt;0tT}X=\{X_t;0\leq t\leq T\} on a probability space (Ω,F,P)(\Omega,\mathcal{F},\mathbb{P}) satisfies the condition E(XtXsα)Cts1+β,0s,tTE(\mid X_t-X_s\mid ^\alpha)\leq C\mid t-s\mid ^{1+\beta},\quad 0\leq s,t\leq T for some positive constants α,β,\alpha,\beta, and CC. Then there exists a continuous modification X~={X~t;0tT}\tilde X =\{\tilde X_t;0\leq t\leq T\} of XX, which is locally Holder-continuous with exponent γ\gamma for every γ(0,β/α)\gamma\in (0,\beta/\alpha), i.e.,

P[w;sup0<ts<h(w),s,t[0,T]X~t(w)X~s(w)tsγδ]=1 \mathbb{P}[w;\sup\limits_{0<t-s<h(w), s,t\in [0,T]}\frac{\mid \tilde X_t(w)-\tilde X_s(w)\mid}{\mid t-s\mid ^\gamma}\leq \delta]=1

where h(w)h(w) is an a.s. positive random variable and δ>0\delta>0 is an appropriate constant.

In our case, the condition in particular holds with α=2\alpha=2 and β=0\beta=0. Since Holder continuity implies continuity, we are done with the construction of Brownian motion.

By the way, there are other ways of constructing Brownian motion, one of which involves constructing Brownian motion in dyadic rationals in an interval [0,1] (by explicitly writing down the joint distribution), filling the gaps (using uniform continuity), and patching the rest, which is also elaborated in Karatzas and Shreve.