Martingales, Probability Theory

The star-ship Enterprise problem

The navigation system on the star-ship Enterprise is broken. There is nothing one can do but Chief Engineer Scott is able to fix a distance to be traveled, then the star-ship will make a space-hop with an uniformly random direction. Assume that the current distance from the star-ship to the Sun is {R} and the radius of the Solar System is {r<R}. The star-ship will make consecutive space-hops with independent uniformly random directions until it reaches Solar System. How large is the probability that the star-ship eventually reaches the Solar System after a finite number of space-hops? How to maximize the chance to accomplish this trip?


Assume that {\Theta_1, \Theta_2, \dots } are i.i.d. uniformly random point on the unit sphere

\displaystyle \partial B(0,1)=\{x\in \mathbb R^3: \|x\|=1\},

which are also random directions that the star-ship will travel along. Let {O=(0,0,0)} be the position of the Sun and {X_n} be the position of the star-ship after {n} space-hops. We can set {X_0=(R,0,0)} being the initial position of the star-ship. Set {\mathcal{F}_n=\sigma(X_0,X_1,\dots, X_n)}. We have

\displaystyle X_{n}=X_{n-1}+C_n \Theta_n,

where {C_n} is the size of the {n}-th space-hop which is a {\mathcal F_{n-1}}-measurable random variable. We call the random sequence {(C_n)_{n\ge0}} a strategy.

Proposition 1. Set {M_n=1/\|X_n\|}. Then {(M_n)_{n\ge0}} is a supermartingale which converges with probability 1 to some {M_{\infty}} as {n\rightarrow\infty}. It is a martingale if and only if {C_n\le \|X_{n-1}\|} for all {n\ge 1}.

Proof: We have

\displaystyle \mathbb E[M_n |\mathcal{F}_{n-1}]= \mathbb E\left[\frac{1}{\|X_{n-1}+C_n\Theta_n\|}|\mathcal{F}_{n-1}\right]=\frac{1}{4\pi}\int_{\partial B(0,1)}\frac{1}{\|X_{n-1}+C_n \theta\|} {\rm d}\sigma(\theta),

where {\sigma} is the Lebesgue measure on the unit sphere {\partial B(0,1)}. Applying the mean value property to the harmonic function {f(x)=1/\|x\|} with {\|x\|>\rho}, we have

\displaystyle \frac{1}{4\pi}\int_{\partial B(0,1)}\frac{1}{\|x+\rho y\|}\text{d}\sigma(y)= \frac{1}{\|x\|}.

If {\|x\|\le \rho}, by the maximum principle, we have

\displaystyle \frac{1}{4\pi}\int_{\partial B(0,1)}\frac{1}{\|x+\rho y\|}\text{d}\sigma(y)= \frac{1}{\rho}.


\displaystyle \mathbb E[M_n |\mathcal{F}_{n-1}] =\frac{1}{\|X_{n-1}\|\vee C_n} \le \frac{1}{\|X_{n-1}\|}=M_{n-1}.

Hence {(M_n)_{n\ge0}} is a positive supermartingale. By Doob’s convergence theorem, {M_n} tends to some {M_{\infty}} with probability 1. It is also obvious that {(M_n)_{n\ge0}} is a martingale if and only if {C_n\le \|X_{n-1}\|} for all {n\ge 1}. \Box

Denote by {\tau=\inf\{n\ge 1: \| X_n\|\le r \}} the first time that the star-ship reaches the Solar System.

Proposition 2. The probability that the star-ship ever reaches the Solar System is strictly smaller than {r/R}, i.e.

\displaystyle \mathbb P(\tau<\infty)< \frac{r}{R}.

Proof: Note that {(M_{n\wedge \tau})_{n\ge 0}} is also a non-negative supermartingale. We thus have

\displaystyle \mathbb E[ M_{n\wedge \tau}]\le \mathbb E[ M_{0}]=\frac{1}{R}.

On the event {\{\tau\le n\}}, we have {M_{\tau}={1}/{\|X_{\tau}\|}\ge1/r}. Therefore,

\displaystyle \mathbb E[ M_{n\wedge \tau}]= \mathbb E[ M_{\tau}\mathbf{1}_{\{\tau\le n\}}]+\mathbb E[ M_{n}\mathbf{1}_{\{\tau> n\}}]\ge \frac{1}{r}\mathbb E[ \mathbf{1}_{\{\tau\le n\}}]= \frac{1}{r} \mathbb P(\tau\le n).

It immediately follows that {\mathbb P(\tau\le n)\le r/R}. Since {\{\tau\le n\}} defines a sequence of increasing events, we have

\displaystyle \mathbb P(\tau<\infty)=\mathbb P\left(\bigcup_{n\ge 1}\{\tau \le n\} \right)=\lim_{n\rightarrow\infty}\mathbb P(\tau\le n)\le \frac{r}{R}.

We now show that {r/R} is actually a strict upper bound for {\mathbb{P}(\tau<\infty)}. Notice that the distribution of {X_n} is non-atomic for all {n\ge 1}. Indeed, assuming that {\mathbb P(X_{n-1}=z)=0} for all {z\in \mathbb R^3}, we have that {X_{n}} conditioning on {\mathcal{F}_{n-1}} is uniformly distributed on the sphere {\partial B(X_{n-1},C_n)}. It follows that {\mathbb P(X_n=z|\mathcal{F}_{n-1})=0} and thus {\mathbb P(X_n=z)=0} for all {z\in \mathbb R^3}. It is also clear that {X_1} is non-atomic. By the principle of induction, {X_n} is non-atomic for all {n\ge 1}. Therefore, on the event {\{\tau<\infty\}},

\displaystyle M_{\tau}=\frac{1}{\|X_{\tau}\|}=\sum_{n=1}^{\infty}\frac{1}{\|X_n\|}\mathbf{1}_{\{\tau=n\}}>\frac{1}{r}.

It follows that { M_{\tau}\ge M_{\tau} \mathbf{1}_{\{\tau<\infty\}}> {1}/{r}\mathbf{1}_{\{\tau<\infty\}}}. By Fatou’s lemma, we thus have

\displaystyle \frac{1}{r}\mathbb{P}(\tau<\infty)< \mathbb E[M_{\tau}]=\mathbb E\left[\lim_{n\rightarrow\infty}M_{n\wedge\tau}\right]\le \liminf_{n\rightarrow\infty}\mathbb E[M_{n\wedge \tau}]\le \frac{1}{R}.


Let m>R be a fixed integer. Set {\eta_m=\inf\{n\ge 1: \|X_n\|> m\}}.

Proposition 3. Assume that {C_n=\epsilon} with {\epsilon>0} for all {n\ge 1}. Then for all {m}, {\mathbb{P}( \eta_m< \infty)=1}, i.e. the star-ship will reaches to a position with an arbitrary large distance from the Sun after a finite number of space-hops with probability 1.

Proof: Let {X_n^1} the first coordinate of {X_n}. Note that {X_n^1} is a sum of i.i.d. random symmetric variables. We thus have

\displaystyle \mathbb P(\eta_m =\infty) = \mathbb P\left(\sup_{n\ge1} \|X_n\|\le m \right)\le \mathbb P\left(\sup_{n\ge1} |X_n^1|\le m \right)=0.


Proposition 4. For each {\epsilon\in (0,r)}, there exists a strategy {(C_n^{\epsilon})_{n\ge1}} such that

\displaystyle \mathbb P(\tau<\infty)\ge \frac{r-\epsilon}{R}.

Proof: Set {C_n=\epsilon} with {\epsilon\in (0,r)} for all {n\ge 1}. Note that {(M_{n\wedge \tau })_{n\ge 1}} is a positive martingale since {\|X_{n-1}\|> r> \epsilon = C_{n}} for all {n\le \tau}. We also note that for {n\le \tau}, we have {\|X_n\|=\|X_{n-1}+C_n \Phi_n\|\ge \|X_{n-1}\|-C_n\ge r-\epsilon}. Thus for all {n\ge1},

\displaystyle M_{\tau\wedge n}= \frac{1}{\|X_{\tau\wedge n}\|}\le \frac{1}{r-\epsilon}.

Applying the optional stopping theorem to the martingale {(M_{n\wedge\tau})_{n\ge0}} and the stopping time {\eta_m}, we have

\displaystyle \frac{1}{R}=\mathbb E[M_{0}]=\mathbb E[M_{\tau\wedge \eta_m}]= \mathbb E[M_{\tau} \mathbf{1}_{\{\tau\le \eta_m\}}] +\mathbb E[M_{\eta_m} \mathbf{1}_{\{\tau>\eta_m\}}]\le \frac{1}{r-\epsilon} \mathbb{P}(\tau\le\eta_m) + \frac{1}{m} \mathbb P(\tau>\eta_m).

It immediately follows that

\displaystyle \mathbb{P}(\tau\le \eta_m)\ge \frac{1/R- 1/m}{1/(r-\epsilon)-1/m}.

Taking {m\rightarrow\infty}, we obtain that {\mathbb{P}(\tau<\infty)\ge(r-\epsilon)/R}.


Martingales, Probability Theory

Wright-Fisher model

Consider a population of {N} haploid individuals. The size of this population remains constant over time. Each individual has two possible gene types (alleles): {A} or {B}. In each generation, each individual independently inherits its genetic material from a uniformly chosen individual from the previous generation. This model is named after Sewall Wright and Ronald Fisher.

Let {X_n} be the number of individuals with allele {A} in generation {n}. If {X_n\in\{0, N\}} then there is no further mutations. This phenomenon is called fixation. We will compute probability that a particular allele is fixated and estimate how fast the fixation occurs by using martingale theory.

Proposition 1. {(X_n)_{n\ge 0}} is a martingale and it converges with probability 1 to some {X_{\infty}} as {n\rightarrow\infty}.

Proof: Given {\mathcal{F}_n=\sigma(X_0,X_1,\dots, X_n)}, we have {X_{n+1}\sim \text{Binomial}(N, X_n/N)}. We thus have

\displaystyle \mathbb P( X_{n+1}=j|\mathcal{F}_n)=\binom{N}{j} \left(\frac{X_n}{N}\right)^j \left(\frac{N-X_n}{N}\right)^{N-j}.

(Note that {(X_n)_{n\ge0}} is actually a Markov chain, i.e. the distribution of {X_{n+1}} depends only on {X_n}). In particular, we have that

\displaystyle \mathbb E[X_{n+1}|\mathcal{F}_n]=N\cdot \frac{X_n}{N}= X_n.

Thus {(X_n)_{n\ge0}} is a non-negative martingale. By Doob’s martingale convergence theorem, {X_{n}} converges almost surely to some {X_{\infty}} as {n\rightarrow\infty}. \Box

Proposition 2. Denote {M_n=\left(\frac{N}{N-1}\right)^n X_n(N-X_n)}. Then {(M_n)_{n\ge0}} is a martingale.

Proof: We have that

\displaystyle \begin{aligned} \mathbb E[M_{n+1}|\mathcal{F}_n] & =\left(\frac{N}{N-1}\right)^{n+1}\left(N \mathbb E[X_{n+1}|\mathcal{F}_n]-\mathbb E[X_{n+1}^2|\mathcal{F}_n] \right)\\ & = \left(\frac{N}{N-1}\right)^{n+1}\left(N X_n- N\cdot \frac{X_n}{N}\left(1-\frac{X_n}{N}+ N\cdot \frac{X_n}{N}\right) \right)\\ & = \left(\frac{N}{N-1} \right)^n X_n(N-X_n)= M_n. \end{aligned}


Now denote by {\tau=\inf\{n\ge 0: X_0=0 \text{\ or\ } N\}} the time of fixation.

Proposition 3. We have that {\mathbb P(\tau<\infty)=1}, i.e. one of the alleles eventually disappears in finite time. Furthermore, the probability that the allele B disappears is equal to

\displaystyle \mathbb P(X_{\tau }=N)=\frac{X_0}{N}.

Proof: Since {(M_n)_{n\ge0}} is a martingale, {(M_{n\wedge \tau})_{n\ge0}} is also a martingale. In particular, we have

\displaystyle \begin{aligned} X_0 (N-X_0) & =\mathbb E[M_0]= \mathbb E[M_{n\wedge\tau }]= \mathbb{E} [M_\tau \mathbb \mathbf{1}_{\{\tau\le n\}}]+ \mathbb E\left[M_n \mathbf{1}_{\{\tau>n\}}\right]\\ &= 0 + \left(\frac{N}{N-1}\right)^n \mathbb E\left[ X_n(1-X_n)\mathbf{1}_{\{\tau>n\}} \right].\end{aligned}

Note that on the event \{\tau>n\}, we have X_n(N-X_n)\ge 1. Thus

\begin{aligned}\left(\frac{N-1}{N}\right)^n X_0(N-X_0) & =\mathbb E[X_n(N-X_n)\mathbf{1}_{\{\tau>n\}}]\ge \mathbb E[\mathbf{1}_{\{\tau>n\}}]\\ & = \mathbb{P}(\tau>n).\end{aligned}

Taking {n\rightarrow\infty}, we obtain {\mathbb{P}(\tau=\infty)=0} and thus the stopping time {\tau} is finite with probability 1. On the other hand, the martingale {(X_n)_{n\ge0}} is bounded by {N}. By optional stopping theorem, we have that

\displaystyle X_0=\mathbb E[X_{0}]=\mathbb{E}[X_{\tau}]= N\cdot \mathbb{P}(X_{\tau}=N) +0\cdot \mathbb{P}(X_{\tau}=0).

It immediately follows that {\mathbb{P}(X_{\tau}=N)=X_0/N}. \Box

Proposition 4. We have

\displaystyle \mathbb{E}[\tau]=\frac{2 X_0(N-X_0)}{N-1}.

Proof: There are several ways to compute {\mathbb{E}[\tau]}. We will present a simple proof by using a concept of population genetics. Assume that two distinct individuals are uniformly randomly chosen from the population in generation {n}. The probability that they don’t have the same gene type is called the genetic variability in generation {n}, denoted by {H_n}. We have that

\displaystyle H_n=\frac{X_n}{N}\cdot \frac{N-X_n}{N-1}+\frac{N-X_n}{N}\frac{X_n}{N-1}=\frac{2X_n(N-X_n)}{N(N-1)}.

We label each individual in a generation by numbers from {1} to {N}. Set {C^n(k)=(C^n_0(k), C^n_1(k), \dots,C^n_n(k))} where {C^n_m(k)\in \{1,2,\dots, N\}} is the label of the ancestor in generation {m} of individual {k} in generation {n}. We call {C^n(k)} the backward ancestral path of the {k}-th individual in generation {n}.

Backward ancestral paths C^4(2), C^4(4) and C^4(5).

Note that the distribution of {(C^n_{j-1}(1), C^n_{j-1}(2),\dots, C^n_{j-1}(N))} depends only on {(C^n_{j}(1), C^n_{j}(2),\dots, C^n_{j}(N))}.

For each {k\neq h} in {\{1,2,\dots, N\}}, we have

\displaystyle \mathbb{P}\left(C_{j-1}^n(k)\neq C_{j-1}^n(h)\ |\ C_{j}^n(k)\neq C_{j}^n(h) \right)= 1-\frac{1}{N}.

Note that {C_0^n(h)=C_0^n(k)} if and only if the backward ancestral paths {C^n(h)} and {C^n(k)} have coalesced in some generation {j} with {0\le j\le n}. We thus have

\displaystyle \mathbb{P}(C_0^n(h)\neq C_0^n(k))=\prod_{j=1}^n\mathbb{P}\left(C_{j-1}^n(k)\neq C_{j-1}^n(h)\ |\ C_{j}^n(k)\neq C_{j}^n(h) \right)= \left(1-\frac{1}{N} \right)^n.


\displaystyle \mathbb{P}(\tau>n) =\mathbb{P}(H_n\neq 0)= \left(1-\frac{1}{N} \right)^n H_0.


\displaystyle \mathbb{E}[\tau]=\sum_{n=0}^{N} \mathbb{P}(\tau>n) = \sum_{n=0}^{N}\left(1-\frac{1}{N} \right)^n H_0=N H_0.


Proposition 5. We have {\mathbb E[H_{Nt}]\to H_0 e^{-t}} as {N\rightarrow\infty}.

Proof: Note that

\displaystyle H_n=\frac{2}{N(N-1)}\left(\frac{N-1}{N}\right)^n M_n.

We thus have

\displaystyle \mathbb E[H_{Nt}]=\frac{2}{N(N-1)}\left(\frac{N-1}{N}\right)^n \mathbb E[M_{Nt}]=\frac{2 X_0(N-X_0)}{N(N-1)}\left(1-\frac{1}{N}\right)^{Nt} = H_0 \left(1-\frac{1}{N}\right)^{Nt}\rightarrow H_0 e^{-t}

as {N\rightarrow\infty}. \Box

Martingales, Probability Theory

Bellman’s optimality principle

Let us consider a betting game as follows. Let {(X_n)_{n\ge1}} be a sequence of i.i.d. random variables taking values in {\{1,-1\}} such that {\mathbb{P}(X_1=1)=p>\frac12} and {\mathbb{P}(X_1=-1)=1-p=:q>0}. Suppose that at the initial time your capital is {Y_0}. We start a sequence of betting rounds. At time {n}, you either win your stake if {X_n=1}, or you loose if {X_n=-1}. Let {Y_n} and {C_n} respectively stand for your capital and your stake at time {n}. Then {(Y_n)_{n\ge1}} can be recursively defined by

\displaystyle Y_{n}=Y_{n-1}+ C_n X_n=\left\{ \begin{matrix}Y_{n-1}+C_n& \text{if you win,}\\ Y_{n-1}-C_n & \text{if you loose}.\end{matrix}\right.

Let {\epsilon} be a fixed (small) number in (0,1). Assume that {0\le C_n\le (1-\epsilon)Y_{n-1}} and {C_n} is predictable, i.e. {C_n} is {\mathcal{F}_{n-1}}-measurable with {\mathcal{F}_n=\sigma(X_1,X_2,\dots, X_n)}. You certainly wish to find an optimal strategy of stakes to maximize your fortune. This is equivalent to the maximization of the expected interest rate given by

\displaystyle \mathbb{E}\left[\log\left(\frac{ Y_N}{Y_0}\right)\right],

where {N} is the length of the game. For {n\ge1}, set {R_n={C_n}/{Y_{n-1}}}, which is a {\mathcal{F}_{n-1}}-measurable random variable. The sequence {(R_1,R_2,...,R_N)} is called a strategy.

Proposition 1 Let

\displaystyle M_n=\log\left(\frac{Y_n}{Y_0}\right)-n\alpha

with {\alpha=p\log(2p)+q\log(2q)}. Then {(M_n)_{n\ge0}} is a supermartingale.

Proof: We fisrt show that {\mathbb{E}|M_n|<\infty} for all {n}. Indeed, we notice that

\displaystyle M_n=\log\left(\frac{ Y_n}{Y_{n-1}}\right)+\log\left(\frac{Y_{n-1}}{Y_0}\right)-n\alpha=\log(1+R_nX_n)+M_{n-1}-\alpha.

We also have {\epsilon \le 1+R_n X_n\le \log(2-\epsilon))} since {0\le R_n={C_n}/{Y_{n-1}}\le 1-\epsilon}. It immediately follows that

\displaystyle |M_n|\le |\log(1+R_nX_n)|+|M_{n-1}|+\alpha\le |\log(\epsilon )|+|\log(2-\epsilon )|+\alpha+|M_{n-1}| =\alpha+\log\left(\frac{2-\epsilon}{\epsilon}\right)+|M_{n-1}|.

Hence, by the principle of induction, we obtain

\displaystyle |M_n|\le n \left(\alpha+\log\left(\frac{2-\epsilon}{\epsilon}\right) \right).

and thus {\mathbb{E}|M_n|<\infty}. On the other hand, we have

\mathbb{E} [M_{n}|\mathcal F_{n-1}] =\mathbb{E}\left[\log(1+R_nX_n)|\mathcal F_{n-1}\right]-\alpha+M_{n-1}
=p\log\left( \frac{1+R_n}{2p}\right)+q\log\left( \frac{1-R_n}{2q}\right)+M_{n-1}.

Applying Jensen’s inequality {p\log(x)+(1-p)\log(y)\le \log(p x+ (1-p)y)}, we thus have

\displaystyle \mathbb{E} [M_{n}|\mathcal F_{n-1}]\le \log \left(p\cdot\frac{1+R_n}{2p}+q\cdot\frac{1-R_n}{2q}\right)+M_{n-1}=M_{n-1}.

Hence {(M_{n})_{n\ge1}} is a supermartingale. \Box

Proposition 2 We have that {(R_1,R_2,\dots,R_N)=(2p-1,\dots, 2p-1)} is an optimal strategy to maximize the expected interest rate.

Proof: Since {(M_{n})_{n\ge0}} is a supermartingale, we have {\mathbb{E}[M_N]\le M_0=0}. Therefore,

\displaystyle \mathbb{E}\left[\log\left(\frac{ Y_N}{Y_0}\right)\right]\le N\alpha.

We note that {(M_{n})_{n\ge0}} is a martingale if

\displaystyle p\log \left(\frac{1+R_n}{2p}\right)+q\log\left(\frac{1-R_n}{2q}\right)=0

for all {n}. This is equivalent to {R_n=2p-1}. Choosing {R_n=2p-1} for all {1\le n\le N}, we must have that {\mathbb{E}[M_N]=M_0=0}. In this case, we obtain

\displaystyle \mathbb{E}\left[\log\left(\frac{ Y_N}{Y_0}\right)\right]= N\alpha.

Hence, {(2p-1,\dots, 2p-1)} is an optimal strategy. \Box