- Book
- Coalescent Theory — An Introduction
- Author
- John Wakeley
- Publisher
- Roberts & Company
- Errata
- PDF
- 輪読担当
- 岩嵜航
- 日程
- 2015-03-20
2. Probability Theory
2.1 Fundamentals
2.1.1 Events, Probabilities, and Random Variables
2.1.2 Probability Distributions
2.2 Poisson Processes
The backbone of the neutral coalescent:
- Poisson distribution
- the number of events that occur over a fixed period of time
- Exponential distribution
- waiting time until a first event occurs
The poisson process is a counting process.
\[\begin{aligned}
P[K(t) = 1] &= \lambda t + o(t) \\
P[K(t) \ge 2] &= o(t)
\end{aligned}\]
- where
- $K(t)$ is the number of observed events before $t$. $K(0) = 0$
$\lambda$ is the rate of occurrence per unit time
$o(t)$ goes to faster than $t$
- These implies, within a sufficiently short period of time, $\delta t$,
- $P[K(\delta t) = 1] \approx \lambda \delta t$
$P[K(\delta t) \ge 2]$ is negligible (i.e., two events don’t occur at the same time)
The number of events over time 0 to $t$
(or from arbitrary starting time $s$ to $s + t$) is Poisson distributed,
for $k = 0, 1, 2, …$,
\[\begin{aligned}
P[K(t) = k] \;&=\; \frac {(\lambda t)^k} {k!} e^{-\lambda t} \\
P[K(t+s) - K(s) = k] \;&=\; P[K(t) = k]
\end{aligned}\]
(Eq. 2.53)
Waiting time to the first event is exponentially distributed,
\[
f_T(t) = \lambda e^{-\lambda t}
\]
- memoryless (無記憶性)
- The number of coin-tosses required to observe the next heads is independent of previous results.
The waiting times between successive events are i.i.d. (independent and identically distributed).
(Eq. 2.54)
The waiting time $W$ until the $n$ th event
(= the sum of $n$ i.i.d. waiting times)
can be derived by $n - 1$ successive convolutions:
\[\begin{aligned}
f_{W,1}(t) &= f_T(t) = \lambda e^{-\lambda t} \\
f_{W,2}(t) &= \int _0^t \lambda e^{-\lambda (t-t')} f _{W,1}(t') dt' \\
&= \lambda ^2 e^{-\lambda t} \left[t'\right]_0^t \\
&= t\lambda^2 e^{-\lambda t} \\
f_{W,3}(t) &= \int _0^t \lambda e^{-\lambda (t-t')} f _{W,2}(t') dt' \\
&= \frac {t^2 \lambda^3 e^{-\lambda t}} {2!} \\
&\;\vdots \\
f_{W,n}(t) &= \lambda e^{-\lambda t} \frac{(\lambda t)^{n-1}} {(n - 1)!}
\end{aligned}\]
This is the gamma distribution.
(Eq. 2.55)
The mean and the variance are
\[\begin{aligned}
\operatorname{E}[W] &= \sum^n \operatorname{E}[T] \\
&= \sum^n \frac 1 \lambda \\
&= \frac n \lambda \\
\operatorname{Var}[W] &= \sum^n \operatorname{Var}[T] \\
&= \sum^n \frac 1 {\lambda^2} \\
&= \frac n {\lambda^2}
\end{aligned}\]
These hold even when $n$ is not an integer
if we replace the factorial $(n - 1)!$ with gamma function,
$\Gamma(n) = \int^\infty_0 x^{n-1} e^{-x} dx$.
(Eq. 2.57)
The coalescent considers the events
that have a very small probability of occurring in any single generation:
- coalescence (common ancestor event)
- mutation
- migration
They will each form a Poisson process.
2.2.1 Poisson Process Results for the Coalescent
The Sum of Independent Poissons
- Two independent Poisson random variables:
- $X_1$ with occurrence rate $\lambda _1$
$X_2$ with occurrence rate $\lambda _2$
(Eq. 2.58)
The distribution of $Y = X_1 + X_2$ can be obtained by convolution:
\[\begin{aligned}
P[Y=k] \;&=\; \sum _{i=0}^k P[X_1=i] \; P[X_2=k-i] \\
&=\; \sum _{i=0}^k \frac {\lambda _1^i} {i!} e^{-\lambda _1}
\frac {\lambda _2^{k-i}} {(k-i)!} e^{-\lambda _2} \\
&=\; e^{-\lambda _1} e^{-\lambda _2}
\sum _{i=0}^k \frac {\lambda _1^i} {i!}
\frac {\lambda _2^{k-i}} {(k-i)!} \frac {k!}{k!} \\
&=\; \frac {e^{-(\lambda _1 + \lambda _2)}} {k!}
\sum _{i=0}^k {k \choose i} \lambda _1^i \lambda _2^{k-i} \\
&=\; \frac {e^{-(\lambda _1 + \lambda _2)}} {k!} (\lambda _1 + \lambda _2)^k \\
&=\; \text{Poisson distribution with occurrence rate } \lambda _1 + \lambda _2
\end{aligned}\]
The sum of independent Poisson processes is another Poisson process.
The Probability that the First Event Is of a Particular Type
(Eq 2.60)
The probability that $X_1$ is observed before $X_2$
is given simply by the relative rate of the event
(i.e., as a fraction of the total rate):
\[\begin{aligned}
P[T_1 < T_2]
&= \int _0^\infty P[T_2>t]\; f _{T_1}(t) dt \\
&= \int _0^\infty \left(e^{-\lambda _2 t} \right) _\text{Eq. 2.59}\;
\lambda _1 e^{-\lambda _1 t} dt \\
&= \lambda _1 \int _0^\infty e^{-(\lambda _1 + \lambda _2) t} dt \\
&= \lambda _1 \left[-\frac {e^{-(\lambda _1 + \lambda _2)t}}
{\lambda _1 + \lambda _2} \right]_0^\infty \\
&= \frac {\lambda _1} {\lambda _1 + \lambda _2},
\end{aligned}\]
using (Eq 2.59)
\[\begin{aligned}
P[T>t] \;&=\; \int _t^\infty \lambda e^{-\lambda t} dt \\
&=\; \left[-e^{-\lambda t} \right]_t^\infty \\
&=\; e^{-\lambda t}.
\end{aligned}\]
The Time to the First Event among Independent Poissons
(Eq. 2.61)
The distribution of $T = \min(T_1, T_2)$
\[\begin{aligned}
P[T>t] \;&=\; P[\min(T_1, T_2) > t] \\
&=\; P[T_1 > t \;\cap\; T_2 > t] \\
&=\; P[T_1 > t]\; P[T_2 > t] \\
&=\; e^{-\lambda _1 t} e^{-\lambda _2 t} \\
&=\; e^{-(\lambda _1 + \lambda _2) t}
\end{aligned}\]
Therefore, $f_ {\min(T_1, T_2)}(t) = (\lambda _1 + \lambda _2) e^{-(\lambda _2 + \lambda _1)t}$
There is a one-to-one correspondence between cumulative distribution $P[T \le t]$ and probability densities $f_T(t)$
The Number of Events Required to See a Particular Outcome
$X_2, X_2, X_2, …, X_2, \boldsymbol{X_1}$, …
(Eq. 2.62)
How many $X_2$ (e.g., mutation events) occur
before $X_1$ (e.g., common ancestor event)?
\[\begin{aligned}
P[K=k] \;&=\; P[\text{First }X_1\text{ occurs at }K\text{th trial}] \\
&=\; P[X_2\text{ occurs }k - 1\text{ times first, then }X_1\text{ occurs}] \\
&=\; \left(\frac {\lambda _2} {\lambda _1 + \lambda _2} \right)^{k-1}
\frac {\lambda _1} {\lambda _1 + \lambda _2}
\end{aligned}\]
This is the geometric distribution with the rate
$p = \frac {\lambda _1} {\lambda _1 + \lambda _2}$.
The process is just like a series of Bernoulli trials with probability of success
$p = \frac {\lambda _1} {\lambda _1 + \lambda _2}$.
Tying All This Together: A Filtered Poisson Process
Reinterpret $f_T(t)$ with the sum rule and product rule
(see Eq. 2.7, 2.8).
\[\begin{aligned}
f_T(t)\; &=\; \sum _{k=1}^\infty f_T(t \mid K=k)\; P[K=k] \\
&=\; \sum _{k=1}^\infty (\text{Eq. }2.54) (\text{Eq. }2.62) \\
&=\; \sum _{k=1}^\infty
(\lambda _1 + \lambda _2) e^{-(\lambda _1 + \lambda _2)t}
\frac {\{(\lambda _1 + \lambda _2)t\}^{k-1}} {(k-1)!}\;
\left(\frac {\lambda _2} {\lambda _1 + \lambda _2} \right)^{k-1}
\frac {\lambda _1} {\lambda _1 + \lambda _2} \\
&=\; \lambda _1 e^{-(\lambda _1 + \lambda _2)t}\;
\sum _{k=1}^\infty \frac {(\lambda _2 t)^{k-1}} {(k-1)!} \\
&=\; \lambda _1 e^{-(\lambda _1 + \lambda _2)t}\;
\sum _{k=0}^\infty \frac {(\lambda _2 t)^k} {k!} \\
&=\; \lambda _1 e^{-(\lambda _1 + \lambda _2)t}\; e^{\lambda _2 t} \\
&=\; \lambda _1 e^{-\lambda _1 t}
\end{aligned}\]
Taylor series of $e^x$
\[\begin{aligned}
e^x \;&=\; 1 + \frac x 1 + \frac {x^2} {2!} + \frac {x^3} {3!} + \cdots \\
&=\; \sum _{k=0}^\infty \frac {x^k} {k!}
\end{aligned}\]
- Filtered Poisson process = Poisson process with rate $\lambda p; (=\lambda _1)$
- total occurrence rate $\lambda = \lambda _1 + \lambda _2$
- acceptance rate (proportion of the focal event)
$p = \frac {\lambda _1} {\lambda _1 + \lambda _2}$
2.2.2 Convolutions of Exponential Distributions
The sum of the waiting times of $n$ events:
- If the rate does not change with each event ($\lambda _i = \lambda \text{ for all } i$),
- → gamma-distributed (Eq. 2.54)
- If the rate changes with each event ($\lambda _i \neq \lambda _j \text{ for } i \neq j$),
- (e.g., in the study of genealogies, the rate of coalescence changes every time a coalescent event occurs)
→ obtained by successive convolution of exponential distribution (Eq. 2.63, 2.64)
\[\begin{aligned}
f_{T_1 + T_2}(t)
&= \int _0^t f_{T_1}(s)\; f_{T_2}(t-s)ds \\
&= \int _0^t \lambda _1 e^{-\lambda _1 s}\; \lambda _2 e^{-\lambda _2 (t-s)}ds \\
&= \lambda _1 \lambda _2 e^{-\lambda _2 t} \int _0^t e^{-(\lambda _1 -\lambda _2)s}ds \\
&= \lambda _1 \lambda _2 e^{-\lambda _2 t}
\left[-\frac 1{\lambda _1 - \lambda _2} e^{-(\lambda _1 -\lambda _2)s} \right]_0^t \\
&= \frac {\lambda _1} {\lambda _1 - \lambda _2} \lambda _2 e^{-\lambda _2 t}
\left(1 - e^{-(\lambda _1 -\lambda _2)t} \right) \\
&= \frac {\lambda _1} {\lambda _1 - \lambda _2} \lambda _2 e^{-\lambda _2 t} +
\frac {\lambda _2} {\lambda _2 - \lambda _1} \lambda _1 e^{-\lambda _1 t} \\
&= \frac {\lambda _1} {\lambda _1 - \lambda _2} f _{T_2}(t) +
\frac {\lambda _2} {\lambda _2 - \lambda _1} f _{T_1}(t) \\
(&= \text{weighted sum of the original distributions})\\[1ex]
f_{T_1 + T_2 + T_3}(t)
&= \int _0^t f_{T_1 + T_2}(s)\; f_{T_3}(t-s)ds \\
&= \frac {\lambda _2} {\lambda _2 - \lambda _1}
\frac {\lambda _3} {\lambda _3 - \lambda _1} \lambda _1 e^{-\lambda _1 t} +
\frac {\lambda _3} {\lambda _3 - \lambda _2}
\frac {\lambda _1} {\lambda _1 - \lambda _2} \lambda _2 e^{-\lambda _2 t} +
\frac {\lambda _1} {\lambda _1 - \lambda _3}
\frac {\lambda _2} {\lambda _2 - \lambda _3} \lambda _3 e^{-\lambda _3 t} \\[1ex]
f_{T_1 + T_2 + T_3 + T_4}(t) &= \cdots \\
&\;\vdots \\
f_{\sum _{i=1}^n T_i}(t)
&= \sum_{i=1}^n \lambda _i e^{-\lambda _i t}
\prod _{j=1,\;j \neq i} \frac {\lambda _j} {\lambda _j - \lambda _i}
\end{aligned}\]
We will use this to obtain the distribution of the total waiting time
to the MRCA of the entire samples (Eq. 3.27)