Poisson and Geometric Distributions Poisson and Geometric Distributions
The Poisson and geometric distributions model discrete random variables arising from counting
processes. The Poisson distribution counts the number of rare events in a fixed interval, while the
geometric distribution counts the number of trials until the first success.
Board Coverage
Board Paper Notes AQA Paper 2 Both Poisson and geometric in depth Edexcel S2, S3 Poisson in S2; geometric in S3 OCR (A) Paper 2 Poisson and geometric CIE (9231) S2 Poisson covered; geometric not required
The formula booklet provides the Poisson PMF. You must know when to apply each distribution
and how to carry out hypothesis testing with discrete distributions. The geometric distribution has
two common conventions for the support: r = 1 , 2 , 3 , … r = 1, 2, 3, \ldots r = 1 , 2 , 3 , … (number of trials) or
r = 0 , 1 , 2 , … r = 0, 1, 2, \ldots r = 0 , 1 , 2 , … (number of failures). AQA uses r = 1 , 2 , … r = 1, 2, \ldots r = 1 , 2 , … . :::
1. The Poisson Distribution 1.1 Definition Definition. A discrete random variable X X X follows a Poisson distribution with parameter
λ \lambda λ (where λ > 0 \lambda > 0 λ > 0 ), written X ∼ P o ( λ ) X \sim \mathrm{Po}(\lambda) X ∼ Po ( λ ) , if
P ( X = r ) = ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ , r = 0 , 1 , 2 , … P(X = r) = \frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆, \quad r = 0, 1, 2, \ldots P ( X = r ) = L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ , r = 0 , 1 , 2 , …
The Poisson distribution models the number of events occurring in a fixed interval of time or space
when:
Events occur independently
Events occur at a constant average rate λ \lambda λ
The probability of more than one event in a sufficiently small interval is negligible
1.2 Derivation as a Limit of the Binomial Theorem. If n → ∞ n \to \infty n → ∞ and p → 0 p \to 0 p → 0 such that n p = λ np = \lambda n p = λ remains constant, then
B ( n , p ) → P o ( λ ) B(n, p) \to \mathrm{Po}(\lambda) B ( n , p ) → Po ( λ ) .
Proof P ( X = r ) = ( n r ) p r ( 1 − p ) n − r = ◆ L B ◆ n ( n − 1 ) ⋯ ( n − r + 1 ) ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ ⋅ ◆ L B ◆ λ r ◆ R B ◆◆ L B ◆ n r ◆ R B ◆ ⋅ ( 1 − ◆ L B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − r \begin{aligned}
P(X = r) &= \binom{n}{r}p^r(1-p)^{n-r} \\
&= \frac◆LB◆n(n-1)\cdots(n-r+1)◆RB◆◆LB◆r!◆RB◆\cdot\frac◆LB◆\lambda^r◆RB◆◆LB◆n^r◆RB◆\cdot\left(1-\frac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-r}
\end{aligned} P ( X = r ) = ( r n ) p r ( 1 − p ) n − r = L ◆ B ◆ n ( n − 1 ) ⋯ ( n − r + 1 ) ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ ⋅ L ◆ B ◆ λ r ◆ R B ◆◆ L B ◆ n r ◆ R B ◆ ⋅ ( 1 − L ◆ B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − r Consider each factor as n → ∞ n \to \infty n → ∞ :
◆ L B ◆ n ( n − 1 ) ⋯ ( n − r + 1 ) ◆ R B ◆◆ L B ◆ n r ◆ R B ◆ → 1 \dfrac◆LB◆n(n-1)\cdots(n-r+1)◆RB◆◆LB◆n^r◆RB◆ \to 1 L ◆ B ◆ n ( n − 1 ) ⋯ ( n − r + 1 ) ◆ R B ◆◆ L B ◆ n r ◆ R B ◆ → 1 since each of the r r r factors tends to 1
( 1 − ◆ L B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − r = ( 1 − ◆ L B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n ⋅ ( 1 − ◆ L B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) − r → e − λ ⋅ 1 = e − λ \left(1-\dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-r} = \left(1-\dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^n \cdot \left(1-\dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{-r} \to e^{-\lambda} \cdot 1 = e^{-\lambda} ( 1 − L ◆ B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − r = ( 1 − L ◆ B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n ⋅ ( 1 − L ◆ B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) − r → e − λ ⋅ 1 = e − λ
Therefore:
P ( X = r ) → 1 r ! ⋅ λ r ⋅ e − λ = ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ ■ P(X = r) \to \frac{1}{r!}\cdot\lambda^r \cdot e^{-\lambda} = \frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ \quad \blacksquare P ( X = r ) → r ! 1 ⋅ λ r ⋅ e − λ = L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ ■
1.3 Proof that E ( X ) = λ E(X) = \lambda E ( X ) = λ Proof E ( X ) = ∑ r = 0 ∞ r ⋅ ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ = ∑ r = 1 ∞ ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ ( r − 1 ) ! ◆ R B ◆ = λ e − λ ∑ r = 1 ∞ ◆ L B ◆ λ r − 1 ◆ R B ◆◆ L B ◆ ( r − 1 ) ! ◆ R B ◆ = λ e − λ ∑ k = 0 ∞ ◆ L B ◆ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = λ e − λ ⋅ e λ = λ ■ \begin{aligned}
E(X) &= \sum_{r=0}^{\infty}r\cdot\frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ = \sum_{r=1}^{\infty}\frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆(r-1)!◆RB◆ \\
&= \lambda e^{-\lambda}\sum_{r=1}^{\infty}\frac◆LB◆\lambda^{r-1}◆RB◆◆LB◆(r-1)!◆RB◆ = \lambda e^{-\lambda}\sum_{k=0}^{\infty}\frac◆LB◆\lambda^k◆RB◆◆LB◆k!◆RB◆ \\
&= \lambda e^{-\lambda}\cdot e^{\lambda} = \lambda \quad \blacksquare
\end{aligned} E ( X ) = r = 0 ∑ ∞ r ⋅ L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ = r = 1 ∑ ∞ L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ ( r − 1 )! ◆ R B ◆ = λ e − λ r = 1 ∑ ∞ L ◆ B ◆ λ r − 1 ◆ R B ◆◆ L B ◆ ( r − 1 )! ◆ R B ◆ = λ e − λ k = 0 ∑ ∞ L ◆ B ◆ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = λ e − λ ⋅ e λ = λ ■ 1.4 Proof that V a r ( X ) = λ \mathrm{Var}(X) = \lambda Var ( X ) = λ Proof First compute E ( X ( X − 1 ) ) E(X(X-1)) E ( X ( X − 1 )) :
E ( X ( X − 1 ) ) = ∑ r = 2 ∞ r ( r − 1 ) ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ = ∑ r = 2 ∞ ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ ( r − 2 ) ! ◆ R B ◆ = λ 2 e − λ ∑ k = 0 ∞ ◆ L B ◆ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = λ 2 e − λ ⋅ e λ = λ 2 \begin{aligned}
E(X(X-1)) &= \sum_{r=2}^{\infty}r(r-1)\frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ = \sum_{r=2}^{\infty}\frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆(r-2)!◆RB◆ \\
&= \lambda^2 e^{-\lambda}\sum_{k=0}^{\infty}\frac◆LB◆\lambda^k◆RB◆◆LB◆k!◆RB◆ = \lambda^2 e^{-\lambda}\cdot e^{\lambda} = \lambda^2
\end{aligned} E ( X ( X − 1 )) = r = 2 ∑ ∞ r ( r − 1 ) L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ = r = 2 ∑ ∞ L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ ( r − 2 )! ◆ R B ◆ = λ 2 e − λ k = 0 ∑ ∞ L ◆ B ◆ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = λ 2 e − λ ⋅ e λ = λ 2 Since E ( X 2 ) = E ( X ( X − 1 ) ) + E ( X ) = λ 2 + λ E(X^2) = E(X(X-1)) + E(X) = \lambda^2 + \lambda E ( X 2 ) = E ( X ( X − 1 )) + E ( X ) = λ 2 + λ :
V a r ( X ) = E ( X 2 ) − [ E ( X ) ] 2 = λ 2 + λ − λ 2 = λ ■ \mathrm{Var}(X) = E(X^2) - [E(X)]^2 = \lambda^2 + \lambda - \lambda^2 = \lambda \quad \blacksquare Var ( X ) = E ( X 2 ) − [ E ( X ) ] 2 = λ 2 + λ − λ 2 = λ ■
E ( X ) = V a r ( X ) = λ \boxed{E(X) = \mathrm{Var}(X) = \lambda} E ( X ) = Var ( X ) = λ
This is the defining property of the Poisson distribution: the mean equals the variance.
1.5 Additivity of Poisson distributions If X ∼ P o ( λ ) X \sim \mathrm{Po}(\lambda) X ∼ Po ( λ ) and Y ∼ P o ( μ ) Y \sim \mathrm{Po}(\mu) Y ∼ Po ( μ ) are independent, then
X + Y ∼ P o ( λ + μ ) \boxed{X + Y \sim \mathrm{Po}(\lambda + \mu)} X + Y ∼ Po ( λ + μ )
1.6 Cumulative probabilities Cumulative Poisson probabilities are found using:
P ( X ≤ r ) = ∑ k = 0 r ◆ L B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ P(X \leq r) = \sum_{k=0}^{r}\frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ P ( X ≤ r ) = ∑ k = 0 r L ◆ B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆
These are typically obtained from tables or a calculator. Key relationships:
P ( X > r ) = 1 − P ( X ≤ r ) P(X > r) = 1 - P(X \leq r) P ( X > r ) = 1 − P ( X ≤ r ) P ( a ≤ X ≤ b ) = P ( X ≤ b ) − P ( X ≤ a − 1 ) P(a \leq X \leq b) = P(X \leq b) - P(X \leq a-1) P ( a ≤ X ≤ b ) = P ( X ≤ b ) − P ( X ≤ a − 1 )
1.7 Poisson hypothesis testing The procedure mirrors binomial hypothesis testing:
Define X X X and state X ∼ P o ( λ 0 ) X \sim \mathrm{Po}(\lambda_0) X ∼ Po ( λ 0 ) under H 0 H_0 H 0
State H 0 : λ = λ 0 H_0: \lambda = \lambda_0 H 0 : λ = λ 0 and H 1 H_1 H 1
State the significance level α \alpha α
Find the critical region
Compare the observed value
Conclude in context
Example. A call centre receives an average of 3.2 calls per minute. In a particular minute, 7
calls are received. Test at the 5% significance level whether the rate has increased.
X ∼ P o ( 3.2 ) X \sim \mathrm{Po}(3.2) X ∼ Po ( 3.2 ) . H 0 : λ = 3.2 H_0: \lambda = 3.2 H 0 : λ = 3.2 , H 1 : λ > 3.2 H_1: \lambda > 3.2 H 1 : λ > 3.2 .
P ( X ≥ 7 ) = 1 − P ( X ≤ 6 ) = 1 − 0.9554 = 0.0446 < 0.05 P(X \geq 7) = 1 - P(X \leq 6) = 1 - 0.9554 = 0.0446 < 0.05 P ( X ≥ 7 ) = 1 − P ( X ≤ 6 ) = 1 − 0.9554 = 0.0446 < 0.05 .
Reject H 0 H_0 H 0 . There is sufficient evidence that the rate has increased.
Example. Find the critical region for a two-tailed test at the 5% level with
X ∼ P o ( 5 ) X \sim \mathrm{Po}(5) X ∼ Po ( 5 ) .
Lower tail: P ( X ≤ 0 ) = e − 5 ≈ 0.0067 ≤ 0.025 P(X \leq 0) = e^{-5} \approx 0.0067 \leq 0.025 P ( X ≤ 0 ) = e − 5 ≈ 0.0067 ≤ 0.025 . P ( X ≤ 1 ) = 0.0404 > 0.025 P(X \leq 1) = 0.0404 > 0.025 P ( X ≤ 1 ) = 0.0404 > 0.025 . So
X ≤ 0 X \leq 0 X ≤ 0 .
Upper tail: P ( X ≥ 10 ) = 1 − 0.9682 = 0.0318 ≤ 0.025 P(X \geq 10) = 1 - 0.9682 = 0.0318 \leq 0.025 P ( X ≥ 10 ) = 1 − 0.9682 = 0.0318 ≤ 0.025 ? No.
P ( X ≥ 11 ) = 1 − 0.9830 = 0.0170 ≤ 0.025 P(X \geq 11) = 1 - 0.9830 = 0.0170 \leq 0.025 P ( X ≥ 11 ) = 1 − 0.9830 = 0.0170 ≤ 0.025 . So X ≥ 11 X \geq 11 X ≥ 11 .
Critical region: X ≤ 0 X \leq 0 X ≤ 0 or X ≥ 11 X \geq 11 X ≥ 11 .
2. The Geometric Distribution 2.1 Definition Definition. A discrete random variable X X X follows a geometric distribution with parameter
p p p (where 0 < p ≤ 1 0 < p \leq 1 0 < p ≤ 1 ), written X ∼ G e o ( p ) X \sim \mathrm{Geo}(p) X ∼ Geo ( p ) , if X X X is the number of the trial on
which the first success occurs:
P ( X = r ) = ( 1 − p ) r − 1 p , r = 1 , 2 , 3 , … P(X = r) = (1-p)^{r-1}p, \quad r = 1, 2, 3, \ldots P ( X = r ) = ( 1 − p ) r − 1 p , r = 1 , 2 , 3 , …
Each trial is independent with probability p p p of success.
2.2 Proof that E ( X ) = 1 p E(X) = \frac{1}{p} E ( X ) = p 1 Proof E ( X ) = ∑ r = 1 ∞ r q r − 1 p w h e r e q = 1 − p \begin{aligned}
E(X) &= \sum_{r=1}^{\infty}r\,q^{r-1}p \quad \mathrm{where } q = 1-p
\end{aligned} E ( X ) = r = 1 ∑ ∞ r q r − 1 p where q = 1 − p Let S = ∑ r = 1 ∞ r q r − 1 S = \sum_{r=1}^{\infty}r\,q^{r-1} S = ∑ r = 1 ∞ r q r − 1 . Recall the geometric series
∑ r = 0 ∞ q r = 1 1 − q \sum_{r=0}^{\infty}q^r = \frac{1}{1-q} ∑ r = 0 ∞ q r = 1 − q 1 for ∣ q ∣ < 1 |q| < 1 ∣ q ∣ < 1 .
Differentiating both sides with respect to q q q :
∑ r = 1 ∞ r q r − 1 = 1 ( 1 − q ) 2 \sum_{r=1}^{\infty}rq^{r-1} = \frac{1}{(1-q)^2} ∑ r = 1 ∞ r q r − 1 = ( 1 − q ) 2 1
Therefore:
E ( X ) = p ⋅ 1 ( 1 − q ) 2 = p ⋅ 1 p 2 = 1 p ■ E(X) = p \cdot \frac{1}{(1-q)^2} = p \cdot \frac{1}{p^2} = \frac{1}{p} \quad \blacksquare E ( X ) = p ⋅ ( 1 − q ) 2 1 = p ⋅ p 2 1 = p 1 ■
2.3 Proof that V a r ( X ) = 1 − p p 2 \mathrm{Var}(X) = \frac{1-p}{p^2} Var ( X ) = p 2 1 − p Proof First compute E ( X 2 ) = E ( X ( X − 1 ) ) + E ( X ) E(X^2) = E(X(X-1)) + E(X) E ( X 2 ) = E ( X ( X − 1 )) + E ( X ) .
E ( X ( X − 1 ) ) = ∑ r = 2 ∞ r ( r − 1 ) q r − 1 p = p q ∑ r = 2 ∞ r ( r − 1 ) q r − 2 \begin{aligned}
E(X(X-1)) &= \sum_{r=2}^{\infty}r(r-1)q^{r-1}p = p\,q\sum_{r=2}^{\infty}r(r-1)q^{r-2}
\end{aligned} E ( X ( X − 1 )) = r = 2 ∑ ∞ r ( r − 1 ) q r − 1 p = p q r = 2 ∑ ∞ r ( r − 1 ) q r − 2 Starting from ∑ r = 0 ∞ q r = 1 1 − q \sum_{r=0}^{\infty}q^r = \frac{1}{1-q} ∑ r = 0 ∞ q r = 1 − q 1 , differentiating twice:
∑ r = 2 ∞ r ( r − 1 ) q r − 2 = 2 ( 1 − q ) 3 \sum_{r=2}^{\infty}r(r-1)q^{r-2} = \frac{2}{(1-q)^3} ∑ r = 2 ∞ r ( r − 1 ) q r − 2 = ( 1 − q ) 3 2
So E ( X ( X − 1 ) ) = p q ⋅ 2 ( 1 − q ) 3 = p q ⋅ 2 p 3 = 2 q p 2 E(X(X-1)) = p\,q\cdot\frac{2}{(1-q)^3} = p\,q\cdot\frac{2}{p^3} = \frac{2q}{p^2} E ( X ( X − 1 )) = p q ⋅ ( 1 − q ) 3 2 = p q ⋅ p 3 2 = p 2 2 q .
E ( X 2 ) = 2 q p 2 + 1 p = 2 q + p p 2 = 2 ( 1 − p ) + p p 2 = 2 − p p 2 V a r ( X ) = E ( X 2 ) − [ E ( X ) ] 2 = 2 − p p 2 − 1 p 2 = 1 − p p 2 ■ \begin{aligned}
E(X^2) &= \frac{2q}{p^2} + \frac{1}{p} = \frac{2q + p}{p^2} = \frac{2(1-p) + p}{p^2} = \frac{2-p}{p^2} \\[4pt]
\mathrm{Var}(X) &= E(X^2) - [E(X)]^2 = \frac{2-p}{p^2} - \frac{1}{p^2} = \frac{1-p}{p^2} \quad \blacksquare
\end{aligned} E ( X 2 ) Var ( X ) = p 2 2 q + p 1 = p 2 2 q + p = p 2 2 ( 1 − p ) + p = p 2 2 − p = E ( X 2 ) − [ E ( X ) ] 2 = p 2 2 − p − p 2 1 = p 2 1 − p ■ E ( X ) = 1 p , V a r ( X ) = 1 − p p 2 \boxed{E(X) = \frac{1}{p}, \qquad \mathrm{Var}(X) = \frac{1-p}{p^2}} E ( X ) = p 1 , Var ( X ) = p 2 1 − p
2.4 The memoryless property Theorem. The geometric distribution is the only discrete memoryless distribution:
P ( X > m + n ∣ X > m ) = P ( X > n ) P(X > m + n \mid X > m) = P(X > n) P ( X > m + n ∣ X > m ) = P ( X > n )
Proof P ( X > m + n ∣ X > m ) = ◆ L B ◆ P ( X > m + n a n d X > m ) ◆ R B ◆◆ L B ◆ P ( X > m ) ◆ R B ◆ = P ( X > m + n ) P ( X > m ) ( s i n c e X > m + n ⟹ X > m ) = ◆ L B ◆ 1 − P ( X ≤ m + n ) ◆ R B ◆◆ L B ◆ 1 − P ( X ≤ m ) ◆ R B ◆ \begin{aligned}
P(X > m + n \mid X > m) &= \frac◆LB◆P(X > m+n \mathrm{ and } X > m)◆RB◆◆LB◆P(X > m)◆RB◆ \\
&= \frac{P(X > m+n)}{P(X > m)} \quad \mathrm{(since } X > m+n \implies X > m\mathrm{)} \\
&= \frac◆LB◆1 - P(X \leq m+n)◆RB◆◆LB◆1 - P(X \leq m)◆RB◆
\end{aligned} P ( X > m + n ∣ X > m ) = L ◆ B ◆ P ( X > m + n and X > m ) ◆ R B ◆◆ L B ◆ P ( X > m ) ◆ R B ◆ = P ( X > m ) P ( X > m + n ) ( since X > m + n ⟹ X > m ) = L ◆ B ◆1 − P ( X ≤ m + n ) ◆ R B ◆◆ L B ◆1 − P ( X ≤ m ) ◆ R B ◆ Now P ( X ≤ k ) = ∑ r = 1 k q r − 1 p = p ⋅ 1 − q k 1 − q = 1 − q k P(X \leq k) = \sum_{r=1}^{k}q^{r-1}p = p\cdot\frac{1-q^k}{1-q} = 1 - q^k P ( X ≤ k ) = ∑ r = 1 k q r − 1 p = p ⋅ 1 − q 1 − q k = 1 − q k .
Therefore:
1 − ( 1 − q m + n ) 1 − ( 1 − q m ) = q m + n q m = q n = 1 − ( 1 − q n ) = P ( X > n ) ■ \frac{1 - (1-q^{m+n})}{1 - (1-q^m)} = \frac{q^{m+n}}{q^m} = q^n = 1 - (1-q^n) = P(X > n) \quad \blacksquare 1 − ( 1 − q m ) 1 − ( 1 − q m + n ) = q m q m + n = q n = 1 − ( 1 − q n ) = P ( X > n ) ■ info success, the probability of waiting at least
n n n more trials is exactly the same as if
you were starting fresh. The process "forgets" its history. :::
2.5 Cumulative distribution function P ( X ≤ r ) = 1 − q r = 1 − ( 1 − p ) r P(X \leq r) = 1 - q^r = 1 - (1-p)^r P ( X ≤ r ) = 1 − q r = 1 − ( 1 − p ) r
2.6 Geometric hypothesis testing Example. A bag contains red and blue balls. The probability of drawing a red ball is p p p . In an
experiment, the first red ball is drawn on the 10th draw. Test at the 5% level whether p = 0.3 p = 0.3 p = 0.3 .
X ∼ G e o ( 0.3 ) X \sim \mathrm{Geo}(0.3) X ∼ Geo ( 0.3 ) . H 0 : p = 0.3 H_0: p = 0.3 H 0 : p = 0.3 , H 1 : p < 0.3 H_1: p < 0.3 H 1 : p < 0.3 (the ball took longer than expected, so
p p p may be smaller).
p − v a l u e = P ( X ≥ 10 ) = ( 1 − 0.3 ) 10 − 1 = 0.7 9 ≈ 0.0404 < 0.05 p\mathrm{-value} = P(X \geq 10) = (1-0.3)^{10-1} = 0.7^9 \approx 0.0404 < 0.05 p − value = P ( X ≥ 10 ) = ( 1 − 0.3 ) 10 − 1 = 0. 7 9 ≈ 0.0404 < 0.05 .
Reject H 0 H_0 H 0 . There is sufficient evidence that p < 0.3 p < 0.3 p < 0.3 .
Critical region approach. For H 1 : p < 0.3 H_1: p < 0.3 H 1 : p < 0.3 at the 5% level, find c c c such that
P ( X ≥ c ) ≤ 0.05 P(X \geq c) \leq 0.05 P ( X ≥ c ) ≤ 0.05 :
P ( X ≥ 9 ) = 0.7 8 ≈ 0.0576 > 0.05 P(X \geq 9) = 0.7^8 \approx 0.0576 > 0.05 P ( X ≥ 9 ) = 0. 7 8 ≈ 0.0576 > 0.05 . P ( X ≥ 10 ) = 0.7 9 ≈ 0.0404 < 0.05 P(X \geq 10) = 0.7^9 \approx 0.0404 < 0.05 P ( X ≥ 10 ) = 0. 7 9 ≈ 0.0404 < 0.05 .
Critical region: X ≥ 10 X \geq 10 X ≥ 10 .
3. Modelling with Poisson and Geometric Distributions 3.1 When to use each Situation Distribution Number of events in a fixed interval, rare events Poisson P o ( λ ) \mathrm{Po}(\lambda) Po ( λ ) Number of trials until first success Geometric G e o ( p ) \mathrm{Geo}(p) Geo ( p ) Fixed number of trials, counting successes Binomial B ( n , p ) B(n, p) B ( n , p )
3.2 Poisson as approximation to Binomial When n n n is large and p p p is small such that n p ≤ 10 np \leq 10 n p ≤ 10 :
B ( n , p ) ≈ P o ( n p ) B(n, p) \approx \mathrm{Po}(np) B ( n , p ) ≈ Po ( n p )
Example. X ∼ B ( 200 , 0.02 ) X \sim B(200, 0.02) X ∼ B ( 200 , 0.02 ) . Then λ = n p = 4 \lambda = np = 4 λ = n p = 4 , so X ≈ P o ( 4 ) X \approx \mathrm{Po}(4) X ≈ Po ( 4 ) .
P ( X ≤ 2 ) ≈ e − 4 ( 1 + 4 + 16 2 ) = 13 e − 4 ≈ 0.2381 P(X \leq 2) \approx e^{-4}\left(1 + 4 + \frac{16}{2}\right) = 13e^{-4} \approx 0.2381 P ( X ≤ 2 ) ≈ e − 4 ( 1 + 4 + 2 16 ) = 13 e − 4 ≈ 0.2381 .
3.3 Conditions check Before applying the Poisson distribution, verify:
Events occur at a constant average rate
Events are independent
At most one event can occur in a sufficiently small sub-interval
warning not confuse this with the normal approximation to the binomial, which requires
n p > 5 np > 5 n p > 5 and n ( 1 − p ) > 5 n(1-p) > 5 n ( 1 − p ) > 5 . :::
Problems Details
Problem 1
A factory produces items with defects occurring at an average rate of 2.5 per hour. Find the probability of exactly 4 defects in a given hour, and the probability of more than 6 defects in a 2-hour period.
Details
Solution 1
For one hour:
X ∼ P o ( 2.5 ) X \sim \mathrm{Po}(2.5) X ∼ Po ( 2.5 ) .
P ( X = 4 ) = e − 2.5 ( 2.5 ) 4 4 ! = ◆ L B ◆ 0.08209 × 39.0625 ◆ R B ◆◆ L B ◆ 24 ◆ R B ◆ ≈ 0.1336 P(X=4) = \dfrac{e^{-2.5}(2.5)^4}{4!} = \dfrac◆LB◆0.08209 \times 39.0625◆RB◆◆LB◆24◆RB◆ \approx 0.1336 P ( X = 4 ) = 4 ! e − 2.5 ( 2.5 ) 4 = L ◆ B ◆0.08209 × 39.0625◆ R B ◆◆ L B ◆24◆ R B ◆ ≈ 0.1336 .
For two hours: Y ∼ P o ( 5 ) Y \sim \mathrm{Po}(5) Y ∼ Po ( 5 ) (by additivity).
P ( Y > 6 ) = 1 − P ( Y ≤ 6 ) = 1 − 0.7622 = 0.2378 P(Y > 6) = 1 - P(Y \leq 6) = 1 - 0.7622 = 0.2378 P ( Y > 6 ) = 1 − P ( Y ≤ 6 ) = 1 − 0.7622 = 0.2378 .
If you get this wrong, revise: Cumulative probabilities —
Section 1.6.
Details
Problem 2
A die is rolled repeatedly until a 6 appears. Find the probability that the first 6 appears on the 5th roll, and the probability that it takes more than 10 rolls.
Details
Solution 2
X ∼ G e o ( 1 / 6 ) X \sim \mathrm{Geo}(1/6) X ∼ Geo ( 1/6 ) .
P ( X = 5 ) = ( 5 6 ) 4 ⋅ 1 6 = 625 1296 ⋅ 1 6 ≈ 0.0804 P(X=5) = \left(\dfrac{5}{6}\right)^4 \cdot \dfrac{1}{6} = \dfrac{625}{1296} \cdot \dfrac{1}{6} \approx 0.0804 P ( X = 5 ) = ( 6 5 ) 4 ⋅ 6 1 = 1296 625 ⋅ 6 1 ≈ 0.0804 .
P ( X > 10 ) = ( 5 6 ) 10 − 1 ⋅ ( 5 6 ) 0 ⋅ 1 = ( 5 6 ) 10 ≈ 0.1615 P(X > 10) = \left(\dfrac{5}{6}\right)^{10-1} \cdot \left(\dfrac{5}{6}\right)^0 \cdot 1 = \left(\dfrac{5}{6}\right)^{10} \approx 0.1615 P ( X > 10 ) = ( 6 5 ) 10 − 1 ⋅ ( 6 5 ) 0 ⋅ 1 = ( 6 5 ) 10 ≈ 0.1615 .
Wait: P ( X > 10 ) = 1 − P ( X ≤ 10 ) = 1 − ( 1 − q 10 ) = q 10 = ( 5 / 6 ) 10 ≈ 0.1615 P(X > 10) = 1 - P(X \leq 10) = 1 - (1-q^{10}) = q^{10} = (5/6)^{10} \approx 0.1615 P ( X > 10 ) = 1 − P ( X ≤ 10 ) = 1 − ( 1 − q 10 ) = q 10 = ( 5/6 ) 10 ≈ 0.1615 .
If you get this wrong, revise:
Cumulative distribution function — Section 2.5.
Details
Problem 3
Prove that
E ( X ) = λ E(X) = \lambda E ( X ) = λ for
X ∼ P o ( λ ) X \sim \mathrm{Po}(\lambda) X ∼ Po ( λ ) , showing all steps of the summation.
Details
Solution 3
E ( X ) = ∑ r = 0 ∞ r ⋅ ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ = ∑ r = 1 ∞ ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ ( r − 1 ) ! ◆ R B ◆ = λ e − λ ∑ r = 1 ∞ ◆ L B ◆ λ r − 1 ◆ R B ◆◆ L B ◆ ( r − 1 ) ! ◆ R B ◆ E(X) = \sum_{r=0}^{\infty}r\cdot\dfrac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ = \sum_{r=1}^{\infty}\dfrac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆(r-1)!◆RB◆ = \lambda e^{-\lambda}\sum_{r=1}^{\infty}\dfrac◆LB◆\lambda^{r-1}◆RB◆◆LB◆(r-1)!◆RB◆ E ( X ) = ∑ r = 0 ∞ r ⋅ L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ = ∑ r = 1 ∞ L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ ( r − 1 )! ◆ R B ◆ = λ e − λ ∑ r = 1 ∞ L ◆ B ◆ λ r − 1 ◆ R B ◆◆ L B ◆ ( r − 1 )! ◆ R B ◆
Substituting k = r − 1 k = r-1 k = r − 1 :
= λ e − λ ∑ k = 0 ∞ ◆ L B ◆ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = λ e − λ ⋅ e λ = λ = \lambda e^{-\lambda}\sum_{k=0}^{\infty}\dfrac◆LB◆\lambda^k◆RB◆◆LB◆k!◆RB◆ = \lambda e^{-\lambda}\cdot e^{\lambda} = \lambda = λ e − λ ∑ k = 0 ∞ L ◆ B ◆ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = λ e − λ ⋅ e λ = λ .
■ \blacksquare ■
If you get this wrong, revise: Proof that E ( X ) = λ E(X) = \lambda E ( X ) = λ —
Section 1.3.
Details
Problem 4
The number of emails received per hour follows
P o ( 8 ) \mathrm{Po}(8) Po ( 8 ) . Find the probability of receiving between 6 and 12 emails (inclusive) in a given hour.
Details
Solution 4
X ∼ P o ( 8 ) X \sim \mathrm{Po}(8) X ∼ Po ( 8 ) .
P ( 6 ≤ X ≤ 12 ) = P ( X ≤ 12 ) − P ( X ≤ 5 ) P(6 \leq X \leq 12) = P(X \leq 12) - P(X \leq 5) P ( 6 ≤ X ≤ 12 ) = P ( X ≤ 12 ) − P ( X ≤ 5 ) .
P ( X ≤ 12 ) ≈ 0.9362 P(X \leq 12) \approx 0.9362 P ( X ≤ 12 ) ≈ 0.9362 , P ( X ≤ 5 ) ≈ 0.1912 P(X \leq 5) \approx 0.1912 P ( X ≤ 5 ) ≈ 0.1912 .
P ( 6 ≤ X ≤ 12 ) ≈ 0.9362 − 0.1912 = 0.7450 P(6 \leq X \leq 12) \approx 0.9362 - 0.1912 = 0.7450 P ( 6 ≤ X ≤ 12 ) ≈ 0.9362 − 0.1912 = 0.7450 .
If you get this wrong, revise: Cumulative probabilities —
Section 1.6.
Details
Problem 5
A manufacturer claims that on average 1 in 20 items is defective. In a batch of 500 items, use the Poisson approximation to find the probability of at most 35 defectives.
Details
Solution 5
X ∼ B ( 500 , 1 / 20 ) X \sim B(500, 1/20) X ∼ B ( 500 , 1/20 ) .
λ = n p = 500 / 20 = 25 \lambda = np = 500/20 = 25 λ = n p = 500/20 = 25 .
X ≈ P o ( 25 ) X \approx \mathrm{Po}(25) X ≈ Po ( 25 ) .
P ( X ≤ 35 ) = ∑ r = 0 35 e − 25 ( 25 ) r r ! ≈ 0.8878 P(X \leq 35) = \sum_{r=0}^{35}\dfrac{e^{-25}(25)^r}{r!} \approx 0.8878 P ( X ≤ 35 ) = ∑ r = 0 35 r ! e − 25 ( 25 ) r ≈ 0.8878 .
If you get this wrong, revise:
Poisson as approximation to Binomial — Section 3.2.
Details
Problem 6
Prove the memoryless property of the geometric distribution:
P ( X > m + n ∣ X > m ) = P ( X > n ) P(X > m+n \mid X > m) = P(X > n) P ( X > m + n ∣ X > m ) = P ( X > n ) .
Details
Solution 6
P ( X > m + n ∣ X > m ) = P ( X > m + n ) P ( X > m ) = q m + n q m = q n = P ( X > n ) P(X > m+n \mid X > m) = \dfrac{P(X > m+n)}{P(X > m)} = \dfrac{q^{m+n}}{q^m} = q^n = P(X > n) P ( X > m + n ∣ X > m ) = P ( X > m ) P ( X > m + n ) = q m q m + n = q n = P ( X > n ) .
This uses P ( X > k ) = q k = ( 1 − p ) k P(X > k) = q^k = (1-p)^k P ( X > k ) = q k = ( 1 − p ) k , which follows from P ( X ≤ k ) = 1 − q k P(X \leq k) = 1 - q^k P ( X ≤ k ) = 1 − q k . ■ \blacksquare ■
If you get this wrong, revise: The memoryless property — Section
2.4.
Details
Problem 7
A shop receives an average of 6 customers per 30 minutes. Find the critical region for a test at the 5% significance level of
H 0 : λ = 6 H_0: \lambda = 6 H 0 : λ = 6 against
H 1 : λ > 6 H_1: \lambda > 6 H 1 : λ > 6 , where
X X X is the number of customers in a 30-minute period.
Details
Solution 7
Under
H 0 H_0 H 0 :
X ∼ P o ( 6 ) X \sim \mathrm{Po}(6) X ∼ Po ( 6 ) .
P ( X ≥ 10 ) = 1 − P ( X ≤ 9 ) = 1 − 0.9161 = 0.0839 > 0.05 P(X \geq 10) = 1 - P(X \leq 9) = 1 - 0.9161 = 0.0839 > 0.05 P ( X ≥ 10 ) = 1 − P ( X ≤ 9 ) = 1 − 0.9161 = 0.0839 > 0.05 .
P ( X ≥ 11 ) = 1 − P ( X ≤ 10 ) = 1 − 0.9574 = 0.0426 < 0.05 P(X \geq 11) = 1 - P(X \leq 10) = 1 - 0.9574 = 0.0426 < 0.05 P ( X ≥ 11 ) = 1 − P ( X ≤ 10 ) = 1 − 0.9574 = 0.0426 < 0.05 .
Critical region: X ≥ 11 X \geq 11 X ≥ 11 . Actual significance level: 4.26%.
If you get this wrong, revise: Poisson hypothesis testing —
Section 1.7.
Details
Problem 8
X ∼ G e o ( p ) X \sim \mathrm{Geo}(p) X ∼ Geo ( p ) . Find
P ( X = 3 ∣ X > 1 ) P(X = 3 \mid X > 1) P ( X = 3 ∣ X > 1 ) and show it equals
P ( X = 2 ) P(X = 2) P ( X = 2 ) .
Details
Solution 8
P ( X = 3 ∣ X > 1 ) = P ( X = 3 ) P ( X > 1 ) = q 2 p q = q p = P ( X = 2 ) P(X = 3 \mid X > 1) = \dfrac{P(X = 3)}{P(X > 1)} = \dfrac{q^2 p}{q} = qp = P(X = 2) P ( X = 3 ∣ X > 1 ) = P ( X > 1 ) P ( X = 3 ) = q q 2 p = q p = P ( X = 2 ) .
This is a direct consequence of the memoryless property: given that the first trial was a failure,
the distribution of the remaining trials is the same as starting fresh.
If you get this wrong, revise: The memoryless property — Section
2.4.
Details
Problem 9
The number of accidents per week at a junction follows
P o ( 3 ) \mathrm{Po}(3) Po ( 3 ) . After new traffic lights are installed, 8 accidents are observed in one week. Test at the 5% level whether the rate has increased.
Details
Solution 9
X ∼ P o ( 3 ) X \sim \mathrm{Po}(3) X ∼ Po ( 3 ) .
H 0 : λ = 3 H_0: \lambda = 3 H 0 : λ = 3 ,
H 1 : λ > 3 H_1: \lambda > 3 H 1 : λ > 3 .
α = 0.05 \alpha = 0.05 α = 0.05 .
p − v a l u e = P ( X ≥ 8 ) = 1 − P ( X ≤ 7 ) = 1 − 0.9881 = 0.0119 < 0.05 p\mathrm{-value} = P(X \geq 8) = 1 - P(X \leq 7) = 1 - 0.9881 = 0.0119 < 0.05 p − value = P ( X ≥ 8 ) = 1 − P ( X ≤ 7 ) = 1 − 0.9881 = 0.0119 < 0.05 .
Reject H 0 H_0 H 0 . There is sufficient evidence that the accident rate has increased.
Alternatively, critical region: P ( X ≥ 7 ) = 1 − 0.9665 = 0.0335 < 0.05 P(X \geq 7) = 1 - 0.9665 = 0.0335 < 0.05 P ( X ≥ 7 ) = 1 − 0.9665 = 0.0335 < 0.05 ,
P ( X ≥ 6 ) = 1 − 0.9165 = 0.0835 > 0.05 P(X \geq 6) = 1 - 0.9165 = 0.0835 > 0.05 P ( X ≥ 6 ) = 1 − 0.9165 = 0.0835 > 0.05 .
Critical region: X ≥ 7 X \geq 7 X ≥ 7 . Since X = 8 ≥ 7 X = 8 \geq 7 X = 8 ≥ 7 , reject H 0 H_0 H 0 .
If you get this wrong, revise: Poisson hypothesis testing —
Section 1.7.
Details
Problem 10
If
X ∼ G e o ( p ) X \sim \mathrm{Geo}(p) X ∼ Geo ( p ) , find
E ( X ( X − 1 ) ) E(X(X-1)) E ( X ( X − 1 )) and hence verify that
V a r ( X ) = 1 − p p 2 \mathrm{Var}(X) = \dfrac{1-p}{p^2} Var ( X ) = p 2 1 − p .
Details
Solution 10
E ( X ( X − 1 ) ) = ∑ r = 2 ∞ r ( r − 1 ) q r − 1 p = p q ∑ r = 2 ∞ r ( r − 1 ) q r − 2 E(X(X-1)) = \sum_{r=2}^{\infty}r(r-1)q^{r-1}p = pq\sum_{r=2}^{\infty}r(r-1)q^{r-2} E ( X ( X − 1 )) = ∑ r = 2 ∞ r ( r − 1 ) q r − 1 p = pq ∑ r = 2 ∞ r ( r − 1 ) q r − 2 .
Since ∑ r = 0 ∞ q r = 1 1 − q \sum_{r=0}^{\infty}q^r = \dfrac{1}{1-q} ∑ r = 0 ∞ q r = 1 − q 1 , differentiating twice gives
∑ r = 2 ∞ r ( r − 1 ) q r − 2 = 2 ( 1 − q ) 3 \sum_{r=2}^{\infty}r(r-1)q^{r-2} = \dfrac{2}{(1-q)^3} ∑ r = 2 ∞ r ( r − 1 ) q r − 2 = ( 1 − q ) 3 2 .
E ( X ( X − 1 ) ) = p q ⋅ 2 p 3 = 2 q p 2 E(X(X-1)) = pq \cdot \dfrac{2}{p^3} = \dfrac{2q}{p^2} E ( X ( X − 1 )) = pq ⋅ p 3 2 = p 2 2 q .
E ( X 2 ) = E ( X ( X − 1 ) ) + E ( X ) = 2 q p 2 + 1 p = 2 q + p p 2 = 2 − p p 2 E(X^2) = E(X(X-1)) + E(X) = \dfrac{2q}{p^2} + \dfrac{1}{p} = \dfrac{2q+p}{p^2} = \dfrac{2-p}{p^2} E ( X 2 ) = E ( X ( X − 1 )) + E ( X ) = p 2 2 q + p 1 = p 2 2 q + p = p 2 2 − p .
V a r ( X ) = 2 − p p 2 − 1 p 2 = 1 − p p 2 \mathrm{Var}(X) = \dfrac{2-p}{p^2} - \dfrac{1}{p^2} = \dfrac{1-p}{p^2} Var ( X ) = p 2 2 − p − p 2 1 = p 2 1 − p . ■ \blacksquare ■
If you get this wrong, revise:
Proof that V a r ( X ) = 1 − p p 2 \mathrm{Var}(X) = \frac{1-p}{p^2} Var ( X ) = p 2 1 − p — Section
2.3.
7. Advanced Worked Examples Example 7.1: Poisson approximation to binomial Problem. A factory produces items with a defect rate of 0.02. In a batch of 200 items, find the
probability of exactly 3 defective items using (a) the binomial distribution and (b) the Poisson
approximation.
Solution. (a) Binomial: X ∼ B i n ( 200 , 0.02 ) X \sim \mathrm{Bin}(200, 0.02) X ∼ Bin ( 200 , 0.02 ) .
P ( X = 3 ) = ( 200 3 ) ( 0.02 ) 3 ( 0.98 ) 197 = ◆ L B ◆ 200 × 199 × 198 ◆ R B ◆◆ L B ◆ 6 ◆ R B ◆ × 8 × 10 − 6 × ( 0.98 ) 197 P(X = 3) = \binom{200}{3}(0.02)^3(0.98)^{197} = \frac◆LB◆200 \times 199 \times 198◆RB◆◆LB◆6◆RB◆ \times 8 \times 10^{-6} \times (0.98)^{197} P ( X = 3 ) = ( 3 200 ) ( 0.02 ) 3 ( 0.98 ) 197 = L ◆ B ◆200 × 199 × 198◆ R B ◆◆ L B ◆6◆ R B ◆ × 8 × 1 0 − 6 × ( 0.98 ) 197
(b) Poisson approximation: λ = n p = 200 × 0.02 = 4 \lambda = np = 200 \times 0.02 = 4 λ = n p = 200 × 0.02 = 4 . X ≈ P o ( 4 ) X \approx \mathrm{Po}(4) X ≈ Po ( 4 ) .
P ( X = 3 ) = ◆ L B ◆ e − 4 ⋅ 4 3 ◆ R B ◆◆ L B ◆ 3 ! ◆ R B ◆ = 64 6 e 4 = 32 3 e 4 ≈ 0.1954 P(X = 3) = \frac◆LB◆e^{-4} \cdot 4^3◆RB◆◆LB◆3!◆RB◆ = \frac{64}{6e^4} = \frac{32}{3e^4} \approx 0.1954 P ( X = 3 ) = L ◆ B ◆ e − 4 ⋅ 4 3 ◆ R B ◆◆ L B ◆3 ! ◆ R B ◆ = 6 e 4 64 = 3 e 4 32 ≈ 0.1954
The approximation is valid since n ≥ 50 n \geq 50 n ≥ 50 and p ≤ 0.1 p \leq 0.1 p ≤ 0.1 .
Example 7.2: Geometric distribution and memoryless property Problem. A fair die is rolled until a 6 appears. Find the probability that more than 4 rolls are
needed. Verify the memoryless property: P ( X > m + n ∣ X > m ) = P ( X > n ) P(X > m + n \mid X > m) = P(X > n) P ( X > m + n ∣ X > m ) = P ( X > n ) .
Solution. X ∼ G e o ( 1 / 6 ) X \sim \mathrm{Geo}(1/6) X ∼ Geo ( 1/6 ) .
P ( X > 4 ) = ( 5 6 ) 4 = 625 1296 ≈ 0.4823 P(X > 4) = \left(\frac{5}{6}\right)^4 = \frac{625}{1296} \approx 0.4823 P ( X > 4 ) = ( 6 5 ) 4 = 1296 625 ≈ 0.4823
Memoryless property:
P ( X > m + n ∣ X > m ) = P ( X > m + n ) P ( X > m ) = ( 5 / 6 ) m + n ( 5 / 6 ) m = ( 5 6 ) n = P ( X > n ) ■ P(X > m + n \mid X > m) = \frac{P(X > m + n)}{P(X > m)} = \frac{(5/6)^{m+n}}{(5/6)^m} = \left(\frac{5}{6}\right)^n = P(X > n) \quad \blacksquare P ( X > m + n ∣ X > m ) = P ( X > m ) P ( X > m + n ) = ( 5/6 ) m ( 5/6 ) m + n = ( 6 5 ) n = P ( X > n ) ■
Example 7.3: Cumulative Poisson probabilities Problem. Calls arrive at a call centre at a rate of 2.5 per minute. Find the probability that
more than 5 calls arrive in a 3-minute period.
Solution. For a 3-minute period: λ = 2.5 × 3 = 7.5 \lambda = 2.5 \times 3 = 7.5 λ = 2.5 × 3 = 7.5 . X ∼ P o ( 7.5 ) X \sim \mathrm{Po}(7.5) X ∼ Po ( 7.5 ) .
P ( X > 5 ) = 1 − P ( X ≤ 5 ) = 1 − ∑ k = 0 5 e − 7.5 ( 7.5 ) k k ! P(X > 5) = 1 - P(X \leq 5) = 1 - \sum_{k=0}^{5}\frac{e^{-7.5}(7.5)^k}{k!} P ( X > 5 ) = 1 − P ( X ≤ 5 ) = 1 − ∑ k = 0 5 k ! e − 7.5 ( 7.5 ) k
= 1 − e − 7.5 ( 1 + 7.5 + 7.5 2 2 + 7.5 3 6 + 7.5 4 24 + 7.5 5 120 ) = 1 - e^{-7.5}\!\left(1 + 7.5 + \frac{7.5^2}{2} + \frac{7.5^3}{6} + \frac{7.5^4}{24} + \frac{7.5^5}{120}\right) = 1 − e − 7.5 ( 1 + 7.5 + 2 7. 5 2 + 6 7. 5 3 + 24 7. 5 4 + 120 7. 5 5 )
= 1 − e − 7.5 ( 1 + 7.5 + 28.125 + 70.3125 + 131.836 + 197.754 + 197.754 ) = 1 - e^{-7.5}\!\left(1 + 7.5 + 28.125 + 70.3125 + 131.836 + 197.754 + 197.754\right) = 1 − e − 7.5 ( 1 + 7.5 + 28.125 + 70.3125 + 131.836 + 197.754 + 197.754 )
= 1 − e − 7.5 × 633.577 ≈ 1 − 0.554 × 0.634 = 1 − 0.351 = 0.649 = 1 - e^{-7.5} \times 633.577 \approx 1 - 0.554 \times 0.634 = 1 - 0.351 = 0.649 = 1 − e − 7.5 × 633.577 ≈ 1 − 0.554 × 0.634 = 1 − 0.351 = 0.649
Example 7.4: Hypothesis testing with the Poisson distribution Problem. A traffic survey records the number of cars passing a point in 10-second intervals. The
observed frequencies for k k k cars are compared with the expected frequencies under H 0 H_0 H 0 :
X ∼ P o ( 3 ) X \sim \mathrm{Po}(3) X ∼ Po ( 3 ) . Calculate the expected frequency for each value of k k k if 200 intervals
were observed.
Solution. Under H 0 H_0 H 0 : P ( X = k ) = ◆ L B ◆ e − 3 ⋅ 3 k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ P(X = k) = \dfrac◆LB◆e^{-3} \cdot 3^k◆RB◆◆LB◆k!◆RB◆ P ( X = k ) = L ◆ B ◆ e − 3 ⋅ 3 k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ .
k k k P ( X = k ) P(X = k) P ( X = k ) Expected freq (× 200 \times 200 × 200 ) 0 e − 3 = 0.0498 e^{-3} = 0.0498 e − 3 = 0.0498 9.96 1 3 e − 3 = 0.1494 3e^{-3} = 0.1494 3 e − 3 = 0.1494 29.87 2 4.5 e − 3 = 0.2240 4.5e^{-3} = 0.2240 4.5 e − 3 = 0.2240 44.81 3 4.5 e − 3 = 0.2240 4.5e^{-3} = 0.2240 4.5 e − 3 = 0.2240 44.81 4 3.375 e − 3 = 0.1680 3.375e^{-3} = 0.1680 3.375 e − 3 = 0.1680 33.60 5 2.025 e − 3 = 0.1008 2.025e^{-3} = 0.1008 2.025 e − 3 = 0.1008 20.17 ≥ 6 \geq 6 ≥ 6 1 − ∑ 0 5 1 - \sum_0^5 1 − ∑ 0 5 ≈ 16.78 \approx 16.78 ≈ 16.78
Example 7.5: Fitting a Poisson distribution Problem. The number of email messages received per hour is recorded over 100 hours:
{ 0 : 5 , 1 : 15 , 2 : 25 , 3 : 30 , 4 : 15 , 5 : 7 , 6 : 3 } \{0: 5, 1: 15, 2: 25, 3: 30, 4: 15, 5: 7, 6: 3\} { 0 : 5 , 1 : 15 , 2 : 25 , 3 : 30 , 4 : 15 , 5 : 7 , 6 : 3 } . Estimate the parameter λ \lambda λ and calculate
expected frequencies.
Solution.
x ˉ = 0 ( 5 ) + 1 ( 15 ) + 2 ( 25 ) + 3 ( 30 ) + 4 ( 15 ) + 5 ( 7 ) + 6 ( 3 ) 100 = 0 + 15 + 50 + 90 + 60 + 35 + 18 100 = 268 100 = 2.68 \bar{x} = \dfrac{0(5) + 1(15) + 2(25) + 3(30) + 4(15) + 5(7) + 6(3)}{100} = \dfrac{0 + 15 + 50 + 90 + 60 + 35 + 18}{100} = \dfrac{268}{100} = 2.68 x ˉ = 100 0 ( 5 ) + 1 ( 15 ) + 2 ( 25 ) + 3 ( 30 ) + 4 ( 15 ) + 5 ( 7 ) + 6 ( 3 ) = 100 0 + 15 + 50 + 90 + 60 + 35 + 18 = 100 268 = 2.68 .
λ ^ = 2.68 \hat{\lambda} = 2.68 λ ^ = 2.68 .
Expected frequency for k k k : 100 × e − 2.68 ( 2.68 ) k k ! 100 \times \dfrac{e^{-2.68}(2.68)^k}{k!} 100 × k ! e − 2.68 ( 2.68 ) k .
k k k Expected 0 100 e − 2.68 = 6.86 100e^{-2.68} = 6.86 100 e − 2.68 = 6.86 1 100 × 2.68 e − 2.68 = 18.38 100 \times 2.68 e^{-2.68} = 18.38 100 × 2.68 e − 2.68 = 18.38 2 100 × 3.59 e − 2.68 = 24.64 100 \times 3.59 e^{-2.68} = 24.64 100 × 3.59 e − 2.68 = 24.64 3 100 × 3.21 e − 2.68 = 22.02 100 \times 3.21 e^{-2.68} = 22.02 100 × 3.21 e − 2.68 = 22.02 4 100 × 2.15 e − 2.68 = 14.76 100 \times 2.15 e^{-2.68} = 14.76 100 × 2.15 e − 2.68 = 14.76 5 100 × 1.15 e − 2.68 = 7.91 100 \times 1.15 e^{-2.68} = 7.91 100 × 1.15 e − 2.68 = 7.91 6 100 × 0.51 e − 2.68 = 3.52 100 \times 0.51 e^{-2.68} = 3.52 100 × 0.51 e − 2.68 = 3.52
Example 7.6: Conditional probability with geometric distribution Problem. In a game, the probability of winning each round is p = 0.3 p = 0.3 p = 0.3 independently. Given that
a player has not won in the first 5 rounds, find the probability that they win within the next 3
rounds.
Solution. X ∼ G e o ( 0.3 ) X \sim \mathrm{Geo}(0.3) X ∼ Geo ( 0.3 ) . By the memoryless property:
P ( X ≤ 8 ∣ X > 5 ) = P ( X ≤ 3 ) = 1 − ( 0.7 ) 3 = 1 − 0.343 = 0.657 P(X \leq 8 \mid X > 5) = P(X \leq 3) = 1 - (0.7)^3 = 1 - 0.343 = 0.657 P ( X ≤ 8 ∣ X > 5 ) = P ( X ≤ 3 ) = 1 − ( 0.7 ) 3 = 1 − 0.343 = 0.657
Example 7.7: Sum of independent Poisson variables Problem. X ∼ P o ( 3 ) X \sim \mathrm{Po}(3) X ∼ Po ( 3 ) and Y ∼ P o ( 5 ) Y \sim \mathrm{Po}(5) Y ∼ Po ( 5 ) are independent. State the
distribution of X + Y X + Y X + Y and find P ( X + Y = 6 ) P(X + Y = 6) P ( X + Y = 6 ) .
Solution. X + Y ∼ P o ( 3 + 5 ) = P o ( 8 ) X + Y \sim \mathrm{Po}(3 + 5) = \mathrm{Po}(8) X + Y ∼ Po ( 3 + 5 ) = Po ( 8 ) .
P ( X + Y = 6 ) = ◆ L B ◆ e − 8 ⋅ 8 6 ◆ R B ◆◆ L B ◆ 6 ! ◆ R B ◆ = ◆ L B ◆ 262144 ⋅ e − 8 ◆ R B ◆◆ L B ◆ 720 ◆ R B ◆ = ◆ L B ◆ 364.09 ⋅ e − 8 ◆ R B ◆◆ L B ◆ 1 ◆ R B ◆ ≈ 0.1221 P(X + Y = 6) = \frac◆LB◆e^{-8} \cdot 8^6◆RB◆◆LB◆6!◆RB◆ = \frac◆LB◆262144 \cdot e^{-8}◆RB◆◆LB◆720◆RB◆ = \frac◆LB◆364.09 \cdot e^{-8}◆RB◆◆LB◆1◆RB◆ \approx 0.1221 P ( X + Y = 6 ) = L ◆ B ◆ e − 8 ⋅ 8 6 ◆ R B ◆◆ L B ◆6 ! ◆ R B ◆ = L ◆ B ◆262144 ⋅ e − 8 ◆ R B ◆◆ L B ◆720◆ R B ◆ = L ◆ B ◆364.09 ⋅ e − 8 ◆ R B ◆◆ L B ◆1◆ R B ◆ ≈ 0.1221
Example 7.8: Poisson as a limiting case Problem. Prove that if X ∼ B i n ( n , p ) X \sim \mathrm{Bin}(n, p) X ∼ Bin ( n , p ) with λ = n p \lambda = np λ = n p fixed as n → ∞ n \to \infty n → ∞ ,
then P ( X = k ) → ◆ L B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ P(X = k) \to \dfrac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ P ( X = k ) → L ◆ B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ .
Solution.
P ( X = k ) = ( n k ) p k ( 1 − p ) n − k = n ! k ! ( n − k ) ! ⋅ ◆ L B ◆ λ k ◆ R B ◆◆ L B ◆ n k ◆ R B ◆ ⋅ ( 1 − ◆ L B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − k P(X = k) = \binom{n}{k}p^k(1-p)^{n-k} = \frac{n!}{k!(n-k)!}\cdot\frac◆LB◆\lambda^k◆RB◆◆LB◆n^k◆RB◆\cdot\left(1-\frac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-k} P ( X = k ) = ( k n ) p k ( 1 − p ) n − k = k ! ( n − k )! n ! ⋅ L ◆ B ◆ λ k ◆ R B ◆◆ L B ◆ n k ◆ R B ◆ ⋅ ( 1 − L ◆ B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − k
= ◆ L B ◆ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ ⋅ ◆ L B ◆ n ( n − 1 ) ⋯ ( n − k + 1 ) ◆ R B ◆◆ L B ◆ n k ◆ R B ◆ ⋅ ( 1 − ◆ L B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − k = \frac◆LB◆\lambda^k◆RB◆◆LB◆k!◆RB◆\cdot\frac◆LB◆n(n-1)\cdots(n-k+1)◆RB◆◆LB◆n^k◆RB◆\cdot\left(1-\frac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-k} = L ◆ B ◆ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ ⋅ L ◆ B ◆ n ( n − 1 ) ⋯ ( n − k + 1 ) ◆ R B ◆◆ L B ◆ n k ◆ R B ◆ ⋅ ( 1 − L ◆ B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − k
As n → ∞ n \to \infty n → ∞ : ◆ L B ◆ n ( n − 1 ) ⋯ ( n − k + 1 ) ◆ R B ◆◆ L B ◆ n k ◆ R B ◆ → 1 \dfrac◆LB◆n(n-1)\cdots(n-k+1)◆RB◆◆LB◆n^k◆RB◆ \to 1 L ◆ B ◆ n ( n − 1 ) ⋯ ( n − k + 1 ) ◆ R B ◆◆ L B ◆ n k ◆ R B ◆ → 1 and
( 1 − ◆ L B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − k → e − λ \left(1-\dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-k} \to e^{-\lambda} ( 1 − L ◆ B ◆ λ ◆ R B ◆◆ L B ◆ n ◆ R B ◆ ) n − k → e − λ .
Therefore P ( X = k ) → ◆ L B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ P(X = k) \to \dfrac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ P ( X = k ) → L ◆ B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ . ■ \blacksquare ■
8. Connections to Other Topics 8.1 Poisson distribution and exponential distribution If events occur according to a Poisson process with rate λ \lambda λ , the time between consecutive
events follows the exponential distribution E x p ( λ ) \mathrm{Exp}(\lambda) Exp ( λ ) . See
Exponential and Continuous Random Variables .
8.2 Geometric distribution and series summation The probability generating function G X ( t ) = p t 1 − q t G_X(t) = \dfrac{pt}{1-qt} G X ( t ) = 1 − q t pt of the geometric distribution
connects to the summation of geometric series. See
Further Algebra .
8.3 Poisson and hypothesis testing Goodness-of-fit tests using the chi-squared statistic compare observed and expected (Poisson)
frequencies. See
Chi-Squared Tests .
9. Additional Exam-Style Questions Question 11 A shop receives on average 4 customers per hour. Find the probability that: (a) Exactly 3
customers arrive in a given hour. (b) More than 2 customers arrive in a 30-minute period.
Solution (a) X ∼ P o ( 4 ) X \sim \mathrm{Po}(4) X ∼ Po ( 4 ) .
P ( X = 3 ) = ◆ L B ◆ e − 4 ⋅ 64 ◆ R B ◆◆ L B ◆ 6 ◆ R B ◆ = 32 3 e 4 ≈ 0.1954 P(X = 3) = \frac◆LB◆e^{-4}\cdot 64◆RB◆◆LB◆6◆RB◆ = \frac{32}{3e^4} \approx 0.1954 P ( X = 3 ) = L ◆ B ◆ e − 4 ⋅ 64◆ R B ◆◆ L B ◆6◆ R B ◆ = 3 e 4 32 ≈ 0.1954
(b) For 30 minutes: Y ∼ P o ( 2 ) Y \sim \mathrm{Po}(2) Y ∼ Po ( 2 ) .
P ( Y > 2 ) = 1 − P ( Y ≤ 2 ) = 1 − e − 2 ( 1 + 2 + 2 ) = 1 − 5 e − 2 ≈ 0.3233 P(Y > 2) = 1 - P(Y \leq 2) = 1 - e^{-2}(1 + 2 + 2) = 1 - 5e^{-2} \approx 0.3233 P ( Y > 2 ) = 1 − P ( Y ≤ 2 ) = 1 − e − 2 ( 1 + 2 + 2 ) = 1 − 5 e − 2 ≈ 0.3233
Question 12 A coin is tossed until the first head appears. The probability of heads is p p p .
(a) Find E ( X ) E(X) E ( X ) and V a r ( X ) \mathrm{Var}(X) Var ( X ) where X X X is the number of tosses needed.
(b) Find the probability that X X X is even.
Solution (a) X ∼ G e o ( p ) X \sim \mathrm{Geo}(p) X ∼ Geo ( p ) : E ( X ) = 1 / p E(X) = 1/p E ( X ) = 1/ p , V a r ( X ) = ( 1 − p ) / p 2 \mathrm{Var}(X) = (1-p)/p^2 Var ( X ) = ( 1 − p ) / p 2 .
(b) P ( X is even ) = P ( X = 2 ) + P ( X = 4 ) + P ( X = 6 ) + ⋯ P(X \text{ is even}) = P(X = 2) + P(X = 4) + P(X = 6) + \cdots P ( X is even ) = P ( X = 2 ) + P ( X = 4 ) + P ( X = 6 ) + ⋯
= q p + q 3 p + q 5 p + ⋯ = q p ( 1 + q 2 + q 4 + ⋯ ) = q p ⋅ 1 1 − q 2 = q p ( 1 − q ) ( 1 + q ) = q 1 + q = qp + q^3p + q^5p + \cdots = qp(1 + q^2 + q^4 + \cdots) = qp \cdot \frac{1}{1 - q^2} = \frac{qp}{(1-q)(1+q)} = \frac{q}{1+q} = q p + q 3 p + q 5 p + ⋯ = q p ( 1 + q 2 + q 4 + ⋯ ) = q p ⋅ 1 − q 2 1 = ( 1 − q ) ( 1 + q ) q p = 1 + q q
Question 13 Prove that if X 1 , X 2 , … , X n X_1, X_2, \ldots, X_n X 1 , X 2 , … , X n are independent with X i ∼ P o ( λ i ) X_i \sim \mathrm{Po}(\lambda_i) X i ∼ Po ( λ i ) ,
then S = ∑ X i ∼ P o ( ∑ λ i ) S = \sum X_i \sim \mathrm{Po}\!\left(\sum \lambda_i\right) S = ∑ X i ∼ Po ( ∑ λ i ) .
Solution The probability generating function of X i ∼ P o ( λ i ) X_i \sim \mathrm{Po}(\lambda_i) X i ∼ Po ( λ i ) is
G X i ( t ) = e λ i ( t − 1 ) G_{X_i}(t) = e^{\lambda_i(t-1)} G X i ( t ) = e λ i ( t − 1 ) .
For independent random variables, the PGF of the sum is the product:
G S ( t ) = ∏ i = 1 n e λ i ( t − 1 ) = e ( t − 1 ) ∑ λ i G_S(t) = \prod_{i=1}^{n}e^{\lambda_i(t-1)} = e^{(t-1)\sum\lambda_i} G S ( t ) = ∏ i = 1 n e λ i ( t − 1 ) = e ( t − 1 ) ∑ λ i
This is the PGF of P o ( ∑ λ i ) \mathrm{Po}\!\left(\sum\lambda_i\right) Po ( ∑ λ i ) . Therefore
S ∼ P o ( ∑ λ i ) S \sim \mathrm{Po}\!\left(\sum\lambda_i\right) S ∼ Po ( ∑ λ i ) . ■ \blacksquare ■
Question 14 A typist makes on average 2 errors per page. Find the probability that a particular page has:
(a) No errors. (b) At most 3 errors. (c) Exactly 2 errors given that it has at most 3
errors.
Solution X ∼ P o ( 2 ) X \sim \mathrm{Po}(2) X ∼ Po ( 2 ) .
(a) P ( X = 0 ) = e − 2 ≈ 0.1353 P(X = 0) = e^{-2} \approx 0.1353 P ( X = 0 ) = e − 2 ≈ 0.1353 .
(b) P ( X ≤ 3 ) = e − 2 ( 1 + 2 + 2 + 4 / 3 ) = e − 2 ⋅ 19 / 3 ≈ 0.8571 P(X \leq 3) = e^{-2}(1 + 2 + 2 + 4/3) = e^{-2} \cdot 19/3 \approx 0.8571 P ( X ≤ 3 ) = e − 2 ( 1 + 2 + 2 + 4/3 ) = e − 2 ⋅ 19/3 ≈ 0.8571 .
(c)
P ( X = 2 ∣ X ≤ 3 ) = ◆ L B ◆ P ( X = 2 ) ◆ R B ◆◆ L B ◆ P ( X ≤ 3 ) ◆ R B ◆ = 2 e − 2 19 e − 2 / 3 = 6 19 ≈ 0.3158 P(X = 2 \mid X \leq 3) = \dfrac◆LB◆P(X = 2)◆RB◆◆LB◆P(X \leq 3)◆RB◆ = \dfrac{2e^{-2}}{19e^{-2}/3} = \dfrac{6}{19} \approx 0.3158 P ( X = 2 ∣ X ≤ 3 ) = L ◆ B ◆ P ( X = 2 ) ◆ R B ◆◆ L B ◆ P ( X ≤ 3 ) ◆ R B ◆ = 19 e − 2 /3 2 e − 2 = 19 6 ≈ 0.3158 .
Question 15 The number of radioactive decays per second from a sample is modelled by
X ∼ P o ( λ ) X \sim \mathrm{Po}(\lambda) X ∼ Po ( λ ) . Over 50 seconds, 145 decays are observed.
(a) Estimate λ \lambda λ .
(b) Using your estimate, find the probability of observing exactly 3 decays in a 1-second
interval.
Solution (a) λ ^ = 145 / 50 = 2.9 \hat{\lambda} = 145/50 = 2.9 λ ^ = 145/50 = 2.9 per second.
(b)
P ( X = 3 ) = e − 2.9 ( 2.9 ) 3 6 = ◆ L B ◆ 24.389 ⋅ e − 2.9 ◆ R B ◆◆ L B ◆ 6 ◆ R B ◆ ≈ 0.2227 P(X = 3) = \dfrac{e^{-2.9}(2.9)^3}{6} = \dfrac◆LB◆24.389 \cdot e^{-2.9}◆RB◆◆LB◆6◆RB◆ \approx 0.2227 P ( X = 3 ) = 6 e − 2.9 ( 2.9 ) 3 = L ◆ B ◆24.389 ⋅ e − 2.9 ◆ R B ◆◆ L B ◆6◆ R B ◆ ≈ 0.2227 .
8. Advanced Worked Examples
Example 8.1: Poisson as a limit of the binomial
Problem. A factory produces items with a defect rate of 0.002. In a batch of 1000, find the
probability of exactly 3 defects using (a) the binomial distribution and (b) the Poisson
approximation.
Solution. (a) X ∼ B ( 1000 , 0.002 ) X \sim B(1000, 0.002) X ∼ B ( 1000 , 0.002 ) :
P ( X = 3 ) = ( 1000 3 ) ( 0.002 ) 3 ( 0.998 ) 997 ≈ 0.1814 P(X=3) = \binom{1000}{3}(0.002)^3(0.998)^{997} \approx 0.1814 P ( X = 3 ) = ( 3 1000 ) ( 0.002 ) 3 ( 0.998 ) 997 ≈ 0.1814 .
(b) λ = n p = 2 \lambda = np = 2 λ = n p = 2 . X ≈ P o ( 2 ) X \approx \mathrm{Po}(2) X ≈ Po ( 2 ) :
P ( X = 3 ) = ◆ L B ◆ e − 2 ⋅ 8 ◆ R B ◆◆ L B ◆ 6 ◆ R B ◆ ≈ 0.1804 P(X=3) = \dfrac◆LB◆e^{-2} \cdot 8◆RB◆◆LB◆6◆RB◆ \approx 0.1804 P ( X = 3 ) = L ◆ B ◆ e − 2 ⋅ 8◆ R B ◆◆ L B ◆6◆ R B ◆ ≈ 0.1804 .
The approximation is excellent (error < 0.6 % < 0.6\% < 0.6% ).
Example 8.2: Sum of independent Poisson random variables
Problem. Emails arrive at a rate of 5 per hour and texts at 3 per hour. Find the probability
that the total number of messages in a 2-hour period exceeds 20.
Solution. In 2 hours: emails ∼ P o ( 10 ) \sim \mathrm{Po}(10) ∼ Po ( 10 ) , texts ∼ P o ( 6 ) \sim \mathrm{Po}(6) ∼ Po ( 6 ) .
Total messages = P o ( 10 + 6 ) = P o ( 16 ) = \mathrm{Po}(10+6) = \mathrm{Po}(16) = Po ( 10 + 6 ) = Po ( 16 ) .
P ( X > 20 ) = 1 − P ( X ≤ 20 ) = 1 − ∑ k = 0 20 ◆ L B ◆ e − 16 ⋅ 16 k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ ≈ 1 − 0.8688 = 0.131 P(X > 20) = 1 - P(X \leq 20) = 1 - \sum_{k=0}^{20} \frac◆LB◆e^{-16} \cdot 16^k◆RB◆◆LB◆k!◆RB◆ \approx 1 - 0.8688 = \boxed{0.131} P ( X > 20 ) = 1 − P ( X ≤ 20 ) = 1 − ∑ k = 0 20 L ◆ B ◆ e − 16 ⋅ 1 6 k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ ≈ 1 − 0.8688 = 0.131
Example 8.3: Conditional probability with the geometric distribution
Problem. X ∼ G e o ( 0.3 ) X \sim \mathrm{Geo}(0.3) X ∼ Geo ( 0.3 ) . Find P ( X > 4 ∣ X > 2 ) P(X > 4 \mid X > 2) P ( X > 4 ∣ X > 2 ) .
Solution. The geometric distribution has the memoryless property:
P ( X > 4 ∣ X > 2 ) = P ( X > 2 ) = ( 1 − 0.3 ) 2 = 0.49 P(X > 4 \mid X > 2) = P(X > 2) = (1-0.3)^2 = 0.49 P ( X > 4 ∣ X > 2 ) = P ( X > 2 ) = ( 1 − 0.3 ) 2 = 0.49
Verification: P ( X > 4 ) = 0.7 4 = 0.2401 P(X > 4) = 0.7^4 = 0.2401 P ( X > 4 ) = 0. 7 4 = 0.2401 , P ( X > 2 ) = 0.49 P(X > 2) = 0.49 P ( X > 2 ) = 0.49 .
P ( X > 4 ∣ X > 2 ) = 0.2401 0.49 = 0.49 P(X>4 \mid X>2) = \dfrac{0.2401}{0.49} = 0.49 P ( X > 4 ∣ X > 2 ) = 0.49 0.2401 = 0.49 . ✓
Example 8.4: Poisson hypothesis testing
Problem. A call centre claims an average of 6 calls per minute. In a 10-minute period, 72 calls
are received. Test at the 5% level whether the rate has increased.
Solution. H 0 H_0 H 0 : λ = 6 \lambda = 6 λ = 6 per minute. H 1 H_1 H 1 : λ > 6 \lambda > 6 λ > 6 .
Under H 0 H_0 H 0 , total calls in 10 minutes ∼ P o ( 60 ) \sim \mathrm{Po}(60) ∼ Po ( 60 ) .
For large λ \lambda λ , approximate with N ( 60 , 60 ) N(60, 60) N ( 60 , 60 ) .
P ( X ≥ 72 ) ≈ P ( Z ≥ ◆ L B ◆ 71.5 − 60 ◆ R B ◆◆ L B ◆ 60 ◆ R B ◆ ) = P ( Z ≥ 1.485 ) = 1 − 0.9311 = 0.069 P(X \geq 72) \approx P\!\left(Z \geq \frac◆LB◆71.5 - 60◆RB◆◆LB◆\sqrt{60}◆RB◆\right) = P(Z \geq 1.485) = 1 - 0.9311 = 0.069 P ( X ≥ 72 ) ≈ P ( Z ≥ L ◆ B ◆71.5 − 60◆ R B ◆◆ L B ◆ 60 ◆ R B ◆ ) = P ( Z ≥ 1.485 ) = 1 − 0.9311 = 0.069
(using continuity correction).
0.069 > 0.05 0.069 > 0.05 0.069 > 0.05 : do not reject H 0 H_0 H 0 . Insufficient evidence that the rate has increased.
Example 8.5: Mode of the Poisson distribution
Problem. Find the mode of the Poisson distribution with parameter λ \lambda λ .
Solution. The mode m m m satisfies P ( X = m ) ≥ P ( X = m − 1 ) P(X = m) \geq P(X = m-1) P ( X = m ) ≥ P ( X = m − 1 ) and P ( X = m ) ≥ P ( X = m + 1 ) P(X = m) \geq P(X = m+1) P ( X = m ) ≥ P ( X = m + 1 ) .
◆ L B ◆ e − λ λ m ◆ R B ◆◆ L B ◆ m ! ◆ R B ◆ ≥ ◆ L B ◆ e − λ λ m − 1 ◆ R B ◆◆ L B ◆ ( m − 1 ) ! ◆ R B ◆ ⟹ ◆ L B ◆ λ ◆ R B ◆◆ L B ◆ m ◆ R B ◆ ≥ 1 ⟹ m ≤ λ \frac◆LB◆e^{-\lambda}\lambda^m◆RB◆◆LB◆m!◆RB◆ \geq \frac◆LB◆e^{-\lambda}\lambda^{m-1}◆RB◆◆LB◆(m-1)!◆RB◆ \implies \frac◆LB◆\lambda◆RB◆◆LB◆m◆RB◆ \geq 1 \implies m \leq \lambda L ◆ B ◆ e − λ λ m ◆ R B ◆◆ L B ◆ m ! ◆ R B ◆ ≥ L ◆ B ◆ e − λ λ m − 1 ◆ R B ◆◆ L B ◆ ( m − 1 )! ◆ R B ◆ ⟹ L ◆ B ◆ λ ◆ R B ◆◆ L B ◆ m ◆ R B ◆ ≥ 1 ⟹ m ≤ λ
◆ L B ◆ e − λ λ m ◆ R B ◆◆ L B ◆ m ! ◆ R B ◆ ≥ ◆ L B ◆ e − λ λ m + 1 ◆ R B ◆◆ L B ◆ ( m + 1 ) ! ◆ R B ◆ ⟹ ◆ L B ◆ m + 1 ◆ R B ◆◆ L B ◆ λ ◆ R B ◆ ≥ 1 ⟹ m ≥ λ − 1 \frac◆LB◆e^{-\lambda}\lambda^m◆RB◆◆LB◆m!◆RB◆ \geq \frac◆LB◆e^{-\lambda}\lambda^{m+1}◆RB◆◆LB◆(m+1)!◆RB◆ \implies \frac◆LB◆m+1◆RB◆◆LB◆\lambda◆RB◆ \geq 1 \implies m \geq \lambda - 1 L ◆ B ◆ e − λ λ m ◆ R B ◆◆ L B ◆ m ! ◆ R B ◆ ≥ L ◆ B ◆ e − λ λ m + 1 ◆ R B ◆◆ L B ◆ ( m + 1 )! ◆ R B ◆ ⟹ L ◆ B ◆ m + 1◆ R B ◆◆ L B ◆ λ ◆ R B ◆ ≥ 1 ⟹ m ≥ λ − 1
So λ − 1 ≤ m ≤ λ \lambda - 1 \leq m \leq \lambda λ − 1 ≤ m ≤ λ , meaning the mode is ⌊ λ ⌋ \lfloor\lambda\rfloor ⌊ λ ⌋ (and also
λ \lambda λ if λ \lambda λ is an integer).
Example 8.6: Relationship between Poisson and exponential
Problem. Events occur according to a Poisson process with rate λ = 4 \lambda = 4 λ = 4 per hour. Find the
probability that the time between two consecutive events exceeds 30 minutes.
Solution. For a Poisson process with rate λ \lambda λ , the inter-arrival time
T ∼ E x p ( λ ) T \sim \mathrm{Exp}(\lambda) T ∼ Exp ( λ ) .
P ( T > 0.5 ) = e − 4 × 0.5 = e − 2 ≈ 0.135 P(T > 0.5) = e^{-4 \times 0.5} = e^{-2} \approx \boxed{0.135} P ( T > 0.5 ) = e − 4 × 0.5 = e − 2 ≈ 0.135
Example 8.7: Variance of the geometric distribution
Problem. Derive V a r ( X ) \mathrm{Var}(X) Var ( X ) for X ∼ G e o ( p ) X \sim \mathrm{Geo}(p) X ∼ Geo ( p ) , defined as the number of trials
until the first success.
Solution. E ( X ) = 1 p E(X) = \dfrac{1}{p} E ( X ) = p 1 . Using V a r ( X ) = E ( X 2 ) − [ E ( X ) ] 2 \mathrm{Var}(X) = E(X^2) - [E(X)]^2 Var ( X ) = E ( X 2 ) − [ E ( X ) ] 2 :
E ( X 2 ) = ∑ k = 1 ∞ k 2 p ( 1 − p ) k − 1 E(X^2) = \displaystyle\sum_{k=1}^{\infty} k^2 p(1-p)^{k-1} E ( X 2 ) = k = 1 ∑ ∞ k 2 p ( 1 − p ) k − 1 .
Using the identity ∑ k = 1 ∞ k 2 r k − 1 = 1 + r ( 1 − r ) 3 \displaystyle\sum_{k=1}^{\infty} k^2 r^{k-1} = \frac{1+r}{(1-r)^3} k = 1 ∑ ∞ k 2 r k − 1 = ( 1 − r ) 3 1 + r with
r = 1 − p r = 1-p r = 1 − p :
E ( X 2 ) = p ( 2 − p ) p 3 = 2 − p p 2 E(X^2) = \frac{p(2-p)}{p^3} = \frac{2-p}{p^2} E ( X 2 ) = p 3 p ( 2 − p ) = p 2 2 − p
V a r ( X ) = 2 − p p 2 − 1 p 2 = 1 − p p 2 \mathrm{Var}(X) = \frac{2-p}{p^2} - \frac{1}{p^2} = \boxed{\frac{1-p}{p^2}} Var ( X ) = p 2 2 − p − p 2 1 = p 2 1 − p
9. Common Pitfalls
Pitfall Correct Approach Confusing the two definitions of the geometric distribution "Number of trials until first success": E ( X ) = 1 / p E(X) = 1/p E ( X ) = 1/ p ; "Number of failures before first success": E ( X ) = ( 1 − p ) / p E(X) = (1-p)/p E ( X ) = ( 1 − p ) / p Using the Poisson approximation when n p > 10 np > 10 n p > 10 or n < 20 n < 20 n < 20 The Poisson approximation requires n n n large and p p p small, with n p np n p moderate Forgetting that Poisson probabilities sum to 1 only over all k k k from 0 to ∞ \infty ∞ Never truncate without adjusting Applying the Poisson to events that are not independent The Poisson process requires independent events at a constant average rate
10. Additional Exam-Style Questions
Question 8
A typist makes an average of 2 errors per page. Find the probability that a 3-page document contains
exactly 5 errors.
Solution Total errors ∼ P o ( 6 ) \sim \mathrm{Po}(6) ∼ Po ( 6 ) .
P ( X = 5 ) = ◆ L B ◆ e − 6 ⋅ 6 5 ◆ R B ◆◆ L B ◆ 120 ◆ R B ◆ = ◆ L B ◆ 7776 ⋅ e − 6 ◆ R B ◆◆ L B ◆ 120 ◆ R B ◆ ≈ 0.1606 P(X = 5) = \frac◆LB◆e^{-6} \cdot 6^5◆RB◆◆LB◆120◆RB◆ = \frac◆LB◆7776 \cdot e^{-6}◆RB◆◆LB◆120◆RB◆ \approx \boxed{0.1606} P ( X = 5 ) = L ◆ B ◆ e − 6 ⋅ 6 5 ◆ R B ◆◆ L B ◆120◆ R B ◆ = L ◆ B ◆7776 ⋅ e − 6 ◆ R B ◆◆ L B ◆120◆ R B ◆ ≈ 0.1606
Question 9
Prove that for X ∼ P o ( λ ) X \sim \mathrm{Po}(\lambda) X ∼ Po ( λ ) , E ( X ) = λ E(X) = \lambda E ( X ) = λ .
Solution E ( X ) = ∑ k = 0 ∞ k ⋅ ◆ L B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = ∑ k = 1 ∞ ◆ L B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ ( k − 1 ) ! ◆ R B ◆ E(X) = \sum_{k=0}^{\infty} k \cdot \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ = \sum_{k=1}^{\infty} \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆(k-1)!◆RB◆ E ( X ) = ∑ k = 0 ∞ k ⋅ L ◆ B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = ∑ k = 1 ∞ L ◆ B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ ( k − 1 )! ◆ R B ◆
Let j = k − 1 j = k-1 j = k − 1 :
= λ e − λ ∑ j = 0 ∞ ◆ L B ◆ λ j ◆ R B ◆◆ L B ◆ j ! ◆ R B ◆ = λ e − λ ⋅ e λ = λ = \lambda e^{-\lambda} \sum_{j=0}^{\infty} \frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆ = \lambda e^{-\lambda} \cdot e^{\lambda} = \lambda = λ e − λ ∑ j = 0 ∞ L ◆ B ◆ λ j ◆ R B ◆◆ L B ◆ j ! ◆ R B ◆ = λ e − λ ⋅ e λ = λ
■ \blacksquare ■
Question 10
X ∼ G e o ( 0.25 ) X \sim \mathrm{Geo}(0.25) X ∼ Geo ( 0.25 ) . Find P ( X ≤ 5 ) P(X \leq 5) P ( X ≤ 5 ) and P ( X > 3 ) P(X > 3) P ( X > 3 ) .
Solution P ( X ≤ 5 ) = 1 − P ( X > 5 ) = 1 − ( 1 − 0.25 ) 5 = 1 − 0.75 5 = 1 − 0.2373 = 0.7627 P(X \leq 5) = 1 - P(X > 5) = 1 - (1-0.25)^5 = 1 - 0.75^5 = 1 - 0.2373 = \boxed{0.7627} P ( X ≤ 5 ) = 1 − P ( X > 5 ) = 1 − ( 1 − 0.25 ) 5 = 1 − 0.7 5 5 = 1 − 0.2373 = 0.7627 .
P ( X > 3 ) = 0.75 3 = 0.4219 P(X > 3) = 0.75^3 = \boxed{0.4219} P ( X > 3 ) = 0.7 5 3 = 0.4219 .
11. Connections to Other Topics
11.1 Poisson process and exponential distribution
The inter-arrival times of a Poisson process follow the exponential distribution. If events occur at
rate λ \lambda λ per unit time, the time between consecutive events is E x p ( λ ) \mathrm{Exp}(\lambda) Exp ( λ ) . See
Exponential and Continuous Random Variables .
11.2 Poisson and binomial
The Poisson distribution approximates the binomial when n n n is large and p p p is small, with
λ = n p \lambda = np λ = n p .
11.3 Poisson and chi-squared tests
The chi-squared goodness-of-fit test is used to test whether data follows a Poisson or geometric
distribution. See
Chi-Squared Tests .
12. Key Results Summary
Distribution PMF E ( X ) E(X) E ( X ) V a r ( X ) \mathrm{Var}(X) Var ( X ) P o ( λ ) \mathrm{Po}(\lambda) Po ( λ ) P ( X = x ) = ◆ L B ◆ e − λ λ x ◆ R B ◆◆ L B ◆ x ! ◆ R B ◆ P(X=x) = \dfrac◆LB◆e^{-\lambda}\lambda^x◆RB◆◆LB◆x!◆RB◆ P ( X = x ) = L ◆ B ◆ e − λ λ x ◆ R B ◆◆ L B ◆ x ! ◆ R B ◆ λ \lambda λ λ \lambda λ G e o ( p ) \mathrm{Geo}(p) Geo ( p ) (trials)P ( X = x ) = p ( 1 − p ) x − 1 P(X=x) = p(1-p)^{x-1} P ( X = x ) = p ( 1 − p ) x − 1 1 p \dfrac{1}{p} p 1 1 − p p 2 \dfrac{1-p}{p^2} p 2 1 − p G e o ( p ) \mathrm{Geo}(p) Geo ( p ) (failures)P ( X = x ) = p ( 1 − p ) x P(X=x) = p(1-p)^x P ( X = x ) = p ( 1 − p ) x 1 − p p \dfrac{1-p}{p} p 1 − p 1 − p p 2 \dfrac{1-p}{p^2} p 2 1 − p
Property Poisson Geometric Memoryless No Yes Additive: X 1 + X 2 X_1+X_2 X 1 + X 2 P o ( λ 1 + λ 2 ) \mathrm{Po}(\lambda_1+\lambda_2) Po ( λ 1 + λ 2 ) if independentNot simple PMF tail behaviour Decays faster than geometric Slower decay
13. Further Exam-Style Questions
Question 11
A shop receives customers at a rate of 8 per hour. Find the probability that: (a) exactly 5
customers arrive in a 30-minute period; (b) more than 10 customers arrive in an hour; (c) the time
between two consecutive arrivals exceeds 20 minutes.
Solution (a) λ = 8 × 0.5 = 4 \lambda = 8 \times 0.5 = 4 λ = 8 × 0.5 = 4 .
P ( X = 5 ) = ◆ L B ◆ e − 4 ⋅ 1024 ◆ R B ◆◆ L B ◆ 120 ◆ R B ◆ ≈ 0.1563 P(X=5) = \dfrac◆LB◆e^{-4} \cdot 1024◆RB◆◆LB◆120◆RB◆ \approx \boxed{0.1563} P ( X = 5 ) = L ◆ B ◆ e − 4 ⋅ 1024◆ R B ◆◆ L B ◆120◆ R B ◆ ≈ 0.1563 .
(b) λ = 8 \lambda = 8 λ = 8 .
P ( X > 10 ) = 1 − P ( X ≤ 10 ) = 1 − ∑ k = 0 10 ◆ L B ◆ e − 8 ⋅ 8 k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ ≈ 1 − 0.8159 = 0.184 P(X > 10) = 1 - P(X \leq 10) = 1 - \sum_{k=0}^{10}\dfrac◆LB◆e^{-8} \cdot 8^k◆RB◆◆LB◆k!◆RB◆ \approx 1 - 0.8159 = \boxed{0.184} P ( X > 10 ) = 1 − P ( X ≤ 10 ) = 1 − ∑ k = 0 10 L ◆ B ◆ e − 8 ⋅ 8 k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ ≈ 1 − 0.8159 = 0.184 .
(c) Inter-arrival time T ∼ E x p ( 8 ) T \sim \mathrm{Exp}(8) T ∼ Exp ( 8 ) . P ( T > 1 / 3 ) = e − 8 / 3 ≈ 0.0695 P(T > 1/3) = e^{-8/3} \approx \boxed{0.0695} P ( T > 1/3 ) = e − 8/3 ≈ 0.0695 .
Question 12
Prove that for X ∼ P o ( λ ) X \sim \mathrm{Po}(\lambda) X ∼ Po ( λ ) , V a r ( X ) = λ \mathrm{Var}(X) = \lambda Var ( X ) = λ .
Solution E ( X 2 ) = ∑ k = 0 ∞ k 2 ⋅ ◆ L B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = ∑ k = 1 ∞ k ⋅ ◆ L B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ ( k − 1 ) ! ◆ R B ◆ = λ e − λ ∑ j = 0 ∞ ( j + 1 ) ◆ L B ◆ λ j ◆ R B ◆◆ L B ◆ j ! ◆ R B ◆ E(X^2) = \displaystyle\sum_{k=0}^{\infty} k^2 \cdot \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ = \sum_{k=1}^{\infty} k \cdot \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆(k-1)!◆RB◆ = \lambda e^{-\lambda}\sum_{j=0}^{\infty}(j+1)\frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆ E ( X 2 ) = k = 0 ∑ ∞ k 2 ⋅ L ◆ B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ = k = 1 ∑ ∞ k ⋅ L ◆ B ◆ e − λ λ k ◆ R B ◆◆ L B ◆ ( k − 1 )! ◆ R B ◆ = λ e − λ j = 0 ∑ ∞ ( j + 1 ) L ◆ B ◆ λ j ◆ R B ◆◆ L B ◆ j ! ◆ R B ◆
= λ e − λ ( ∑ j = 0 ∞ j ◆ L B ◆ λ j ◆ R B ◆◆ L B ◆ j ! ◆ R B ◆ + ∑ j = 0 ∞ ◆ L B ◆ λ j ◆ R B ◆◆ L B ◆ j ! ◆ R B ◆ ) = λ e − λ ( λ e λ + e λ ) = λ ( λ + 1 ) = λ 2 + λ = \lambda e^{-\lambda}\!\left(\sum_{j=0}^{\infty} j\frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆ + \sum_{j=0}^{\infty}\frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆\right) = \lambda e^{-\lambda}(\lambda e^{\lambda} + e^{\lambda}) = \lambda(\lambda+1) = \lambda^2+\lambda = λ e − λ ( ∑ j = 0 ∞ j L ◆ B ◆ λ j ◆ R B ◆◆ L B ◆ j ! ◆ R B ◆ + ∑ j = 0 ∞ L ◆ B ◆ λ j ◆ R B ◆◆ L B ◆ j ! ◆ R B ◆ ) = λ e − λ ( λ e λ + e λ ) = λ ( λ + 1 ) = λ 2 + λ .
V a r ( X ) = E ( X 2 ) − [ E ( X ) ] 2 = λ 2 + λ − λ 2 = λ \mathrm{Var}(X) = E(X^2)-[E(X)]^2 = \lambda^2+\lambda-\lambda^2 = \boxed{\lambda} Var ( X ) = E ( X 2 ) − [ E ( X ) ] 2 = λ 2 + λ − λ 2 = λ . ■ \blacksquare ■
14. Advanced Topics
14.1 Compound Poisson process
If events of type A A A occur at rate λ A \lambda_A λ A and type B B B at rate λ B \lambda_B λ B , independently,
then the total event process is Poisson with rate λ A + λ B \lambda_A + \lambda_B λ A + λ B .
14.2 Poisson distribution and the Poisson point process
A Poisson point process in 2D with rate λ \lambda λ per unit area has the property that the number of
points in a region of area A A A follows P o ( λ A ) \mathrm{Po}(\lambda A) Po ( λ A ) .
14.3 The geometric distribution as a special case of the negative binomial
The negative binomial distribution counts the number of trials until r r r successes. The geometric
distribution is the case r = 1 r = 1 r = 1 .
N e g B i n ( r , p ) \mathrm{NegBin}(r, p) NegBin ( r , p ) : P ( X = n ) = ( n − 1 r − 1 ) p r ( 1 − p ) n − r P(X = n) = \binom{n-1}{r-1}p^r(1-p)^{n-r} P ( X = n ) = ( r − 1 n − 1 ) p r ( 1 − p ) n − r for n = r , r + 1 , … n = r, r+1, \ldots n = r , r + 1 , …
14.4 Relationship to exponential families
Both the Poisson and geometric distributions belong to the exponential family of distributions,
which have PDF/PMF of the form f ( x ; θ ) = h ( x ) exp ( η ( θ ) T ( x ) − A ( θ ) ) f(x;\theta) = h(x)\exp(\eta(\theta)T(x) - A(\theta)) f ( x ; θ ) = h ( x ) exp ( η ( θ ) T ( x ) − A ( θ )) .
15. Further Exam-Style Questions
Question 13
A radioactive source emits particles at a rate of 12 per minute. Find the probability that in a
2-minute period, the number of particles emitted is between 20 and 30 (inclusive).
Solution λ = 24 \lambda = 24 λ = 24 per 2 minutes. X ∼ P o ( 24 ) X \sim \mathrm{Po}(24) X ∼ Po ( 24 ) .
P ( 20 ≤ X ≤ 30 ) = P ( X ≤ 30 ) − P ( X ≤ 19 ) P(20 \leq X \leq 30) = P(X \leq 30) - P(X \leq 19) P ( 20 ≤ X ≤ 30 ) = P ( X ≤ 30 ) − P ( X ≤ 19 ) .
Using the normal approximation: X ≈ N ( 24 , 24 ) X \approx N(24, 24) X ≈ N ( 24 , 24 ) .
P ( 19.5 < X < 30.5 ) ≈ P ( ◆ L B ◆ 19.5 − 24 ◆ R B ◆◆ L B ◆ 24 ◆ R B ◆ < Z < ◆ L B ◆ 30.5 − 24 ◆ R B ◆◆ L B ◆ 24 ◆ R B ◆ ) P(19.5 < X < 30.5) \approx P\!\left(\dfrac◆LB◆19.5-24◆RB◆◆LB◆\sqrt{24}◆RB◆ < Z < \dfrac◆LB◆30.5-24◆RB◆◆LB◆\sqrt{24}◆RB◆\right) P ( 19.5 < X < 30.5 ) ≈ P ( L ◆ B ◆19.5 − 24◆ R B ◆◆ L B ◆ 24 ◆ R B ◆ < Z < L ◆ B ◆30.5 − 24◆ R B ◆◆ L B ◆ 24 ◆ R B ◆ )
= P ( − 0.919 < Z < 1.327 ) = Φ ( 1.327 ) − Φ ( − 0.919 ) = 0.908 − 0.179 = 0.729 = P(-0.919 < Z < 1.327) = \Phi(1.327) - \Phi(-0.919) = 0.908 - 0.179 = \boxed{0.729} = P ( − 0.919 < Z < 1.327 ) = Φ ( 1.327 ) − Φ ( − 0.919 ) = 0.908 − 0.179 = 0.729
Question 14
Prove that if X X X and Y Y Y are independent with X ∼ G e o ( p ) X \sim \mathrm{Geo}(p) X ∼ Geo ( p ) and
Y ∼ G e o ( p ) Y \sim \mathrm{Geo}(p) Y ∼ Geo ( p ) , then min ( X , Y ) ∼ G e o ( 1 − ( 1 − p ) 2 ) \min(X,Y) \sim \mathrm{Geo}(1-(1-p)^2) min ( X , Y ) ∼ Geo ( 1 − ( 1 − p ) 2 ) .
Solution P ( min ( X , Y ) > n ) = P ( X > n ) P ( Y > n ) = ( 1 − p ) n ⋅ ( 1 − p ) n = ( 1 − p ) 2 n P(\min(X,Y) > n) = P(X > n)P(Y > n) = (1-p)^n \cdot (1-p)^n = (1-p)^{2n} P ( min ( X , Y ) > n ) = P ( X > n ) P ( Y > n ) = ( 1 − p ) n ⋅ ( 1 − p ) n = ( 1 − p ) 2 n .
P ( min ( X , Y ) = n ) = P ( min > n − 1 ) − P ( min > n ) = ( 1 − p ) 2 ( n − 1 ) − ( 1 − p ) 2 n = ( 1 − p ) 2 n − 2 [ 1 − ( 1 − p ) 2 ] P(\min(X,Y) = n) = P(\min > n-1) - P(\min > n) = (1-p)^{2(n-1)} - (1-p)^{2n} = (1-p)^{2n-2}[1-(1-p)^2] P ( min ( X , Y ) = n ) = P ( min > n − 1 ) − P ( min > n ) = ( 1 − p ) 2 ( n − 1 ) − ( 1 − p ) 2 n = ( 1 − p ) 2 n − 2 [ 1 − ( 1 − p ) 2 ] .
This is G e o ( 1 − ( 1 − p ) 2 ) \mathrm{Geo}(1-(1-p)^2) Geo ( 1 − ( 1 − p ) 2 ) with success probability q = 1 − ( 1 − p ) 2 q = 1-(1-p)^2 q = 1 − ( 1 − p ) 2 . ■ \blacksquare ■
16. Further Advanced Topics
A Poisson process with rate λ \lambda λ is a counting process N ( t ) N(t) N ( t ) satisfying:
N ( 0 ) = 0 N(0) = 0 N ( 0 ) = 0
Independent increments
N ( t + s ) − N ( s ) ∼ P o ( λ t ) N(t+s) - N(s) \sim \mathrm{Po}(\lambda t) N ( t + s ) − N ( s ) ∼ Po ( λ t ) for all s , t ≥ 0 s, t \geq 0 s , t ≥ 0
16.2 Conditional distributions
For X ∼ P o ( λ 1 ) X \sim \mathrm{Po}(\lambda_1) X ∼ Po ( λ 1 ) and Y ∼ P o ( λ 2 ) Y \sim \mathrm{Po}(\lambda_2) Y ∼ Po ( λ 2 ) , independent:
P ( X = k ∣ X + Y = n ) = ( n k ) ( ◆ L B ◆ λ 1 ◆ R B ◆◆ L B ◆ λ 1 + λ 2 ◆ R B ◆ ) k ( ◆ L B ◆ λ 2 ◆ R B ◆◆ L B ◆ λ 1 + λ 2 ◆ R B ◆ ) n − k P(X = k \mid X + Y = n) = \binom{n}{k}\!\left(\frac◆LB◆\lambda_1◆RB◆◆LB◆\lambda_1+\lambda_2◆RB◆\right)^k\left(\frac◆LB◆\lambda_2◆RB◆◆LB◆\lambda_1+\lambda_2◆RB◆\right)^{n-k} P ( X = k ∣ X + Y = n ) = ( k n ) ( L ◆ B ◆ λ 1 ◆ R B ◆◆ L B ◆ λ 1 + λ 2 ◆ R B ◆ ) k ( L ◆ B ◆ λ 2 ◆ R B ◆◆ L B ◆ λ 1 + λ 2 ◆ R B ◆ ) n − k
This is B i n ( n , λ 1 / ( λ 1 + λ 2 ) ) \mathrm{Bin}(n, \lambda_1/(\lambda_1+\lambda_2)) Bin ( n , λ 1 / ( λ 1 + λ 2 )) — the conditional distribution is
binomial!
16.3 The negative binomial distribution
The number of trials until the r r r -th success follows N e g B i n ( r , p ) \mathrm{NegBin}(r, p) NegBin ( r , p ) :
P ( X = n ) = ( n − 1 r − 1 ) p r ( 1 − p ) n − r for n = r , r + 1 , … P(X = n) = \binom{n-1}{r-1}p^r(1-p)^{n-r} \quad \text{for } n = r, r+1, \ldots P ( X = n ) = ( r − 1 n − 1 ) p r ( 1 − p ) n − r for n = r , r + 1 , …
E ( X ) = r p E(X) = \dfrac{r}{p} E ( X ) = p r , V a r ( X ) = r ( 1 − p ) p 2 \mathrm{Var}(X) = \dfrac{r(1-p)}{p^2} Var ( X ) = p 2 r ( 1 − p ) .
The geometric distribution is N e g B i n ( 1 , p ) \mathrm{NegBin}(1, p) NegBin ( 1 , p ) .
16.4 Poisson goodness-of-fit
To test whether data follows P o ( λ ) \mathrm{Po}(\lambda) Po ( λ ) :
Estimate λ ^ = x ˉ \hat{\lambda} = \bar{x} λ ^ = x ˉ
Calculate expected frequencies using λ ^ \hat{\lambda} λ ^
Apply the chi-squared test
17. Further Exam-Style Questions
Question 15
Calls arrive at rate 3 per hour. Find the probability that the third call arrives before time
t = 1 t = 1 t = 1 hour.
Solution The time of the 3rd call is G a m m a ( 3 , 3 ) \mathrm{Gamma}(3, 3) Gamma ( 3 , 3 ) (sum of 3 independent E x p ( 3 ) \mathrm{Exp}(3) Exp ( 3 )
variables).
P ( T 3 < 1 ) = P ( at least 3 calls in 1 hour ) = ∑ k = 3 ∞ e − 3 3 k k ! P(T_3 < 1) = P(\text{at least 3 calls in 1 hour}) = \sum_{k=3}^{\infty}\dfrac{e^{-3}3^k}{k!} P ( T 3 < 1 ) = P ( at least 3 calls in 1 hour ) = ∑ k = 3 ∞ k ! e − 3 3 k
= 1 − P ( X ≤ 2 ) = 1 − e − 3 ( 1 + 3 + 9 2 ) = 1 − e − 3 ⋅ 8.5 = 1 - P(X \leq 2) = 1 - e^{-3}\!\left(1 + 3 + \dfrac{9}{2}\right) = 1 - e^{-3}\cdot 8.5 = 1 − P ( X ≤ 2 ) = 1 − e − 3 ( 1 + 3 + 2 9 ) = 1 − e − 3 ⋅ 8.5
≈ 1 − 0.4232 ≈ 0.577 \approx 1 - 0.4232 \approx \boxed{0.577} ≈ 1 − 0.4232 ≈ 0.577 .
Question 16
Prove that for X ∼ G e o ( p ) X \sim \mathrm{Geo}(p) X ∼ Geo ( p ) , the moment generating function is
M X ( t ) = p e t 1 − ( 1 − p ) e t M_X(t) = \dfrac{pe^t}{1-(1-p)e^t} M X ( t ) = 1 − ( 1 − p ) e t p e t for t < − ln ( 1 − p ) t < -\ln(1-p) t < − ln ( 1 − p ) .
Solution M X ( t ) = ∑ n = 1 ∞ e t n p ( 1 − p ) n − 1 = p 1 − p ∑ n = 1 ∞ [ ( 1 − p ) e t ] n M_X(t) = \displaystyle\sum_{n=1}^{\infty} e^{tn} p(1-p)^{n-1} = \frac{p}{1-p}\sum_{n=1}^{\infty} [(1-p)e^t]^n M X ( t ) = n = 1 ∑ ∞ e t n p ( 1 − p ) n − 1 = 1 − p p n = 1 ∑ ∞ [( 1 − p ) e t ] n
= p 1 − p ⋅ ( 1 − p ) e t 1 − ( 1 − p ) e t = p e t 1 − ( 1 − p ) e t = \frac{p}{1-p} \cdot \frac{(1-p)e^t}{1-(1-p)e^t} = \frac{pe^t}{1-(1-p)e^t} = 1 − p p ⋅ 1 − ( 1 − p ) e t ( 1 − p ) e t = 1 − ( 1 − p ) e t p e t .
This converges when ∣ ( 1 − p ) e t ∣ < 1 |(1-p)e^t| < 1 ∣ ( 1 − p ) e t ∣ < 1 , i.e., t < − ln ( 1 − p ) t < -\ln(1-p) t < − ln ( 1 − p ) . ■ \blacksquare ■