Skip to main content

Poisson and Geometric Distributions

Poisson and Geometric Distributions

The Poisson and geometric distributions model discrete random variables arising from counting processes. The Poisson distribution counts the number of rare events in a fixed interval, while the geometric distribution counts the number of trials until the first success.

Board Coverage

BoardPaperNotes
AQAPaper 2Both Poisson and geometric in depth
EdexcelS2, S3Poisson in S2; geometric in S3
OCR (A)Paper 2Poisson and geometric
CIE (9231)S2Poisson covered; geometric not required
The formula booklet provides the Poisson PMF. You must know when to apply each distribution

and how to carry out hypothesis testing with discrete distributions. The geometric distribution has two common conventions for the support: r=1,2,3,r = 1, 2, 3, \ldots (number of trials) or r=0,1,2,r = 0, 1, 2, \ldots (number of failures). AQA uses r=1,2,r = 1, 2, \ldots. :::


1. The Poisson Distribution

1.1 Definition

Definition. A discrete random variable XX follows a Poisson distribution with parameter λ\lambda (where λ>0\lambda > 0), written XPo(λ)X \sim \mathrm{Po}(\lambda), if

P(X=r)=LBeλλrRB◆◆LBr!RB,r=0,1,2,P(X = r) = \frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆, \quad r = 0, 1, 2, \ldots

The Poisson distribution models the number of events occurring in a fixed interval of time or space when:

  • Events occur independently
  • Events occur at a constant average rate λ\lambda
  • The probability of more than one event in a sufficiently small interval is negligible

1.2 Derivation as a Limit of the Binomial

Theorem. If nn \to \infty and p0p \to 0 such that np=λnp = \lambda remains constant, then B(n,p)Po(λ)B(n, p) \to \mathrm{Po}(\lambda).

Proof

P(X=r)=(nr)pr(1p)nr=LBn(n1)(nr+1)RB◆◆LBr!RBLBλrRB◆◆LBnrRB(1LBλRB◆◆LBnRB)nr\begin{aligned} P(X = r) &= \binom{n}{r}p^r(1-p)^{n-r} \\ &= \frac◆LB◆n(n-1)\cdots(n-r+1)◆RB◆◆LB◆r!◆RB◆\cdot\frac◆LB◆\lambda^r◆RB◆◆LB◆n^r◆RB◆\cdot\left(1-\frac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-r} \end{aligned}

Consider each factor as nn \to \infty:

  • LBn(n1)(nr+1)RB◆◆LBnrRB1\dfrac◆LB◆n(n-1)\cdots(n-r+1)◆RB◆◆LB◆n^r◆RB◆ \to 1 since each of the rr factors tends to 1
  • (1LBλRB◆◆LBnRB)nr=(1LBλRB◆◆LBnRB)n(1LBλRB◆◆LBnRB)reλ1=eλ\left(1-\dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-r} = \left(1-\dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^n \cdot \left(1-\dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{-r} \to e^{-\lambda} \cdot 1 = e^{-\lambda}

Therefore:

P(X=r)1r!λreλ=LBeλλrRB◆◆LBr!RBP(X = r) \to \frac{1}{r!}\cdot\lambda^r \cdot e^{-\lambda} = \frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ \quad \blacksquare

1.3 Proof that E(X)=λE(X) = \lambda

Proof

E(X)=r=0rLBeλλrRB◆◆LBr!RB=r=1LBeλλrRB◆◆LB(r1)!RB=λeλr=1LBλr1RB◆◆LB(r1)!RB=λeλk=0LBλkRB◆◆LBk!RB=λeλeλ=λ\begin{aligned} E(X) &= \sum_{r=0}^{\infty}r\cdot\frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ = \sum_{r=1}^{\infty}\frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆(r-1)!◆RB◆ \\ &= \lambda e^{-\lambda}\sum_{r=1}^{\infty}\frac◆LB◆\lambda^{r-1}◆RB◆◆LB◆(r-1)!◆RB◆ = \lambda e^{-\lambda}\sum_{k=0}^{\infty}\frac◆LB◆\lambda^k◆RB◆◆LB◆k!◆RB◆ \\ &= \lambda e^{-\lambda}\cdot e^{\lambda} = \lambda \quad \blacksquare \end{aligned}

1.4 Proof that Var(X)=λ\mathrm{Var}(X) = \lambda

Proof

First compute E(X(X1))E(X(X-1)):

E(X(X1))=r=2r(r1)LBeλλrRB◆◆LBr!RB=r=2LBeλλrRB◆◆LB(r2)!RB=λ2eλk=0LBλkRB◆◆LBk!RB=λ2eλeλ=λ2\begin{aligned} E(X(X-1)) &= \sum_{r=2}^{\infty}r(r-1)\frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ = \sum_{r=2}^{\infty}\frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆(r-2)!◆RB◆ \\ &= \lambda^2 e^{-\lambda}\sum_{k=0}^{\infty}\frac◆LB◆\lambda^k◆RB◆◆LB◆k!◆RB◆ = \lambda^2 e^{-\lambda}\cdot e^{\lambda} = \lambda^2 \end{aligned}

Since E(X2)=E(X(X1))+E(X)=λ2+λE(X^2) = E(X(X-1)) + E(X) = \lambda^2 + \lambda:

Var(X)=E(X2)[E(X)]2=λ2+λλ2=λ\mathrm{Var}(X) = E(X^2) - [E(X)]^2 = \lambda^2 + \lambda - \lambda^2 = \lambda \quad \blacksquare

E(X)=Var(X)=λ\boxed{E(X) = \mathrm{Var}(X) = \lambda}

This is the defining property of the Poisson distribution: the mean equals the variance.

1.5 Additivity of Poisson distributions

If XPo(λ)X \sim \mathrm{Po}(\lambda) and YPo(μ)Y \sim \mathrm{Po}(\mu) are independent, then

X+YPo(λ+μ)\boxed{X + Y \sim \mathrm{Po}(\lambda + \mu)}

1.6 Cumulative probabilities

Cumulative Poisson probabilities are found using:

P(Xr)=k=0rLBeλλkRB◆◆LBk!RBP(X \leq r) = \sum_{k=0}^{r}\frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆

These are typically obtained from tables or a calculator. Key relationships:

P(X>r)=1P(Xr)P(X > r) = 1 - P(X \leq r) P(aXb)=P(Xb)P(Xa1)P(a \leq X \leq b) = P(X \leq b) - P(X \leq a-1)

1.7 Poisson hypothesis testing

The procedure mirrors binomial hypothesis testing:

  1. Define XX and state XPo(λ0)X \sim \mathrm{Po}(\lambda_0) under H0H_0
  2. State H0:λ=λ0H_0: \lambda = \lambda_0 and H1H_1
  3. State the significance level α\alpha
  4. Find the critical region
  5. Compare the observed value
  6. Conclude in context

Example. A call centre receives an average of 3.2 calls per minute. In a particular minute, 7 calls are received. Test at the 5% significance level whether the rate has increased.

XPo(3.2)X \sim \mathrm{Po}(3.2). H0:λ=3.2H_0: \lambda = 3.2, H1:λ>3.2H_1: \lambda > 3.2.

P(X7)=1P(X6)=10.9554=0.0446<0.05P(X \geq 7) = 1 - P(X \leq 6) = 1 - 0.9554 = 0.0446 < 0.05.

Reject H0H_0. There is sufficient evidence that the rate has increased.

Example. Find the critical region for a two-tailed test at the 5% level with XPo(5)X \sim \mathrm{Po}(5).

Lower tail: P(X0)=e50.00670.025P(X \leq 0) = e^{-5} \approx 0.0067 \leq 0.025. P(X1)=0.0404>0.025P(X \leq 1) = 0.0404 > 0.025. So X0X \leq 0.

Upper tail: P(X10)=10.9682=0.03180.025P(X \geq 10) = 1 - 0.9682 = 0.0318 \leq 0.025? No. P(X11)=10.9830=0.01700.025P(X \geq 11) = 1 - 0.9830 = 0.0170 \leq 0.025. So X11X \geq 11.

Critical region: X0X \leq 0 or X11X \geq 11.


2. The Geometric Distribution

2.1 Definition

Definition. A discrete random variable XX follows a geometric distribution with parameter pp (where 0<p10 < p \leq 1), written XGeo(p)X \sim \mathrm{Geo}(p), if XX is the number of the trial on which the first success occurs:

P(X=r)=(1p)r1p,r=1,2,3,P(X = r) = (1-p)^{r-1}p, \quad r = 1, 2, 3, \ldots

Each trial is independent with probability pp of success.

2.2 Proof that E(X)=1pE(X) = \frac{1}{p}

Proof

E(X)=r=1rqr1pwhereq=1p\begin{aligned} E(X) &= \sum_{r=1}^{\infty}r\,q^{r-1}p \quad \mathrm{where } q = 1-p \end{aligned}

Let S=r=1rqr1S = \sum_{r=1}^{\infty}r\,q^{r-1}. Recall the geometric series r=0qr=11q\sum_{r=0}^{\infty}q^r = \frac{1}{1-q} for q<1|q| < 1.

Differentiating both sides with respect to qq:

r=1rqr1=1(1q)2\sum_{r=1}^{\infty}rq^{r-1} = \frac{1}{(1-q)^2}

Therefore:

E(X)=p1(1q)2=p1p2=1pE(X) = p \cdot \frac{1}{(1-q)^2} = p \cdot \frac{1}{p^2} = \frac{1}{p} \quad \blacksquare

2.3 Proof that Var(X)=1pp2\mathrm{Var}(X) = \frac{1-p}{p^2}

Proof

First compute E(X2)=E(X(X1))+E(X)E(X^2) = E(X(X-1)) + E(X).

E(X(X1))=r=2r(r1)qr1p=pqr=2r(r1)qr2\begin{aligned} E(X(X-1)) &= \sum_{r=2}^{\infty}r(r-1)q^{r-1}p = p\,q\sum_{r=2}^{\infty}r(r-1)q^{r-2} \end{aligned}

Starting from r=0qr=11q\sum_{r=0}^{\infty}q^r = \frac{1}{1-q}, differentiating twice:

r=2r(r1)qr2=2(1q)3\sum_{r=2}^{\infty}r(r-1)q^{r-2} = \frac{2}{(1-q)^3}

So E(X(X1))=pq2(1q)3=pq2p3=2qp2E(X(X-1)) = p\,q\cdot\frac{2}{(1-q)^3} = p\,q\cdot\frac{2}{p^3} = \frac{2q}{p^2}.

E(X2)=2qp2+1p=2q+pp2=2(1p)+pp2=2pp2Var(X)=E(X2)[E(X)]2=2pp21p2=1pp2\begin{aligned} E(X^2) &= \frac{2q}{p^2} + \frac{1}{p} = \frac{2q + p}{p^2} = \frac{2(1-p) + p}{p^2} = \frac{2-p}{p^2} \\[4pt] \mathrm{Var}(X) &= E(X^2) - [E(X)]^2 = \frac{2-p}{p^2} - \frac{1}{p^2} = \frac{1-p}{p^2} \quad \blacksquare \end{aligned}

E(X)=1p,Var(X)=1pp2\boxed{E(X) = \frac{1}{p}, \qquad \mathrm{Var}(X) = \frac{1-p}{p^2}}

2.4 The memoryless property

Theorem. The geometric distribution is the only discrete memoryless distribution:

P(X>m+nX>m)=P(X>n)P(X > m + n \mid X > m) = P(X > n)

Proof

P(X>m+nX>m)=LBP(X>m+nandX>m)RB◆◆LBP(X>m)RB=P(X>m+n)P(X>m)(sinceX>m+n    X>m)=LB1P(Xm+n)RB◆◆LB1P(Xm)RB\begin{aligned} P(X > m + n \mid X > m) &= \frac◆LB◆P(X > m+n \mathrm{ and } X > m)◆RB◆◆LB◆P(X > m)◆RB◆ \\ &= \frac{P(X > m+n)}{P(X > m)} \quad \mathrm{(since } X > m+n \implies X > m\mathrm{)} \\ &= \frac◆LB◆1 - P(X \leq m+n)◆RB◆◆LB◆1 - P(X \leq m)◆RB◆ \end{aligned}

Now P(Xk)=r=1kqr1p=p1qk1q=1qkP(X \leq k) = \sum_{r=1}^{k}q^{r-1}p = p\cdot\frac{1-q^k}{1-q} = 1 - q^k.

Therefore:

1(1qm+n)1(1qm)=qm+nqm=qn=1(1qn)=P(X>n)\frac{1 - (1-q^{m+n})}{1 - (1-q^m)} = \frac{q^{m+n}}{q^m} = q^n = 1 - (1-q^n) = P(X > n) \quad \blacksquare
info success, the probability of waiting at least nn more trials is exactly the same as if

you were starting fresh. The process "forgets" its history. :::

2.5 Cumulative distribution function

P(Xr)=1qr=1(1p)rP(X \leq r) = 1 - q^r = 1 - (1-p)^r

2.6 Geometric hypothesis testing

Example. A bag contains red and blue balls. The probability of drawing a red ball is pp. In an experiment, the first red ball is drawn on the 10th draw. Test at the 5% level whether p=0.3p = 0.3.

XGeo(0.3)X \sim \mathrm{Geo}(0.3). H0:p=0.3H_0: p = 0.3, H1:p<0.3H_1: p < 0.3 (the ball took longer than expected, so pp may be smaller).

pvalue=P(X10)=(10.3)101=0.790.0404<0.05p\mathrm{-value} = P(X \geq 10) = (1-0.3)^{10-1} = 0.7^9 \approx 0.0404 < 0.05.

Reject H0H_0. There is sufficient evidence that p<0.3p < 0.3.

Critical region approach. For H1:p<0.3H_1: p < 0.3 at the 5% level, find cc such that P(Xc)0.05P(X \geq c) \leq 0.05:

P(X9)=0.780.0576>0.05P(X \geq 9) = 0.7^8 \approx 0.0576 > 0.05. P(X10)=0.790.0404<0.05P(X \geq 10) = 0.7^9 \approx 0.0404 < 0.05.

Critical region: X10X \geq 10.


3. Modelling with Poisson and Geometric Distributions

3.1 When to use each

SituationDistribution
Number of events in a fixed interval, rare eventsPoisson Po(λ)\mathrm{Po}(\lambda)
Number of trials until first successGeometric Geo(p)\mathrm{Geo}(p)
Fixed number of trials, counting successesBinomial B(n,p)B(n, p)

3.2 Poisson as approximation to Binomial

When nn is large and pp is small such that np10np \leq 10:

B(n,p)Po(np)B(n, p) \approx \mathrm{Po}(np)

Example. XB(200,0.02)X \sim B(200, 0.02). Then λ=np=4\lambda = np = 4, so XPo(4)X \approx \mathrm{Po}(4).

P(X2)e4(1+4+162)=13e40.2381P(X \leq 2) \approx e^{-4}\left(1 + 4 + \frac{16}{2}\right) = 13e^{-4} \approx 0.2381.

3.3 Conditions check

Before applying the Poisson distribution, verify:

  1. Events occur at a constant average rate
  2. Events are independent
  3. At most one event can occur in a sufficiently small sub-interval
warning not confuse this with the normal approximation to the binomial, which requires

np>5np > 5 and n(1p)>5n(1-p) > 5. :::


Problems

Details

Problem 1 A factory produces items with defects occurring at an average rate of 2.5 per hour. Find the probability of exactly 4 defects in a given hour, and the probability of more than 6 defects in a 2-hour period.

Details

Solution 1 For one hour: XPo(2.5)X \sim \mathrm{Po}(2.5). P(X=4)=e2.5(2.5)44!=LB0.08209×39.0625RB◆◆LB24RB0.1336P(X=4) = \dfrac{e^{-2.5}(2.5)^4}{4!} = \dfrac◆LB◆0.08209 \times 39.0625◆RB◆◆LB◆24◆RB◆ \approx 0.1336.

For two hours: YPo(5)Y \sim \mathrm{Po}(5) (by additivity). P(Y>6)=1P(Y6)=10.7622=0.2378P(Y > 6) = 1 - P(Y \leq 6) = 1 - 0.7622 = 0.2378.

If you get this wrong, revise: Cumulative probabilities — Section 1.6.

Details

Problem 2 A die is rolled repeatedly until a 6 appears. Find the probability that the first 6 appears on the 5th roll, and the probability that it takes more than 10 rolls.

Details

Solution 2 XGeo(1/6)X \sim \mathrm{Geo}(1/6). P(X=5)=(56)416=6251296160.0804P(X=5) = \left(\dfrac{5}{6}\right)^4 \cdot \dfrac{1}{6} = \dfrac{625}{1296} \cdot \dfrac{1}{6} \approx 0.0804.

P(X>10)=(56)101(56)01=(56)100.1615P(X > 10) = \left(\dfrac{5}{6}\right)^{10-1} \cdot \left(\dfrac{5}{6}\right)^0 \cdot 1 = \left(\dfrac{5}{6}\right)^{10} \approx 0.1615.

Wait: P(X>10)=1P(X10)=1(1q10)=q10=(5/6)100.1615P(X > 10) = 1 - P(X \leq 10) = 1 - (1-q^{10}) = q^{10} = (5/6)^{10} \approx 0.1615.

If you get this wrong, revise: Cumulative distribution function — Section 2.5.

Details

Problem 3 Prove that E(X)=λE(X) = \lambda for XPo(λ)X \sim \mathrm{Po}(\lambda), showing all steps of the summation.

Details

Solution 3 E(X)=r=0rLBeλλrRB◆◆LBr!RB=r=1LBeλλrRB◆◆LB(r1)!RB=λeλr=1LBλr1RB◆◆LB(r1)!RBE(X) = \sum_{r=0}^{\infty}r\cdot\dfrac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆ = \sum_{r=1}^{\infty}\dfrac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆(r-1)!◆RB◆ = \lambda e^{-\lambda}\sum_{r=1}^{\infty}\dfrac◆LB◆\lambda^{r-1}◆RB◆◆LB◆(r-1)!◆RB◆

Substituting k=r1k = r-1: =λeλk=0LBλkRB◆◆LBk!RB=λeλeλ=λ= \lambda e^{-\lambda}\sum_{k=0}^{\infty}\dfrac◆LB◆\lambda^k◆RB◆◆LB◆k!◆RB◆ = \lambda e^{-\lambda}\cdot e^{\lambda} = \lambda. \blacksquare

If you get this wrong, revise: Proof that E(X)=λE(X) = \lambda — Section 1.3.

Details

Problem 4 The number of emails received per hour follows Po(8)\mathrm{Po}(8). Find the probability of receiving between 6 and 12 emails (inclusive) in a given hour.

Details

Solution 4 XPo(8)X \sim \mathrm{Po}(8). P(6X12)=P(X12)P(X5)P(6 \leq X \leq 12) = P(X \leq 12) - P(X \leq 5).

P(X12)0.9362P(X \leq 12) \approx 0.9362, P(X5)0.1912P(X \leq 5) \approx 0.1912.

P(6X12)0.93620.1912=0.7450P(6 \leq X \leq 12) \approx 0.9362 - 0.1912 = 0.7450.

If you get this wrong, revise: Cumulative probabilities — Section 1.6.

Details

Problem 5 A manufacturer claims that on average 1 in 20 items is defective. In a batch of 500 items, use the Poisson approximation to find the probability of at most 35 defectives.

Details

Solution 5 XB(500,1/20)X \sim B(500, 1/20). λ=np=500/20=25\lambda = np = 500/20 = 25.

XPo(25)X \approx \mathrm{Po}(25). P(X35)=r=035e25(25)rr!0.8878P(X \leq 35) = \sum_{r=0}^{35}\dfrac{e^{-25}(25)^r}{r!} \approx 0.8878.

If you get this wrong, revise: Poisson as approximation to Binomial — Section 3.2.

Details

Problem 6 Prove the memoryless property of the geometric distribution: P(X>m+nX>m)=P(X>n)P(X > m+n \mid X > m) = P(X > n).

Details

Solution 6 P(X>m+nX>m)=P(X>m+n)P(X>m)=qm+nqm=qn=P(X>n)P(X > m+n \mid X > m) = \dfrac{P(X > m+n)}{P(X > m)} = \dfrac{q^{m+n}}{q^m} = q^n = P(X > n).

This uses P(X>k)=qk=(1p)kP(X > k) = q^k = (1-p)^k, which follows from P(Xk)=1qkP(X \leq k) = 1 - q^k. \blacksquare

If you get this wrong, revise: The memoryless property — Section 2.4.

Details

Problem 7 A shop receives an average of 6 customers per 30 minutes. Find the critical region for a test at the 5% significance level of H0:λ=6H_0: \lambda = 6 against H1:λ>6H_1: \lambda > 6, where XX is the number of customers in a 30-minute period.

Details

Solution 7 Under H0H_0: XPo(6)X \sim \mathrm{Po}(6).

P(X10)=1P(X9)=10.9161=0.0839>0.05P(X \geq 10) = 1 - P(X \leq 9) = 1 - 0.9161 = 0.0839 > 0.05. P(X11)=1P(X10)=10.9574=0.0426<0.05P(X \geq 11) = 1 - P(X \leq 10) = 1 - 0.9574 = 0.0426 < 0.05.

Critical region: X11X \geq 11. Actual significance level: 4.26%.

If you get this wrong, revise: Poisson hypothesis testing — Section 1.7.

Details

Problem 8 XGeo(p)X \sim \mathrm{Geo}(p). Find P(X=3X>1)P(X = 3 \mid X > 1) and show it equals P(X=2)P(X = 2).

Details

Solution 8 P(X=3X>1)=P(X=3)P(X>1)=q2pq=qp=P(X=2)P(X = 3 \mid X > 1) = \dfrac{P(X = 3)}{P(X > 1)} = \dfrac{q^2 p}{q} = qp = P(X = 2).

This is a direct consequence of the memoryless property: given that the first trial was a failure, the distribution of the remaining trials is the same as starting fresh.

If you get this wrong, revise: The memoryless property — Section 2.4.

Details

Problem 9 The number of accidents per week at a junction follows Po(3)\mathrm{Po}(3). After new traffic lights are installed, 8 accidents are observed in one week. Test at the 5% level whether the rate has increased.

Details

Solution 9 XPo(3)X \sim \mathrm{Po}(3). H0:λ=3H_0: \lambda = 3, H1:λ>3H_1: \lambda > 3. α=0.05\alpha = 0.05.

pvalue=P(X8)=1P(X7)=10.9881=0.0119<0.05p\mathrm{-value} = P(X \geq 8) = 1 - P(X \leq 7) = 1 - 0.9881 = 0.0119 < 0.05.

Reject H0H_0. There is sufficient evidence that the accident rate has increased.

Alternatively, critical region: P(X7)=10.9665=0.0335<0.05P(X \geq 7) = 1 - 0.9665 = 0.0335 < 0.05, P(X6)=10.9165=0.0835>0.05P(X \geq 6) = 1 - 0.9165 = 0.0835 > 0.05.

Critical region: X7X \geq 7. Since X=87X = 8 \geq 7, reject H0H_0.

If you get this wrong, revise: Poisson hypothesis testing — Section 1.7.

Details

Problem 10 If XGeo(p)X \sim \mathrm{Geo}(p), find E(X(X1))E(X(X-1)) and hence verify that Var(X)=1pp2\mathrm{Var}(X) = \dfrac{1-p}{p^2}.

Details

Solution 10 E(X(X1))=r=2r(r1)qr1p=pqr=2r(r1)qr2E(X(X-1)) = \sum_{r=2}^{\infty}r(r-1)q^{r-1}p = pq\sum_{r=2}^{\infty}r(r-1)q^{r-2}.

Since r=0qr=11q\sum_{r=0}^{\infty}q^r = \dfrac{1}{1-q}, differentiating twice gives r=2r(r1)qr2=2(1q)3\sum_{r=2}^{\infty}r(r-1)q^{r-2} = \dfrac{2}{(1-q)^3}.

E(X(X1))=pq2p3=2qp2E(X(X-1)) = pq \cdot \dfrac{2}{p^3} = \dfrac{2q}{p^2}.

E(X2)=E(X(X1))+E(X)=2qp2+1p=2q+pp2=2pp2E(X^2) = E(X(X-1)) + E(X) = \dfrac{2q}{p^2} + \dfrac{1}{p} = \dfrac{2q+p}{p^2} = \dfrac{2-p}{p^2}.

Var(X)=2pp21p2=1pp2\mathrm{Var}(X) = \dfrac{2-p}{p^2} - \dfrac{1}{p^2} = \dfrac{1-p}{p^2}. \blacksquare

If you get this wrong, revise: Proof that Var(X)=1pp2\mathrm{Var}(X) = \frac{1-p}{p^2} — Section 2.3.


7. Advanced Worked Examples

Example 7.1: Poisson approximation to binomial

Problem. A factory produces items with a defect rate of 0.02. In a batch of 200 items, find the probability of exactly 3 defective items using (a) the binomial distribution and (b) the Poisson approximation.

Solution. (a) Binomial: XBin(200,0.02)X \sim \mathrm{Bin}(200, 0.02).

P(X=3)=(2003)(0.02)3(0.98)197=LB200×199×198RB◆◆LB6RB×8×106×(0.98)197P(X = 3) = \binom{200}{3}(0.02)^3(0.98)^{197} = \frac◆LB◆200 \times 199 \times 198◆RB◆◆LB◆6◆RB◆ \times 8 \times 10^{-6} \times (0.98)^{197}

(b) Poisson approximation: λ=np=200×0.02=4\lambda = np = 200 \times 0.02 = 4. XPo(4)X \approx \mathrm{Po}(4).

P(X=3)=LBe443RB◆◆LB3!RB=646e4=323e40.1954P(X = 3) = \frac◆LB◆e^{-4} \cdot 4^3◆RB◆◆LB◆3!◆RB◆ = \frac{64}{6e^4} = \frac{32}{3e^4} \approx 0.1954

The approximation is valid since n50n \geq 50 and p0.1p \leq 0.1.

Example 7.2: Geometric distribution and memoryless property

Problem. A fair die is rolled until a 6 appears. Find the probability that more than 4 rolls are needed. Verify the memoryless property: P(X>m+nX>m)=P(X>n)P(X > m + n \mid X > m) = P(X > n).

Solution. XGeo(1/6)X \sim \mathrm{Geo}(1/6).

P(X>4)=(56)4=62512960.4823P(X > 4) = \left(\frac{5}{6}\right)^4 = \frac{625}{1296} \approx 0.4823

Memoryless property:

P(X>m+nX>m)=P(X>m+n)P(X>m)=(5/6)m+n(5/6)m=(56)n=P(X>n)P(X > m + n \mid X > m) = \frac{P(X > m + n)}{P(X > m)} = \frac{(5/6)^{m+n}}{(5/6)^m} = \left(\frac{5}{6}\right)^n = P(X > n) \quad \blacksquare

Example 7.3: Cumulative Poisson probabilities

Problem. Calls arrive at a call centre at a rate of 2.5 per minute. Find the probability that more than 5 calls arrive in a 3-minute period.

Solution. For a 3-minute period: λ=2.5×3=7.5\lambda = 2.5 \times 3 = 7.5. XPo(7.5)X \sim \mathrm{Po}(7.5).

P(X>5)=1P(X5)=1k=05e7.5(7.5)kk!P(X > 5) = 1 - P(X \leq 5) = 1 - \sum_{k=0}^{5}\frac{e^{-7.5}(7.5)^k}{k!}

=1e7.5 ⁣(1+7.5+7.522+7.536+7.5424+7.55120)= 1 - e^{-7.5}\!\left(1 + 7.5 + \frac{7.5^2}{2} + \frac{7.5^3}{6} + \frac{7.5^4}{24} + \frac{7.5^5}{120}\right)

=1e7.5 ⁣(1+7.5+28.125+70.3125+131.836+197.754+197.754)= 1 - e^{-7.5}\!\left(1 + 7.5 + 28.125 + 70.3125 + 131.836 + 197.754 + 197.754\right)

=1e7.5×633.57710.554×0.634=10.351=0.649= 1 - e^{-7.5} \times 633.577 \approx 1 - 0.554 \times 0.634 = 1 - 0.351 = 0.649

Example 7.4: Hypothesis testing with the Poisson distribution

Problem. A traffic survey records the number of cars passing a point in 10-second intervals. The observed frequencies for kk cars are compared with the expected frequencies under H0H_0: XPo(3)X \sim \mathrm{Po}(3). Calculate the expected frequency for each value of kk if 200 intervals were observed.

Solution. Under H0H_0: P(X=k)=LBe33kRB◆◆LBk!RBP(X = k) = \dfrac◆LB◆e^{-3} \cdot 3^k◆RB◆◆LB◆k!◆RB◆.

kkP(X=k)P(X = k)Expected freq (×200\times 200)
0e3=0.0498e^{-3} = 0.04989.96
13e3=0.14943e^{-3} = 0.149429.87
24.5e3=0.22404.5e^{-3} = 0.224044.81
34.5e3=0.22404.5e^{-3} = 0.224044.81
43.375e3=0.16803.375e^{-3} = 0.168033.60
52.025e3=0.10082.025e^{-3} = 0.100820.17
6\geq 61051 - \sum_0^516.78\approx 16.78

Example 7.5: Fitting a Poisson distribution

Problem. The number of email messages received per hour is recorded over 100 hours: {0:5,1:15,2:25,3:30,4:15,5:7,6:3}\{0: 5, 1: 15, 2: 25, 3: 30, 4: 15, 5: 7, 6: 3\}. Estimate the parameter λ\lambda and calculate expected frequencies.

Solution. xˉ=0(5)+1(15)+2(25)+3(30)+4(15)+5(7)+6(3)100=0+15+50+90+60+35+18100=268100=2.68\bar{x} = \dfrac{0(5) + 1(15) + 2(25) + 3(30) + 4(15) + 5(7) + 6(3)}{100} = \dfrac{0 + 15 + 50 + 90 + 60 + 35 + 18}{100} = \dfrac{268}{100} = 2.68.

λ^=2.68\hat{\lambda} = 2.68.

Expected frequency for kk: 100×e2.68(2.68)kk!100 \times \dfrac{e^{-2.68}(2.68)^k}{k!}.

kkExpected
0100e2.68=6.86100e^{-2.68} = 6.86
1100×2.68e2.68=18.38100 \times 2.68 e^{-2.68} = 18.38
2100×3.59e2.68=24.64100 \times 3.59 e^{-2.68} = 24.64
3100×3.21e2.68=22.02100 \times 3.21 e^{-2.68} = 22.02
4100×2.15e2.68=14.76100 \times 2.15 e^{-2.68} = 14.76
5100×1.15e2.68=7.91100 \times 1.15 e^{-2.68} = 7.91
6100×0.51e2.68=3.52100 \times 0.51 e^{-2.68} = 3.52

Example 7.6: Conditional probability with geometric distribution

Problem. In a game, the probability of winning each round is p=0.3p = 0.3 independently. Given that a player has not won in the first 5 rounds, find the probability that they win within the next 3 rounds.

Solution. XGeo(0.3)X \sim \mathrm{Geo}(0.3). By the memoryless property:

P(X8X>5)=P(X3)=1(0.7)3=10.343=0.657P(X \leq 8 \mid X > 5) = P(X \leq 3) = 1 - (0.7)^3 = 1 - 0.343 = 0.657

Example 7.7: Sum of independent Poisson variables

Problem. XPo(3)X \sim \mathrm{Po}(3) and YPo(5)Y \sim \mathrm{Po}(5) are independent. State the distribution of X+YX + Y and find P(X+Y=6)P(X + Y = 6).

Solution. X+YPo(3+5)=Po(8)X + Y \sim \mathrm{Po}(3 + 5) = \mathrm{Po}(8).

P(X+Y=6)=LBe886RB◆◆LB6!RB=LB262144e8RB◆◆LB720RB=LB364.09e8RB◆◆LB1RB0.1221P(X + Y = 6) = \frac◆LB◆e^{-8} \cdot 8^6◆RB◆◆LB◆6!◆RB◆ = \frac◆LB◆262144 \cdot e^{-8}◆RB◆◆LB◆720◆RB◆ = \frac◆LB◆364.09 \cdot e^{-8}◆RB◆◆LB◆1◆RB◆ \approx 0.1221

Example 7.8: Poisson as a limiting case

Problem. Prove that if XBin(n,p)X \sim \mathrm{Bin}(n, p) with λ=np\lambda = np fixed as nn \to \infty, then P(X=k)LBeλλkRB◆◆LBk!RBP(X = k) \to \dfrac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆.

Solution.

P(X=k)=(nk)pk(1p)nk=n!k!(nk)!LBλkRB◆◆LBnkRB(1LBλRB◆◆LBnRB)nkP(X = k) = \binom{n}{k}p^k(1-p)^{n-k} = \frac{n!}{k!(n-k)!}\cdot\frac◆LB◆\lambda^k◆RB◆◆LB◆n^k◆RB◆\cdot\left(1-\frac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-k}

=LBλkRB◆◆LBk!RBLBn(n1)(nk+1)RB◆◆LBnkRB(1LBλRB◆◆LBnRB)nk= \frac◆LB◆\lambda^k◆RB◆◆LB◆k!◆RB◆\cdot\frac◆LB◆n(n-1)\cdots(n-k+1)◆RB◆◆LB◆n^k◆RB◆\cdot\left(1-\frac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-k}

As nn \to \infty: LBn(n1)(nk+1)RB◆◆LBnkRB1\dfrac◆LB◆n(n-1)\cdots(n-k+1)◆RB◆◆LB◆n^k◆RB◆ \to 1 and (1LBλRB◆◆LBnRB)nkeλ\left(1-\dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-k} \to e^{-\lambda}.

Therefore P(X=k)LBeλλkRB◆◆LBk!RBP(X = k) \to \dfrac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆. \blacksquare


8. Connections to Other Topics

8.1 Poisson distribution and exponential distribution

If events occur according to a Poisson process with rate λ\lambda, the time between consecutive events follows the exponential distribution Exp(λ)\mathrm{Exp}(\lambda). See Exponential and Continuous Random Variables.

8.2 Geometric distribution and series summation

The probability generating function GX(t)=pt1qtG_X(t) = \dfrac{pt}{1-qt} of the geometric distribution connects to the summation of geometric series. See Further Algebra.

8.3 Poisson and hypothesis testing

Goodness-of-fit tests using the chi-squared statistic compare observed and expected (Poisson) frequencies. See Chi-Squared Tests.


9. Additional Exam-Style Questions

Question 11

A shop receives on average 4 customers per hour. Find the probability that: (a) Exactly 3 customers arrive in a given hour. (b) More than 2 customers arrive in a 30-minute period.

Solution

(a) XPo(4)X \sim \mathrm{Po}(4).

P(X=3)=LBe464RB◆◆LB6RB=323e40.1954P(X = 3) = \frac◆LB◆e^{-4}\cdot 64◆RB◆◆LB◆6◆RB◆ = \frac{32}{3e^4} \approx 0.1954

(b) For 30 minutes: YPo(2)Y \sim \mathrm{Po}(2).

P(Y>2)=1P(Y2)=1e2(1+2+2)=15e20.3233P(Y > 2) = 1 - P(Y \leq 2) = 1 - e^{-2}(1 + 2 + 2) = 1 - 5e^{-2} \approx 0.3233

Question 12

A coin is tossed until the first head appears. The probability of heads is pp.

(a) Find E(X)E(X) and Var(X)\mathrm{Var}(X) where XX is the number of tosses needed.

(b) Find the probability that XX is even.

Solution

(a) XGeo(p)X \sim \mathrm{Geo}(p): E(X)=1/pE(X) = 1/p, Var(X)=(1p)/p2\mathrm{Var}(X) = (1-p)/p^2.

(b) P(X is even)=P(X=2)+P(X=4)+P(X=6)+P(X \text{ is even}) = P(X = 2) + P(X = 4) + P(X = 6) + \cdots

=qp+q3p+q5p+=qp(1+q2+q4+)=qp11q2=qp(1q)(1+q)=q1+q= qp + q^3p + q^5p + \cdots = qp(1 + q^2 + q^4 + \cdots) = qp \cdot \frac{1}{1 - q^2} = \frac{qp}{(1-q)(1+q)} = \frac{q}{1+q}

Question 13

Prove that if X1,X2,,XnX_1, X_2, \ldots, X_n are independent with XiPo(λi)X_i \sim \mathrm{Po}(\lambda_i), then S=XiPo ⁣(λi)S = \sum X_i \sim \mathrm{Po}\!\left(\sum \lambda_i\right).

Solution

The probability generating function of XiPo(λi)X_i \sim \mathrm{Po}(\lambda_i) is GXi(t)=eλi(t1)G_{X_i}(t) = e^{\lambda_i(t-1)}.

For independent random variables, the PGF of the sum is the product:

GS(t)=i=1neλi(t1)=e(t1)λiG_S(t) = \prod_{i=1}^{n}e^{\lambda_i(t-1)} = e^{(t-1)\sum\lambda_i}

This is the PGF of Po ⁣(λi)\mathrm{Po}\!\left(\sum\lambda_i\right). Therefore SPo ⁣(λi)S \sim \mathrm{Po}\!\left(\sum\lambda_i\right). \blacksquare

Question 14

A typist makes on average 2 errors per page. Find the probability that a particular page has: (a) No errors. (b) At most 3 errors. (c) Exactly 2 errors given that it has at most 3 errors.

Solution

XPo(2)X \sim \mathrm{Po}(2).

(a) P(X=0)=e20.1353P(X = 0) = e^{-2} \approx 0.1353.

(b) P(X3)=e2(1+2+2+4/3)=e219/30.8571P(X \leq 3) = e^{-2}(1 + 2 + 2 + 4/3) = e^{-2} \cdot 19/3 \approx 0.8571.

(c) P(X=2X3)=LBP(X=2)RB◆◆LBP(X3)RB=2e219e2/3=6190.3158P(X = 2 \mid X \leq 3) = \dfrac◆LB◆P(X = 2)◆RB◆◆LB◆P(X \leq 3)◆RB◆ = \dfrac{2e^{-2}}{19e^{-2}/3} = \dfrac{6}{19} \approx 0.3158.

Question 15

The number of radioactive decays per second from a sample is modelled by XPo(λ)X \sim \mathrm{Po}(\lambda). Over 50 seconds, 145 decays are observed.

(a) Estimate λ\lambda.

(b) Using your estimate, find the probability of observing exactly 3 decays in a 1-second interval.

Solution

(a) λ^=145/50=2.9\hat{\lambda} = 145/50 = 2.9 per second.

(b) P(X=3)=e2.9(2.9)36=LB24.389e2.9RB◆◆LB6RB0.2227P(X = 3) = \dfrac{e^{-2.9}(2.9)^3}{6} = \dfrac◆LB◆24.389 \cdot e^{-2.9}◆RB◆◆LB◆6◆RB◆ \approx 0.2227.


8. Advanced Worked Examples

Example 8.1: Poisson as a limit of the binomial

Problem. A factory produces items with a defect rate of 0.002. In a batch of 1000, find the probability of exactly 3 defects using (a) the binomial distribution and (b) the Poisson approximation.

Solution. (a) XB(1000,0.002)X \sim B(1000, 0.002): P(X=3)=(10003)(0.002)3(0.998)9970.1814P(X=3) = \binom{1000}{3}(0.002)^3(0.998)^{997} \approx 0.1814.

(b) λ=np=2\lambda = np = 2. XPo(2)X \approx \mathrm{Po}(2): P(X=3)=LBe28RB◆◆LB6RB0.1804P(X=3) = \dfrac◆LB◆e^{-2} \cdot 8◆RB◆◆LB◆6◆RB◆ \approx 0.1804.

The approximation is excellent (error <0.6%< 0.6\%).

Example 8.2: Sum of independent Poisson random variables

Problem. Emails arrive at a rate of 5 per hour and texts at 3 per hour. Find the probability that the total number of messages in a 2-hour period exceeds 20.

Solution. In 2 hours: emails Po(10)\sim \mathrm{Po}(10), texts Po(6)\sim \mathrm{Po}(6).

Total messages =Po(10+6)=Po(16)= \mathrm{Po}(10+6) = \mathrm{Po}(16).

P(X>20)=1P(X20)=1k=020LBe1616kRB◆◆LBk!RB10.8688=0.131P(X > 20) = 1 - P(X \leq 20) = 1 - \sum_{k=0}^{20} \frac◆LB◆e^{-16} \cdot 16^k◆RB◆◆LB◆k!◆RB◆ \approx 1 - 0.8688 = \boxed{0.131}

Example 8.3: Conditional probability with the geometric distribution

Problem. XGeo(0.3)X \sim \mathrm{Geo}(0.3). Find P(X>4X>2)P(X > 4 \mid X > 2).

Solution. The geometric distribution has the memoryless property:

P(X>4X>2)=P(X>2)=(10.3)2=0.49P(X > 4 \mid X > 2) = P(X > 2) = (1-0.3)^2 = 0.49

Verification: P(X>4)=0.74=0.2401P(X > 4) = 0.7^4 = 0.2401, P(X>2)=0.49P(X > 2) = 0.49. P(X>4X>2)=0.24010.49=0.49P(X>4 \mid X>2) = \dfrac{0.2401}{0.49} = 0.49. ✓

Example 8.4: Poisson hypothesis testing

Problem. A call centre claims an average of 6 calls per minute. In a 10-minute period, 72 calls are received. Test at the 5% level whether the rate has increased.

Solution. H0H_0: λ=6\lambda = 6 per minute. H1H_1: λ>6\lambda > 6.

Under H0H_0, total calls in 10 minutes Po(60)\sim \mathrm{Po}(60).

For large λ\lambda, approximate with N(60,60)N(60, 60).

P(X72)P ⁣(ZLB71.560RB◆◆LB60RB)=P(Z1.485)=10.9311=0.069P(X \geq 72) \approx P\!\left(Z \geq \frac◆LB◆71.5 - 60◆RB◆◆LB◆\sqrt{60}◆RB◆\right) = P(Z \geq 1.485) = 1 - 0.9311 = 0.069

(using continuity correction).

0.069>0.050.069 > 0.05: do not reject H0H_0. Insufficient evidence that the rate has increased.

Example 8.5: Mode of the Poisson distribution

Problem. Find the mode of the Poisson distribution with parameter λ\lambda.

Solution. The mode mm satisfies P(X=m)P(X=m1)P(X = m) \geq P(X = m-1) and P(X=m)P(X=m+1)P(X = m) \geq P(X = m+1).

LBeλλmRB◆◆LBm!RBLBeλλm1RB◆◆LB(m1)!RB    LBλRB◆◆LBmRB1    mλ\frac◆LB◆e^{-\lambda}\lambda^m◆RB◆◆LB◆m!◆RB◆ \geq \frac◆LB◆e^{-\lambda}\lambda^{m-1}◆RB◆◆LB◆(m-1)!◆RB◆ \implies \frac◆LB◆\lambda◆RB◆◆LB◆m◆RB◆ \geq 1 \implies m \leq \lambda

LBeλλmRB◆◆LBm!RBLBeλλm+1RB◆◆LB(m+1)!RB    LBm+1RB◆◆LBλRB1    mλ1\frac◆LB◆e^{-\lambda}\lambda^m◆RB◆◆LB◆m!◆RB◆ \geq \frac◆LB◆e^{-\lambda}\lambda^{m+1}◆RB◆◆LB◆(m+1)!◆RB◆ \implies \frac◆LB◆m+1◆RB◆◆LB◆\lambda◆RB◆ \geq 1 \implies m \geq \lambda - 1

So λ1mλ\lambda - 1 \leq m \leq \lambda, meaning the mode is λ\lfloor\lambda\rfloor (and also λ\lambda if λ\lambda is an integer).

Example 8.6: Relationship between Poisson and exponential

Problem. Events occur according to a Poisson process with rate λ=4\lambda = 4 per hour. Find the probability that the time between two consecutive events exceeds 30 minutes.

Solution. For a Poisson process with rate λ\lambda, the inter-arrival time TExp(λ)T \sim \mathrm{Exp}(\lambda).

P(T>0.5)=e4×0.5=e20.135P(T > 0.5) = e^{-4 \times 0.5} = e^{-2} \approx \boxed{0.135}

Example 8.7: Variance of the geometric distribution

Problem. Derive Var(X)\mathrm{Var}(X) for XGeo(p)X \sim \mathrm{Geo}(p), defined as the number of trials until the first success.

Solution. E(X)=1pE(X) = \dfrac{1}{p}. Using Var(X)=E(X2)[E(X)]2\mathrm{Var}(X) = E(X^2) - [E(X)]^2:

E(X2)=k=1k2p(1p)k1E(X^2) = \displaystyle\sum_{k=1}^{\infty} k^2 p(1-p)^{k-1}.

Using the identity k=1k2rk1=1+r(1r)3\displaystyle\sum_{k=1}^{\infty} k^2 r^{k-1} = \frac{1+r}{(1-r)^3} with r=1pr = 1-p:

E(X2)=p(2p)p3=2pp2E(X^2) = \frac{p(2-p)}{p^3} = \frac{2-p}{p^2}

Var(X)=2pp21p2=1pp2\mathrm{Var}(X) = \frac{2-p}{p^2} - \frac{1}{p^2} = \boxed{\frac{1-p}{p^2}}


9. Common Pitfalls

PitfallCorrect Approach
Confusing the two definitions of the geometric distribution"Number of trials until first success": E(X)=1/pE(X) = 1/p; "Number of failures before first success": E(X)=(1p)/pE(X) = (1-p)/p
Using the Poisson approximation when np>10np > 10 or n<20n < 20The Poisson approximation requires nn large and pp small, with npnp moderate
Forgetting that Poisson probabilities sum to 1 only over all kk from 0 to \inftyNever truncate without adjusting
Applying the Poisson to events that are not independentThe Poisson process requires independent events at a constant average rate

10. Additional Exam-Style Questions

Question 8

A typist makes an average of 2 errors per page. Find the probability that a 3-page document contains exactly 5 errors.

Solution

Total errors Po(6)\sim \mathrm{Po}(6).

P(X=5)=LBe665RB◆◆LB120RB=LB7776e6RB◆◆LB120RB0.1606P(X = 5) = \frac◆LB◆e^{-6} \cdot 6^5◆RB◆◆LB◆120◆RB◆ = \frac◆LB◆7776 \cdot e^{-6}◆RB◆◆LB◆120◆RB◆ \approx \boxed{0.1606}

Question 9

Prove that for XPo(λ)X \sim \mathrm{Po}(\lambda), E(X)=λE(X) = \lambda.

Solution

E(X)=k=0kLBeλλkRB◆◆LBk!RB=k=1LBeλλkRB◆◆LB(k1)!RBE(X) = \sum_{k=0}^{\infty} k \cdot \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ = \sum_{k=1}^{\infty} \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆(k-1)!◆RB◆

Let j=k1j = k-1:

=λeλj=0LBλjRB◆◆LBj!RB=λeλeλ=λ= \lambda e^{-\lambda} \sum_{j=0}^{\infty} \frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆ = \lambda e^{-\lambda} \cdot e^{\lambda} = \lambda

\blacksquare

Question 10

XGeo(0.25)X \sim \mathrm{Geo}(0.25). Find P(X5)P(X \leq 5) and P(X>3)P(X > 3).

Solution

P(X5)=1P(X>5)=1(10.25)5=10.755=10.2373=0.7627P(X \leq 5) = 1 - P(X > 5) = 1 - (1-0.25)^5 = 1 - 0.75^5 = 1 - 0.2373 = \boxed{0.7627}.

P(X>3)=0.753=0.4219P(X > 3) = 0.75^3 = \boxed{0.4219}.


11. Connections to Other Topics

11.1 Poisson process and exponential distribution

The inter-arrival times of a Poisson process follow the exponential distribution. If events occur at rate λ\lambda per unit time, the time between consecutive events is Exp(λ)\mathrm{Exp}(\lambda). See Exponential and Continuous Random Variables.

11.2 Poisson and binomial

The Poisson distribution approximates the binomial when nn is large and pp is small, with λ=np\lambda = np.

11.3 Poisson and chi-squared tests

The chi-squared goodness-of-fit test is used to test whether data follows a Poisson or geometric distribution. See Chi-Squared Tests.


12. Key Results Summary

DistributionPMFE(X)E(X)Var(X)\mathrm{Var}(X)
Po(λ)\mathrm{Po}(\lambda)P(X=x)=LBeλλxRB◆◆LBx!RBP(X=x) = \dfrac◆LB◆e^{-\lambda}\lambda^x◆RB◆◆LB◆x!◆RB◆λ\lambdaλ\lambda
Geo(p)\mathrm{Geo}(p) (trials)P(X=x)=p(1p)x1P(X=x) = p(1-p)^{x-1}1p\dfrac{1}{p}1pp2\dfrac{1-p}{p^2}
Geo(p)\mathrm{Geo}(p) (failures)P(X=x)=p(1p)xP(X=x) = p(1-p)^x1pp\dfrac{1-p}{p}1pp2\dfrac{1-p}{p^2}
PropertyPoissonGeometric
MemorylessNoYes
Additive: X1+X2X_1+X_2Po(λ1+λ2)\mathrm{Po}(\lambda_1+\lambda_2) if independentNot simple
PMF tail behaviourDecays faster than geometricSlower decay

13. Further Exam-Style Questions

Question 11

A shop receives customers at a rate of 8 per hour. Find the probability that: (a) exactly 5 customers arrive in a 30-minute period; (b) more than 10 customers arrive in an hour; (c) the time between two consecutive arrivals exceeds 20 minutes.

Solution

(a) λ=8×0.5=4\lambda = 8 \times 0.5 = 4. P(X=5)=LBe41024RB◆◆LB120RB0.1563P(X=5) = \dfrac◆LB◆e^{-4} \cdot 1024◆RB◆◆LB◆120◆RB◆ \approx \boxed{0.1563}.

(b) λ=8\lambda = 8. P(X>10)=1P(X10)=1k=010LBe88kRB◆◆LBk!RB10.8159=0.184P(X > 10) = 1 - P(X \leq 10) = 1 - \sum_{k=0}^{10}\dfrac◆LB◆e^{-8} \cdot 8^k◆RB◆◆LB◆k!◆RB◆ \approx 1 - 0.8159 = \boxed{0.184}.

(c) Inter-arrival time TExp(8)T \sim \mathrm{Exp}(8). P(T>1/3)=e8/30.0695P(T > 1/3) = e^{-8/3} \approx \boxed{0.0695}.

Question 12

Prove that for XPo(λ)X \sim \mathrm{Po}(\lambda), Var(X)=λ\mathrm{Var}(X) = \lambda.

Solution

E(X2)=k=0k2LBeλλkRB◆◆LBk!RB=k=1kLBeλλkRB◆◆LB(k1)!RB=λeλj=0(j+1)LBλjRB◆◆LBj!RBE(X^2) = \displaystyle\sum_{k=0}^{\infty} k^2 \cdot \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ = \sum_{k=1}^{\infty} k \cdot \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆(k-1)!◆RB◆ = \lambda e^{-\lambda}\sum_{j=0}^{\infty}(j+1)\frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆

=λeλ ⁣(j=0jLBλjRB◆◆LBj!RB+j=0LBλjRB◆◆LBj!RB)=λeλ(λeλ+eλ)=λ(λ+1)=λ2+λ= \lambda e^{-\lambda}\!\left(\sum_{j=0}^{\infty} j\frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆ + \sum_{j=0}^{\infty}\frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆\right) = \lambda e^{-\lambda}(\lambda e^{\lambda} + e^{\lambda}) = \lambda(\lambda+1) = \lambda^2+\lambda.

Var(X)=E(X2)[E(X)]2=λ2+λλ2=λ\mathrm{Var}(X) = E(X^2)-[E(X)]^2 = \lambda^2+\lambda-\lambda^2 = \boxed{\lambda}. \blacksquare


14. Advanced Topics

14.1 Compound Poisson process

If events of type AA occur at rate λA\lambda_A and type BB at rate λB\lambda_B, independently, then the total event process is Poisson with rate λA+λB\lambda_A + \lambda_B.

14.2 Poisson distribution and the Poisson point process

A Poisson point process in 2D with rate λ\lambda per unit area has the property that the number of points in a region of area AA follows Po(λA)\mathrm{Po}(\lambda A).

14.3 The geometric distribution as a special case of the negative binomial

The negative binomial distribution counts the number of trials until rr successes. The geometric distribution is the case r=1r = 1.

NegBin(r,p)\mathrm{NegBin}(r, p): P(X=n)=(n1r1)pr(1p)nrP(X = n) = \binom{n-1}{r-1}p^r(1-p)^{n-r} for n=r,r+1,n = r, r+1, \ldots

14.4 Relationship to exponential families

Both the Poisson and geometric distributions belong to the exponential family of distributions, which have PDF/PMF of the form f(x;θ)=h(x)exp(η(θ)T(x)A(θ))f(x;\theta) = h(x)\exp(\eta(\theta)T(x) - A(\theta)).


15. Further Exam-Style Questions

Question 13

A radioactive source emits particles at a rate of 12 per minute. Find the probability that in a 2-minute period, the number of particles emitted is between 20 and 30 (inclusive).

Solution

λ=24\lambda = 24 per 2 minutes. XPo(24)X \sim \mathrm{Po}(24).

P(20X30)=P(X30)P(X19)P(20 \leq X \leq 30) = P(X \leq 30) - P(X \leq 19).

Using the normal approximation: XN(24,24)X \approx N(24, 24).

P(19.5<X<30.5)P ⁣(LB19.524RB◆◆LB24RB<Z<LB30.524RB◆◆LB24RB)P(19.5 < X < 30.5) \approx P\!\left(\dfrac◆LB◆19.5-24◆RB◆◆LB◆\sqrt{24}◆RB◆ < Z < \dfrac◆LB◆30.5-24◆RB◆◆LB◆\sqrt{24}◆RB◆\right)

=P(0.919<Z<1.327)=Φ(1.327)Φ(0.919)=0.9080.179=0.729= P(-0.919 < Z < 1.327) = \Phi(1.327) - \Phi(-0.919) = 0.908 - 0.179 = \boxed{0.729}

Question 14

Prove that if XX and YY are independent with XGeo(p)X \sim \mathrm{Geo}(p) and YGeo(p)Y \sim \mathrm{Geo}(p), then min(X,Y)Geo(1(1p)2)\min(X,Y) \sim \mathrm{Geo}(1-(1-p)^2).

Solution

P(min(X,Y)>n)=P(X>n)P(Y>n)=(1p)n(1p)n=(1p)2nP(\min(X,Y) > n) = P(X > n)P(Y > n) = (1-p)^n \cdot (1-p)^n = (1-p)^{2n}.

P(min(X,Y)=n)=P(min>n1)P(min>n)=(1p)2(n1)(1p)2n=(1p)2n2[1(1p)2]P(\min(X,Y) = n) = P(\min > n-1) - P(\min > n) = (1-p)^{2(n-1)} - (1-p)^{2n} = (1-p)^{2n-2}[1-(1-p)^2].

This is Geo(1(1p)2)\mathrm{Geo}(1-(1-p)^2) with success probability q=1(1p)2q = 1-(1-p)^2. \blacksquare


16. Further Advanced Topics

16.1 The Poisson process — formal definition

A Poisson process with rate λ\lambda is a counting process N(t)N(t) satisfying:

  1. N(0)=0N(0) = 0
  2. Independent increments
  3. N(t+s)N(s)Po(λt)N(t+s) - N(s) \sim \mathrm{Po}(\lambda t) for all s,t0s, t \geq 0

16.2 Conditional distributions

For XPo(λ1)X \sim \mathrm{Po}(\lambda_1) and YPo(λ2)Y \sim \mathrm{Po}(\lambda_2), independent:

P(X=kX+Y=n)=(nk) ⁣(LBλ1RB◆◆LBλ1+λ2RB)k(LBλ2RB◆◆LBλ1+λ2RB)nkP(X = k \mid X + Y = n) = \binom{n}{k}\!\left(\frac◆LB◆\lambda_1◆RB◆◆LB◆\lambda_1+\lambda_2◆RB◆\right)^k\left(\frac◆LB◆\lambda_2◆RB◆◆LB◆\lambda_1+\lambda_2◆RB◆\right)^{n-k}

This is Bin(n,λ1/(λ1+λ2))\mathrm{Bin}(n, \lambda_1/(\lambda_1+\lambda_2)) — the conditional distribution is binomial!

16.3 The negative binomial distribution

The number of trials until the rr-th success follows NegBin(r,p)\mathrm{NegBin}(r, p):

P(X=n)=(n1r1)pr(1p)nrfor n=r,r+1,P(X = n) = \binom{n-1}{r-1}p^r(1-p)^{n-r} \quad \text{for } n = r, r+1, \ldots

E(X)=rpE(X) = \dfrac{r}{p}, Var(X)=r(1p)p2\mathrm{Var}(X) = \dfrac{r(1-p)}{p^2}.

The geometric distribution is NegBin(1,p)\mathrm{NegBin}(1, p).

16.4 Poisson goodness-of-fit

To test whether data follows Po(λ)\mathrm{Po}(\lambda):

  1. Estimate λ^=xˉ\hat{\lambda} = \bar{x}
  2. Calculate expected frequencies using λ^\hat{\lambda}
  3. Apply the chi-squared test

17. Further Exam-Style Questions

Question 15

Calls arrive at rate 3 per hour. Find the probability that the third call arrives before time t=1t = 1 hour.

Solution

The time of the 3rd call is Gamma(3,3)\mathrm{Gamma}(3, 3) (sum of 3 independent Exp(3)\mathrm{Exp}(3) variables).

P(T3<1)=P(at least 3 calls in 1 hour)=k=3e33kk!P(T_3 < 1) = P(\text{at least 3 calls in 1 hour}) = \sum_{k=3}^{\infty}\dfrac{e^{-3}3^k}{k!}

=1P(X2)=1e3 ⁣(1+3+92)=1e38.5= 1 - P(X \leq 2) = 1 - e^{-3}\!\left(1 + 3 + \dfrac{9}{2}\right) = 1 - e^{-3}\cdot 8.5

10.42320.577\approx 1 - 0.4232 \approx \boxed{0.577}.

Question 16

Prove that for XGeo(p)X \sim \mathrm{Geo}(p), the moment generating function is MX(t)=pet1(1p)etM_X(t) = \dfrac{pe^t}{1-(1-p)e^t} for t<ln(1p)t < -\ln(1-p).

Solution

MX(t)=n=1etnp(1p)n1=p1pn=1[(1p)et]nM_X(t) = \displaystyle\sum_{n=1}^{\infty} e^{tn} p(1-p)^{n-1} = \frac{p}{1-p}\sum_{n=1}^{\infty} [(1-p)e^t]^n

=p1p(1p)et1(1p)et=pet1(1p)et= \frac{p}{1-p} \cdot \frac{(1-p)e^t}{1-(1-p)e^t} = \frac{pe^t}{1-(1-p)e^t}.

This converges when (1p)et<1|(1-p)e^t| < 1, i.e., t<ln(1p)t < -\ln(1-p). \blacksquare