Statistical Distributions (Extended) Statistical Distributions (Extended Treatment)
This document provides rigorous coverage of the binomial, normal, and Poisson distributions, their
approximations, and hypothesis testing applications.
Always state the distribution you are using in full, including the parameter values, before
calculating probabilities. For example: "X ∼ B ( 20 , 0.3 ) X \sim B(20, 0.3) X ∼ B ( 20 , 0.3 ) ".
1. The Binomial Distribution
1.1 Definition
A random variable X X X has a binomial distribution with parameters n n n and p p p (written
X ∼ B ( n , p ) X \sim B(n, p) X ∼ B ( n , p ) ) if:
P ( X = r ) = ( n r ) p r ( 1 − p ) n − r , r = 0 , 1 , 2 , … , n P(X = r) = \binom{n}{r}p^r(1-p)^{n-r}, \quad r = 0, 1, 2, \ldots, n P ( X = r ) = ( r n ) p r ( 1 − p ) n − r , r = 0 , 1 , 2 , … , n
Conditions for a binomial distribution:
A fixed number n n n of trials.
Each trial has exactly two outcomes (success/failure).
The probability of success p p p is constant for each trial.
Trials are independent.
1.2 Mean and variance
E ( X ) = n p , V a r ( X ) = n p ( 1 − p ) E(X) = np, \qquad \mathrm{Var}(X) = np(1-p) E ( X ) = n p , Var ( X ) = n p ( 1 − p )
Proof of E ( X ) = n p E(X) = np E ( X ) = n p .
E ( X ) = ∑ r = 0 n r ( n r ) p r ( 1 − p ) n − r = ∑ r = 1 n r ( n r ) p r ( 1 − p ) n − r E(X) = \sum_{r=0}^{n} r\binom{n}{r}p^r(1-p)^{n-r} = \sum_{r=1}^{n} r\binom{n}{r}p^r(1-p)^{n-r} E ( X ) = ∑ r = 0 n r ( r n ) p r ( 1 − p ) n − r = ∑ r = 1 n r ( r n ) p r ( 1 − p ) n − r
Using r ( n r ) = n ( n − 1 r − 1 ) r\binom{n}{r} = n\binom{n-1}{r-1} r ( r n ) = n ( r − 1 n − 1 ) :
= n p ∑ r = 1 n ( n − 1 r − 1 ) p r − 1 ( 1 − p ) n − r = n p ∑ k = 0 n − 1 ( n − 1 k ) p k ( 1 − p ) n − 1 − k = n p ⋅ 1 = n p ■ = np\sum_{r=1}^{n}\binom{n-1}{r-1}p^{r-1}(1-p)^{n-r} = np\sum_{k=0}^{n-1}\binom{n-1}{k}p^k(1-p)^{n-1-k} = np \cdot 1 = np \quad \blacksquare = n p ∑ r = 1 n ( r − 1 n − 1 ) p r − 1 ( 1 − p ) n − r = n p ∑ k = 0 n − 1 ( k n − 1 ) p k ( 1 − p ) n − 1 − k = n p ⋅ 1 = n p ■
1.3 Cumulative probabilities
P ( X ≤ r ) = ∑ k = 0 r ( n k ) p k ( 1 − p ) n − k P(X \leq r) = \sum_{k=0}^{r}\binom{n}{k}p^k(1-p)^{n-k} P ( X ≤ r ) = ∑ k = 0 r ( k n ) p k ( 1 − p ) n − k
P ( X ≥ r ) = 1 − P ( X ≤ r − 1 ) P(X \geq r) = 1 - P(X \leq r-1) P ( X ≥ r ) = 1 − P ( X ≤ r − 1 )
1.4 Worked example
Problem. A fair coin is tossed 12 times. Find the probability of getting: (a) exactly 7 heads;
(b) at most 4 heads; (c) between 5 and 9 heads inclusive.
X ∼ B ( 12 , 0.5 ) X \sim B(12, 0.5) X ∼ B ( 12 , 0.5 ) .
(a) P ( X = 7 ) = ( 12 7 ) ( 0.5 ) 12 = 792 4096 = 99 512 ≈ 0.1934 P(X = 7) = \dbinom{12}{7}(0.5)^{12} = \dfrac{792}{4096} = \dfrac{99}{512} \approx 0.1934 P ( X = 7 ) = ( 7 12 ) ( 0.5 ) 12 = 4096 792 = 512 99 ≈ 0.1934
(b) P ( X ≤ 4 ) = ∑ k = 0 4 ( 12 k ) ( 0.5 ) 12 = 1 + 12 + 66 + 220 + 495 4096 = 794 4096 ≈ 0.1938 P(X \leq 4) = \displaystyle\sum_{k=0}^{4}\dbinom{12}{k}(0.5)^{12} = \dfrac{1 + 12 + 66 + 220 + 495}{4096} = \dfrac{794}{4096} \approx 0.1938 P ( X ≤ 4 ) = k = 0 ∑ 4 ( k 12 ) ( 0.5 ) 12 = 4096 1 + 12 + 66 + 220 + 495 = 4096 794 ≈ 0.1938
(c) P ( 5 ≤ X ≤ 9 ) = P ( X ≤ 9 ) − P ( X ≤ 4 ) = 1 − P ( X ≤ 4 ) − P ( X ≥ 10 ) P(5 \leq X \leq 9) = P(X \leq 9) - P(X \leq 4) = 1 - P(X \leq 4) - P(X \geq 10) P ( 5 ≤ X ≤ 9 ) = P ( X ≤ 9 ) − P ( X ≤ 4 ) = 1 − P ( X ≤ 4 ) − P ( X ≥ 10 )
P ( X ≥ 10 ) = P ( X ≤ 2 ) P(X \geq 10) = P(X \leq 2) P ( X ≥ 10 ) = P ( X ≤ 2 ) (by symmetry of p = 0.5 p = 0.5 p = 0.5 ) = 1 + 12 + 66 4096 = 79 4096 = \dfrac{1 + 12 + 66}{4096} = \dfrac{79}{4096} = 4096 1 + 12 + 66 = 4096 79
P ( 5 ≤ X ≤ 9 ) = 1 − 794 4096 − 79 4096 = 3223 4096 ≈ 0.7869 P(5 \leq X \leq 9) = 1 - \dfrac{794}{4096} - \dfrac{79}{4096} = \dfrac{3223}{4096} \approx 0.7869 P ( 5 ≤ X ≤ 9 ) = 1 − 4096 794 − 4096 79 = 4096 3223 ≈ 0.7869
2. The Normal Distribution
2.1 Definition
A random variable X X X has a normal distribution with parameters μ \mu μ and σ 2 \sigma^2 σ 2 (written
X ∼ N ( μ , σ 2 ) X \sim N(\mu, \sigma^2) X ∼ N ( μ , σ 2 ) ) if its probability density function is:
f ( x ) = ◆ L B ◆ 1 ◆ R B ◆◆ L B ◆ σ ◆ L B ◆ 2 π ◆ R B ◆◆ R B ◆ exp ( − ◆ L B ◆ ( x − μ ) 2 ◆ R B ◆◆ L B ◆ 2 σ 2 ◆ R B ◆ ) , x ∈ R f(x) = \frac◆LB◆1◆RB◆◆LB◆\sigma\sqrt◆LB◆2\pi◆RB◆◆RB◆\exp\!\left(-\frac◆LB◆(x - \mu)^2◆RB◆◆LB◆2\sigma^2◆RB◆\right), \quad x \in \mathbb{R} f ( x ) = L ◆ B ◆1◆ R B ◆◆ L B ◆ σ ◆ L B ◆2 π ◆ R B ◆◆ R B ◆ exp ( − L ◆ B ◆ ( x − μ ) 2 ◆ R B ◆◆ L B ◆2 σ 2 ◆ R B ◆ ) , x ∈ R
2.2 Properties
The distribution is symmetric about x = μ x = \mu x = μ .
The mean, median, and mode are all equal to μ \mu μ .
E ( X ) = μ E(X) = \mu E ( X ) = μ , V a r ( X ) = σ 2 \mathrm{Var}(X) = \sigma^2 Var ( X ) = σ 2 .
Approximately 68% of data lies within μ ± σ \mu \pm \sigma μ ± σ .
Approximately 95% of data lies within μ ± 2 σ \mu \pm 2\sigma μ ± 2 σ .
Approximately 99.7% of data lies within μ ± 3 σ \mu \pm 3\sigma μ ± 3 σ .
2.3 Standardisation
To find probabilities, we standardise to the standard normal Z ∼ N ( 0 , 1 ) Z \sim N(0, 1) Z ∼ N ( 0 , 1 ) :
Z = ◆ L B ◆ X − μ ◆ R B ◆◆ L B ◆ σ ◆ R B ◆ Z = \frac◆LB◆X - \mu◆RB◆◆LB◆\sigma◆RB◆ Z = L ◆ B ◆ X − μ ◆ R B ◆◆ L B ◆ σ ◆ R B ◆
P ( X ≤ x ) = P ( Z ≤ ◆ L B ◆ x − μ ◆ R B ◆◆ L B ◆ σ ◆ R B ◆ ) = Φ ( ◆ L B ◆ x − μ ◆ R B ◆◆ L B ◆ σ ◆ R B ◆ ) P(X \leq x) = P\!\left(Z \leq \frac◆LB◆x - \mu◆RB◆◆LB◆\sigma◆RB◆\right) = \Phi\!\left(\frac◆LB◆x - \mu◆RB◆◆LB◆\sigma◆RB◆\right) P ( X ≤ x ) = P ( Z ≤ L ◆ B ◆ x − μ ◆ R B ◆◆ L B ◆ σ ◆ R B ◆ ) = Φ ( L ◆ B ◆ x − μ ◆ R B ◆◆ L B ◆ σ ◆ R B ◆ )
where Φ ( z ) \Phi(z) Φ ( z ) denotes the cumulative distribution function of the standard normal.
2.4 Worked example
Problem. The masses of bags of sugar are normally distributed with mean 1.02 k g 1.02\;\mathrm{kg} 1.02 kg and
standard deviation 0.03 k g 0.03\;\mathrm{kg} 0.03 kg . Find: (a) the probability a randomly selected bag has mass
less than 1.00 k g 1.00\;\mathrm{kg} 1.00 kg ; (b) the probability the mass is between 0.98 0.98 0.98 and 1.05 k g 1.05\;\mathrm{kg} 1.05 kg ;
(c) the value m m m such that 90% of bags have mass less than m m m .
X ∼ N ( 1.02 , 0.03 2 ) X \sim N(1.02, 0.03^2) X ∼ N ( 1.02 , 0.0 3 2 ) .
(a) P ( X < 1.00 ) = P ( Z < 1.00 − 1.02 0.03 ) = P ( Z < − 0.667 ) = 1 − Φ ( 0.667 ) ≈ 1 − 0.7476 = 0.2524 P(X \lt 1.00) = P\!\left(Z \lt \dfrac{1.00 - 1.02}{0.03}\right) = P(Z \lt -0.667) = 1 - \Phi(0.667) \approx 1 - 0.7476 = 0.2524 P ( X < 1.00 ) = P ( Z < 0.03 1.00 − 1.02 ) = P ( Z < − 0.667 ) = 1 − Φ ( 0.667 ) ≈ 1 − 0.7476 = 0.2524
(b) P ( 0.98 < X < 1.05 ) = P ( 0.98 − 1.02 0.03 < Z < 1.05 − 1.02 0.03 ) = P ( − 1.333 < Z < 1.000 ) P(0.98 \lt X \lt 1.05) = P\!\left(\dfrac{0.98 - 1.02}{0.03} \lt Z \lt \dfrac{1.05 - 1.02}{0.03}\right) = P(-1.333 \lt Z \lt 1.000) P ( 0.98 < X < 1.05 ) = P ( 0.03 0.98 − 1.02 < Z < 0.03 1.05 − 1.02 ) = P ( − 1.333 < Z < 1.000 )
= Φ ( 1.000 ) − Φ ( − 1.333 ) = 0.8413 − ( 1 − 0.9088 ) = 0.8413 − 0.0912 = 0.7501 = \Phi(1.000) - \Phi(-1.333) = 0.8413 - (1 - 0.9088) = 0.8413 - 0.0912 = 0.7501 = Φ ( 1.000 ) − Φ ( − 1.333 ) = 0.8413 − ( 1 − 0.9088 ) = 0.8413 − 0.0912 = 0.7501
(c) We need Φ ( m − 1.02 0.03 ) = 0.90 \Phi\!\left(\dfrac{m - 1.02}{0.03}\right) = 0.90 Φ ( 0.03 m − 1.02 ) = 0.90 , so m − 1.02 0.03 = 1.282 \dfrac{m - 1.02}{0.03} = 1.282 0.03 m − 1.02 = 1.282 .
m = 1.02 + 0.03 × 1.282 = 1.058 k g m = 1.02 + 0.03 \times 1.282 = 1.058\;\mathrm{kg} m = 1.02 + 0.03 × 1.282 = 1.058 kg
2.5 The normal approximation to the binomial
If X ∼ B ( n , p ) X \sim B(n, p) X ∼ B ( n , p ) and n n n is large, then X X X is approximately normal with:
X ≈ N ( n p , n p ( 1 − p ) ) X \approx N(np, np(1-p)) X ≈ N ( n p , n p ( 1 − p ))
Continuity correction. Since the binomial is discrete and the normal is continuous, apply a
continuity correction:
P ( X ≤ k ) ≈ P ( Y < k + 0.5 ) P(X \leq k) \approx P(Y \lt k + 0.5) P ( X ≤ k ) ≈ P ( Y < k + 0.5 )
P ( X ≥ k ) ≈ P ( Y > k − 0.5 ) P(X \geq k) \approx P(Y \gt k - 0.5) P ( X ≥ k ) ≈ P ( Y > k − 0.5 )
P ( X = k ) ≈ P ( k − 0.5 < Y < k + 0.5 ) P(X = k) \approx P(k - 0.5 \lt Y \lt k + 0.5) P ( X = k ) ≈ P ( k − 0.5 < Y < k + 0.5 )
The approximation is reasonable when n p > 5 np \gt 5 n p > 5 and n ( 1 − p ) > 5 n(1-p) \gt 5 n ( 1 − p ) > 5 .
2.6 Worked example: normal approximation
Problem. X ∼ B ( 80 , 0.45 ) X \sim B(80, 0.45) X ∼ B ( 80 , 0.45 ) . Use a normal approximation to find P ( X > 35 ) P(X \gt 35) P ( X > 35 ) .
μ = 80 × 0.45 = 36 \mu = 80 \times 0.45 = 36 μ = 80 × 0.45 = 36 , σ 2 = 80 × 0.45 × 0.55 = 19.8 \sigma^2 = 80 \times 0.45 \times 0.55 = 19.8 σ 2 = 80 × 0.45 × 0.55 = 19.8 , σ = 4.45 \sigma = 4.45 σ = 4.45 .
X ≈ N ( 36 , 19.8 ) X \approx N(36, 19.8) X ≈ N ( 36 , 19.8 ) .
P ( X > 35 ) ≈ P ( Y > 34.5 ) = P ( Z > 34.5 − 36 4.45 ) = P ( Z > − 0.337 ) P(X \gt 35) \approx P(Y \gt 34.5) = P\!\left(Z \gt \dfrac{34.5 - 36}{4.45}\right) = P(Z \gt -0.337) P ( X > 35 ) ≈ P ( Y > 34.5 ) = P ( Z > 4.45 34.5 − 36 ) = P ( Z > − 0.337 )
= 1 − Φ ( − 0.337 ) = Φ ( 0.337 ) ≈ 0.632 = 1 - \Phi(-0.337) = \Phi(0.337) \approx 0.632 = 1 − Φ ( − 0.337 ) = Φ ( 0.337 ) ≈ 0.632
3. The Poisson Distribution
3.1 Definition
A random variable X X X has a Poisson distribution with parameter λ \lambda λ (written
X ∼ P o ( λ ) X \sim \mathrm{Po}(\lambda) X ∼ Po ( λ ) ) if:
P ( X = r ) = ◆ L B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ , r = 0 , 1 , 2 , … P(X = r) = \frac◆LB◆e^{-\lambda}\lambda^r◆RB◆◆LB◆r!◆RB◆, \quad r = 0, 1, 2, \ldots P ( X = r ) = L ◆ B ◆ e − λ λ r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ , r = 0 , 1 , 2 , …
Conditions:
Events occur independently at a constant average rate.
The probability of more than one event in a sufficiently small interval is negligible.
Events occur singly in continuous time or space.
3.2 Mean and variance
E ( X ) = λ , V a r ( X ) = λ E(X) = \lambda, \qquad \mathrm{Var}(X) = \lambda E ( X ) = λ , Var ( X ) = λ
The equality of mean and variance is a distinguishing feature of the Poisson distribution.
3.3 Worked example
Problem. A call centre receives an average of 4.5 calls per minute. Assuming a Poisson model,
find: (a) the probability of exactly 6 calls in a minute; (b) the probability of at most 2 calls
in a minute; (c) the probability of more than 8 calls in a two-minute period.
X ∼ P o ( 4.5 ) X \sim \mathrm{Po}(4.5) X ∼ Po ( 4.5 ) .
(a) P ( X = 6 ) = e − 4.5 ( 4.5 ) 6 6 ! = ◆ L B ◆ e − 4.5 × 8303.8 ◆ R B ◆◆ L B ◆ 720 ◆ R B ◆ ≈ 0.1271 P(X = 6) = \dfrac{e^{-4.5}(4.5)^6}{6!} = \dfrac◆LB◆e^{-4.5} \times 8303.8◆RB◆◆LB◆720◆RB◆ \approx 0.1271 P ( X = 6 ) = 6 ! e − 4.5 ( 4.5 ) 6 = L ◆ B ◆ e − 4.5 × 8303.8◆ R B ◆◆ L B ◆720◆ R B ◆ ≈ 0.1271
(b) P ( X ≤ 2 ) = e − 4.5 ( 1 + 4.5 + 4.5 2 2 ) = e − 4.5 ( 1 + 4.5 + 10.125 ) = 15.625 e − 4.5 ≈ 0.1736 P(X \leq 2) = e^{-4.5}\!\left(1 + 4.5 + \dfrac{4.5^2}{2}\right) = e^{-4.5}(1 + 4.5 + 10.125) = 15.625\,e^{-4.5} \approx 0.1736 P ( X ≤ 2 ) = e − 4.5 ( 1 + 4.5 + 2 4. 5 2 ) = e − 4.5 ( 1 + 4.5 + 10.125 ) = 15.625 e − 4.5 ≈ 0.1736
(c) For two minutes, Y ∼ P o ( 9 ) Y \sim \mathrm{Po}(9) Y ∼ Po ( 9 ) .
P ( Y > 8 ) = 1 − P ( Y ≤ 8 ) = 1 − e − 9 ∑ r = 0 8 9 r r ! ≈ 1 − 0.4557 = 0.5443 P(Y \gt 8) = 1 - P(Y \leq 8) = 1 - e^{-9}\displaystyle\sum_{r=0}^{8}\dfrac{9^r}{r!} \approx 1 - 0.4557 = 0.5443 P ( Y > 8 ) = 1 − P ( Y ≤ 8 ) = 1 − e − 9 r = 0 ∑ 8 r ! 9 r ≈ 1 − 0.4557 = 0.5443
3.4 Poisson approximation to the binomial
If X ∼ B ( n , p ) X \sim B(n, p) X ∼ B ( n , p ) where n n n is large and p p p is small (so that n p np n p is moderate), then:
X ≈ P o ( n p ) X \approx \mathrm{Po}(np) X ≈ Po ( n p )
This is valid when n ≥ 50 n \geq 50 n ≥ 50 and p ≤ 0.1 p \leq 0.1 p ≤ 0.1 (and n p ≤ 10 np \leq 10 n p ≤ 10 as a rough guideline).
3.5 Worked example: Poisson approximation
Problem. A machine produces items with a defect rate of 0.02. In a batch of 200 items, find the
probability that exactly 3 are defective.
X ∼ B ( 200 , 0.02 ) X \sim B(200, 0.02) X ∼ B ( 200 , 0.02 ) . Since n = 200 n = 200 n = 200 is large and p = 0.02 p = 0.02 p = 0.02 is small, X ≈ P o ( 4 ) X \approx \mathrm{Po}(4) X ≈ Po ( 4 ) .
P ( X = 3 ) = ◆ L B ◆ e − 4 ⋅ 4 3 ◆ R B ◆◆ L B ◆ 3 ! ◆ R B ◆ = 64 e − 4 6 = 32 3 e − 4 ≈ 0.1954 P(X = 3) = \frac◆LB◆e^{-4} \cdot 4^3◆RB◆◆LB◆3!◆RB◆ = \frac{64e^{-4}}{6} = \frac{32}{3}e^{-4} \approx 0.1954 P ( X = 3 ) = L ◆ B ◆ e − 4 ⋅ 4 3 ◆ R B ◆◆ L B ◆3 ! ◆ R B ◆ = 6 64 e − 4 = 3 32 e − 4 ≈ 0.1954
4. Choosing the Correct Distribution
4.1 Decision framework
Situation Distribution Fixed trials, two outcomes, const p p p Binomial B ( n , p ) B(n, p) B ( n , p ) Rare events, constant rate Poisson P o ( λ ) \mathrm{Po}(\lambda) Po ( λ ) Continuous, symmetric, bell-shaped Normal N ( μ , σ 2 ) N(\mu, \sigma^2) N ( μ , σ 2 )
4.2 Sums of independent Poisson variables
Theorem. If X ∼ P o ( λ 1 ) X \sim \mathrm{Po}(\lambda_1) X ∼ Po ( λ 1 ) and Y ∼ P o ( λ 2 ) Y \sim \mathrm{Po}(\lambda_2) Y ∼ Po ( λ 2 ) are independent,
then X + Y ∼ P o ( λ 1 + λ 2 ) X + Y \sim \mathrm{Po}(\lambda_1 + \lambda_2) X + Y ∼ Po ( λ 1 + λ 2 ) .
Proof sketch. Using MGFs or direct convolution:
P ( X + Y = r ) = ∑ k = 0 r P ( X = k ) P ( Y = r − k ) = ∑ k = 0 r ◆ L B ◆ e − λ 1 λ 1 k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ ⋅ ◆ L B ◆ e − λ 2 λ 2 r − k ◆ R B ◆◆ L B ◆ ( r − k ) ! ◆ R B ◆ P(X + Y = r) = \sum_{k=0}^{r}P(X = k)P(Y = r-k) = \sum_{k=0}^{r}\frac◆LB◆e^{-\lambda_1}\lambda_1^k◆RB◆◆LB◆k!◆RB◆ \cdot \frac◆LB◆e^{-\lambda_2}\lambda_2^{r-k}◆RB◆◆LB◆(r-k)!◆RB◆ P ( X + Y = r ) = ∑ k = 0 r P ( X = k ) P ( Y = r − k ) = ∑ k = 0 r L ◆ B ◆ e − λ 1 λ 1 k ◆ R B ◆◆ L B ◆ k ! ◆ R B ◆ ⋅ L ◆ B ◆ e − λ 2 λ 2 r − k ◆ R B ◆◆ L B ◆ ( r − k )! ◆ R B ◆
= ◆ L B ◆ e − ( λ 1 + λ 2 ) ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ ∑ k = 0 r ( r k ) λ 1 k λ 2 r − k = ◆ L B ◆ e − ( λ 1 + λ 2 ) ( λ 1 + λ 2 ) r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ ■ = \frac◆LB◆e^{-(\lambda_1+\lambda_2)}◆RB◆◆LB◆r!◆RB◆\sum_{k=0}^{r}\binom{r}{k}\lambda_1^k\lambda_2^{r-k} = \frac◆LB◆e^{-(\lambda_1+\lambda_2)}(\lambda_1+\lambda_2)^r◆RB◆◆LB◆r!◆RB◆ \quad \blacksquare = L ◆ B ◆ e − ( λ 1 + λ 2 ) ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ ∑ k = 0 r ( k r ) λ 1 k λ 2 r − k = L ◆ B ◆ e − ( λ 1 + λ 2 ) ( λ 1 + λ 2 ) r ◆ R B ◆◆ L B ◆ r ! ◆ R B ◆ ■
4.3 Worked example
Problem. A shop receives orders at an average rate of 3 per hour from online and 2 per hour
from walk-in customers. Find the probability of receiving more than 7 orders in a two-hour period.
Total rate per hour = 3 + 2 = 5 = 3 + 2 = 5 = 3 + 2 = 5 . For two hours, X ∼ P o ( 10 ) X \sim \mathrm{Po}(10) X ∼ Po ( 10 ) .
P ( X > 7 ) = 1 − P ( X ≤ 7 ) = 1 − e − 10 ∑ r = 0 7 10 r r ! ≈ 1 − 0.2202 = 0.7798 P(X \gt 7) = 1 - P(X \leq 7) = 1 - e^{-10}\displaystyle\sum_{r=0}^{7}\dfrac{10^r}{r!} \approx 1 - 0.2202 = 0.7798 P ( X > 7 ) = 1 − P ( X ≤ 7 ) = 1 − e − 10 r = 0 ∑ 7 r ! 1 0 r ≈ 1 − 0.2202 = 0.7798
Common Pitfall
When using the Poisson approximation to the binomial, always check that the conditions are met
(large n n n , small p p p ). If p p p is close to 0.5, the normal approximation is more appropriate.
5. Practice Problems
Problem 1
X ∼ B ( 15 , 0.35 ) X \sim B(15, 0.35) X ∼ B ( 15 , 0.35 ) . Find: (a) P ( X = 5 ) P(X = 5) P ( X = 5 ) ; (b) P ( 3 ≤ X ≤ 7 ) P(3 \leq X \leq 7) P ( 3 ≤ X ≤ 7 ) ; (c) the most likely value
of X X X .
Solution (a) P ( X = 5 ) = ( 15 5 ) ( 0.35 ) 5 ( 0.65 ) 10 ≈ 0.2123 P(X = 5) = \dbinom{15}{5}(0.35)^5(0.65)^{10} \approx 0.2123 P ( X = 5 ) = ( 5 15 ) ( 0.35 ) 5 ( 0.65 ) 10 ≈ 0.2123 .
(b) P ( 3 ≤ X ≤ 7 ) = P ( X ≤ 7 ) − P ( X ≤ 2 ) ≈ 0.9506 − 0.0355 = 0.9151 P(3 \leq X \leq 7) = P(X \leq 7) - P(X \leq 2) \approx 0.9506 - 0.0355 = 0.9151 P ( 3 ≤ X ≤ 7 ) = P ( X ≤ 7 ) − P ( X ≤ 2 ) ≈ 0.9506 − 0.0355 = 0.9151 .
(c) Mode ≈ ( n + 1 ) p = 16 × 0.35 = 5.6 \approx (n+1)p = 16 \times 0.35 = 5.6 ≈ ( n + 1 ) p = 16 × 0.35 = 5.6 , so check r = 5 r = 5 r = 5 and r = 6 r = 6 r = 6 .
P ( X = 5 ) ≈ 0.2123 P(X = 5) \approx 0.2123 P ( X = 5 ) ≈ 0.2123 , P ( X = 6 ) ≈ 0.2186 P(X = 6) \approx 0.2186 P ( X = 6 ) ≈ 0.2186 . The mode is X = 6 X = 6 X = 6 .
Problem 2
The heights of men are normally distributed with mean 175 c m 175\;\mathrm{cm} 175 cm and standard deviation
8 c m 8\;\mathrm{cm} 8 cm . Find the probability that a randomly selected man is: (a) taller than 190 c m 190\;\mathrm{cm} 190 cm ;
(b) between 168 c m 168\;\mathrm{cm} 168 cm and 182 c m 182\;\mathrm{cm} 182 cm ; (c) what height is exceeded by only 5% of men?
Solution (a) P ( X > 190 ) = P ( Z > 15 8 ) = P ( Z > 1.875 ) = 1 − 0.9696 = 0.0304 P(X \gt 190) = P\!\left(Z \gt \dfrac{15}{8}\right) = P(Z \gt 1.875) = 1 - 0.9696 = 0.0304 P ( X > 190 ) = P ( Z > 8 15 ) = P ( Z > 1.875 ) = 1 − 0.9696 = 0.0304 .
(b) P ( 168 < X < 182 ) = P ( − 0.875 < Z < 0.875 ) = 2 Φ ( 0.875 ) − 1 = 2 ( 0.8092 ) − 1 = 0.6184 P(168 \lt X \lt 182) = P(-0.875 \lt Z \lt 0.875) = 2\Phi(0.875) - 1 = 2(0.8092) - 1 = 0.6184 P ( 168 < X < 182 ) = P ( − 0.875 < Z < 0.875 ) = 2Φ ( 0.875 ) − 1 = 2 ( 0.8092 ) − 1 = 0.6184 .
(c) P ( Z > z ) = 0.05 ⟹ z = 1.645 P(Z \gt z) = 0.05 \implies z = 1.645 P ( Z > z ) = 0.05 ⟹ z = 1.645 . Height = 175 + 1.645 × 8 = 188.2 c m = 175 + 1.645 \times 8 = 188.2\;\mathrm{cm} = 175 + 1.645 × 8 = 188.2 cm .
Problem 3
The number of emails received per hour follows a Poisson distribution with mean 6. Find the
probability that: (a) exactly 4 emails are received in an hour; (b) more than 10 emails in two
hours.
Solution (a) P ( X = 4 ) = ◆ L B ◆ e − 6 ⋅ 6 4 ◆ R B ◆◆ L B ◆ 4 ! ◆ R B ◆ = 1296 24 e − 6 = 54 e − 6 ≈ 0.1335 P(X = 4) = \dfrac◆LB◆e^{-6} \cdot 6^4◆RB◆◆LB◆4!◆RB◆ = \dfrac{1296}{24}e^{-6} = 54e^{-6} \approx 0.1335 P ( X = 4 ) = L ◆ B ◆ e − 6 ⋅ 6 4 ◆ R B ◆◆ L B ◆4 ! ◆ R B ◆ = 24 1296 e − 6 = 54 e − 6 ≈ 0.1335 .
(b) For two hours, Y ∼ P o ( 12 ) Y \sim \mathrm{Po}(12) Y ∼ Po ( 12 ) .
P ( Y > 10 ) = 1 − P ( Y ≤ 10 ) ≈ 1 − 0.6528 = 0.3472 P(Y \gt 10) = 1 - P(Y \leq 10) \approx 1 - 0.6528 = 0.3472 P ( Y > 10 ) = 1 − P ( Y ≤ 10 ) ≈ 1 − 0.6528 = 0.3472 .
Problem 4
A die is rolled 60 times. Use a suitable approximation to find the probability that the number of
sixes is between 8 and 14 inclusive.
Solution X ∼ B ( 60 , 1 / 6 ) X \sim B(60, 1/6) X ∼ B ( 60 , 1/6 ) . μ = 10 \mu = 10 μ = 10 , σ 2 = 60 × 1 6 × 5 6 = 25 3 ≈ 8.333 \sigma^2 = 60 \times \dfrac{1}{6} \times \dfrac{5}{6} = \dfrac{25}{3} \approx 8.333 σ 2 = 60 × 6 1 × 6 5 = 3 25 ≈ 8.333 .
σ ≈ 2.887 \sigma \approx 2.887 σ ≈ 2.887 .
P ( 8 ≤ X ≤ 14 ) ≈ P ( 7.5 < Y < 14.5 ) P(8 \leq X \leq 14) \approx P(7.5 \lt Y \lt 14.5) P ( 8 ≤ X ≤ 14 ) ≈ P ( 7.5 < Y < 14.5 ) where Y ∼ N ( 10 , 25 / 3 ) Y \sim N(10, 25/3) Y ∼ N ( 10 , 25/3 ) .
= P ( 7.5 − 10 2.887 < Z < 14.5 − 10 2.887 ) = P ( − 0.866 < Z < 1.558 ) = P\!\left(\dfrac{7.5 - 10}{2.887} \lt Z \lt \dfrac{14.5 - 10}{2.887}\right) = P(-0.866 \lt Z \lt 1.558) = P ( 2.887 7.5 − 10 < Z < 2.887 14.5 − 10 ) = P ( − 0.866 < Z < 1.558 )
= Φ ( 1.558 ) − Φ ( − 0.866 ) = 0.9404 − 0.1931 = 0.7473 = \Phi(1.558) - \Phi(-0.866) = 0.9404 - 0.1931 = 0.7473 = Φ ( 1.558 ) − Φ ( − 0.866 ) = 0.9404 − 0.1931 = 0.7473 .