Skip to main content

Statistical Distributions

Board Coverage

BoardPaperNotes
AQAPaper 1, 2Binomial and normal in P1; Poisson in P2
EdexcelP1, P2Similar
OCR (A)Paper 1, 2Binomial in P1; normal and Poisson in P2
CIE (9709)P1, P6Binomial in P1; normal and Poisson in P6
info

The formula booklet gives the probability mass function for the Binomial and Poisson distributions, and the normal distribution function. You must know when to use each distribution and how to find probabilities.


1. Discrete Random Variables

1.1 Definition

Definition. A discrete random variable XX takes values from a countable set with probabilities P(X=xi)=piP(X = x_i) = p_i satisfying:

  • pi0p_i \geq 0 for all ii
  • ipi=1\sum_i p_i = 1

1.2 Expectation and variance

E(X)=μ=xipiE(X) = \mu = \sum x_i\,p_i Var(X)=σ2=E(X2)[E(X)]2=xi2piμ2\mathrm{Var}(X) = \sigma^2 = E(X^2) - [E(X)]^2 = \sum x_i^2\,p_i - \mu^2


2. The Binomial Distribution

2.1 Derivation from Bernoulli trials

A Bernoulli trial is an experiment with exactly two outcomes: success (probability pp) and failure (probability 1p1-p).

If we perform nn independent Bernoulli trials, the number of successes XX follows a Binomial distribution: XB(n,p)X \sim B(n, p).

Derivation of the PMF. Each sequence of kk successes and nkn-k failures has probability pk(1p)nkp^k(1-p)^{n-k}. The number of such sequences is (nk)\binom{n}{k} (choosing which kk of the nn trials are successes). Therefore:

P(X=k)=(nk)pk(1p)nk,k=0,1,,nP(X = k) = \binom{n}{k}p^k(1-p)^{n-k}, \quad k = 0, 1, \ldots, n

2.2 Proof that E(X)=npE(X) = np

Proof. Let XiX_i be the indicator variable for the ii-th trial: Xi=1X_i = 1 if success, 00 if failure.

X=X1+X2++XnX = X_1 + X_2 + \cdots + X_n.

E(Xi)=1p+0(1p)=pE(X_i) = 1 \cdot p + 0 \cdot (1-p) = p.

By linearity of expectation: E(X)=E(Xi)=npE(X) = \sum E(X_i) = np. \blacksquare

2.3 Proof that Var(X)=np(1p)\mathrm{Var}(X) = np(1-p)

Proof. E(Xi2)=12p+02(1p)=pE(X_i^2) = 1^2 \cdot p + 0^2 \cdot (1-p) = p.

Var(Xi)=E(Xi2)[E(Xi)]2=pp2=p(1p)\mathrm{Var}(X_i) = E(X_i^2) - [E(X_i)]^2 = p - p^2 = p(1-p).

Since the XiX_i are independent: Var(X)=Var(Xi)=np(1p)\mathrm{Var}(X) = \sum \mathrm{Var}(X_i) = np(1-p). \blacksquare

2.4 Properties

  • The distribution is symmetric when p=0.5p = 0.5.
  • It is skewed left when p>0.5p \gt{} 0.5 and skewed right when p<0.5p \lt{} 0.5.
  • The mode is at (n+1)p\lfloor(n+1)p\rfloor.

2.5 Direct derivation of E(X)=npE(X) = np from the PMF

The proofs in Sections 2.2 and 2.3 use indicator variables. Here we derive the same results directly from the probability mass function using algebraic identities.

Proof. Starting from the definition of expectation applied to the binomial PMF:

E(X)=k=0nk(nk)pk(1p)nkE(X) = \sum_{k=0}^{n} k \binom{n}{k}p^k(1-p)^{n-k}

The k=0k=0 term vanishes, so begin the sum at k=1k=1. Apply the identity k(nk)=n(n1k1)k\binom{n}{k} = n\binom{n-1}{k-1}:

E(X)=k=1nn(n1k1)pk(1p)nk=npk=1n(n1k1)pk1(1p)(n1)(k1)E(X) = \sum_{k=1}^{n} n\binom{n-1}{k-1}p^k(1-p)^{n-k} = np\sum_{k=1}^{n}\binom{n-1}{k-1}p^{k-1}(1-p)^{(n-1)-(k-1)}

Substitute j=k1j = k - 1:

E(X)=npj=0n1(n1j)pj(1p)n1jE(X) = np\sum_{j=0}^{n-1}\binom{n-1}{j}p^j(1-p)^{n-1-j}

By the binomial theorem, j=0n1(n1j)pj(1p)n1j=[p+(1p)]n1=1\sum_{j=0}^{n-1}\binom{n-1}{j}p^j(1-p)^{n-1-j} = [p + (1-p)]^{n-1} = 1.

Therefore E(X)=npE(X) = np. \blacksquare

2.6 Direct derivation of Var(X)=np(1p)\mathrm{Var}(X) = np(1-p) from the PMF

Proof. First compute E(X(X1))E(X(X-1)):

E(X(X1))=k=0nk(k1)(nk)pk(1p)nkE(X(X-1)) = \sum_{k=0}^{n} k(k-1)\binom{n}{k}p^k(1-p)^{n-k}

Terms with k=0,1k = 0, 1 are zero. Apply the identity k(k1)(nk)=n(n1)(n2k2)k(k-1)\binom{n}{k} = n(n-1)\binom{n-2}{k-2}:

E(X(X1))=k=2nn(n1)(n2k2)pk(1p)nk=n(n1)p2k=2n(n2k2)pk2(1p)(n2)(k2)E(X(X-1)) = \sum_{k=2}^{n} n(n-1)\binom{n-2}{k-2}p^k(1-p)^{n-k} = n(n-1)p^2\sum_{k=2}^{n}\binom{n-2}{k-2}p^{k-2}(1-p)^{(n-2)-(k-2)}

Substitute j=k2j = k - 2:

E(X(X1))=n(n1)p2j=0n2(n2j)pj(1p)n2j=n(n1)p2E(X(X-1)) = n(n-1)p^2\sum_{j=0}^{n-2}\binom{n-2}{j}p^j(1-p)^{n-2-j} = n(n-1)p^2

The final equality follows from the binomial theorem: j=0n2(n2j)pj(1p)n2j=1\sum_{j=0}^{n-2}\binom{n-2}{j}p^j(1-p)^{n-2-j} = 1.

Now E(X2)=E(X(X1))+E(X)=n(n1)p2+npE(X^2) = E(X(X-1)) + E(X) = n(n-1)p^2 + np.

Var(X)=E(X2)[E(X)]2=n(n1)p2+npn2p2=npnp2=np(1p)\mathrm{Var}(X) = E(X^2) - [E(X)]^2 = n(n-1)p^2 + np - n^2p^2 = np - np^2 = np(1-p) \quad \blacksquare


3. The Normal Distribution

3.1 Motivation from the Central Limit Theorem

The Central Limit Theorem (CLT) states that the sum (or mean) of a large number of independent, identically distributed random variables is approximately normally distributed, regardless of the original distribution.

This is why the normal distribution appears so widely in nature: any quantity that is the sum of many small independent effects (height, measurement error, etc.) will be approximately normal.

3.2 Definition

XN(μ,σ2)X \sim N(\mu, \sigma^2) has PDF

f(x)=LB1RB◆◆LBσLB2πRB◆◆RBeLB(xμ)2RB◆◆LB2σ2RBf(x) = \frac◆LB◆1◆RB◆◆LB◆\sigma\sqrt◆LB◆2\pi◆RB◆◆RB◆\,e^{-\frac◆LB◆(x-\mu)^2◆RB◆◆LB◆2\sigma^2◆RB◆}

3.3 Properties

  • Bell-shaped, symmetric about μ\mu.
  • E(X)=μE(X) = \mu, Var(X)=σ2\mathrm{Var}(X) = \sigma^2.
  • Approximately 68% of data within μ±σ\mu \pm \sigma, 95% within μ±2σ\mu \pm 2\sigma, 99.7% within μ±3σ\mu \pm 3\sigma.

3.4 Standard normal

If XN(μ,σ2)X \sim N(\mu, \sigma^2), then Z=LBXμRB◆◆LBσRBN(0,1)Z = \dfrac◆LB◆X - \mu◆RB◆◆LB◆\sigma◆RB◆ \sim N(0, 1).

Probabilities are found using the standard normal table or a calculator's inverse normal function.

3.5 Finding probabilities

P(a<X<b)=P ⁣(LBaμRB◆◆LBσRB<Z<LBbμRB◆◆LBσRB)=Φ ⁣(LBbμRB◆◆LBσRB)Φ ⁣(LBaμRB◆◆LBσRB)P(a < X < b) = P\!\left(\frac◆LB◆a-\mu◆RB◆◆LB◆\sigma◆RB◆ < Z < \frac◆LB◆b-\mu◆RB◆◆LB◆\sigma◆RB◆\right) = \Phi\!\left(\frac◆LB◆b-\mu◆RB◆◆LB◆\sigma◆RB◆\right) - \Phi\!\left(\frac◆LB◆a-\mu◆RB◆◆LB◆\sigma◆RB◆\right)

3.6 Normal approximation to Binomial

For large nn with np>5np \gt{} 5 and n(1p)>5n(1-p) \gt{} 5:

B(n,p)N(np,np(1p))B(n, p) \approx N(np, np(1-p))

with continuity correction: P(Xk)P ⁣(Z<LBk+0.5npRB◆◆LBnp(1p)RB)P(X \leq k) \approx P\!\left(Z \lt{} \frac◆LB◆k + 0.5 - np◆RB◆◆LB◆\sqrt{np(1-p)}◆RB◆\right).

warning

warning (Binomial) with a continuous one (Normal). Add or subtract 0.5 depending on the inequality direction.


4. The Poisson Distribution

4.1 Definition

XPo(λ)X \sim \mathrm{Po}(\lambda) models the number of events in a fixed interval when events occur independently at a constant average rate λ\lambda.

P(X=k)=LBeλλkRB◆◆LBk!RB,k=0,1,2,P(X = k) = \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆, \quad k = 0, 1, 2, \ldots

4.2 Derivation as a limit of the Binomial

Theorem. If nn \to \infty and p0p \to 0 such that np=λnp = \lambda remains constant, then B(n,p)Po(λ)B(n, p) \to \mathrm{Po}(\lambda).

Proof.

P(X=k)=(nk)pk(1p)nk=LBn(n1)(nk+1)RB◆◆LBk!RBLBλkRB◆◆LBnkRB(1LBλRB◆◆LBnRB)nk\begin{aligned} P(X = k) &= \binom{n}{k}p^k(1-p)^{n-k} \\ &= \frac◆LB◆n(n-1)\cdots(n-k+1)◆RB◆◆LB◆k!◆RB◆ \cdot \frac◆LB◆\lambda^k◆RB◆◆LB◆n^k◆RB◆ \cdot \left(1-\frac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-k} \end{aligned}

Consider each factor as nn \to \infty:

  • LBn(n1)(nk+1)RB◆◆LBnkRB1\dfrac◆LB◆n(n-1)\cdots(n-k+1)◆RB◆◆LB◆n^k◆RB◆ \to 1 (each term ninn-i \approx n)
  • (1LBλRB◆◆LBnRB)nkeλ\left(1 - \dfrac◆LB◆\lambda◆RB◆◆LB◆n◆RB◆\right)^{n-k} \to e^{-\lambda} (using limn(1+a/n)n=ea\lim_{n\to\infty}(1+a/n)^n = e^a)

Therefore:

P(X=k)1k!λkeλ=LBeλλkRB◆◆LBk!RBP(X = k) \to \frac{1}{k!} \cdot \lambda^k \cdot e^{-\lambda} = \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ \quad \blacksquare

4.3 Proof that E(X)=λE(X) = \lambda

Proof.

E(X)=k=0kLBeλλkRB◆◆LBk!RB=k=1LBeλλkRB◆◆LB(k1)!RB=λeλk=1LBλk1RB◆◆LB(k1)!RB=λeλj=0LBλjRB◆◆LBj!RB=λeλeλ=λ\begin{aligned} E(X) &= \sum_{k=0}^{\infty}k \cdot \frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ = \sum_{k=1}^{\infty}\frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆(k-1)!◆RB◆ \\ &= \lambda e^{-\lambda}\sum_{k=1}^{\infty}\frac◆LB◆\lambda^{k-1}◆RB◆◆LB◆(k-1)!◆RB◆ = \lambda e^{-\lambda}\sum_{j=0}^{\infty}\frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆ \\ &= \lambda e^{-\lambda} \cdot e^{\lambda} = \lambda \quad \blacksquare \end{aligned}

4.4 Proof that Var(X)=λ\mathrm{Var}(X) = \lambda

Proof. First compute E(X(X1))E(X(X-1)):

E(X(X1))=k=2k(k1)LBeλλkRB◆◆LBk!RB=k=2LBeλλkRB◆◆LB(k2)!RB=λ2eλj=0LBλjRB◆◆LBj!RB=λ2eλeλ=λ2\begin{aligned} E(X(X-1)) &= \sum_{k=2}^{\infty}k(k-1)\frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆k!◆RB◆ = \sum_{k=2}^{\infty}\frac◆LB◆e^{-\lambda}\lambda^k◆RB◆◆LB◆(k-2)!◆RB◆ \\ &= \lambda^2 e^{-\lambda}\sum_{j=0}^{\infty}\frac◆LB◆\lambda^j◆RB◆◆LB◆j!◆RB◆ = \lambda^2 e^{-\lambda} \cdot e^{\lambda} = \lambda^2 \end{aligned}

E(X2)=E(X(X1))+E(X)=λ2+λE(X^2) = E(X(X-1)) + E(X) = \lambda^2 + \lambda.

Var(X)=E(X2)[E(X)]2=λ2+λλ2=λ\mathrm{Var}(X) = E(X^2) - [E(X)]^2 = \lambda^2 + \lambda - \lambda^2 = \lambda. \blacksquare

4.5 Additivity

If XPo(λ)X \sim \mathrm{Po}(\lambda) and YPo(μ)Y \sim \mathrm{Po}(\mu) are independent, then X+YPo(λ+μ)X + Y \sim \mathrm{Po}(\lambda + \mu).

4.6 Conditions for the Poisson model

The Poisson distribution is appropriate when all of the following hold:

  • Events occur independently of one another.
  • Events occur at a constant average rate λ\lambda in a fixed interval of time, space, or volume.
  • The probability of more than one event occurring in a sufficiently small sub-interval is negligible.

These are sometimes called the Poisson postulates. When they are satisfied, the number of events in any interval of length tt follows Po(λt)\mathrm{Po}(\lambda t).

Typical applications include: calls arriving at a call centre per hour, typing errors per page, radioactive decays per second, and cars passing a checkpoint per minute.

tip

tip constant over the interval and that events do not cluster. If events tend to occur in bursts, the Poisson model is not appropriate.

4.7 Poisson approximation to the Binomial

Practical rule. When n>50n \gt{} 50 and p<0.1p \lt{} 0.1, we may approximate B(n,p)B(n, p) by Po(λ)\mathrm{Po}(\lambda) where λ=np\lambda = np.

Justification. The theoretical result in Section 4.2 shows that as nn \to \infty and p0p \to 0 with np=λnp = \lambda held constant, the binomial PMF converges pointwise to the Poisson PMF. The conditions n>50n \gt{} 50 and p<0.1p \lt{} 0.1 are practical thresholds that ensure:

  1. nn is large enough that the discrete binomial is well-approximated by a limit distribution.
  2. pp is small enough that the "rare event" assumption of the Poisson model is satisfied.
  3. λ=np\lambda = np is moderate (typically 0<λ<100 \lt{} \lambda \lt{} 10), so that neither distribution is heavily concentrated at a single point.

The approximation improves as nn increases and pp decreases while λ=np\lambda = np remains fixed.

warning

warning and nn is large, use the normal approximation (Section 3.6) instead. The two approximations are complementary: Poisson handles the case of many trials with rare success, while normal handles the case of many trials with moderate success probability.


5. Choosing the Right Distribution

SituationDistribution
Fixed nn trials, success/failureBinomial B(n,p)B(n,p)
Events in continuous interval, rare eventsPoisson Po(λ)\mathrm{Po}(\lambda)
Continuous, bell-shapedNormal N(μ,σ2)N(\mu,\sigma^2)

6. Coding of Random Variables

6.1 Definition

A coding (or linear transformation) of a discrete random variable XX is a new random variable Y=aX+bY = aX + b where aa and bb are constants with a0a \neq 0.

Coding arises naturally when changing units (e.g. centimetres to metres, or Celsius to Fahrenheit) or when shifting and scaling a distribution.

6.2 Effect on expectation

Theorem. If Y=aX+bY = aX + b, then E(Y)=aE(X)+bE(Y) = aE(X) + b.

Proof. Applying the definition of expectation to YY:

E(Y)=(axi+b)pi=axipi+bpi=aE(X)+b1=aE(X)+bE(Y) = \sum (ax_i + b)\,p_i = a\sum x_i\,p_i + b\sum p_i = aE(X) + b \cdot 1 = aE(X) + b \quad \blacksquare

The key step is pi=1\sum p_i = 1, since the probabilities sum to 1.

6.3 Effect on variance

Theorem. If Y=aX+bY = aX + b, then Var(Y)=a2Var(X)\mathrm{Var}(Y) = a^2\mathrm{Var}(X).

Proof.

Var(Y)=E(Y2)[E(Y)]2=E[(aX+b)2][aE(X)+b]2=E[a2X2+2abX+b2]a2[E(X)]2+2abE(X)+b2=a2E(X2)+2abE(X)+b2a2[E(X)]22abE(X)b2=a2E(X2)[E(X)]2=a2Var(X)\begin{aligned} \mathrm{Var}(Y) &= E(Y^2) - [E(Y)]^2 \\ &= E[(aX + b)^2] - [aE(X) + b]^2 \\ &= E[a^2X^2 + 2abX + b^2] - \\{a^2[E(X)]^2 + 2abE(X) + b^2\\} \\ &= a^2E(X^2) + 2abE(X) + b^2 - a^2[E(X)]^2 - 2abE(X) - b^2 \\ &= a^2\\{E(X^2) - [E(X)]^2\\} \\ &= a^2\mathrm{Var}(X) \quad \blacksquare \end{aligned}

Note how the terms 2abE(X)2abE(X) and b2b^2 cancel between E(Y2)E(Y^2) and [E(Y)]2[E(Y)]^2.

info

Adding a constant bb (a location shift) has no effect on variance. Only multiplying by aa (a scale change) affects variance, and it does so by a factor of a2a^2. This is why variance is measured in squared units of the original variable.

6.4 Effect on standard deviation

Since Var(Y)=a2Var(X)\mathrm{Var}(Y) = a^2\mathrm{Var}(X), taking square roots gives:

SD(Y)=aSD(X)\mathrm{SD}(Y) = |a|\,\mathrm{SD}(X)

The absolute value ensures the standard deviation remains non-negative regardless of the sign of aa.


Problem Set

Details

Problem 1 XB(10,0.3)X \sim B(10, 0.3). Find P(X=4)P(X = 4), P(X3)P(X \leq 3), and P(X7)P(X \geq 7).

Details

Solution 1 P(X=4)=(104)(0.3)4(0.7)6=210×0.0081×0.11760.2001P(X=4) = \binom{10}{4}(0.3)^4(0.7)^6 = 210 \times 0.0081 \times 0.1176 \approx 0.2001.

P(X3)=P(X=0)+P(X=1)+P(X=2)+P(X=3)0.0282+0.1211+0.2335+0.26680.6496P(X \leq 3) = P(X=0)+P(X=1)+P(X=2)+P(X=3) \approx 0.0282 + 0.1211 + 0.2335 + 0.2668 \approx 0.6496.

P(X7)=P(X=7)+P(X=8)+P(X=9)+P(X=10)0.0090+0.0014+0.0001+0.00000.0106P(X \geq 7) = P(X=7)+P(X=8)+P(X=9)+P(X=10) \approx 0.0090 + 0.0014 + 0.0001 + 0.0000 \approx 0.0106.

If you get this wrong, revise: The Binomial Distribution — Section 2.

Details

Problem 2 Heights of men are normally distributed with mean 175 cm and standard deviation 8 cm. Find the probability that a randomly chosen man is taller than 185 cm.

Details

Solution 2 XN(175,64)X \sim N(175, 64). P(X>185)=P ⁣(Z>1851758)=P(Z>1.25)=1Φ(1.25)10.8944=0.1056P(X \gt{} 185) = P\!\left(Z \gt{} \dfrac{185-175}{8}\right) = P(Z \gt{} 1.25) = 1 - \Phi(1.25) \approx 1 - 0.8944 = 0.1056.

If you get this wrong, revise: The Normal Distribution — Section 3.

Details

Problem 3 A call centre receives an average of 4.5 calls per minute. Find the probability of receiving exactly 6 calls in a given minute, and the probability of receiving more than 8 calls.

Details

Solution 3 XPo(4.5)X \sim \mathrm{Po}(4.5).

P(X=6)=e4.5(4.5)66!=LB0.01111×8303.77RB◆◆LB720RB0.1281P(X=6) = \dfrac{e^{-4.5}(4.5)^6}{6!} = \dfrac◆LB◆0.01111 \times 8303.77◆RB◆◆LB◆720◆RB◆ \approx 0.1281.

P(X>8)=1P(X8)=1k=08e4.5(4.5)kk!10.9804=0.0196P(X \gt{} 8) = 1 - P(X \leq 8) = 1 - \sum_{k=0}^{8}\dfrac{e^{-4.5}(4.5)^k}{k!} \approx 1 - 0.9804 = 0.0196.

If you get this wrong, revise: The Poisson Distribution — Section 4.

Details

Problem 4 XB(100,0.04)X \sim B(100, 0.04). Use the Poisson approximation to find P(X2)P(X \leq 2).

Details

Solution 4 λ=np=4\lambda = np = 4. XPo(4)X \approx \mathrm{Po}(4).

P(X2)=e4(1+4+162)=e4(1+4+8)=13e40.2381P(X \leq 2) = e^{-4}\left(1 + 4 + \dfrac{16}{2}\right) = e^{-4}(1 + 4 + 8) = 13e^{-4} \approx 0.2381.

If you get this wrong, revise: Derivation as a Limit — Section 4.2.

Details

Problem 5 Find cc such that P(c<Z<c)=0.95P(-c \lt{} Z \lt{} c) = 0.95 where ZN(0,1)Z \sim N(0,1).

Details

Solution 5 P(c<Z<c)=2Φ(c)1=0.95    Φ(c)=0.975P(-c \lt{} Z \lt{} c) = 2\Phi(c) - 1 = 0.95 \implies \Phi(c) = 0.975.

From tables: c1.96c \approx 1.96.

If you get this wrong, revise: Standard Normal — Section 3.4.

Details

Problem 6 The number of emails received per hour follows Po(12)\mathrm{Po}(12). Find the probability of receiving between 10 and 15 emails (inclusive) in a given hour.

Details

Solution 6 XPo(12)X \sim \mathrm{Po}(12).

P(10X15)=P(X15)P(X9)P(10 \leq X \leq 15) = P(X \leq 15) - P(X \leq 9).

P(X15)0.7728P(X \leq 15) \approx 0.7728, P(X9)0.2424P(X \leq 9) \approx 0.2424.

P(10X15)0.77280.2424=0.5304P(10 \leq X \leq 15) \approx 0.7728 - 0.2424 = 0.5304.

If you get this wrong, revise: The Poisson Distribution — Section 4.

Details

Problem 7 A machine produces bolts with lengths XN(50,0.04)X \sim N(50, 0.04) cm. Bolts with length less than 49.7 cm or greater than 50.3 cm are rejected. Find the proportion of bolts rejected.

Details

Solution 7 σ=0.04=0.2\sigma = \sqrt{0.04} = 0.2.

P(X<49.7)=P(Z<(49.750)/0.2)=P(Z<1.5)=0.0668P(X \lt{} 49.7) = P(Z \lt{} (49.7-50)/0.2) = P(Z \lt{} -1.5) = 0.0668.

P(X>50.3)=P(Z>1.5)=0.0668P(X \gt{} 50.3) = P(Z \gt{} 1.5) = 0.0668.

Proportion rejected =0.0668+0.0668=0.1336= 0.0668 + 0.0668 = 0.1336 (13.36%).

If you get this wrong, revise: Finding Probabilities — Section 3.5.

Details

Problem 8 Prove that E(aX+b)=aE(X)+bE(aX + b) = aE(X) + b and Var(aX+b)=a2Var(X)\mathrm{Var}(aX + b) = a^2\mathrm{Var}(X).

Details

Solution 8 E(aX+b)=(axi+b)pi=axipi+bpi=aE(X)+bE(aX+b) = \sum(a x_i + b)p_i = a\sum x_i p_i + b\sum p_i = aE(X) + b. ✓

Var(aX+b)=E[(aX+b)2][E(aX+b)]2=E[a2X2+2abX+b2][aE(X)+b]2\mathrm{Var}(aX+b) = E[(aX+b)^2] - [E(aX+b)]^2 = E[a^2X^2 + 2abX + b^2] - [aE(X)+b]^2 =a2E(X2)+2abE(X)+b2a2[E(X)]22abE(X)b2= a^2E(X^2) + 2abE(X) + b^2 - a^2[E(X)]^2 - 2abE(X) - b^2 =a2[E(X2)(E(X))2]=a2Var(X)= a^2[E(X^2) - (E(X))^2] = a^2\mathrm{Var}(X). ✓

If you get this wrong, revise: Expectation and Variance — Section 1.2.

Details

Problem 9 XB(200,0.15)X \sim B(200, 0.15). Use the normal approximation with continuity correction to approximate P(X>35)P(X \gt{} 35).

Details

Solution 9 μ=200(0.15)=30\mu = 200(0.15) = 30, σ2=200(0.15)(0.85)=25.5\sigma^2 = 200(0.15)(0.85) = 25.5, σ5.05\sigma \approx 5.05.

P(X>35)P ⁣(Z>35.5305.05)=P(Z>1.089)10.8621=0.1379P(X \gt{} 35) \approx P\!\left(Z \gt{} \dfrac{35.5 - 30}{5.05}\right) = P(Z \gt{} 1.089) \approx 1 - 0.8621 = 0.1379.

If you get this wrong, revise: Normal Approximation to Binomial — Section 3.6.

Details

Problem 10 If XPo(3)X \sim \mathrm{Po}(3) and YPo(5)Y \sim \mathrm{Po}(5) are independent, find P(X+Y=6)P(X + Y = 6).

Details

Solution 10 By additivity: X+YPo(3+5)=Po(8)X + Y \sim \mathrm{Po}(3+5) = \mathrm{Po}(8).

P(X+Y=6)=e8(8)66!=LBe8×262144RB◆◆LB720RBLB0.000335×262144RB◆◆LB720RB0.1221P(X + Y = 6) = \dfrac{e^{-8}(8)^6}{6!} = \dfrac◆LB◆e^{-8} \times 262144◆RB◆◆LB◆720◆RB◆ \approx \dfrac◆LB◆0.000335 \times 262144◆RB◆◆LB◆720◆RB◆ \approx 0.1221.

If you get this wrong, revise: Additivity — Section 4.5.

Details

Problem 11 Starting from the definition E(X)=k=0nk(nk)pk(1p)nkE(X) = \sum_{k=0}^{n} k\binom{n}{k}p^k(1-p)^{n-k}, derive E(X)=npE(X) = np using the identity k(nk)=n(n1k1)k\binom{n}{k} = n\binom{n-1}{k-1} and the binomial theorem.

Solution 11

E(X)=k=0nk(nk)pk(1p)nk=k=1nn(n1k1)pk(1p)nkE(X) = \sum_{k=0}^{n} k\binom{n}{k}p^k(1-p)^{n-k} = \sum_{k=1}^{n} n\binom{n-1}{k-1}p^k(1-p)^{n-k}

=npk=1n(n1k1)pk1(1p)(n1)(k1)=npj=0n1(n1j)pj(1p)n1j= np\sum_{k=1}^{n}\binom{n-1}{k-1}p^{k-1}(1-p)^{(n-1)-(k-1)} = np\sum_{j=0}^{n-1}\binom{n-1}{j}p^j(1-p)^{n-1-j}

By the binomial theorem: j=0n1(n1j)pj(1p)n1j=[p+(1p)]n1=1\sum_{j=0}^{n-1}\binom{n-1}{j}p^j(1-p)^{n-1-j} = [p+(1-p)]^{n-1} = 1.

Therefore E(X)=npE(X) = np.

If you get this wrong, revise: Direct derivation of E(X)=npE(X) = np from the PMF — Section 2.5.

Details

Problem 12 XPo(7)X \sim \mathrm{Po}(7). Let Y=3X2Y = 3X - 2. Find E(Y)E(Y) and Var(Y)\mathrm{Var}(Y).

Details

Solution 12 For XPo(7)X \sim \mathrm{Po}(7): E(X)=7E(X) = 7 and Var(X)=7\mathrm{Var}(X) = 7.

Using the coding formulae E(aX+b)=aE(X)+bE(aX+b) = aE(X)+b and Var(aX+b)=a2Var(X)\mathrm{Var}(aX+b) = a^2\mathrm{Var}(X):

E(Y)=3(7)2=19E(Y) = 3(7) - 2 = 19.

Var(Y)=32×7=63\mathrm{Var}(Y) = 3^2 \times 7 = 63.

Note that the additive constant 2-2 affects the mean but not the variance.

If you get this wrong, revise: Coding of Random Variables — Section 6.

Details

Problem 13 XB(80,0.03)X \sim B(80, 0.03). State whether the Poisson approximation is valid, giving reasons. If valid, use it to find P(X1)P(X \leq 1).

Details

Solution 13 Check conditions: n=80>50n = 80 \gt{} 50 and p=0.03<0.1p = 0.03 \lt{} 0.1. Both conditions are satisfied, so the Poisson approximation is valid with λ=np=80×0.03=2.4\lambda = np = 80 \times 0.03 = 2.4.

XPo(2.4)X \approx \mathrm{Po}(2.4).

P(X1)=P(X=0)+P(X=1)=e2.4(1+2.4)=3.4e2.43.4×0.09070.3085P(X \leq 1) = P(X=0) + P(X=1) = e^{-2.4}\left(1 + 2.4\right) = 3.4\,e^{-2.4} \approx 3.4 \times 0.0907 \approx 0.3085.

If you get this wrong, revise: Poisson approximation to the Binomial — Section 4.7.

Details

Problem 14 A discrete random variable XX has E(X)=5E(X) = 5 and Var(X)=4\mathrm{Var}(X) = 4. Let W=2X+3W = 2X + 3. Find E(W)E(W) and Var(W)\mathrm{Var}(W).

Details

Solution 14 E(W)=2E(X)+3=2(5)+3=13E(W) = 2E(X) + 3 = 2(5) + 3 = 13.

Var(W)=22×Var(X)=4×4=16\mathrm{Var}(W) = 2^2 \times \mathrm{Var}(X) = 4 \times 4 = 16.

SD(W)=16=4\mathrm{SD}(W) = \sqrt{16} = 4.

If you get this wrong, revise: Coding of Random Variables — Section 6.

Details

Problem 15 Starting from E(X(X1))=k=0nk(k1)(nk)pk(1p)nkE(X(X-1)) = \sum_{k=0}^{n} k(k-1)\binom{n}{k}p^k(1-p)^{n-k}, derive Var(X)=np(1p)\mathrm{Var}(X) = np(1-p) for XB(n,p)X \sim B(n,p).

Details

Solution 15 Using k(k1)(nk)=n(n1)(n2k2)k(k-1)\binom{n}{k} = n(n-1)\binom{n-2}{k-2}:

E(X(X1))=k=2nn(n1)(n2k2)pk(1p)nk=n(n1)p2j=0n2(n2j)pj(1p)n2j=n(n1)p2E(X(X-1)) = \sum_{k=2}^{n} n(n-1)\binom{n-2}{k-2}p^k(1-p)^{n-k} = n(n-1)p^2\sum_{j=0}^{n-2}\binom{n-2}{j}p^j(1-p)^{n-2-j} = n(n-1)p^2

Then E(X2)=E(X(X1))+E(X)=n(n1)p2+npE(X^2) = E(X(X-1)) + E(X) = n(n-1)p^2 + np.

Var(X)=E(X2)[E(X)]2=n(n1)p2+npn2p2=npnp2=np(1p)\mathrm{Var}(X) = E(X^2) - [E(X)]^2 = n(n-1)p^2 + np - n^2p^2 = np - np^2 = np(1-p).

If you get this wrong, revise: Direct derivation of Var(X)=np(1p)\mathrm{Var}(X) = np(1-p) from the PMF — Section 2.6.

Details

Problem 16 XB(120,0.025)X \sim B(120, 0.025). (a) Show that the Poisson approximation is appropriate. (b) Use it to find P(X=5)P(X = 5). (c) State why the normal approximation would not be appropriate here.

Details

Solution 16 (a) n=120>50n = 120 \gt{} 50 and p=0.025<0.1p = 0.025 \lt{} 0.1, so the Poisson approximation is appropriate. λ=np=120×0.025=3\lambda = np = 120 \times 0.025 = 3.

(b) XPo(3)X \approx \mathrm{Po}(3).

P(X=5)=LBe3×35RB◆◆LB5!RB=LBe3×243RB◆◆LB120RB=2.025e32.025×0.04980.1008P(X = 5) = \frac◆LB◆e^{-3} \times 3^5◆RB◆◆LB◆5!◆RB◆ = \frac◆LB◆e^{-3} \times 243◆RB◆◆LB◆120◆RB◆ = 2.025\,e^{-3} \approx 2.025 \times 0.0498 \approx 0.1008

(c) For the normal approximation we need np>5np \gt{} 5 and n(1p)>5n(1-p) \gt{} 5. Here np=3<5np = 3 \lt{} 5, so the normal approximation is not appropriate. The Poisson approximation is the correct choice since pp is small.

If you get this wrong, revise: Poisson approximation to the Binomial — Section 4.7.

Details

Problem 17 Temperatures in a city are modelled by XN(15,9)X \sim N(15, 9) in degrees Celsius. The temperature in Fahrenheit is F=95X+32F = \frac{9}{5}X + 32. Find E(F)E(F), Var(F)\mathrm{Var}(F), and P(F>68)P(F \gt{} 68).

Details

Solution 17 E(F)=95E(X)+32=95(15)+32=27+32=59FE(F) = \frac{9}{5}E(X) + 32 = \frac{9}{5}(15) + 32 = 27 + 32 = 59^\circ\mathrm{F}.

Var(F)=(95)2×9=8125×9=72925=29.16\mathrm{Var}(F) = \left(\frac{9}{5}\right)^2 \times 9 = \frac{81}{25} \times 9 = \frac{729}{25} = 29.16.

SD(F)=29.16=5.4\mathrm{SD}(F) = \sqrt{29.16} = 5.4.

P(F>68)=P ⁣(Z>68595.4)=P(Z>1.667)10.9522=0.0478P(F \gt{} 68) = P\!\left(Z \gt{} \dfrac{68 - 59}{5.4}\right) = P(Z \gt{} 1.667) \approx 1 - 0.9522 = 0.0478.

If you get this wrong, revise: Coding of Random Variables — Section 6.

:::

:::


tip

tip Ready to test your understanding of Statistical Distributions? The diagnostic test contains the hardest questions within the A-Level specification for this topic, each with a full worked solution.

Unit tests probe edge cases and common misconceptions. Integration tests combine Statistical Distributions with other topics to test synthesis under exam conditions.

See Diagnostic Guide for instructions on self-marking and building a personal test matrix.