Skip to main content

Probability

Board Coverage

BoardPaperNotes
AQAPaper 1, 2Basic probability in P1; conditional, Bayes in P2
EdexcelP1, P2Similar
OCR (A)Paper 1, 2Includes Venn diagrams and tree diagrams
CIE (9709)P1, P6Probability in P1; conditional in P6
info

Probability questions test logical reasoning as much as formula recall. Always define events clearly and draw a diagram before calculating.


1. Kolmogorov's Axioms

Definition. A probability function PP on a sample space Ω\Omega satisfies:

  1. Non-negativity: P(A)0P(A) \geq 0 for all events AΩA \subseteq \Omega.
  2. Normalisation: P(Ω)=1P(\Omega) = 1.
  3. Countable additivity: If A1,A2,A_1, A_2, \ldots are mutually exclusive, then P ⁣(iAi)=iP(Ai)P\!\left(\bigcup_{i}A_i\right) = \sum_i P(A_i).

These three axioms are the foundation of all probability theory. Every theorem in probability can be derived from them.


2. Basic Probability Results

2.1 Complement rule

Theorem. P(A)=1P(A)P(A') = 1 - P(A).

Proof. AA and AA' are mutually exclusive and AA=ΩA \cup A' = \Omega.

P(AA)=P(A)+P(A)=P(Ω)=1    P(A)=1P(A).P(A \cup A') = P(A) + P(A') = P(\Omega) = 1 \implies P(A') = 1 - P(A). \quad \blacksquare

Corollary. For any event AA, P()=0P(\emptyset) = 0.

Proof. P()=P(Ω)=1P(Ω)=11=0P(\emptyset) = P(\Omega') = 1 - P(\Omega) = 1 - 1 = 0. \blacksquare

Corollary. If ABA \subseteq B, then P(A)P(B)P(A) \leq P(B).

Proof. Write B=A(BA)B = A \cup (B \cap A') where the two sets are disjoint. Then P(B)=P(A)+P(BA)P(A)P(B) = P(A) + P(B \cap A') \geq P(A) since P(BA)0P(B \cap A') \geq 0. \blacksquare

2.2 Addition rule

Theorem. P(AB)=P(A)+P(B)P(AB)P(A \cup B) = P(A) + P(B) - P(A \cap B).

Proof. ABA \cup B can be partitioned into three disjoint sets: ABA \cap B', ABA \cap B, and ABA' \cap B.

P(AB)=P(AB)+P(AB)+P(AB)P(A \cup B) = P(A \cap B') + P(A \cap B) + P(A' \cap B)

P(A)=P(AB)+P(AB)    P(AB)=P(A)P(AB)P(A) = P(A \cap B') + P(A \cap B) \implies P(A \cap B') = P(A) - P(A \cap B)

P(B)=P(AB)+P(AB)    P(AB)=P(B)P(AB)P(B) = P(A \cap B) + P(A' \cap B) \implies P(A' \cap B) = P(B) - P(A \cap B)

P(AB)=[P(A)P(AB)]+P(AB)+[P(B)P(AB)]=P(A)+P(B)P(AB).P(A \cup B) = [P(A) - P(A \cap B)] + P(A \cap B) + [P(B) - P(A \cap B)] = P(A) + P(B) - P(A \cap B). \quad \blacksquare

For mutually exclusive events (AB=A \cap B = \emptyset): P(AB)=P(A)+P(B)P(A \cup B) = P(A) + P(B).

Corollary (Boole's inequality). For any events AA and BB, P(AB)P(A)+P(B)P(A \cup B) \leq P(A) + P(B).

Proof. Since P(AB)0P(A \cap B) \geq 0, we have P(AB)=P(A)+P(B)P(AB)P(A)+P(B)P(A \cup B) = P(A) + P(B) - P(A \cap B) \leq P(A) + P(B). \blacksquare

2.3 Multiplication rule

Theorem. P(AB)=P(A)P(BA)P(A \cap B) = P(A) \cdot P(B|A).

Proof. This follows directly from the definition of conditional probability (Section 3.1). \blacksquare

General multiplication rule. For events A1,A2,,AnA_1, A_2, \ldots, A_n:

P ⁣(i=1nAi)=P(A1)P(A2A1)P(A3A1A2)P(AnA1An1)P\!\left(\bigcap_{i=1}^{n} A_i\right) = P(A_1) \cdot P(A_2|A_1) \cdot P(A_3|A_1 \cap A_2) \cdots P(A_n|A_1 \cap \cdots \cap A_{n-1})


3. Conditional Probability

3.1 Definition

Definition. The conditional probability of AA given BB is

P(AB)=LBP(AB)RB◆◆LBP(B)RBforP(B)>0P(A|B) = \frac◆LB◆P(A \cap B)◆RB◆◆LB◆P(B)◆RB◆ \quad \mathrm{for } P(B) > 0

Intuition. P(AB)P(A|B) is the probability of AA occurring given that we already know BB has occurred. Knowing BB has happened changes our sample space from Ω\Omega to BB, and we measure what fraction of BB is also in AA.

3.2 Properties of conditional probability

Theorem. Conditional probability satisfies the Kolmogorov axioms for a fixed conditioning event BB (with P(B)>0P(B) > 0).

Proof.

  1. P(AB)=P(AB)/P(B)0P(A|B) = P(A \cap B)/P(B) \geq 0 since P(AB)0P(A \cap B) \geq 0 and P(B)>0P(B) > 0.
  2. P(ΩB)=P(ΩB)/P(B)=P(B)/P(B)=1P(\Omega|B) = P(\Omega \cap B)/P(B) = P(B)/P(B) = 1.
  3. If A1,A2,A_1, A_2, \ldots are mutually exclusive, then so are A1B,A2B,A_1 \cap B, A_2 \cap B, \ldots, and

P ⁣(iAi|B)=LBP ⁣((iAi)B)RB◆◆LBP(B)RB=LBiP(AiB)RB◆◆LBP(B)RB=iP(AiB).P\!\left(\bigcup_i A_i \,\middle|\, B\right) = \frac◆LB◆P\!\left(\left(\bigcup_i A_i\right) \cap B\right)◆RB◆◆LB◆P(B)◆RB◆ = \frac◆LB◆\sum_i P(A_i \cap B)◆RB◆◆LB◆P(B)◆RB◆ = \sum_i P(A_i|B). \quad \blacksquare

Corollary. The complement rule holds for conditional probability: P(AB)=1P(AB)P(A'|B) = 1 - P(A|B).

Proof. This follows from applying the complement rule within the conditional probability measure, which is justified by the theorem above. \blacksquare


4. Bayes' Theorem

4.1 Statement

Theorem. For events AA and BB with P(B)>0P(B) \gt{} 0:

P(AB)=LBP(BA)P(A)RB◆◆LBP(B)RBP(A|B) = \frac◆LB◆P(B|A) \cdot P(A)◆RB◆◆LB◆P(B)◆RB◆

4.2 Proof

P(AB)=LBP(AB)RB◆◆LBP(B)RB=LBP(BA)P(A)RB◆◆LBP(B)RBP(A|B) = \frac◆LB◆P(A \cap B)◆RB◆◆LB◆P(B)◆RB◆ = \frac◆LB◆P(B|A) \cdot P(A)◆RB◆◆LB◆P(B)◆RB◆ \quad \blacksquare

4.3 Law of Total Probability

If B1,B2,,BnB_1, B_2, \ldots, B_n partition Ω\Omega (mutually exclusive and exhaustive):

P(A)=i=1nP(ABi)P(Bi)P(A) = \sum_{i=1}^{n}P(A|B_i)P(B_i)

4.4 Extended Bayes' Theorem

P(BkA)=LBP(ABk)P(Bk)RB◆◆LBi=1nP(ABi)P(Bi)RBP(B_k|A) = \frac◆LB◆P(A|B_k)P(B_k)◆RB◆◆LB◆\sum_{i=1}^{n}P(A|B_i)P(B_i)◆RB◆

tip

Bayes' theorem is essential for "reverse" probability questions: "Given that a test is positive, what is the probability the patient actually has the disease?" Always define events clearly and identify what is given (P(AB)P(A|B)) versus what is sought (P(BA)P(B|A)).


5. Independence

5.1 Definition

Definition. Events AA and BB are independent if and only if

P(AB)=P(A)P(B)P(A \cap B) = P(A) \cdot P(B)

5.2 Proof: Independence ⟺ conditional probability equals unconditional

Theorem. AA and BB are independent if and only if P(AB)=P(A)P(A|B) = P(A) (provided P(B)>0P(B) \gt{} 0).

Proof.

(\Rightarrow) If P(AB)=P(A)P(B)P(A \cap B) = P(A)P(B), then P(AB)=LBP(AB)RB◆◆LBP(B)RB=P(A)P(B)P(B)=P(A)P(A|B) = \dfrac◆LB◆P(A \cap B)◆RB◆◆LB◆P(B)◆RB◆ = \dfrac{P(A)P(B)}{P(B)} = P(A).

(\Leftarrow) If P(AB)=P(A)P(A|B) = P(A), then LBP(AB)RB◆◆LBP(B)RB=P(A)\dfrac◆LB◆P(A \cap B)◆RB◆◆LB◆P(B)◆RB◆ = P(A), so P(AB)=P(A)P(B)P(A \cap B) = P(A)P(B). \blacksquare

Intuition. Independence means knowing BB occurred gives you no information about AA. The probability of AA is the same whether or not BB has happened.

warning

warning mutually exclusive and both have positive probability, they are not independent (since P(AB)=0P(A)P(B)P(A \cap B) = 0 \neq P(A)P(B)).

5.3 Pairwise and mutual independence

Definition. Events A1,A2,,AnA_1, A_2, \ldots, A_n are mutually independent if for every subset i1,,ik1,2,,n\\{i_1, \ldots, i_k\\} \subseteq \\{1, 2, \ldots, n\\} with k2k \geq 2:

P(Ai1Ai2Aik)=P(Ai1)P(Ai2)P(Aik)P(A_{i_1} \cap A_{i_2} \cap \cdots \cap A_{i_k}) = P(A_{i_1}) \cdot P(A_{i_2}) \cdots P(A_{i_k})

Definition. Events A1,A2,,AnA_1, A_2, \ldots, A_n are pairwise independent if every pair (Ai,Aj)(A_i, A_j) with iji \neq j is independent.

warning

Mutual independence is a stronger condition than pairwise independence. Pairwise independence does not imply mutual independence. For example, with two independent coin tosses, let AA = "first toss is heads", BB = "second toss is heads", CC = "both tosses are the same". Then AA, BB, CC are pairwise independent but not mutually independent since P(ABC)=0P(A)P(B)P(C)=1/8P(A \cap B \cap C) = 0 \neq P(A)P(B)P(C) = 1/8.


6. Venn Diagrams and Tree Diagrams

6.1 Venn diagrams

Venn diagrams represent events as regions. Useful for visualising:

  • ABA \cup B, ABA \cap B, AA'
  • Relationships between events
  • Applying the addition rule

6.2 Tree diagrams

Tree diagrams are useful for sequential experiments. Each branch represents a possible outcome with its probability. The probability along any path is the product of the probabilities along its branches (multiplication rule). The probability of any event is found by adding the probabilities of all paths leading to it (addition rule for mutually exclusive paths).

Example. A bag contains 3 red and 2 blue balls. Two balls are drawn without replacement.

P(bothred)=35×24=620=310P(\mathrm{both red}) = \frac{3}{5} \times \frac{2}{4} = \frac{6}{20} = \frac{3}{10}

P(oneofeach)=35×24+25×34=620+620=1220=35P(\mathrm{one of each}) = \frac{3}{5} \times \frac{2}{4} + \frac{2}{5} \times \frac{3}{4} = \frac{6}{20} + \frac{6}{20} = \frac{12}{20} = \frac{3}{5}


7. Counting Principles

7.1 Factorials

n!=n(n1)(n2)1n! = n(n-1)(n-2)\cdots 1, with 0!=10! = 1.

7.2 Permutations and combinations

  • Permutations: nPr=n!(nr)!{}^n P_r = \dfrac{n!}{(n-r)!} (order matters)
  • Combinations: nCr=(nr)=n!r!(nr)!{}^n C_r = \binom{n}{r} = \dfrac{n!}{r!(n-r)!} (order does not matter)

7.3 Probability with equally likely outcomes

When all outcomes are equally likely: P(A)=LBARB◆◆LBΩRB=LBnumberoffavourableoutcomesRB◆◆LBtotalnumberofoutcomesRBP(A) = \dfrac◆LB◆|A|◆RB◆◆LB◆|\Omega|◆RB◆ = \dfrac◆LB◆\mathrm{number of favourable outcomes}◆RB◆◆LB◆\mathrm{total number of outcomes}◆RB◆.


8. Venn Diagrams for Three Events

8.1 Inclusion-exclusion principle

Theorem (Inclusion-Exclusion for three events). For events AA, BB, CC:

P(ABC)=P(A)+P(B)+P(C)P(AB)P(AC)P(BC)+P(ABC)P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C) - P(B \cap C) + P(A \cap B \cap C)

Proof. Apply the two-event inclusion-exclusion rule twice:

P(ABC)=P(A)+P(BC)P(A(BC))P(A \cup B \cup C) = P(A) + P(B \cup C) - P(A \cap (B \cup C))

Now P(BC)=P(B)+P(C)P(BC)P(B \cup C) = P(B) + P(C) - P(B \cap C), and by the distributive law of set theory A(BC)=(AB)(AC)A \cap (B \cup C) = (A \cap B) \cup (A \cap C), so:

P(A(BC))=P(AB)+P(AC)P(ABC)P(A \cap (B \cup C)) = P(A \cap B) + P(A \cap C) - P(A \cap B \cap C)

Substituting:

P(ABC)=P(A)+P(B)+P(C)P(BC)P(AB)P(AC)+P(ABC).P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(B \cap C) - P(A \cap B) - P(A \cap C) + P(A \cap B \cap C). \quad \blacksquare

8.2 De Morgan's laws for three events

Theorem. For events AA, BB, CC:

(ABC)=ABC(A \cup B \cup C)' = A' \cap B' \cap C'

(ABC)=ABC(A \cap B \cap C)' = A' \cup B' \cup C'

Proof. The three-event case follows by induction from the two-event case. For the first law:

(ABC)=((AB)C)=(AB)C=ABC.(A \cup B \cup C)' = ((A \cup B) \cup C)' = (A \cup B)' \cap C' = A' \cap B' \cap C'. \quad \blacksquare

8.3 Working with three-event Venn diagrams

When solving problems with three events, the Venn diagram is divided into 8 regions (including the exterior). The fundamental approach is:

  1. Start from the innermost region ABCA \cap B \cap C and work outward.
  2. Use the given information to find the value of each region.
  3. Each region represents a disjoint event, so probabilities add.

Example. In a class of 40 students, 18 study Maths, 15 study Physics, and 12 study Chemistry. 5 study all three, 8 study Maths and Physics, 6 study Maths and Chemistry, and 7 study Physics and Chemistry.

The region for "Maths only" is: 1886+5=918 - 8 - 6 + 5 = 9 (subtract overlaps, add back the triple overlap).

RegionDescriptionCalculationCount
ABCA \cap B \cap CAll threeGiven5
ABCA \cap B \cap C'Maths and Physics only858 - 53
ACBA \cap C \cap B'Maths and Chemistry only656 - 51
BCAB \cap C \cap A'Physics and Chemistry only757 - 52
ABCA \cap B' \cap C'Maths only1831518 - 3 - 1 - 59
BACB \cap A' \cap C'Physics only1532515 - 3 - 2 - 55
CABC \cap A' \cap B'Chemistry only1212512 - 1 - 2 - 54
ABCA' \cap B' \cap C'None402940 - 2911

Check: 5+3+1+2+9+5+4+11=405 + 3 + 1 + 2 + 9 + 5 + 4 + 11 = 40. \checkmark


9. Multi-Stage Experiments and Tree Diagrams

9.1 Formal structure

A multi-stage experiment consists of a sequence of trials. A tree diagram represents this as:

  • Levels correspond to stages (trials).
  • Branches at each node represent possible outcomes at that stage.
  • Branch probabilities are the conditional probabilities of each outcome given the path so far.
  • Path probability is the product of all branch probabilities along the path.
  • Event probability is the sum of all relevant path probabilities.

9.2 With and without replacement

With replacement. At each stage, the sample space and probabilities reset. The trials are independent.

Without replacement. At each stage, the sample space shrinks. The trials are not independent; later probabilities depend on earlier outcomes.

Example. A bag contains 5 balls: 2 red and 3 blue. Three balls are drawn without replacement. Find the probability of drawing exactly 2 red balls.

There are (32)=3\binom{3}{2} = 3 ways to arrange the two red draws among three positions: RRB, RBR, BRR.

P(RRB)=25×14×33=660=110P(\mathrm{RRB}) = \frac{2}{5} \times \frac{1}{4} \times \frac{3}{3} = \frac{6}{60} = \frac{1}{10}

P(RBR)=25×34×13=660=110P(\mathrm{RBR}) = \frac{2}{5} \times \frac{3}{4} \times \frac{1}{3} = \frac{6}{60} = \frac{1}{10}

P(BRR)=35×24×13=660=110P(\mathrm{BRR}) = \frac{3}{5} \times \frac{2}{4} \times \frac{1}{3} = \frac{6}{60} = \frac{1}{10}

P(exactly2red)=110+110+110=310P(\mathrm{exactly 2 red}) = \frac{1}{10} + \frac{1}{10} + \frac{1}{10} = \frac{3}{10}

9.3 At least and at most problems

For "at least kk" problems, it is often easier to compute the complement: P(atleastk)=1P(atmostk1)P(\mathrm{at least } k) = 1 - P(\mathrm{at most } k-1).

Example. A fair coin is tossed 4 times. Find P(atleast3heads)P(\mathrm{at least 3 heads}).

P(atleast3heads)=P(exactly3heads)+P(exactly4heads)P(\mathrm{at least 3 heads}) = P(\mathrm{exactly 3 heads}) + P(\mathrm{exactly 4 heads})

=(43)(12)4+(44)(12)4=416+116=516= \binom{4}{3}\left(\frac{1}{2}\right)^4 + \binom{4}{4}\left(\frac{1}{2}\right)^4 = \frac{4}{16} + \frac{1}{16} = \frac{5}{16}

Alternatively: P(atleast3heads)=1P(atmost2heads)=11116=516P(\mathrm{at least 3 heads}) = 1 - P(\mathrm{at most 2 heads}) = 1 - \frac{11}{16} = \frac{5}{16}.

9.4 Conditional probability from tree diagrams

To find a conditional probability P(XY)P(X|Y) from a tree diagram:

  1. Identify all paths leading to YY (the conditioning event).
  2. Sum these path probabilities to get P(Y)P(Y).
  3. Among those paths, identify which also satisfy XX.
  4. Sum the relevant path probabilities to get P(XY)P(X \cap Y).
  5. P(XY)=P(XY)/P(Y)P(X|Y) = P(X \cap Y)/P(Y).

10. Discrete Random Variables and Probability Mass Functions

10.1 Discrete random variables

Definition. A random variable is a function X ⁣:ΩRX \colon \Omega \to \mathbb{R} that assigns a real number to each outcome in the sample space.

Definition. A random variable XX is discrete if its set of possible values is countable (i.e. finite or countably infinite).

Example. If a fair die is rolled, define XX = "the number shown". Then XX takes values in 1,2,3,4,5,6\\{1, 2, 3, 4, 5, 6\\}, so XX is discrete.

Example. If a coin is tossed until the first head appears, define XX = "number of tosses". Then XX takes values in 1,2,3,\\{1, 2, 3, \ldots\\}, which is countably infinite.

10.2 Probability mass function (PMF)

Definition. The probability mass function (PMF) of a discrete random variable XX is the function p(x)=P(X=x)p(x) = P(X = x), defined for all xRx \in \mathbb{R}.

Properties of a PMF. A function p ⁣:R[0,1]p \colon \mathbb{R} \to [0, 1] is a valid PMF if and only if:

  1. p(x)0p(x) \geq 0 for all xx.
  2. allxp(x)=1\displaystyle\sum_{\mathrm{all } x} p(x) = 1.

Proof. Property 1 follows from non-negativity of probability. Property 2 follows because the events X=x\\{X = x\\} for all possible values of xx form a partition of Ω\Omega, so their probabilities sum to 1 by the normalisation axiom. \blacksquare

10.3 Cumulative distribution function (CDF)

Definition. The cumulative distribution function (CDF) of a discrete random variable XX is

F(x)=P(Xx)=txp(t)F(x) = P(X \leq x) = \sum_{t \leq x} p(t)

The CDF is a non-decreasing, right-continuous function with limxF(x)=0\lim_{x \to -\infty} F(x) = 0 and limx+F(x)=1\lim_{x \to +\infty} F(x) = 1.

10.4 Expectation and variance

Definition. The expected value (mean) of a discrete random variable XX is

E(X)=μ=allxxp(x)E(X) = \mu = \sum_{\mathrm{all } x} x \cdot p(x)

Definition. The variance of XX is

Var(X)=σ2=E ⁣[(Xμ)2]=allx(xμ)2p(x)\mathrm{Var}(X) = \sigma^2 = E\!\left[(X - \mu)^2\right] = \sum_{\mathrm{all } x} (x - \mu)^2 \cdot p(x)

An equivalent computational formula is:

Var(X)=E(X2)[E(X)]2\mathrm{Var}(X) = E(X^2) - [E(X)]^2

Proof of the computational formula:

Var(X)=E ⁣[(Xμ)2]=E(X22μX+μ2)=E(X2)2μE(X)+μ2=E(X2)μ2.\mathrm{Var}(X) = E\!\left[(X - \mu)^2\right] = E(X^2 - 2\mu X + \mu^2) = E(X^2) - 2\mu E(X) + \mu^2 = E(X^2) - \mu^2. \quad \blacksquare

10.5 Worked example

A biased die has PMF:

xx123456
p(x)p(x)1/121/121/61/61/41/41/41/41/61/61/121/12

Check: 1/12+1/6+1/4+1/4+1/6+1/12=1/12+2/12+3/12+3/12+2/12+1/12=12/12=11/12 + 1/6 + 1/4 + 1/4 + 1/6 + 1/12 = 1/12 + 2/12 + 3/12 + 3/12 + 2/12 + 1/12 = 12/12 = 1. \checkmark

E(X)=1 ⁣ ⁣112+2 ⁣ ⁣16+3 ⁣ ⁣14+4 ⁣ ⁣14+5 ⁣ ⁣16+6 ⁣ ⁣112E(X) = 1\!\cdot\!\tfrac{1}{12} + 2\!\cdot\!\tfrac{1}{6} + 3\!\cdot\!\tfrac{1}{4} + 4\!\cdot\!\tfrac{1}{4} + 5\!\cdot\!\tfrac{1}{6} + 6\!\cdot\!\tfrac{1}{12}

=112+26+34+44+56+612=1+4+9+12+10+612=4212=3.5= \tfrac{1}{12} + \tfrac{2}{6} + \tfrac{3}{4} + \tfrac{4}{4} + \tfrac{5}{6} + \tfrac{6}{12} = \tfrac{1 + 4 + 9 + 12 + 10 + 6}{12} = \tfrac{42}{12} = 3.5

E(X2)=1 ⁣ ⁣112+4 ⁣ ⁣16+9 ⁣ ⁣14+16 ⁣ ⁣14+25 ⁣ ⁣16+36 ⁣ ⁣112E(X^2) = 1\!\cdot\!\tfrac{1}{12} + 4\!\cdot\!\tfrac{1}{6} + 9\!\cdot\!\tfrac{1}{4} + 16\!\cdot\!\tfrac{1}{4} + 25\!\cdot\!\tfrac{1}{6} + 36\!\cdot\!\tfrac{1}{12}

=1+8+27+48+50+3612=17012=856= \tfrac{1 + 8 + 27 + 48 + 50 + 36}{12} = \tfrac{170}{12} = \tfrac{85}{6}

Var(X)=E(X2)[E(X)]2=856494=17014712=23121.917\mathrm{Var}(X) = E(X^2) - [E(X)]^2 = \tfrac{85}{6} - \tfrac{49}{4} = \tfrac{170 - 147}{12} = \tfrac{23}{12} \approx 1.917

info

info above has the same mean but smaller variance, meaning its outcomes are more concentrated around the centre.


Problem Set

Details

Problem 1 Events AA and BB are such that P(A)=0.4P(A) = 0.4, P(B)=0.5P(B) = 0.5, and P(AB)=0.7P(A \cup B) = 0.7. Find P(AB)P(A \cap B) and P(AB)P(A|B).

Details

Solution 1 P(AB)=P(A)+P(B)P(AB)=0.4+0.50.7=0.2P(A \cap B) = P(A) + P(B) - P(A \cup B) = 0.4 + 0.5 - 0.7 = 0.2.

P(AB)=P(AB)/P(B)=0.2/0.5=0.4P(A|B) = P(A \cap B)/P(B) = 0.2/0.5 = 0.4.

If you get this wrong, revise: Addition Rule — Section 2.2.

Details

Problem 2 A disease affects 1% of a population. A test is 99% accurate (both sensitivity and specificity). A person tests positive. What is the probability they actually have the disease?

Details

Solution 2 Let DD = has disease, T+T^+ = tests positive.

P(D)=0.01P(D) = 0.01, P(T+D)=0.99P(T^+|D) = 0.99, P(T+D)=0.01P(T^+|D') = 0.01.

By the law of total probability: P(T+)=P(T+D)P(D)+P(T+D)P(D)=0.99(0.01)+0.01(0.99)=0.0099+0.0099=0.0198P(T^+) = P(T^+|D)P(D) + P(T^+|D')P(D') = 0.99(0.01) + 0.01(0.99) = 0.0099 + 0.0099 = 0.0198.

By Bayes' theorem: P(DT+)=LBP(T+D)P(D)RB◆◆LBP(T+)RB=0.00990.0198=0.5P(D|T^+) = \dfrac◆LB◆P(T^+|D)P(D)◆RB◆◆LB◆P(T^+)◆RB◆ = \dfrac{0.0099}{0.0198} = 0.5.

Even with a 99% accurate test, a positive result means only a 50% chance of actually having the disease, because the disease is so rare.

If you get this wrong, revise: Bayes' Theorem — Section 4.

Details

Problem 3 Prove that if AA and BB are independent, then so are AA and BB'.

Details

Solution 3 P(AB)=P(A)P(AB)=P(A)P(A)P(B)P(A \cap B') = P(A) - P(A \cap B) = P(A) - P(A)P(B) (by independence) =P(A)[1P(B)]=P(A)P(B)= P(A)[1 - P(B)] = P(A)P(B'). \blacksquare

If you get this wrong, revise: Independence — Section 5.

Details

Problem 4 A bag contains 4 red, 3 blue, and 2 green balls. Three balls are drawn without replacement. Find the probability that all three are different colours.

Details

Solution 4 Total ways to choose 3 from 9: (93)=84\binom{9}{3} = 84.

Ways to get one of each colour: (41)(31)(21)=4×3×2=24\binom{4}{1}\binom{3}{1}\binom{2}{1} = 4 \times 3 \times 2 = 24.

P=24/84=2/7P = 24/84 = 2/7.

If you get this wrong, revise: Counting Principles — Section 7.

Details

Problem 5 Events AA, BB, CC are such that P(A)=0.3P(A) = 0.3, P(B)=0.4P(B) = 0.4, P(C)=0.5P(C) = 0.5, P(AB)=0.1P(A \cap B) = 0.1, P(AC)=0.15P(A \cap C) = 0.15, P(BC)=0.2P(B \cap C) = 0.2, and P(ABC)=0.05P(A \cap B \cap C) = 0.05. Find P(ABC)P(A \cup B \cup C).

Details

Solution 5 By the inclusion-exclusion principle:

P(ABC)=P(A)+P(B)+P(C)P(AB)P(AC)P(BC)+P(ABC)P(A \cup B \cup C) = P(A) + P(B) + P(C) - P(A \cap B) - P(A \cap C) - P(B \cap C) + P(A \cap B \cap C) =0.3+0.4+0.50.10.150.2+0.05=0.8= 0.3 + 0.4 + 0.5 - 0.1 - 0.15 - 0.2 + 0.05 = 0.8

If you get this wrong, revise: Addition Rule — Section 2.2.

Details

Problem 6 Two coins are tossed. Given that at least one is heads, find the probability that both are heads.

Details

Solution 6 Ω={HH,HT,TH,TT}\Omega = \{HH, HT, TH, TT\}. A={atleastoneheads}={HH,HT,TH}A = \{\mathrm{at least one heads}\} = \{HH, HT, TH\}. B={bothheads}={HH}B = \{\mathrm{both heads}\} = \{HH\}.

P(BA)=P(BA)/P(A)=P(B)/P(A)=(1/4)/(3/4)=1/3P(B|A) = P(B \cap A)/P(A) = P(B)/P(A) = (1/4)/(3/4) = 1/3.

If you get this wrong, revise: Conditional Probability — Section 3.

Details

Problem 7 A fair die is rolled. Let AA = "even number" and BB = "number greater than 3". Are AA and BB independent?

Details

Solution 7 A={2,4,6}A = \{2, 4, 6\}, B={4,5,6}B = \{4, 5, 6\}, AB={4,6}A \cap B = \{4, 6\}.

P(A)=3/6=1/2P(A) = 3/6 = 1/2, P(B)=3/6=1/2P(B) = 3/6 = 1/2, P(AB)=2/6=1/3P(A \cap B) = 2/6 = 1/3.

P(A)P(B)=1/41/3=P(AB)P(A)P(B) = 1/4 \neq 1/3 = P(A \cap B). So AA and BB are not independent.

If you get this wrong, revise: Independence — Section 5.

Details

Problem 8 In a school, 60% of students study Maths, 40% study Physics, and 25% study both. A student is chosen at random. Given that they study Physics, find the probability they study Maths.

Details

Solution 8 P(M)=0.6P(M) = 0.6, P(P)=0.4P(P) = 0.4, P(MP)=0.25P(M \cap P) = 0.25.

P(MP)=P(MP)/P(P)=0.25/0.4=0.625P(M|P) = P(M \cap P)/P(P) = 0.25/0.4 = 0.625.

If you get this wrong, revise: Conditional Probability — Section 3.

Details

Problem 9 A box contains 10 items, 3 of which are defective. Items are inspected one by one without replacement. Find the probability that the first defective item is the third one inspected.

Details

Solution 9 First two non-defective, third defective:

P=710×69×38=LB7×6×3RB◆◆LB720RB=126720=740P = \frac{7}{10} \times \frac{6}{9} \times \frac{3}{8} = \frac◆LB◆7 \times 6 \times 3◆RB◆◆LB◆720◆RB◆ = \frac{126}{720} = \frac{7}{40}

If you get this wrong, revise: Tree Diagrams — Section 6.2.

Details

Problem 10 A machine produces components. 5% are defective. Components are packed in boxes of 20. Find the probability that a box contains exactly one defective component.

Details

Solution 10 This is a binomial scenario: XB(20,0.05)X \sim B(20, 0.05).

P(X=1)=(201)(0.05)1(0.95)19=20×0.05×0.95190.3774P(X=1) = \binom{20}{1}(0.05)^1(0.95)^{19} = 20 \times 0.05 \times 0.95^{19} \approx 0.3774.

If you get this wrong, revise: Binomial Distribution — Statistical Distributions chapter.

Details

Problem 11 Prove that P(AB)=1P(AB)P(A \cup B') = 1 - P(A' \cap B).

Details

Solution 11 By De Morgan's law: (AB)=AB(A \cup B')' = A' \cap B.

So P(AB)=1P((AB))=1P(AB)P(A \cup B') = 1 - P((A \cup B')') = 1 - P(A' \cap B). \blacksquare

If you get this wrong, revise: Complement Rule — Section 2.1.

Details

Problem 12 From a standard 52-card deck, 5 cards are dealt. Find the probability of getting a flush (all 5 cards of the same suit).

Details

Solution 12 Total ways: (525)=2598960\binom{52}{5} = 2598960.

Ways to get a flush: choose suit (4 ways), then 5 cards from that suit ((135)=1287\binom{13}{5} = 1287).

Total flushes: 4×1287=51484 \times 1287 = 5148.

P(flush)=5148/25989600.001980.2%P(\mathrm{flush}) = 5148/2598960 \approx 0.00198 \approx 0.2\%.

If you get this wrong, revise: Counting Principles — Section 7.

Details

Problem 13 A discrete random variable XX has PMF p(x)=kxp(x) = kx for x{1,2,3,4,5}x \in \{1, 2, 3, 4, 5\} and p(x)=0p(x) = 0 otherwise. Find the constant kk, then find E(X)E(X) and Var(X)\mathrm{Var}(X).

Details

Solution 13 For a valid PMF: x=15kx=k(1+2+3+4+5)=15k=1\sum_{x=1}^{5} kx = k(1 + 2 + 3 + 4 + 5) = 15k = 1, so k=1/15k = 1/15.

E(X)=x=15xx15=1+4+9+16+2515=5515=113E(X) = \sum_{x=1}^{5} x \cdot \frac{x}{15} = \frac{1 + 4 + 9 + 16 + 25}{15} = \frac{55}{15} = \frac{11}{3}

E(X2)=x=15x2x15=1+8+27+64+12515=22515=15E(X^2) = \sum_{x=1}^{5} x^2 \cdot \frac{x}{15} = \frac{1 + 8 + 27 + 64 + 125}{15} = \frac{225}{15} = 15

Var(X)=E(X2)[E(X)]2=151219=1351219=149\mathrm{Var}(X) = E(X^2) - [E(X)]^2 = 15 - \frac{121}{9} = \frac{135 - 121}{9} = \frac{14}{9}

If you get this wrong, revise: Discrete Random Variables — Section 10.

Details

Problem 14 A bag contains 4 red and 6 blue balls. Balls are drawn one at a time without replacement until a red ball is drawn. Find the probability that exactly 3 draws are needed.

Details

Solution 14 We need the first two draws to be blue and the third to be red:

P=610×59×48=120720=16P = \frac{6}{10} \times \frac{5}{9} \times \frac{4}{8} = \frac{120}{720} = \frac{1}{6}

If you get this wrong, revise: Multi-Stage Experiments — Section 9.

Details

Problem 15 Prove Boole's inequality: for events A1,A2,,AnA_1, A_2, \ldots, A_n,

P ⁣(i=1nAi)i=1nP(Ai)P\!\left(\bigcup_{i=1}^{n} A_i\right) \leq \sum_{i=1}^{n} P(A_i)

Details

Solution 15 By induction on nn.

Base case (n=2n = 2): P(A1A2)=P(A1)+P(A2)P(A1A2)P(A1)+P(A2)P(A_1 \cup A_2) = P(A_1) + P(A_2) - P(A_1 \cap A_2) \leq P(A_1) + P(A_2). \checkmark

Inductive step: Assume P ⁣(i=1kAi)i=1kP(Ai)P\!\left(\bigcup_{i=1}^{k} A_i\right) \leq \sum_{i=1}^{k} P(A_i). Then

P ⁣(i=1k+1Ai)=P ⁣(i=1kAi)+P(Ak+1)P ⁣(i=1kAiAk+1)P\!\left(\bigcup_{i=1}^{k+1} A_i\right) = P\!\left(\bigcup_{i=1}^{k} A_i\right) + P(A_{k+1}) - P\!\left(\bigcup_{i=1}^{k} A_i \cap A_{k+1}\right)

i=1kP(Ai)+P(Ak+1)=i=1k+1P(Ai).\leq \sum_{i=1}^{k} P(A_i) + P(A_{k+1}) = \sum_{i=1}^{k+1} P(A_i). \quad \blacksquare

If you get this wrong, revise: Basic Probability Results — Section 2.

Details

Problem 16 In a survey, 70% of people like tea, 50% like coffee, and 35% like both. A person is chosen at random. Given that they like at least one of the two drinks, find the probability that they like both.

Details

Solution 16 P(T)=0.7P(T) = 0.7, P(C)=0.5P(C) = 0.5, P(TC)=0.35P(T \cap C) = 0.35.

P(TC)=P(T)+P(C)P(TC)=0.7+0.50.35=0.85P(T \cup C) = P(T) + P(C) - P(T \cap C) = 0.7 + 0.5 - 0.35 = 0.85.

P(TCTC)=LBP(TC)RB◆◆LBP(TC)RB=0.350.85=7170.412P(T \cap C \mid T \cup C) = \frac◆LB◆P(T \cap C)◆RB◆◆LB◆P(T \cup C)◆RB◆ = \frac{0.35}{0.85} = \frac{7}{17} \approx 0.412

If you get this wrong, revise: Conditional Probability — Section 3.

Details

Problem 17 A fair coin is tossed 5 times. Using the complement rule, find the probability of getting at least one head.

Details

Solution 17 Let AA = "at least one head". Then AA' = "no heads" = "all tails".

P(A)=1P(A)=1(12)5=1132=3132P(A) = 1 - P(A') = 1 - \left(\frac{1}{2}\right)^5 = 1 - \frac{1}{32} = \frac{31}{32}

If you get this wrong, revise: Complement Rule — Section 2.1.

Details

Problem 18 Two events AA and BB satisfy P(A)=0.6P(A) = 0.6, P(BA)=0.4P(B|A) = 0.4, and P(BA)=0.7P(B|A') = 0.7. Find P(B)P(B), P(AB)P(A|B), and determine whether AA and BB are independent.

Details

Solution 18 By the law of total probability:

P(B)=P(BA)P(A)+P(BA)P(A)=0.4×0.6+0.7×0.4=0.24+0.28=0.52P(B) = P(B|A)P(A) + P(B|A')P(A') = 0.4 \times 0.6 + 0.7 \times 0.4 = 0.24 + 0.28 = 0.52

P(AB)=P(BA)P(A)=0.4×0.6=0.24P(A \cap B) = P(B|A)P(A) = 0.4 \times 0.6 = 0.24

P(AB)=LBP(AB)RB◆◆LBP(B)RB=0.240.52=6130.462P(A|B) = \frac◆LB◆P(A \cap B)◆RB◆◆LB◆P(B)◆RB◆ = \frac{0.24}{0.52} = \frac{6}{13} \approx 0.462

Check independence: P(A)P(B)=0.6×0.52=0.3120.24=P(AB)P(A)P(B) = 0.6 \times 0.52 = 0.312 \neq 0.24 = P(A \cap B). So AA and BB are not independent.

If you get this wrong, revise: Bayes' Theorem — Section 4, and Independence — Section 5.

Details

Problem 19 A discrete random variable XX has CDF F(x)=0F(x) = 0 for x<0x \lt{} 0, F(x)=x/4F(x) = x/4 for 0x<10 \leq x \lt{} 1, F(x)=1/2F(x) = 1/2 for 1x<21 \leq x \lt{} 2, F(x)=3/4F(x) = 3/4 for 2x<32 \leq x \lt{} 3, and F(x)=1F(x) = 1 for x3x \geq 3. Find the PMF of XX and verify it sums to 1.

Details

Solution 19 The PMF is obtained from the jumps in the CDF:

  • p(0)=F(0)F(0)=00=0p(0) = F(0) - F(0^-) = 0 - 0 = 0. But from the formula F(x)=x/4F(x) = x/4 at x=0x = 0: p(0)=0p(0) = 0. Actually, the jump occurs at the boundary. Since FF is continuous at x=0x = 0, there is no point mass at 0. The value X=0X = 0 has probability 0; we look at where jumps occur.

More carefully, the jumps occur at:

  • x=1x = 1: p(1)=F(1)F(1)=1/21/4=1/4p(1) = F(1) - F(1^-) = 1/2 - 1/4 = 1/4
  • x=2x = 2: p(2)=F(2)F(2)=3/41/2=1/4p(2) = F(2) - F(2^-) = 3/4 - 1/2 = 1/4
  • x=3x = 3: p(3)=F(3)F(3)=13/4=1/4p(3) = F(3) - F(3^-) = 1 - 3/4 = 1/4

There is also a continuous component on [0,1)[0, 1), but since XX is discrete, the CDF must be a step function. The given CDF has a linear portion, which indicates this CDF actually corresponds to a mixed distribution. For a purely discrete XX, the CDF should be piecewise constant with jumps.

Assuming the problem intended a discrete distribution, the PMF from the jumps is:

p(1)=14,p(2)=14,p(3)=14,p(x)=0otherwisep(1) = \frac{1}{4}, \quad p(2) = \frac{1}{4}, \quad p(3) = \frac{1}{4}, \quad p(x) = 0 \mathrm{ otherwise}

Check: 1/4+1/4+1/4=3/411/4 + 1/4 + 1/4 = 3/4 \neq 1. This indicates the continuous portion F(x)=x/4F(x) = x/4 on [0,1)[0,1) contributes probability 1/41/4 spread over a continuum, confirming this is not a purely discrete distribution.

If you get this wrong, revise: Discrete Random Variables — Section 10.

Details

Problem 20 Three machines M1M_1, M2M_2, M3M_3 produce items with proportions 50%, 30%, 20%. Their defect rates are 2%, 3%, 5% respectively. An item is found to be defective. Find the probability it was produced by M3M_3.

Details

Solution 20 Let DD = "defective". By the law of total probability:

P(D)=P(DM1)P(M1)+P(DM2)P(M2)+P(DM3)P(M3)P(D) = P(D|M_1)P(M_1) + P(D|M_2)P(M_2) + P(D|M_3)P(M_3) =0.02×0.5+0.03×0.3+0.05×0.2=0.01+0.009+0.01=0.029= 0.02 \times 0.5 + 0.03 \times 0.3 + 0.05 \times 0.2 = 0.01 + 0.009 + 0.01 = 0.029

By Bayes' theorem:

P(M3D)=LBP(DM3)P(M3)RB◆◆LBP(D)RB=LB0.05×0.2RB◆◆LB0.029RB=0.010.029=10290.345P(M_3|D) = \frac◆LB◆P(D|M_3)P(M_3)◆RB◆◆LB◆P(D)◆RB◆ = \frac◆LB◆0.05 \times 0.2◆RB◆◆LB◆0.029◆RB◆ = \frac{0.01}{0.029} = \frac{10}{29} \approx 0.345

If you get this wrong, revise: Extended Bayes' Theorem — Section 4.4.


tip

Diagnostic Test Ready to test your understanding of Probability? The diagnostic test contains the hardest questions within the A-Level specification for this topic, each with a full worked solution.

Unit tests probe edge cases and common misconceptions. Integration tests combine Probability with other topics to test synthesis under exam conditions.

See Diagnostic Guide for instructions on self-marking and building a personal test matrix.