Probability — Diagnostic Tests
Unit Tests
Tests edge cases, boundary conditions, and common misconceptions for probability.
UT-1: Mutually Exclusive vs Independent Events
Question:
Events A and B are such that P(A)=0.4, P(B)=0.5, and P(A∩B)=0.1.
(a) Determine whether A and B are mutually exclusive. Justify your answer.
(b) Determine whether A and B are independent. Justify your answer.
(c) A student claims: "If two events are not mutually exclusive, then they must be independent." Construct a counterexample using numerical values to show this claim is false.
(d) A second student claims: "If two events are independent, then they cannot be mutually exclusive (unless one has probability zero)." Prove this claim is true.
[Difficulty: hard. Tests the precise distinction between two concepts that students frequently confuse, and requires a proof.]
Solution:
(a) Events A and B are mutually exclusive if P(A∩B)=0.
Here P(A∩B)=0.1=0, so A and B are not mutually exclusive.
(b) Events A and B are independent if P(A∩B)=P(A)×P(B).
P(A)×P(B)=0.4×0.5=0.20
But P(A∩B)=0.1=0.20, so A and B are not independent.
(c) Let C and D be events with P(C)=0.3, P(D)=0.4, P(C∩D)=0.15.
Not mutually exclusive: P(C∩D)=0.15=0.
Not independent: P(C)×P(D)=0.3×0.4=0.12=0.15=P(C∩D).
So C and D are neither mutually exclusive nor independent, disproving the claim.
(d) Suppose A and B are independent and mutually exclusive.
By independence: P(A∩B)=P(A)×P(B).
By mutual exclusivity: P(A∩B)=0.
Therefore: P(A)×P(B)=0.
Since probabilities are non-negative, this means P(A)=0 or P(B)=0 (or both).
So if two events are both independent and mutually exclusive, at least one of them must have probability zero. This proves the claim: independent non-trivial events (with positive probability) cannot be mutually exclusive.
UT-2: Conditional Probability and the Prosecutor's Fallacy
Question:
A medical test for a disease has the following characteristics:
- If a person has the disease, the test is positive with probability 0.95.
- If a person does not have the disease, the test is positive with probability 0.02.
- The prevalence of the disease in the population is 0.01 (1%).
A person is randomly selected from the population and tests positive.
(a) Calculate the probability that the person actually has the disease. This probability is called the positive predictive value.
(b) A prosecutor argues in court: "The test is 95% accurate, so there is a 95% chance the defendant has the disease." Identify the specific error in this reasoning, naming the two probabilities that have been confused.
(c) The hospital director wants the positive predictive value to be at least 50%. Find the minimum prevalence of the disease required to achieve this, keeping the test characteristics the same.
(d) Explain why P(disease∣positive)=P(positive∣disease) in general, and state the condition under which they would be equal.
[Difficulty: hard. Tests conditional probability, Bayes' theorem, and the critical distinction between the direction of conditioning.]
Solution:
(a) Let D = "person has the disease" and + = "test is positive".
We need P(D∣+).
By Bayes' theorem (or using a tree diagram / contingency table):
P(D∣+)=L◆B◆P(+∣D)⋅P(D)◆RB◆◆LB◆P(+)◆RB◆
P(+)=P(+∣D)P(D)+P(+∣D′)P(D′)
=(0.95)(0.01)+(0.02)(0.99)=0.0095+0.0198=0.0293
P(D∣+)=0.02930.0095=0.3242...≈0.324
So there is approximately a 32.4% chance the person actually has the disease, despite the positive test.
(b) The prosecutor has confused:
- P(+∣D)=0.95 (probability of a positive test given the person has the disease)
- P(D∣+)≈0.324 (probability the person has the disease given a positive test)
This is the prosecutor's fallacy (also known as the base rate fallacy or confusion of the inverse). The prosecutor has transposed the conditioning, ignoring the base rate (prevalence) of the disease. When the disease is rare (1% prevalence), most positive tests are actually false positives, even with a "95% accurate" test.
(c) We need P(D∣+)≥0.5.
Let p=P(D) be the prevalence. Then:
P(+)=0.95p+0.02(1−p)=0.95p+0.02−0.02p=0.93p+0.02
P(D∣+)=0.93p+0.020.95p≥0.5
0.95p≥0.5(0.93p+0.02)=0.465p+0.01
0.95p−0.465p≥0.01
0.485p≥0.01
p≥0.4850.01=0.02062...
The minimum prevalence is approximately 2.06%. At this prevalence, exactly half of all positive tests are true positives.
(d) P(D∣+)=P(+∣D) in general because they condition on different events. P(+∣D) is the sensitivity of the test (among people with the disease, what fraction test positive), while P(D∣+) is the positive predictive value (among people who test positive, what fraction actually have the disease). These are related by Bayes' theorem:
P(D∣+)=L◆B◆P(+∣D)⋅P(D)◆RB◆◆LB◆P(+)◆RB◆
They would be equal only when P(D)=P(+), i.e., when the prevalence equals the overall probability of a positive test. This is a very specific condition that would not generally hold.
UT-3: Probability Distributions and "At Least One" Problems
Question:
A bag contains 4 red balls and 6 blue balls. Three balls are drawn at random without replacement.
(a) Find the probability distribution of X, the number of red balls drawn. Verify that your probabilities sum to 1.
(b) Calculate E(X) and Var(X) directly from the probability distribution.
(c) A student claims that since P(at least one red)=1−P(no red), this identity is only valid for independent events. Prove that this identity holds for any events (dependent or independent), using the addition rule.
(d) In a modified version of the problem, balls are drawn with replacement. Find the probability that the first red ball appears on or before the 5th draw, and express your answer in terms of a geometric distribution.
[Difficulty: hard. Tests probability distribution construction, expectation/variance calculation, and a proof about a fundamental identity.]
Solution:
(a) X can take values 0, 1, 2, 3.
P(X=0)=L◆B◆(36)◆RB◆◆LB◆(310)◆RB◆=12020=61
P(X=1)=L◆B◆(14)(26)◆RB◆◆LB◆(310)◆RB◆=L◆B◆4×15◆RB◆◆LB◆120◆RB◆=12060=21
P(X=2)=L◆B◆(24)(16)◆RB◆◆LB◆(310)◆RB◆=L◆B◆6×6◆RB◆◆LB◆120◆RB◆=12036=103
P(X=3)=L◆B◆(34)◆RB◆◆LB◆(310)◆RB◆=1204=301
Verification:
61+21+103+301=305+3015+309+301=3030=1✓
(b)
E(X)=∑x⋅P(X=x)=0(61)+1(21)+2(103)+3(301)
=0+21+106+303=21+53+101=105+106+101=1012=1.2
Alternative check: E(X)=n×L◆B◆number of red◆RB◆◆LB◆total◆RB◆=3×104=1.2. This confirms our result.
E(X2)=02(61)+12(21)+22(103)+32(301)
=0+21+1012+309=21+56+103=105+1012+103=1020=2
Var(X)=E(X2)−[E(X)]2=2−1.44=0.56
(c) Let A = "at least one red ball is drawn" and B = "no red balls are drawn".
Note that A=B′ (the complement of B). This is always true: "at least one" is the complement of "none."
By the complement rule (which holds for all events):
P(A)=P(B′)=1−P(B)
This is P(at least one red)=1−P(no red).
The complement rule P(B′)=1−P(B) is derived from the addition rule: since B and B′ are mutually exclusive and exhaustive (they form a partition of the sample space):
P(B∪B′)=P(B)+P(B′)=1
So P(B′)=1−P(B).
This holds for any events, regardless of whether individual trials are independent or dependent. The identity depends only on the fact that B and B′ partition the sample space.
(d) With replacement, each draw is independent with P(red)=104=0.4.
Let Y = number of draws until the first red ball. Then Y∼Geo(0.4).
P(Y≤5)=1−P(Y>5)=1−P(first 5 draws are all blue)=1−(0.6)5
=1−0.07776=0.92224≈0.922
Alternatively, using the geometric CDF:
P(Y≤5)=∑k=15(0.6)k−1(0.4)=0.4+0.24+0.144+0.0864+0.05184=0.92224
Integration Tests
Tests synthesis of probability with other topics. Requires combining concepts from multiple units.
IT-1: Deriving the Expected Value of a Binomial Distribution (with Statistical Distributions)
Question:
The random variable X follows a binomial distribution X∼B(n,p), where n is a positive integer and 0<p<1.
(a) By writing X=∑i=1nXi where each Xi is an indicator random variable for the i-th trial being a success, show that E(X)=np.
(b) Using the fact that Var(Xi)=p(1−p) for each Xi and that the Xi are independent, show that Var(X)=np(1−p).
(c) A fair coin is tossed 20 times. Using the results from parts (a) and (b), find E(X) and Var(X) where X is the number of heads. Hence find E(X2).
(d) The random variable Y=3X−5. Find E(Y) and Var(Y).
[Difficulty: hard. Derives the binomial expectation and variance from first principles using indicator variables.]
Solution:
(a) Define Xi as follows:
Xi={10if the i-th trial is a successif the i-th trial is a failure
Each Xi follows a Bernoulli distribution with parameter p:
E(Xi)=1⋅p+0⋅(1−p)=p
The total number of successes is X=X1+X2+⋯+Xn=∑i=1nXi.
By the linearity of expectation (which holds regardless of independence):
E(X)=E(∑i=1nXi)=∑i=1nE(Xi)=∑i=1np=np
(b) For each Xi:
E(Xi2)=12⋅p+02⋅(1−p)=p
Var(Xi)=E(Xi2)−[E(Xi)]2=p−p2=p(1−p)
Since the trials are independent, the Xi are independent random variables. For independent random variables, the variance of the sum equals the sum of the variances:
Var(X)=Var(∑i=1nXi)=∑i=1nVar(Xi)=∑i=1np(1−p)=np(1−p)
Note: This step requires independence. If the trials were dependent, we would need to add covariance terms, and the result would not simplify to np(1−p).
(c) For a fair coin: n=20, p=0.5.
E(X)=20×0.5=10
Var(X)=20×0.5×0.5=5
E(X2)=Var(X)+[E(X)]2=5+100=105
(d) Using E(aX+b)=aE(X)+b and Var(aX+b)=a2Var(X):
E(Y)=3E(X)−5=3(10)−5=25
Var(Y)=32Var(X)=9×5=45
Note that the additive constant −5 has no effect on the variance. This is a common source of error: students sometimes write Var(Y)=Var(3X)+Var(−5), which is incorrect. The variance of a constant is zero.
IT-2: Solving a Probability Equation with Algebra (with Algebra)
Question:
A fair six-sided die is rolled twice. Let A be the event "the sum of the two scores is greater than k" and let B be the event "at least one of the scores is prime."
(a) Find P(A) and P(B) when k=7.
(b) Find the value of k for which P(A)=31.
(c) For k=7, determine whether events A and B are independent. Show all your working.
(d) Find the range of values of k for which P(A∩B)<P(A)⋅P(B), and explain the significance of this inequality in terms of dependence.
[Difficulty: hard. Combines probability with algebraic equation solving and formal independence testing.]
Solution:
(a) The sample space has 6×6=36 equally likely outcomes.
For A (sum >7):
| Sum | Outcomes | Count |
|---|
| 8 | (2,6), (3,5), (4,4), (5,3), (6,2) | 5 |
| 9 | (3,6), (4,5), (5,4), (6,3) | 4 |
| 10 | (4,6), (5,5), (6,4) | 3 |
| 11 | (5,6), (6,5) | 2 |
| 12 | (6,6) | 1 |
P(A)=365+4+3+2+1=3615=125
For B (at least one prime): Prime numbers on a die are 2, 3, 5.
P(prime on one roll)=63=21
P(B)=1−P(no primes)=1−(63)2=1−369=3627=43
(Non-prime faces are 1, 4, 6, so 3 out of 6 faces are non-prime.)
(b) We need P(sum>k)=31.
The number of outcomes with sum >k must be 336=12.
Counting outcomes with sum greater than various values of k:
- Sum >8: outcomes with sum 9, 10, 11, 12 = 4+3+2+1=10
- Sum >7: outcomes with sum 8, 9, 10, 11, 12 = 5+4+3+2+1=15
Since we need exactly 12 outcomes, and the counts jump from 10 to 15, there is no integer value of k for which P(A)=31 exactly.
However, let me reconsider. "Sum >k" means strictly greater than k. The counts are:
| k | Sum >k | Count |
|---|
| 6 | 7, 8, 9, 10, 11, 12 | 21 |
| 7 | 8, 9, 10, 11, 12 | 15 |
| 8 | 9, 10, 11, 12 | 10 |
| 9 | 10, 11, 12 | 6 |
No integer k gives exactly 12 outcomes. So there is no integer value of k for which P(A)=31.
If k is allowed to be non-integer, then since the probability function takes discrete values (3621,3615,3610,366,…), no value of k achieves exactly 3612=31.
(c) For independence, we need P(A∩B)=P(A)⋅P(B)=125×43=165.
We need to count outcomes where the sum is >7 AND at least one score is prime.
Listing outcomes with sum >7 and at least one prime:
- Sum 8: (2,6) [2 prime], (3,5) [both prime], (5,3) [both prime], (6,2) [2 prime]. Note: (4,4) has no prime. That is 4 outcomes.
- Sum 9: (3,6) [3 prime], (4,5) [5 prime], (5,4) [5 prime], (6,3) [3 prime]. All 4 outcomes.
- Sum 10: (4,6) [neither prime], (5,5) [prime], (6,4) [neither prime]. That is 1 outcome.
- Sum 11: (5,6) [5 prime], (6,5) [5 prime]. Both 2 outcomes.
- Sum 12: (6,6) [neither prime]. That is 0 outcomes.
P(A∩B)=364+4+1+2+0=3611
P(A)⋅P(B)=125×43=165=3611.25
Since 3611=3611.25, events A and B are not independent.
(d) We need P(A∩B)<P(A)⋅P(B), which means the events are negatively dependent: knowing B occurred makes A less likely.
Computing for different values of k:
For k=7: P(A∩B)=3611≈0.306, P(A)⋅P(B)=165=0.3125. So P(A∩B)<P(A)⋅P(B).
For k=6: P(A)=3621. We need P(A∩B). Outcomes with sum >6 and at least one prime: add the sum-7 outcomes (6 of them) with at least one prime: (1,6) no, (2,5) yes, (3,4) yes, (4,3) yes, (5,2) yes, (6,1) no. That is 4 more. Total: 3611+4=3615. P(A)⋅P(B)=3621×43=14463=3615.75. So P(A∩B)<P(A)⋅P(B).
For k=8: P(A)=3610. P(A∩B): outcomes with sum >8 and at least one prime = 3611−4=367 (removing the 4 sum-8 outcomes from part c). P(A)⋅P(B)=3610×43=14430=367.5. So P(A∩B)<P(A)⋅P(B).
For k=9: P(A)=366. P(A∩B): sum >9 with at least one prime = 1+2+0=3 out of 36. P(A)⋅P(B)=366×43=14418=364.5. So P(A∩B)<P(A)⋅P(B).
For k=10: P(A)=363. P(A∩B): sum >10 with at least one prime = 2+0=2 out of 36. P(A)⋅P(B)=363×43=362.25. So P(A∩B)>P(A)⋅P(B).
For k=11: P(A)=361. P(A∩B): sum 12 = (6,6), no prime. P(A∩B)=0. P(A)⋅P(B)=361×43=360.75. So P(A∩B)<P(A)⋅P(B).
The inequality P(A∩B)<P(A)⋅P(B) holds for k∈{6,7,8,9,11} but not for k=10. This inconsistency reflects the discrete nature of the problem.
When this inequality holds, the events are negatively dependent: knowing that at least one die shows a prime slightly decreases the probability of a large sum. This is because high sums like 10 = (4,6) or 12 = (6,6) often involve only non-prime numbers, creating a slight negative association.
IT-3: Proof of the Addition Rule Using a Venn Diagram (with Proof)
Question:
(a) Using a Venn diagram, prove that for any two events A and B:
P(A∪B)=P(A)+P(B)−P(A∩B)
(b) Hence prove that for three events A, B, and C:
P(A∪B∪C)=P(A)+P(B)+P(C)−P(A∩B)−P(A∩C)−P(B∩C)+P(A∩B∩C)
(c) In a class of 40 students: 18 study Mathematics, 15 study Physics, 12 study Chemistry, 7 study both Mathematics and Physics, 5 study both Mathematics and Chemistry, 4 study both Physics and Chemistry, and 2 study all three subjects. A student is chosen at random. Find the probability that the student studies exactly one of the three subjects.
(d) A student claims that if P(A∪B)=P(A)+P(B), then A and B must be independent. Determine whether this claim is true, justifying with a counterexample or proof.
[Difficulty: hard. Requires a formal proof of the inclusion-exclusion principle and its application to a multi-set problem.]
Solution:
(a) A Venn diagram for two events A and B has four regions:
- A∩B (the overlap)
- A∩B′ (in A only)
- A′∩B (in B only)
- A′∩B′ (in neither)
The event A∪B consists of regions 1, 2, and 3.
Now:
- P(A)=P(A∩B)+P(A∩B′) — the probability of being in A is the probability of being in the overlap plus the probability of being in A only.
- P(B)=P(A∩B)+P(A′∩B) — similarly for B.
- P(A∪B)=P(A∩B)+P(A∩B′)+P(A′∩B) — the probability of being in A or B is the sum of all three non-empty regions.
Therefore:
P(A)+P(B)=[P(A∩B)+P(A∩B′)]+[P(A∩B)+P(A′∩B)]
=P(A∩B)+P(A∩B′)+P(A′∩B)+P(A∩B)
=P(A∪B)+P(A∩B)
Rearranging:
P(A∪B)=P(A)+P(B)−P(A∩B)
This is the inclusion-exclusion principle for two events. The term P(A∩B) is subtracted because the overlap was counted twice (once in P(A) and once in P(B)).
(b) Apply the two-event formula repeatedly:
P(A∪B∪C)=P((A∪B)∪C)
=P(A∪B)+P(C)−P((A∪B)∩C)
Using the two-event formula for P(A∪B):
=[P(A)+P(B)−P(A∩B)]+P(C)−P((A∩C)∪(B∩C))
Using the two-event formula for P((A∩C)∪(B∩C)):
=P(A∩C)+P(B∩C)−P(A∩B∩C)
Substituting back:
P(A∪B∪C)=P(A)+P(B)−P(A∩B)+P(C)−P(A∩C)−P(B∩C)+P(A∩B∩C)
Rearranging:
P(A∪B∪C)=P(A)+P(B)+P(C)−P(A∩B)−P(A∩C)−P(B∩C)+P(A∩B∩C)
This is the inclusion-exclusion principle for three events.
(c) Using the inclusion-exclusion formula for three events, or equivalently, constructing the Venn diagram:
Let M = studies Mathematics, P = studies Physics, C = studies Chemistry.
P(M)=4018, P(P)=4015, P(C)=4012
P(M∩P)=407, P(M∩C)=405, P(P∩C)=404
P(M∩P∩C)=402
Number studying exactly one subject:
- Only Mathematics: 18−7−5+2=8
- Only Physics: 15−7−4+2=6
- Only Chemistry: 12−5−4+2=5
Total studying exactly one: 8+6+5=19
P(exactly one)=4019
(d) The claim is false. If P(A∪B)=P(A)+P(B), then from the inclusion-exclusion principle:
P(A)+P(B)=P(A)+P(B)−P(A∩B)
This gives P(A∩B)=0, meaning A and B are mutually exclusive, not independent.
Counterexample: Let P(A)=0.3, P(B)=0.2, P(A∩B)=0. Then P(A∪B)=0.5=0.3+0.2. But for independence we need P(A∩B)=0.3×0.2=0.06=0. The events are mutually exclusive but not independent.
As shown in UT-1 part (d), independent events with positive probability cannot be mutually exclusive. The student has confused two fundamentally different concepts.