Probability and Statistics: Making Sense of Randomness

What are the odds of flipping ten heads in a row? Most people guess somewhere around "basically impossible," but the real answer is about 1 in 1,024 — roughly 0.1%. Probability is the branch of mathematics that quantifies uncertainty, and understanding it changes how you interpret everything from medical test results to weather forecasts.

What probability actually means

Probability measures how likely an event is on a scale from 0 (impossible) to 1 (certain). You can express it as a fraction, a decimal, or a percentage — they all mean the same thing. A fair coin has a probability of 1/2 = 0.5 = 50% of landing heads.

There are two common interpretations. Frequentist probability says "if I flip this coin 10,000 times, about 5,000 will be heads." Subjective (Bayesian) probability says "given what I know, I believe there's a 50% chance of heads." Both frameworks agree on the math — they differ on philosophy.

Independent vs dependent events

Two events are independent when the outcome of one does not affect the other. Each coin flip is independent — the coin has no memory. Two events are dependent when the first outcome changes the second probability. Drawing cards from a deck without replacement is dependent: once you remove the ace of spades, the odds of drawing it next are zero.

The AND / OR rules

These two rules cover most practical probability questions:

AND (both happen) — Multiply the probabilities. The chance of rolling a 6 on two dice: 1/6 × 1/6 = 1/36 ≈ 2.8%.
OR (at least one happens) — Add the probabilities, then subtract the overlap. The chance of drawing a heart OR a king from a deck: 13/52 + 4/52 − 1/52 = 16/52 ≈ 30.8%.

P(A AND B) = P(A) × P(B)          (independent events)
P(A OR B)  = P(A) + P(B) - P(A AND B)

Example: two dice, both show 6
P = 1/6 × 1/6 = 1/36 ≈ 0.028

Example: coin flip, at least one head in 3 flips
P = 1 - P(all tails) = 1 - (1/2)^3 = 7/8 = 0.875

The normal distribution (bell curve)

Measure the heights of 10,000 adults and plot the results. You get a bell-shaped curve where most values cluster near the average and extreme values are rare. This is the normal distribution, and it shows up everywhere — test scores, measurement errors, stock returns, even the weight of apples in a harvest.

Standard deviation: measuring spread

The standard deviation (SD or σ) measures how spread out values are from the mean. A small SD means tightly packed data; a large SD means widely scattered data. In a normal distribution, fixed percentages of data fall within each sigma level:

Range	Coverage	Meaning
±1σ	68.27%	About 2 in 3 values
±2σ	95.45%	About 19 in 20 values
±3σ	99.73%	About 369 in 370 values
±4σ	99.994%	1 in 15,787 outside
±6σ	99.99966%	3.4 per million outside

Six Sigma quality control aims for fewer than 3.4 defects per million opportunities. The name literally means "six standard deviations from the mean."

Z-scores: locating a value on the curve

A z-score tells you how many standard deviations a value is from the mean. The formula is simple: z = (x − μ) / σ. A z-score of 1.5 means the value is 1.5 standard deviations above average. A z-score of −2 means it is 2 standard deviations below.

Z-scores let you compare apples and oranges. If you scored 85 on a test with mean 70 and SD 10, your z-score is 1.5. If you scored 140 on another test with mean 100 and SD 25, your z-score is 1.6. The second performance was slightly more exceptional, even though the raw numbers look different.

Confidence intervals

When a poll reports "52% support, margin of error ±3%," that margin comes from a confidence interval. It means: if we repeated this survey many times, 95% of those surveys would produce a result between 49% and 55%.

A 95% confidence interval does not mean there is a 95% chance the true value is in the range. The true value is fixed — it either is or is not in the interval. The 95% refers to the method: 95% of intervals constructed this way will capture the true value.

Sample size and margin of error

Larger samples give narrower intervals. The margin of error shrinks proportionally to the square root of the sample size — to halve the margin, you need four times as many respondents. This is why national polls survey around 1,000 people: it gives roughly a ±3% margin, and going to 4,000 only improves it to ±1.5%.

Margin of error ≈ z × sqrt(p(1-p) / n)

For 95% confidence (z = 1.96), p = 0.5, n = 1000:
ME ≈ 1.96 × sqrt(0.25 / 1000)
ME ≈ 1.96 × 0.0158
ME ≈ 0.031  →  ±3.1%

Putting it together

Probability and statistics form a feedback loop. Probability starts with a known model (a fair die) and predicts outcomes. Statistics starts with observed outcomes (survey data) and infers the model. Standard deviation tells you how much to trust a single measurement, z-scores normalize across different scales, and confidence intervals quantify the uncertainty in your conclusions.