Convergence of Random Events
Life is full of random events.
We learn that multiple coin flips are “independent events” – no matter whether the past flip was heads or tails, the next flip is 50/50. (So why do they show the last few results at the routlette table? Hint: Don’t play routlette.) We learn that about half of babies are male and half female, so chances are 50/50 that your new little sibling will be a boy or a girl.
I found the answer to “Of my 8 children, what are the chances that 4 are girls and 4 are boys?” counterintuitive. The central limit theorem is crucial to intuition around this question.
When I initially encountered the Monte Hall problem, the correct answer wasn’t obvious or intuitive, but the mathemetical explanation is surprisingly understandable. We’ll try here to make the central limit theorem more understandable as well.
Start with a single random event – value drawn from [0.0, 1.0)
rand) (
0.9444741798633549
One way to combine random events is to take the average:
defn avg [nums]
(/ (reduce + nums) (count nums))) (
0.0 1.0]) (avg [
0.5
Let’s try taking the average of several events together:
rand) (rand)]) (avg [(
0.2260346609043261
rand) (rand) (rand)]) (avg [(
0.5777898105446688
This is getting repetitive. We can make the computer repeat for us:
repeatedly 3 rand)) (avg (
0.718099653009579
The more events that you average, the closer the result comes to 0.5:
repeatedly 30 rand)) (avg (
0.42449078088020803
repeatedly 300 rand)) (avg (
0.5030610184636088
Let’s try taking several events together:
defn event []
(rand)) (
(event)
0.3890018948970616
defn combined-event [number-of-events]
(repeatedly number-of-events event))) (avg (
1) (combined-event
0.30888003329596103
2) (combined-event
0.5193090196027024
5) (combined-event
0.41497234661446525
Let’s look at a series of multiple of these combined event
repeatedly 5 #(combined-event 2)) (
0.40846261646800947
(0.6034609724398203
0.5100203767753714
0.5715178565795758
0.7643895475048696)
repeatedly 5 #(combined-event 5)) (
0.5504639686792496
(0.29688596633947595
0.6381304902703808
0.46521032488771963
0.3726026061621697)
repeatedly 5 #(combined-event 10)) (
0.31571650141595553
(0.4779976291697417
0.5323091524540302
0.40372033455175577
0.48422141387833334)
As we combine a larger number of events, the values cluster more closely to the middle of the original distribution.
And regardless of the shape of the original event distribution, the result of combining more and more events will approach the normal distribution – it’s a unique function toward which these combinations always converge.
This is true for both continuous variables (like (rand)
) or discrete variables (like dice (rand-nth [1 2 3 4 5 6])
), and it’s true even for oddly shaped distributions. When you combine enough of them, they take on the character of the bell-shaped curve.
Learn More at 3Blue1Brown - But what is the Central Limit Theorem?