Convergence of Random Events
Life is full of random events.
We learn that multiple coin flips are “independent events” – no matter whether the past flip was heads or tails, the next flip is 50/50. (So why do they show the last few results at the routlette table? Hint: Don’t play routlette.) We learn that about half of babies are male and half female, so chances are 50/50 that your new little sibling will be a boy or a girl.
I found the answer to “Of my 8 children, what are the chances that 4 are girls and 4 are boys?” counterintuitive. The central limit theorem is crucial to intuition around this question.
When I initially encountered the Monte Hall problem, the correct answer wasn’t obvious or intuitive, but the mathemetical explanation is surprisingly understandable. We’ll try here to make the central limit theorem more understandable as well.
Start with a single random event – value drawn from [0.0, 1.0)
rand) (
0.9103988080681965
One way to combine random events is to take the average:
defn avg [nums]
(/ (reduce + nums) (count nums))) (
0.0 1.0]) (avg [
0.5
Let’s try taking the average of several events together:
rand) (rand)]) (avg [(
0.8719956298125853
rand) (rand) (rand)]) (avg [(
0.31990869726481075
This is getting repetitive. We can make the computer repeat for us:
repeatedly 3 rand)) (avg (
0.4500087173275038
The more events that you average, the closer the result comes to 0.5:
repeatedly 30 rand)) (avg (
0.624231674168369
repeatedly 300 rand)) (avg (
0.488293065684605
Let’s try taking several events together:
defn event []
(rand)) (
(event)
0.802498673187776
defn combined-event [number-of-events]
(repeatedly number-of-events event))) (avg (
1) (combined-event
0.63536487577006
2) (combined-event
0.5721153553426599
5) (combined-event
0.28081990114408695
Let’s look at a series of multiple of these combined event
repeatedly 5 #(combined-event 2)) (
0.9042643083011265
(0.3689283343189743
0.45937723312214024
0.5886647924945406
0.06224298556658847)
repeatedly 5 #(combined-event 5)) (
0.6250064375225972
(0.5240286249661003
0.3456921290333647
0.3152589172232867
0.813400369663581)
repeatedly 5 #(combined-event 10)) (
0.5793143452539514
(0.3441688533942249
0.3658907843924105
0.4894930460579287
0.5601809662203656)
As we combine a larger number of events, the values cluster more closely to the middle of the original distribution.
And regardless of the shape of the original event distribution, the result of combining more and more events will approach the normal distribution – it’s a unique function toward which these combinations always converge.
This is true for both continuous variables (like (rand)
) or discrete variables (like dice (rand-nth [1 2 3 4 5 6])
), and it’s true even for oddly shaped distributions. When you combine enough of them, they take on the character of the bell-shaped curve.
Learn More at 3Blue1Brown - But what is the Central Limit Theorem?