Main Chancing It

Chancing It

Over the years, some very smart people have thought they understood the rules of chance—only to fail dismally. Whether you call it probability, risk, or uncertainty, the workings of chance often defy common sense. Fortunately, advances in math and science have revealed the laws of chance, and understanding those laws can help in your everyday life.
In Chancing It, award-winning scientist and writer Robert Matthews shows how to understand the laws of probability and use them to your advantage. He gives you access to some of the most potent intellectual tools ever developed and explains how to use them to guide your judgments and decisions. By the end of the book, you will know:

  • How to understand and even predict coincidences
  • When an insurance policy is worth having
  • Why "expert" predictions are often misleading
  • How to tell when a scientific claim is a breakthrough or baloney
  • When it makes sense to place a bet on...
  • Year:
    Skyhorse Publishing
    MOBI , 2.43 MB
    Download (mobi, 2.43 MB)

    Most frequently terms

    You can write a book review and share your experiences. Other readers will always be interested in your opinion of the books you've read. Whether you've loved the book or not, if you give your honest and detailed thoughts then people will find new books that are right for them.

    Yoga para nervosos

    EPUB, 2.83 MB

    Aesthetics After Finitude

    PDF, 4.01 MB
    Praise for Chancing It
    ‘Beguiling … Matthews has the knack of explaining things clearly for the non-specialist … his enthusiasm contributes to a lively and fascinating narrative’ Ian Critchley, Sunday Times
    ‘Outstanding … an extraordinary writer … At a time when mathematics needs charismatic ambassadors more than ever, Matthews has written a book of great significance.’ Oliver Moody, The Times
    ‘…an excellent handbook for addressing the seemingly illogical logic of chance’ Fortean Times
    ‘Masterly … the book equips decision-makers with the tools to arrive at rational decisions that mitigate risks and provide optimal solutions’ The Hindu
    Praise for Why Don’t Spiders Stick to Their Webs?
    ‘Matthews gives us his wisdom like a beneficent and well-read uncle, entertaining his guests at the dinner table.’ Brian Clegg, Popular Science Books
    Praise for 25 Big Ideas
    ‘Robert Matthews has a gift for finding the simple, fascinating stories at the heart of concepts transforming the modern world.’ John Rennie, former Editor, Scientific American
    By the same author
    Unravelling the Mind of God:
    Mysteries at the Frontiers of Science
    25 Big Ideas:
    The Science that’s Changing our World
    Why Don’t Spiders Stick to their Webs?
    And Other Everyday Mysteries of Science
    ROBERT MATTHEWS is one of Britain’s most experienced and successful science writers. Numerous awards for his writing include the Association of British Science Writers’ Feature Writer of the Year. His published work includes Unravelling the Mind of God: Mysteries at the Frontiers of Science (Virgin, 1992), 25 Big Ideas: The Science that’s Changing our World (Oneworld, 2005), and Why Don’t Spiders Stick to their Webs? And Other Everyday Mysteries of Science (Oneworld, 2007). He is currently a Visiting Professor at Aston University, where he specialises in probability and statistics.
    First Skyhorse Publishing edition 2017
    First published in Great Britain in 2016 by
    3 Holford Yard
    Bevin Way
    wc1x 9hd
    Copyright © Robert Matthews, 2016, 2017
    Foreword © Larry Gonick, 2017
    1 3 5 7 9 10 8 6 4 2
    Typeset in Plantin by MacGuru Ltd
    The moral right of the author has been asserted.
    All rights reserved. No part of this book may be reproduced in any manner without the express written consent of the publisher, except in the case of brief excerpts in critical reviews or articles. All inquiries should be addressed to Skyhorse Publishing, 307 West 36th Street, 11th Floor, New York, NY 10018.
    Skyhorse Publishing books may be purchased in bulk at special discounts for sales promotion, corporate gifts, fund-raising, or educational purposes. Special editions can also be created to specifications. For details, contact the Special Sales Department, Skyhorse Publishing, 307 West 36th Street, 11th Floor, New York, NY 10018 or
    Skyhorse® and Skyhorse Publishing® are registered trademarks of Skyhorse Publishing, Inc.®, a Delaware corporation.
    Visit our website at
    10 9 8 7 6 5 4 3 2 1
    Library of Congress Cataloging-in-Publication Data is available on file.
    Cover design by Rain Saukas
    Cover photo credit: iStock
    ISBN 978-1-5107-2379-5
    eISBN 978-1-5107-2381-8
    Printed in the United States of America
    Foreword by Larry Gonick
    1. The coin-tossing prisoner of the Nazis
    2. What the Law of Averages really means
    3. The dark secret of the Golden Theorem
    4. The First Law of Lawlessness
    5. What are the chances of that?
    6. Thinking independently is no yolk
    7. Random lessons from the lottery
    8. Warning: there’s a lot of X about
    9. Why the amazing so often turns ho-hum
    10. If you don’t know, go random
    11. Doing the right thing isn’t always ethical
    12. How a lot of bull sparked a revolution
    13. How to beat casinos at their own game
    14. Where wise-guys go wrong
    15. The Golden Rule of Gambling
    16. Insure it – or chance it?
    17. Making better bets in the Casino of Life
    18. Tell me straight, doc – what are my chances?
    19. This is not a drill! Repeat: this is not a drill!
    20. The miraculous formula of Reverend Bayes
    21. When Dr Turing met Reverend Bayes
    22. Using Bayes to be a better judge
    23. A scandal of significance
    24. Dodging the Amazing Baloney Machine
    25. Making use of what you already know
    26. I’m sorry, professor, I just don’t buy it
    27. The Amazing Curve for Everything
    28. The dangers of thinking everything’s Normal
    29. Ugly sisters and evil twins
    30. Going to extremes
    31. See a Nicolas Cage movie and die
    32. We’ve got to draw the line somewhere
    33. Playing the markets isn’t rocket science
    34. Beware geeks bearing models
    For Denise
    The smartest person I know,
    who unaccountably took a chance on me.
    by Larry Gonick
    We are an intuitively statistical species, or so I used to think. People who drive to work have a good sense of how random fluctuations in traffic will affect their commute time. We have a pretty good sense of whether to bring an umbrella if there’s a chance of rain. We’re comfortable with random variations in the size of apples and cucumbers.
    On the other hand, I am a terrible poker player.
    As recent psychological research has revealed, my experience with poker is typical. People are lousy statisticians. Confronted with risk and uncertainty, we make terrible decisions. We might take some solace in the thought that statistics is a mathematical discipline practiced by adepts better equipped to evaluate uncertainty than the rest of us mentally unwashed, but this “fact” also turns out to be wrong. Geeks with sophisticated statistical models have blown up world markets. Social scientists and medical researchers, all trained in statistics, have published—in peer-reviewed journals!—decades’ worth of studies with breakthrough, headline-making results that later turn out to be false. Education experts have decided that small schools are better, a conclusion at odds with statistics, which would have told them, had they thought about it, that small schools are simply more variable. If a disproportionate number of the best fifty schools (by some measure) are small, well then, a disproportionate number of the worst fifty schools will also be small. The “distribution is flatter,” but this hasn’t stopped governments and foundations from pouring billions into making small schools, not so different from my uncle Harry, who blew most of his family’s assets trying to beat the stock market. Science has spoken.
    Are we then no better than an army of impulsive chimpanzees with hyperdeveloped prefontal cortexes? Is there no hope? This is the question explored by Robert Matthews, an incurable optimist who, by revealing the reasons for the crisis of replication, the case of the vanishing breakthrough, and other statistical malpractice, guides us to more sensible tools for interpreting the uncertainty we all inevitably face in life.
    The scientific conclusion that people are lousy statisticians may turn out to be false, after all.
    One Sunday afternoon in April 2004, a 32-year-old Englishman walked into the Plaza Hotel & Casino in Las Vegas with his entire worldly possessions. They amounted to a change of underwear and a cheque. Ashley Revell had sold everything he owned to raise the $135,300 sum printed on the cheque; even the tuxedo he wore was hired. After exchanging the cheque for a depressingly small heap of chips, Revell headed for a roulette table, and did something extraordinary. He bet the lot on a single event: that when the little white ball came to rest, it would end up on red.
    Revell’s decision to choose that colour may have been impulsive, but the event itself wasn’t. He’d planned it for months. He’d talked about it with friends, who thought it was a brilliant idea, and with his family, who didn’t. Nor did some of the casinos; they may well have been fearful of going down in Vegas folklore as The Casino Where One Man Bet Everything And Lost. The manager of the Plaza certainly looked solemn as Revell placed the chips on the table, and asked him whether he was certain he wanted to go ahead. But nothing seemed likely to deter Revell. Surrounded by a large gathering of onlookers he waited anxiously as the croupier put the ball into the wheel. Then in one swift motion he stepped forward and put all his chips down on red. He watched as the ball slowed, spiralled in and bounced in and out of various pockets, and then came to rest – in pocket number 7. Red.
    In that moment Revell doubled his net worth to $270,600. The crowd cheered, and his friends hugged him – and his father ruefully declared him ‘a naughty boy’. Most people would probably take a harsher view of Revell’s actions that day: at best ill-advised, certainly rash and possibly insane. For surely even billionaires for whom such sums are loose change would not have punted the lot on one bet. Would not any rational person have divided up such a sum into smaller wagers, to at least check whether Lady Luck was in town?
    But here’s the thing: having decided to do it, Revell had done precisely the right thing. The laws of probability show that there is no surer way of doubling your net worth at a casino than to do what he did, and bet everything on one spin of the wheel. Yes, the game is unfair: the odds in roulette are deliberately – and legally – tilted against you. Yes, there was a better than 50 per cent chance of losing the lot. Yet bizarre as it may seem, in such situations the best strategy is to bet boldly and big. Anything more timid cuts the chances of success. Revell had proved this himself in the run-up to the big bet. Over the previous few days he’d punted several thousand dollars on bets in the casino, and succeeded only in losing $1,000. His best hope of doubling his money lay in swapping ‘common sense’ for the dictates of the laws of probability.
    So should we all follow Revell’s example, sell everything we own and head for the nearest casino? Of course not; there are much better, if more boring, ways of trying to double your money. Yet one thing’s for certain: they’ll all involve probability in one of its many guises: as chance, uncertainty, risk or degree of belief.
    We all know there are few certainties in life except death and taxes. But few of us are comfortable in the presence of chance. It threatens whatever sense we have of being in control of events, suggesting we could all become what Shakespeare called ‘Fortune’s fool’. It has prompted many to believe in fickle gods, and others to deny its primacy: Einstein famously refused to believe that God plays dice with the universe. Yet the very idea of making sense of chance seems oxymoronic: surely randomness is, by definition, beyond understanding? Such logic may underpin one of the great mysteries of intellectual history: why, despite its obvious usefulness, did a reliable theory of probability take so long to emerge? While games of chance were being played in Ancient Egypt over 5,500 years ago, it wasn’t until the seventeenth century that a few daring thinkers seriously challenged the view summed up by Aristotle that ‘There can be no demonstrative knowledge of chance’.
    It hardly helps that chance so often defies our intuitions. Take coincidences: roughly speaking, what are the chances of a football match having two players with birthdays within a day of each other? As there are 365 days in a year and 22 players, one might put the chances at less than 1 in 10. In fact, the laws of probability reveal the true answer to be around 90 per cent. Don’t believe it? Then check the birthdays of those playing in some football games, and see for yourself. Even then, it is hard to avoid thinking something odd is going on. After all, if you find yourself in a similar-sized crowd and ask whether anyone shares your birthday, you’re very unlikely to find a match. Even simple problems about coin-tosses and dice seem to defy common sense. Given that a coin is fair, surely tossing heads several times on the trot makes tails more likely? If you’re struggling to see why that’s not true, don’t worry: one of the great mathematicians of the Enlightenment never got it.
    One aim of this book is to show how to understand such everyday manifestations of chance by revealing their underlying laws and how to apply them. We will see how to use these laws to predict coincidences, make better decisions in business and in life, and make sense of everything from medical diagnoses to investment advice.
    But this is not just a book of top tips and handy hints. My principal goal is to show how the laws of probability are capable of so much more than just understanding chance events. They are also the weapon of choice for anyone faced with turning evidence into insight. From the identification of health risks and new drugs for dealing with them to advances in our knowledge of the cosmos, the laws of probability have proved crucial in separating random dross from evidential gold.
    Now another revolution is under way, one which centres on the laws of probability themselves. It has become clear that in the quest for knowledge these laws are even more powerful than previously thought. But accessing this power demands a radical reinterpretation of probability – one which until recently provoked bitter argument. That decades-long controversy is now fading in the face of evidence that so-called Bayesian methods can transform science, technology and medicine. So far, little of all this has reached the public. In this book I tell the often astonishing story of how these techniques emerged, the controversy they provoked and how we can all use them to make sense of everything from weather forecasts to the credibility of new scientific claims.
    Anyone wanting to wield the power-tools of probability must, however, always be aware that they can be pushed too far. Bad things happen when they’re abused. For decades, statisticians have warned about fundamental flaws in the methods used by researchers to test whether a new finding is just a fluke, or worth taking seriously. Long dismissed as pedantry, those warnings are now central to understanding a scandal that threatens the very future of scientific progress: the replication crisis. In disciplines ranging from medicine and genetics to psychology and economics, researchers are finding that many ‘statistically significant’ discoveries simply vanish when re-examined. This is now casting doubt on findings that have become embedded in the research literature, textbooks, and even government policy. This book is the first to explain both the nature of the scandal and show how to tell when research claims are being pushed too far, and what the truth is more likely to be. In doing so, it draws on my own academic research into the subject, which I began in the late 1990s after encountering the ‘vanishing breakthrough’ phenomenon as a science journalist.
    The need to understand chance, risk and uncertainty has never been more urgent. In the face of political upheaval, turmoil in financial markets and an endless litany of risks, threats and calamities, we all crave certainty. In truth, it never existed. But that is no reason for fatalism – or for refusing to accept reality.
    The central message of this book is that while we can never be free of chance, risk and uncertainty, they all follow rules which can be turned to our advantage.
    The coin-tossing prisoner of the Nazis
    In the spring of 1940, John Kerrich set out from his home to visit his in-laws – no small undertaking, given that he lived in South Africa and they were in Denmark 12,000 kilometres away. And the moment he arrived in Copenhagen he must have wished he’d stayed at home. Just days earlier, Denmark had been invaded by Nazi Germany. Thousands of troops swarmed over the border in a devastating demonstration of blitzkrieg. Within hours the Nazis had overwhelmed the opposition and taken control. Over the weeks that followed, they set about arresting enemy aliens and herding them into internment camps. Kerrich was soon among them.
    It could have been worse. He found himself in a camp in Jutland run by the Danish government, which was, he later reported, run in a ‘truly admirable way’.1 Even so, he knew he faced many months and possibly years devoid of intellectual stimulation – not a happy prospect for this lecturer in mathematics from the University of Witwatersrand. Casting around for something to occupy his time, he came up with an idea for a mathematical project that required minimal equipment but which might prove instructive to others. He decided to embark on a comprehensive study of the workings of chance via that most basic of its manifestations: the outcome of tossing a coin.
    Kerrich was already familiar with the theory developed by mathematicians to understand the workings of chance. Now, he realised, he had a rare opportunity to put that theory to the test on a lot of simple, real-life data. Then once the war was over – presuming, of course, he outlived it – he’d be able to go back to university equipped not only with the theoretical underpinning for the laws of chance, but also hard evidence for its reliability. And that would be invaluable when explaining the notoriously counter-intuitive predictions of the laws of chance to his students.
    Kerrich wanted his study to be as comprehensive and reliable as possible, and that meant tossing a coin and recording the result for as long as he could bear. Fortunately, he found someone willing to share the tedium, a fellow internee named Eric Christensen. And so together they set up a table, spread a cloth on it and, with a flick of a thumb, tossed a coin about 30 centimetres into the air.
    For the record, it came down tails.
    Many people probably think they could guess how things went from there. As the number of tosses increases, the well-known Law of Averages would ensure that the numbers of heads and tails would start to even out. And indeed, Kerrich found that by the 100th toss, the numbers of heads and tails were pretty similar: 44 heads versus 56 tails.
    But then something odd started to happen. As the hours and coin-tosses rolled by, heads started to pull ahead of tails. By the 2,000th toss, heads had built up a lead of 26 over tails. By the 4,000th toss, the difference had more than doubled, to 58. The discrepancy seemed to be getting bigger.
    By the time Kerrich called a halt – at 10,000 tosses – the coin had landed heads-up 5,067 times, exceeding the number of tails by the hefty margin of 134. Far from disappearing, the discrepancy between heads and tails had continued to grow. Was there something wrong with the experiment? Or had Kerrich discovered a flaw in the Law of Averages? Kerrich and Christensen had done their best to rule out biased tosses, and when they crunched the numbers, they found the Law of Averages had not been violated at all. The real problem was not with the coin, nor with the law, but with the commonly held view of what it says. Kerrich’s simple experiment had in fact done just what he wanted. It had demonstrated one of the big misconceptions about the workings of chance.
    Asked what the Law of Averages states, many people say something along the lines of ‘In the long run, it all evens out’. As such, the law is a source of consolation when we have a run of bad luck, or our enemies seem on the ascendant. Sports fans often invoke it when on the receiving end of anything from a lost coin-toss to a bad refereeing decision. Win some, lose some – in the end, it all evens out.
    Well, yes and no. Yes, there is indeed a Law of Averages at work in our universe. Its existence hasn’t merely been demonstrated experimentally; it’s been proved mathematically. It applies not only in our universe, but in every universe with the same rules of mathematics; not even the laws of physics can claim that. But no, the law doesn’t imply ‘it all evens out in the end’. As we’ll see in later chapters, precisely what it does mean took some of the greatest mathematicians of the last millennium a huge amount of effort to pin down. They still argue about the law, even now. Admittedly, mathematicians often demand a level of precision the rest of us would regard as ludicrously pedantic. But in this case, they are right to be picky. For knowing precisely what the Law of Averages says turns out to be one of the keys to understanding how chance operates in our world – and how to turn that understanding to our advantage. And the key to that understanding lies in establishing just what we mean by ‘It all evens out in the end’. In particular, what, exactly, is ‘it’?
    This sounds perilously like an exercise in philosophical navel-gazing, but Kerrich’s experiment points us towards the right answer. Many people think the ‘it’ which evens out in the long run is the raw numbers of heads and tails.
    So why did the coin produce far more of one outcome than another? The short answer is: because blind, random chance was acting on each coin-toss, making an exact match in the raw numbers of heads and tails ever more unlikely. So what happened to the Law of Averages? It’s alive and well; the thing is, it just doesn’t apply to the raw numbers of heads and tails. Pretty obviously, we cannot say how individual chance events will turn out with absolute certainty. But we can say something about them if we drop down to a slightly lower level of knowledge – and ask what chance events will do on average.
    In the case of the coin-toss, we cannot say with certainty when we’ll get ‘heads’ or ‘tails’, or how many we’ll get of each. But given that there are just two outcomes and they’re equally likely, we can say they should pop up with equal frequency – namely, 50 per cent of the time.
    And this, in turn, shows exactly what ‘it’ is that ‘evens out in the long run’. It’s not the raw numbers of heads and tails, about which we can say nothing with certainty. It is their relative frequencies: the number of times each pops up, as a proportion of the total number of opportunities we give them to do so.
    This is the real Law of Averages, and it’s what Kerrich and Christensen saw at work in their experiment. As the tosses mounted up, the relative frequencies of heads and tails – that is, their numbers divided by the total number of tosses – got ever closer. By the time the experiment finished, these frequencies were within 1 per cent of being identical (50.67 per cent heads versus 49.33 per cent tails. In stark contrast, the raw numbers of heads and tails grew ever farther apart (see table).
    The real Law of Averages, and what really ‘all evens out in the end’
    The Law of Averages tells us that if we want to understand the action of chance on events, we should focus not on each individual event, but on their relative frequencies. Their importance is reflected in the fact they’re often regarded as a measure of that most basic feature of all chance events: their probability.
    Is a coin-toss really fair?
    A coin-toss is generally regarded as random, but how the coin lands can be predicted – in theory, at least. In 2008, a team from the Technical University of Łód Poland,2 analysed the mechanics of a realistic coin tumbling under the influence of air resistance. The theory is very complex, but revealed that the coin’s behaviour is predictable until it strikes the floor. Then ‘chaotic’ behaviour sets in, with just small differences producing radically different outcomes. This in turn suggested that coin-tosses caught in mid-air may have a slight bias. This possibility has also been investigated by a team led by mathematician Persi Diaconis of Stanford University.3 They found that coins that are caught do have a slight tendency to end up in the same state as they start. The bias is, however, incredibly slight. So the outcome of tossing a coin can indeed be regarded as random, whether caught in mid-air or allowed to bounce.
    So, for example, if we roll a die a thousand times, random chance is very unlikely to lead to the numbers 1 to 6 appearing precisely the same number of times; that’s a statement about individual outcomes, about which we can say nothing with certainty. But, thanks to the Law of Averages, we can expect the relative frequencies of the six different outcomes to appear in around 1/6th of all the rolls – and get ever closer to that exact proportion the more rolls we perform. That exact proportion is what we call the probability of each number appearing (though, as we’ll see later, it’s not the only way of thinking of probability). For some things – like a coin, a die or a pack of cards – we can get a handle on the probability from the fundamental properties that govern the various outcomes (the number of sides, court cards, etc.). Then we can say that, in the long run, the relative frequencies of the outcomes should get ever closer to that probability. And if they don’t, we can start to wonder about why our beliefs have proved ill-founded.
    The Law of Averages tells us that when we know – or suspect – we’re dealing with events that involve an element of chance, we should focus not on the events themselves, but on their relative frequency – that is, the number of times each event comes up as a proportion of the total number of opportunities to do so.
    What the Law of Averages really means
    The Law of Averages warns us that when dealing with chance events, it’s their relative frequencies, not their raw numbers, we should focus on. But if you’re struggling to give up the idea that it’s the raw numbers that ‘even out in the long run’, don’t beat yourself up; you’re in good company. Jean-Baptiste le Rond d’Alembert, one of the great mathematicians of the Enlightenment, was sure that a run of heads while tossing a coin made tails ever more likely.
    Even today, many otherwise savvy people throw good money after bad in casinos and bookmakers in the belief that a run of bad luck makes good luck more likely. If you’re still struggling to abandon the belief, then turn the question around, and ask yourself this: why should the raw numbers of times that, say, the ball lands in red or black in roulette get ever closer as the number of spins of the wheel increases?
    Think about what would be needed to bring that about. It would require the ball to keep tabs on how many times it’s landed on red and black, detect any discrepancy, and then somehow compel itself to land on either red or black to drive the numbers closer together. That’s asking a lot of a small white ball bouncing around at random.
    In fairness, overcoming what mathematicians call ‘The Gambler’s Fallacy’ means overcoming the wealth of everyday experiences which seem to support it. The fact is that most of our encounters with chance are more complex than mere coin-tosses, and can easily seem to violate the Law of Averages.
    For example, imagine we’re rummaging through the chaos of our sock drawer before racing off to work, looking for one of the few pairs of sensible black socks. Chances are the first few socks are hopelessly colourful. So we do the obvious thing and remove them from the drawer while we persist with our search. Now who says the Law of Averages applies, and that a run of coloured socks does not affect the chances of finding the black ones? Well, it may look vaguely similar, yet what we’re doing is wholly different from a coin-toss or a throw of the roulette ball. With the socks, we’re able to remove the outcomes we don’t like, thus boosting the proportion of black socks left in the drawer. That’s not possible with events like coin-tosses. The Law of Averages no longer applies, because it assumes each event leaves the next one unaffected.
    Another hurdle we face in accepting the law is that we rarely give it enough opportunity to reveal itself. Suppose we decide to put the Law of Averages to the test, and carry out a proper scientific experiment involving tossing a coin ten times. That might seem a reasonable number of trials; after all, how many times does one usually try something out before being convinced it’s true: three times, perhaps, maybe half a dozen? In fact, ten throws is nothing like enough to demonstrate the Law of Averages with any reliability. Indeed, with so small a sample we could easily end up convincing ourselves of the fallacy about raw numbers evening out. The mathematics of coin-tosses shows that with ten tosses it’s odds-on that the number of heads and tails will be within 1 of each other; there’s even a 1 in 4 chance of a dead heat.
    Small wonder so many of us think that ‘everday experience proves’ it’s the raw numbers of heads and tails that even out over time, rather than their relative frequencies.
    When trying to make sense of chance events, be wary of relying on ‘common sense’ and everyday experience. As we’ll see repeatedly in this book, the laws ruling chance events lay a host of traps for those not savvy in their tricksy ways.
    The dark secret of the Golden Theorem
    Mathematicians sometimes claim they’re just like everyone else; they’re not. Forget the clichés about gaucheness and a penchant for weird attire; many mathematicians look perfectly normal. But they all share a characteristic that sets them apart from ordinary folk: an obsession with proof. This is not ‘proof’ in the sense of a court of law, or the outcome of an experiment. To mathematicians, these are risibly unconvincing. They mean absolute, guaranteed, mathematical proof.
    On the face of it, a refusal to take anyone’s word for anything seems laudable enough. But mathematicians insist on applying it to questions the rest of us would regard as blindingly, obviously true. They adore rigorous proofs of the likes of the Jordan Curve Theorem, which says that if you draw a squiggly loop on a piece of paper, it creates two regions: one inside the loop, the other outside. To be fair, sometimes their extreme scepticism turns out to be well founded. Who would have guessed, for example, that the outcome of 1 + 2 + 3 + 4 + etc., all the way to infinity could provoke controversy?1 More often, a proof confirms what they suspected anyway. But occasionally a proof of something ‘obvious’ turns out both to be amazingly hard, and to have shocking implications. Given its reputation for delivering surprises, it’s perhaps no surprise that just such a proof emerged during the first attempts to bring some rigour to the theory of chance events – and specifically, the definition of the ‘probability’ of an event.
    What does ’60 per cent chance of rain’ mean?
    You’re thinking of taking a lunchtime walk, but you remember hearing the weather forecast warn of a 60 per cent chance of rain. So what do you do? That depends on what you think the 60 per cent chance means – and chances are it’s not what you think. Weather forecasts are based on computer models of the atmosphere, and in the early 1960s scientists discovered such models are ‘chaotic’, implying that even small errors in the data fed in can produce radically different forecasts. Worse still, this sensitivity of the models changes unpredictably – making some forecasts inherently less reliable than others. So since the 1990s, meteorologists have increasingly used so-called ensemble methods, making dozens of forecasts, each based on slightly different data, and seeing how they diverge over time. The more chaotic the conditions, the bigger the divergence, and the less precise the final forecasts. Does that mean that a ’60 per cent chance of rain at lunchtime’ means 60 per cent of the ensemble showed rain then? Sadly not: as the ensemble is just a model of reality, its reliability is itself uncertain. So what forecasters often end up giving us is the so-called ‘Probability of Precipitation’ (PoP), which takes all this into account, plus the chances of our locality actually being rained on. They claim this hybrid probability helps people make better decisions. Perhaps it does, but in April 2009 the UK Meteorological Office certainly made a bad decision in declaring it was ‘odds on for a barbecue summer’. To those versed in the argot of probability, this just meant the computer model had indicated that the chances were greater than 50 per cent. But to most everyone else, ‘odds on’ means ‘very likely’. Sure enough, the summer was awful and the Met Office was ridiculed – which was always a racing certainty.
    One of the most intriguing things about probability is its slippery, protean nature. Its very definition seems to change according to what we’re asking of it. Sometimes it seems simple enough. If we want to know the chances of throwing a six, it seems fine to think of probabilities in terms of frequencies – that is, the number of times we’ll get the outcome we want, divided by the total number of opportunities it has to occur. For a die, as each number takes up one of six faces, it seems reasonable to talk about the probability as being the long-term frequency of getting the number we want, which is 1 in 6. But what does it mean to talk about the chances of a horse winning a race? We can’t run the race a million times and see how many times the horse wins. And what do weather forecasters mean when they say there’s a 60 per cent chance of rain tomorrow? Surely it’ll either rain or it won’t? Or are the forecasters trying to convey their confidence in their forecast? (As it happens, it’s neither – see box on previous page.)
    Mathematicians aren’t comfortable with such vagueness – as they showed when they started taking a serious interest in the workings of chance around 350 years ago. Pinning down the concept of probability was on their to-do list. Yet the first person to make serious progress with the problem found himself rewarded with the first glimpse of the dirty secret about probability that dogs its application to this day.
    Born in Basle, Switzerland, in 1655, Jacob Bernoulli was the eldest of the most celebrated mathematical family in history. Over the course of three generations, the family produced eight brilliant mathematicians who together helped lay the foundations of applied mathematics and physics. Jacob began reading up on the newly emerging theory of chance in his twenties, and was entranced by its potential applications to everything from gambling to predicting life expectancy. But he recognised that there were some big gaps in the theory that needed plugging – not least surrounding the precise meaning of probability.2
    Around a century earlier, an Italian mathematician named Girolamo Cardano had shown the convenience of describing chance events in terms of their relative frequency. Bernoulli decided to do what mathematicians do, and see whether he could make this definition rigorous. He quickly realised, however, that this seemingly arcane task created a huge practical challenge. Clearly, if we’re trying to establish the probability of some event, the more data we have, the more reliable our estimate will be. But just how much data do we need before we can say that we ‘know’ what the probability is? Indeed, is that even a meaningful question to ask? Could it be that probability is something we can never know exactly?
    Despite being one of the most able mathematicians of his age, it took Bernoulli 20 years to answer these questions. He confirmed Cardano’s instinct that relative frequencies are what matter when making sense of chance events like coin-tosses. That is, he’d succeeded in pinning down the true identity of the ‘it’ in statements like ‘It all evens out in the long run’. As such, Bernoulli had identified and proved the correct version of the Law of Averages, which focuses on relative frequencies rather than individual events.
    But that wasn’t all. Bernoulli also confirmed the ‘obvious’ fact that when it comes to pinning down probabilities, more data are better. Specifically, he showed that as data accumulate, the risk of the measured frequencies being wildly different from the true probability gets ever smaller (if you find this less than compelling, congratulations: you’ve spotted why mathematicians call Bernoulli’s theorem the Weak Law of Large Numbers; the more impressive ‘strong’ version was only proved around a century ago).
    In a sense, Bernoulli’s theorem is a rare confirmation of a common-sense intuition concerning chance events. As he himself rather bluntly put it, ‘even the most foolish person’ knows that the more data, the better. But dig a little deeper, and the theorem reveals a typically subtle twist about chance: we can’t ever ‘know’ the true probability with utter certainty. The best we can do is collect so much data that we cut the risk of being wildly wrong to some acceptable level.
    Proving all this was a monumental achievement – as Bernoulli himself realised, calling his proof the theorema aureum: ‘Golden Theorem’. He was laying the foundations of both probability and statistics, allowing raw data subject to random effects to be turned into reliable insights.
    With his mathematician’s predilection for proof satisfied, Bernoulli began collecting his thoughts for his magnum opus, the Ars Conjectandi – the Art of Conjecturing. Keen to show the practical power of his theorem, he set about applying his theorem to real-life problems. It was then that his theorema started to lose some of its lustre.
    Bernoulli’s theorem showed that probabilities can be pinned down to any level of reliability – given enough data. So the obvious question was: how much data was ‘enough’? For example, if we want to know the probability that someone over a certain age will die within a year, how big a database do we need to get an answer that we can be sure is, say, 99 per cent reliable? To keep things clear, Bernoulli used his theorem to tackle a very simple question. Imagine a huge jar containing a random mix of black and white stones. Suppose we’re told that the jar contains 2,000 black stones and 3,000 white ones. The probability that we’ll pick out a white stone is thus 3,000 out of a total of 5,000, or 60 per cent. But what if we don’t know the proportions – and thus the probability of picking out a white stone? How many stones would we need to extract in order to be confident of being pretty close to the true probability?
    In typical mathematician’s style, Bernoulli pointed out that before we can use the Golden Theorem, we need to pin down those two vague concepts ‘pretty close to’ and ‘confident’. The first means demanding that the data get us within, say, plus or minus 5 per cent of the true probability, or plus or minus 1 per cent, or closer still. Confidence, on the other hand, centres on how often we achieve this level of precision. We might decide we want to be confident of hitting that standard nine times out of ten (‘90 per cent confidence’) or 99 times out of 100 (‘99 per cent confidence’), or even more reliably.3 Ideally, of course, we’d like to be 100 per cent confident, but as the Golden Theorem makes clear, in phenomena affected by chance such God-like certainty isn’t achievable.
    The Golden Theorem seemed to capture the relationship between precision and confidence for the problem of randomly plucking coloured stones from not just one jar, but any jar. So Bernoulli asked it to reveal the number of stones that would have to be extracted from a jar in order to be 99.9 per cent confident of having pinned down the relative proportions of black and white stones it contains to within plus or minus 2 per cent. Plugging these figures into his theorem, he turned the mathematical handle … and a shocking answer popped out. If the problem was to be solved by taken out stones at random, over 25,500 stones would have to be examined before the relative proportions of the two colours could be pinned down to Bernoulli’s specifications.
    This wasn’t merely a depressingly large number, it was ridiculously large. It suggested that random sampling was a hopelessly inefficient way of gauging relative proportions, as even with a jar of just a few thousand stones, one would have to repeat the process of examining stones over 25,000 times to get the true proportion nailed down to Bernoulli’s standard. Clearly, it would be far quicker simply to tip the stones out and count them. Historians still argue over what Bernoulli thought of his estimate;4 disappointment seems to be the consensus. What is certain is that, after noting the answer, he added a few more lines to his great work – and then stopped. The Ars Conjectandi languished unpublished until 1713, eight years after his death. It’s hard to avoid the suspicion that Bernoulli had lost confidence in the practical value of his Golden Theorem. It’s known that he was keen to apply it to much more interesting problems, including settling legal disputes where evidence was needed to put a case ‘beyond reasonable doubt’. Bernoulli seems to have expressed his disappointment in the implications of his theorem in a letter to the distinguished German mathematician Gottfried Leibniz, where he admitted he could not find ‘suitable examples’ of such applications of his theorem.
    Whatever the truth, we now know that although Bernoulli’s theorem gave him the conceptual insights he sought, it needed some mathematical turbocharging before it was fit for use in real-life problems. This was supplied after his death by the brilliant French mathematician (and friend of Isaac Newton) Abraham de Moivre – allowing the theorem to work with far less data.5 Yet the real source of the problem lay not so much in the theorem as in Bernoulli’s expectations of it. The levels of confidence and precision he’d demanded from it may have seemed reasonable to him, but they turn out to be incredibly exacting. Even using the modern version of his theorem, pinning down the probability to the standards he set demands that around 7,000 stones be randomly chosen from a jar and their colour noted – which is still a huge amount.
    It’s odd that Bernoulli didn’t do the obvious thing and rework his calculations with less demanding levels of precision and confidence. For even in its original form, the Golden Theorem shows this has a significant impact on the amount of data required; using the modern version, the impact is pretty dramatic. Taking Bernoulli’s 99.9 per cent confidence level, but easing the precision level from plus or minus 2 per cent to 3 per cent, slashes the number of observations by more than half, to around 3,000. Alternatively, sticking with an error level of 2 per cent but reducing our confidence level to 95 per cent cuts the number of observations by even more, to around 2,500 – just 10 per cent of the amount estimated by Bernoulli. Do both – a bit less precision, a bit less confidence – and the figure plunges again, to around 1,000.
    That’s far less demanding than the figure reached by Bernoulli, though admittedly we’ve paid a price in terms of the reliability of our knowledge. Perhaps Bernoulli would have baulked at lowering his standards so far; sadly, we’ll never know.
    Today, 95 per cent has become the de facto standard for confidence levels in a host of data-driven disciplines, from economics to medicine. Polling organisations have combined it with the precision of plus or minus 3 per cent to arrive at their standard polling group size of 1,000 or so. Yet while they may be widely used, we should never forget that these standards are based on pragmatism, rather than some grand consensus of what constitutes ‘scientific proof’.
    The dirty secret lurking in Bernoulli’s Golden Theorem is that when trying to gauge the effects of chance, God-like certainty is unattainable. Instead, we usually face a compromise between gathering a lot more evidence, or lowering our standard of knowledge.
    The First Law of Lawlessness
    The true meaning of the Law of Averages has been mangled and misunderstood so badly and so often that experts in probability tend to avoid the term. They prefer arguably even less helpful terms like the Weak Law of Large Numbers – which sounds like an unreliable rule about crowds. So instead, let us break apart the Law of Averages into its constituent insights, and call them the ‘Laws of Lawlessness’. The first centres on how best to think about events that involve an element of chance.
    The First Law of Lawlessness
    When trying to make sense of chance events, ignore the raw numbers. Focus instead on their relative frequency – that is, how often they occurred, divided by how often they had the opportunity to do so.
    The First Law of Lawlessness warns us to be wary of claims based purely on raw numbers of events. That makes it especially useful when confronted by media coverage of, say, people with side effects to some new treatment, or lottery wins in a specific town. Such stories are typically accompanied by pictures of the tragic victims or lucky winners. There’s no doubting the power of such stories. Even a single, shocking, real-life case can trigger historic changes in policy – as anyone who’s been through airport security after 9/11 knows. And sometimes that’s the appropriate response. But basing a decision on a handful of cases is usually a very bad idea.
    The danger is that the cases appear to be typical, when in fact they’re anything but. Indeed, the very fact they’re so shocking is often because they’re ‘outliers’ – the product of extremely rare confluences of chance.
    The First Law of Lawlessness shows that we can avoid such traps by focusing instead on relative frequencies: the raw numbers of events, divided by the relevant number of opportunities for the event to occur.
    Let’s apply the law to a real-life example: the 2008 decision by the UK government to vaccinate pre-teen girls against HPV, the virus responsible for cervical cancer. This national programme was hailed as having the potential to save the lives of hundreds of women each year. Yet shortly after its launch, the media seemed to have compelling evidence that this was a dangerously optimistic view. They reported the tragic case of Natalie Morton, a fourteen-year-old girl who died within hours of being given the vaccine. The health authorities responded by checking stocks and withdrawing the suspect batch. This was not enough for some, however: they wanted the mass vaccination programme abandoned. Was this reasonable? Some would insist on invoking the so-called Precautionary Principle, which in its most unsophisticated form amounts to ‘better safe than sorry’. The danger here lies in resolving one problem while creating another. Stopping the programme would eliminate any risk of death among its participants, but that still leaves the problem of how best to tackle cervical cancer.
    Then there’s the risk of falling for a trap that deserves to be much better known (and which we’ll encounter again in this book). Logicians call it the ‘Post hoc, ergo propter hoc’ fallacy – from the Latin for ‘After this, therefore because of this’. In the case of Natalie’s death, the trap lies in assuming that because she died after being vaccinated, the vaccination must have been the cause. Certainly, true causes always precede their effects, but reversing the logic has its dangers: people in car crashes typically put on seat belts before setting off, but that doesn’t mean putting on seat belts causes crashes.
    But let’s assume the worst: that Natalie’s death really was caused by a bad reaction to the vaccine. The First Law of Lawlessness tells us that the best way to make sense of such events is to focus not on individual cases, but instead on the relevant proportions. What are these? By the time of Natalie’s death, 1.3 million girls had been given the same vaccine. That means the relative frequency of this kind of event was around 1 in a million. It was this that persuaded the UK government, in the face of protests from anti-vaccination campaigners, to resume the programme once the suspect batch had been withdrawn. This was the rational response if Natalie had indeed fallen victim to a rare reaction to the vaccine.
    As it happens, this wasn’t the case: the media had fallen into the trap of post hoc, ergo propter hoc. At the inquest into her death, it emerged that Natalie had a malignant tumour in her chest, and her death was unconnected to the vaccination. Even so, the First Law showed that the authorities had adopted the right approach by taking out just the suspect batch, rather than abandoning the whole programme.
    Of course, the First Law isn’t guaranteed to lead straight to the truth. Natalie could have been Case Zero of a reaction to the vaccine never seen during tests. And it was clearly right to look into the causes of the case for evidence that there could be more. The role of the First Law lies in preventing us being overly impressed by individual cases, and focusing our attention instead on relative frequencies, thus putting such cases in their correct context.
    There are more general lessons here for managers, administrators and politicians determined to bring about ‘improvements’ following a handful of one-off events. If they ignore the First Law of Lawlessness, they risk taking action to deal with events that are exceedingly rare. Worse, having based the ‘improvement’ on a handful of cases, they may then decide to test it on a similarly small set of data, focus again on raw numbers rather than relative frequencies, and come to utterly erroneous conclusions. It could be anything from a spate of customer complaints to a staff suggestion about, say, a new way of doing things. They all tend to start with a few anecdotes which may or may not be significant. But the first step to finding out is to put them into their proper context – by turning them into the appropriate relative frequencies.
    Sometimes making sense of events requires a comparison of relative frequencies. In the late 1980s, UK-based defence contractor GEC-Marconi became the focus of media coverage following a spate of over twenty suicides, deaths and disappearances among technical staff. Conspiracy theories started to emerge, fuelled by the fact that some of the victims were working on classified projects. While these made for intriguing stories, the First Law tells us to ignore the anecdotes and focus instead on relative frequencies – in this case, a comparison of the relative frequency of strange events at Marconi and those we’d expect within the general population. And that immediately focuses attention of the fact that GEC-Marconi was a huge company employing over 30,000 staff, and that the deaths had been spread over eight years. This suggests that the ‘mysterious’ deaths and disappearances may not have been so surprising, given the size of the company. That at least is what the subsequent police investigation concluded, though the conspiracy theories persist to this day.
    In fairness, the importance of comparing relative frequencies is starting to catch on within the media. In 2010, France Telecom made headlines with a GEC-Marconi-like number of suicides: 30 between 2008 and 2009. The story flared again in 2014, when the company – now called Orange Telecom – saw a resurgence in suicides, with ten in just a few months. This time, the explanation du jour was work-related stress. But in contrast to the reporting of the GEC-Marconi cases, some journalists raised the key question prompted by the First Law: is the rate of suicides, rather than just the raw numbers, really all that abnormal – given that it’s a huge company with around 100,000 employees?
    That raises a tricky question that often emerges when trying to apply the First Law, however: what is the appropriate relative frequency to use in the comparison? In the case of Orange Telecom, is it the national suicide rate (which is notoriously high in France, at 40 per cent above the EU average), or something more specific, like the rate among specific age ranges (suicide is the principal cause of death among 25–34-year-olds in France) or perhaps socio-economic grouping? The jury is still out on the Orange Telecom case; while it may be a statistical blip, others insist workplace stress is the real explanation. It’s entirely possible that the truth will never be known.
    The strange case of the Bermuda Triangle
    The First Law is especially useful when trying to make sense of spooky claims and conspiracy theories. Take the notorious case of the disappearance of ships and aircraft over a patch of the western Atlantic known as the Bermuda Triangle. From the 1950s onwards, there have been countless reports that bad things happen to those who enter the triangular-shaped area between Miami, Puerto Rico and the eponymous island. Many theories have been put forward to explain the events, from UFO attacks to rogue waves. But the First Law of Lawlessness tells us to focus not on the raw numbers of ‘spooky’ disappearances (which may or may not have happened), but instead compare their relative frequency to what we’d expect from any comparable part of the ocean. Do that, and something amazing emerges: it’s entirely possible that all the unexplained disappearances really did take place. That’s because tens of thousands of ships and aircraft pass through this vast, 1 million square kilometres of sea and airspace each year. Even if you include all those weird tales of the unexplained, it turns out the Bermuda Triangle is not even in the top ten of oceanic danger zones. Certainly the hard-nosed actuaries at world-renowned insurers Lloyd’s of London aren’t fazed by the raw numbers of supposedly ‘spooky’ events in the region. They don’t charge higher premiums for daring to venture into it.
    Whatever the reality, the First Law tells us where to start in making sense of such questions. It also makes a prediction: that anything that encompasses enough people – from a government health campaign to employment with a multinational – has the ability to generate headline-grabbing stories, backed up with compelling real-life anecdotes, that mean less than they seem.
    Try it yourself. Next time you hear of some national campaign that is generally a good thing but can have nasty side effects for some people – such as a mass medication campaign – make a note of it, wait for the horror stories, and then put the First Law to work.
    Chance events can shock us by their apparent improbability. The First Law of Lawlessness tells us to look beyond the raw numbers of such events, and focus instead on their relative frequencies – which gives us a handle on the probability of the event. And if low-probability events can happen, they will – given enough opportunity.
    What are the chances of that?
    Sue Hamilton was doing some paperwork in her office in Dover in July 1992 when she ran into a problem. She thought her colleague Jason might know how to solve it, but as he’d gone home, she decided to call him. She found his phone number on the office noticeboard. After apologising for disturbing him at home, she began explaining her problem, but barely had she begun than Jason interrupted to point out that he wasn’t at home. He was in a public phone box whose phone had begun to ring as he walked past, and he’d just decided to pick it up. Amazingly, it turned out that that number on the noticeboard wasn’t Jason’s home number at all. It was his employee number – which just happened to be identical to the number of the phone box he was walking past at the moment she called.
    Everyone loves stories about coincidences. They seem to hint at invisible connections between events and ourselves, governed by mysterious laws. And it’s true. There are myriad invisible connections between us, but they’re invisible primarily because we just don’t go looking for them. The laws that govern them are also mysterious – but again, that’s primarily because we rarely get told about them.
    Coincidences are manifestations of the First Law of Lawlessness, but with a twist. That’s because this law tells us what to do to make sense of chance events, while coincidences warn us of how difficult it can be to do this.
    When confronted with an ‘amazing’ coincidence, the First Law tells us to start by asking ourselves about its relative frequency – that is, the number of times such an amazing coincidence could happen, divided by the number of opportunities such events have to occur. For a truly amazing coincidence, we’d expect the resulting estimate of the probability of the event to be astoundingly low. But as soon as we try to apply the law to coincidences such as Sue Hamilton’s phone call, we run into trouble.
    How do we even begin to estimate the number of such amazing events, or the number of opportunities they get to arise? What, for that matter, constitutes ‘amazing’? This clearly isn’t something we can define objectively, which in turn means we’re on shaky ground insisting that we’ve experienced something inherently meaningful. The late, great, Nobel Prize-winning physicist Richard Feynman highlighted this common feature of coincidences with a typically down-to-earth example. During a lecture on how to make sense of evidence, he told his audience, ‘You know, the most amazing thing happened to me tonight. I was coming here, on the way to the lecture, and I came in through the parking lot. And you won’t believe what happened. I saw a car with the license plate ARW 357. Can you imagine? Of all the millions of license plates in the state, what was the chance that I would see that particular one tonight? Amazing!’
    Then there’s the awkward fact that we usually decide that a coincidence was ‘amazing’ only after we’ve experienced it, making our assessment of its significance entirely post hoc, and potentially very misleading. There’s a Monty Python sketch based on the legend of William Tell that captures the dangers of post hoc rationalisation perfectly. It shows a crowd of people gathered round our eponymous hero, as he takes careful aim at the apple sitting on the head of his son – and hits it. The crowd duly cheers … and we feel impressed too, until the camera pulls back to reveal Tell’s son riddled with arrows from all the previous failed attempts to hit the apple. Tell’s skill only appears amazing if we ignore all of these; likewise with coincidences. In reality, they are constantly occurring around us all the time, but the overwhelming majority are boring and insignificant. Every so often we’ll spot something we decide is the equivalent of an arrow splitting an apple – and declare it surprising, amazing or even spooky, having studiously ignored the myriad less interesting events.
    All this speaks to the fact that we humans are natural-born pattern seekers, prone to seeing significance in meaningless noise. Doubtless our cave-dwelling ancestors benefited from erring on the side of caution and hiding if something looked even vaguely like a predator. But this can all too easily slide into what psychologists called apophenia: a predilection for seeing patterns where none exists. We’re all especially prone to one particular form of this, known as pareidolia. Every so often the media reports claims of ‘miraculous’ cloud formations, scorch marks on toast or features on Google maps that supposedly look like Christ, Mother Teresa or Kim Kardashian. And it’s hard to disagree that they do. What we make of such ‘miracles’ depends on whether we think the chances of getting them by fluke alone are impossibly low. If we apply the First Law of Lawlessness, we have to confront the fact that the brain has myriad ways of making a face out of random swirl.
    One of the most notorious cases of pareidolia centres on the so-called Face on Mars. In 1976, one of NASA’s probes to the Red Planet sent back a picture that seemed to show the image of an alien on the Red Planet. It provoked controversy for 25 years, with most scientists dismissing it as nonsense. A few tried estimating the chances of getting so realistic a face by chance alone, but ended up mired in disputes over the figures they’d plugged into their relative frequency calculations. Finally, in 2001, the truth was revealed by sharp images taken by NASA’s Mars Global Surveyor. These showed that the ‘face’ was indeed just a rocky outcrop, just as the sceptics had claimed.
    When trying to make sense of a coincidence, it’s easy to underestimate just how common such an ‘amazing’ event is – not least by defining how amazing it is only after seeing it, which is cheating, really.
    How to predict coincidences
    One of the most perplexing demonstrations of the laws of chance is the so-called Birthday Paradox: just 23 people are needed to give better that 50:50 odds that at least two will share a birthday. You don’t need so big a group to demonstrate such coincidences, though: a random gathering of five people gives an evens chance that at least two will share the same star sign (or were born in the same month, if you’re a rational Virgo and thus prefer a less silly example). The reason so few people are needed is because you’re asking for any match between all the different ways of pairing the people in the group – which is surprisingly large: one can form 253 pairs from 23 people. This lack of specificity is key: if you demand an exact match with your birthday, you’ll need a crowd of over 250 people to give better than 50:50 odds. Being less fussy and asking for a match of any two birthdays within a day either way hugely boosts the chances of a coincidence: indeed, there’s a 90 per cent chance of finding such a ‘near miss’ coincidence among the players in any football match.1
    Coincidences surprise us because we think they’re very unlikely, and so can’t be ‘mere flukes’. The First Law of Lawlessness warns us of the dangers of underestimating the chances of coincidences by deciding ourselves what counts as ‘amazing’.
    Thinking independently is no yolk
    In September 2013, John Winfield was in the kitchen of his home in Breadsall, Derbyshire, when he realised he needed some eggs. Popping out to the store, he returned with six, and began cracking them open. To his surprise, the first one had a double yolk – something he’d never seen before in his life. Then he cracked another, and saw another double yolker. Amazed, he carried on opening the eggs, and discovered every one had double yolks, including the final one – which he dropped on the floor in his excitement.
    The amazing case of the six double yolkers was picked up by journalists, who helpfully did the maths to show how unlikely the event was. According to the British Egg Information Service, on average only around 1 in 1,000 eggs produced has a double yolk. And this prompted reporters to reach for their calculators plus some half-remembered notions about how to handle probabilities. They reckoned that if the chances of getting one double yolker was just 1 in 1,000, the chances of getting six must be 1 in 1,000 multiplied by itself six times, or 1 in a billion billion. That’s an astronomical number: it implies that to witness what Mr Winfield saw just once, you’d have to have opened a box of eggs every second since the birth of the universe.
    Some journalists twigged there was something dodgy about this reasoning, however. For a start, Mr Winfield was hardly the first since the Big Bang to report such an event. A quick trawl of the web revealed several similar reports, including an identical case of six double yolkers being found in Cumbria just three years earlier. Science writer Michael Hanlon at the Daily Mail raised doubts about the 1-in-1,000 figure used in the calculation.1 He pointed out that the chances of getting multiple yolkers depend heavily on the age of the hens, with younger ones being over ten times more likely to produce them. So while the 1-in-1,000 figure might be true on average, the double-yolker rate for farms with younger birds could easily be 1 in 100 – boosting the chances of getting a six-pack from such farms at least a million-fold.
    Yet that can’t be the whole explanation, as it still leaves the chances of getting six double yolkers at around 1 in 1,000 billion. Each year the equivalent of around 2 billion half-dozen packs are consumed in the UK, so even with the hugely increased chances, we’d still expect to hear of around two cases per millennium, not two in barely three years. When a calculation gives as crazily incorrect an answer as this, it’s a sign there’s something fundamentally wrong with the assumptions behind it. And the big assumption made in this one is that the probabilities of each event occurring separately really can be multiplied together. The laws of probability show that’s only permissible if the events in question – in this case, the discovery of double yolkers – are independent of one another, so that we don’t have to correct for any outside influence.
    The notion that events are independent runs deep in the theory of chance events. Many ‘textbook’ manifestations of chance – repeated tosses of a coin, say, or throws of a die – are indeed independent; there’s no reason to suspect one such event should influence any other. Yet while the assumption of independence keeps the maths simple, we must never lose sight of the fact that it’s just that: an assumption. Sometimes it’s an assumption we can safely make – as when trying to make sense of cricketer Nasser Hussain’s legendary run of ‘bad luck’ in 2001 when he lost the toss fourteen times on the trot. While the chances of that are barely 1 in 16,000, there’s no need to suspect anything strange; when one thinks of how many top cricket captains have tossed coins over the decades, it’s an event that was clearly going to happen one day. But all too often, the assumption of independence isn’t remotely justifiable. We live in a messy, interconnected world shot through with connections, links and relationships. Some are the result of the laws of physics, some of biology, some of human psychology. Whatever the cause of the connections, blithely assuming they don’t exist can lead us into trouble. Indeed, so serious are the consequences that another Law of Lawlessness is merited:
    The Second Law of Lawlessness
    When trying to understand runs of seemingly ‘random’ events, don’t automatically assume they’re independent. Many events in the real world aren’t – and assuming otherwise can lead to very misleading estimates of the chances of observing such ‘runs’.
    Applying the Second Law to the double-yolker story means thinking of ways in which finding one such egg in a pack might be linked to finding more. As we’ve seen, one way is that the contents of one box could have come from young hens, which are prone to producing double yolkers. Then there’s the possibility of double yolkers being brought together by egg-box packers, increasing the chances of getting a box full of them. Again, that’s known to occur: double yolkers tend to be relatively large eggs, and stand out among the otherwise small ones produced by young hens – and thus tend to get boxed up together. Some supermarkets even make a point of boxing up potential double yolkers together.
    There are, therefore, solid grounds for thinking that finding one double yolker increases the chances of finding another in the same box – and thus for rejecting the idea of independence and the colossal odds that implies. Like the First Law, the Second Law has myriad uses – including making sense of seemingly spooky coincidences. Take the bizarre tale of how the Titanic disaster of April 1912 was foretold in eerily accurate detail by a book written fourteen years earlier. In the short story ‘Futility’, published in 1898, the American writer Morgan Robertson told the story of John Rowland, a deckhand aboard the largest ship ever built, which sinks with huge loss of life after striking an iceberg in the North Atlantic one April night. And the name of the ship? SS Titan. The parallels don’t stop there, either. Robertson’s vessel was, at over 240 metres in length, around the same size as the Titanic, was described as ‘unsinkable’, and carried fewer than half the lifeboats needed for those aboard. It was even struck on the same side: starboard.
    This is certainly an impressive list of coincidences, and might lead one to wonder whether Robertson had based his book on a premonition. Maybe he did, but the smart money is on his plot-line being a demonstration of what coincidences emerge if events are not independent. When ‘Futility’ was published, a race to build colossal passenger ships was already well under way, driven by international competition to win the Blue Riband – the accolade awarded to the fastest Atlantic passenger liner. In the final decade of the nineteenth century, the largest vessels went from around 170 metres in length to well in excess of 200 metres – and 240 metres was patently not out of the question. As for what could wreak havoc on such leviathans, icebergs were already a recognised threat. So was the inadequate provision of lifeboats: there had been warnings that regulations had failed to keep pace with the rapid increase in the size of vessels. Clearly, correctly guessing the side hit by the iceberg was a simple 50:50 shot. Barely less surprising is Robertson’s choice of name for his doomed ship. In the search for something evocative of a colossal vessel, SS Titan is clearly more likely to feature in a list of candidates than, say, SS Midget.
    In short, Robertson’s aim of penning a tragic but plausible tale about a doomed leviathan more or less compelled him to include events and characteristics not too far from those of the Titanic. A random choice simply wouldn’t have made narrative sense.
    ‘Textbook’ manifestations of chance, such as coin-tosses, can be assumed to be independent. But in the real world, that’s often a dangerous assumption to make, even with runs of apparently rare events. The Second Law of Lawlessness warns against automatically assuming independence when estimating the chances of such a set of coincidences.
    Random lessons from the lottery
    Since it began in 1988, Florida’s state lottery has handed out over $37 billion in prizes, created over 1,300 dollar millionaires and put over 650,000 students through college. But on 21 March 2011, it turned a lot of Floridians into conspiracy theorists. After years of suspicion, that evening they believed they had finally been given proof of why they had never won anything despite years of trying: the whole lottery was a fix. Their evidence could hardly have been more impressive. Every evening, seven days a week, the lottery runs the Fantasy 5 draw, where 36 balls are put into a randomising machine and five winning balls are chosen at random. Or at least, that’s what the organisers claim. But on that day in 2011, it was obvious the fix was in. As the balls popped up out of the machine, it was clear that the process was anything but random: the winning numbers were 14, 15, 16, 17, 18. Hard-core lottery players knew that the odds against winning the jackpot with any random pick were around 1 in 377,000, so clearly something very suspicious had happened.
    In reality, something all too common had taken place: a demonstration that most of us have a less-than-perfect grasp of what randomness really is.
    We all like to think we can learn from experience. And given how common random events are in our world, you’d think people would be pretty much up to speed with what randomness can toss their way. You could hardly be more wrong. Asked simply to define randomness, people typically mention characteristics like ‘having no rhyme or reason’ and ‘patternless’ – which isn’t too bad, at least, up to a point. It’s when they’re asked to apply these intuitions to real-life problems that it all starts to go wrong.
    In the 1970s, psychologist Norman Ginsburg at McMaster University, Canada, carried out studies to see how good people are at the seemingly simple task of writing down lists of 100 random digits. Most participants came up with well-jumbled sequences of digits, with few repeats digits, runs of consecutive numbers or any other numerical pattern. In other words, they did their best to ensure every digit got its ‘fair share’ of appearances in each otherwise patternless sequence. In the process, they inadvertently demonstrated a fundamental misconception about randomness.
    It’s true that there’s no rhyme or reason to randomness: by definition it cannot be the outcome of any predictable process. It’s also true that it is patternless. The problem is, that’s something that’s only guaranteed on huge (indeed, strictly speaking, infinite) scales. On every other scale, the lack of rhyme or reason of randomness is entirely capable of containing pattern-like sequences long enough to seem significant. Yet when asked to create some randomness of our own, we can’t resist trying to reflect the patternless nature of infinite randomness in even the shortest burst of the stuff.
    Clearly, what we need is regular exposure to short snatches of randomness, so we can get a feel for what it looks like on such scales. Fortunately, that’s easily achieved – indeed, millions unwittingly do it worldwide several times a week. It’s called watching lottery number draws on TV.
    Many countries have national lotteries as a means of raising money for good causes. Most people tune in to watch the draws of lottery numbers simply to see whether they’ve won the jackpot – which, given that the odds are typically millions to one against, is usually an exercise in futility. Yet there’s something to be said even for those who’ve not bought any lottery tickets tuning in occasionally, to see what randomness can do – and watch the numbers fall into what look suspiciously like patterns.
    Many lotteries (including, until recently, the UK’s national lottery) are ‘6-from-49’; that is, winning involves correctly guessing the six balls drawn from the 49 put into the randomising machine. This doesn’t sound too difficult; it’s oddly tempting to estimate that the chances of being able to guess the right set of six is 6 out of 49, or around 1 in 8. But like most forms of gambling (and that’s what lotteries are), that’s misleading, and the real chances are far lower. That 1 in 8 figure would be true if there were only six numbered balls among the 49, and we had to pick just one of the six. What we’re being asked to do is far harder: pick all six of the right balls from 49, all of which have their own numbers on. The chances of doing this are very slim indeed: around 1 in 14 million. Why so small? Because our chances of getting the first number right are 1 in 49, the chances of getting the second number correct from the 48 remaining in the machine are 1 in 48; for the third they are 1 in 47 – and so on, all the way down to getting the sixth number right, which is 1 out of the 44 remaining. As the chance of any specific ball emerging from the machine is random and thus independent of the chances for any of the others, the probability of guessing all six of any given set is all these probabilities multiplied together – which is (1/49) × (1/48) × (1/47) × (1/46) × (1/45) × (1/44) – which works out at pretty much exactly 1 in 10 billion. The organisers of lotteries do cut us a bit of slack by not demanding that we also get the exact order in which the six come out of the machine correct too. They’ll accept any of the 720 different orderings of six chosen balls (say, 2, 5, 11, 34, 41, 44, or 34, 2, 5, 11, 44, 41, etc.). So the odds of our picking the same numbers are around 1 in 10 billion times 720, which comes out at around 1 in 14 million. Just in case you think these aren’t bad odds, picture this: they’re equivalent to the lottery organisers tipping ten 1-kilogram bags of sugar on the floor, and asking you to pick out from the heap the single grain they’ve stained black – in one go, and while wearing a blindfold. Good luck with that.
    So the chances are that we’re never going to win the jackpot, even if we play for the rest of our lives. Indeed, one can show that the average lottery player in the UK has a bigger chance of dropping dead during the half-hour it takes to watch the show and make the call to claim the jackpot. But lurking in those numbers that routinely disappoint week in, week out, is an important lesson about the workings of randomness. Indeed, it’s so important, it deserves elevation to the status of a Law of Lawlessness:
    The Third Law of Lawlessness
    True randomness has no rhyme or reason to it, and is ultimately patternless. But that doesn’t mean it is devoid of all patterns at every scale. Indeed, on the scales on which we encounter it, randomness is shockingly prone to producing regularities which seduce our pattern-craving minds.
    Evidence for this law can be found by regularly watching those lottery draws on TV – or, for those who need more rapid gratification, by looking through the online archives of previous jackpot selections. Examining a few weeks’ worth of the winning six numbers for the UK national lottery at random (what else?) won’t reveal any obvious patterns – seemingly confirming our belief that randomness really does mean patternless at every scale. For example, here’s the eight sets of winning numbers in the UK draw in June 2014:
    14, 19, 30, 31, 47, 48
    5, 10, 16, 23, 31, 44
    11, 13, 14, 28, 40, 42
    9, 18, 22, 23, 29, 33
    10, 11, 18, 23, 26, 37
    3, 7, 13, 17, 27, 40
    5, 15, 19, 25, 34, 36
    8, 12, 28, 30, 43, 39
    At first glance, it looks like 48 numbers with no obvious patterns, biases or sequences, just as we’d expect. But look again, this time for the most basic pattern possible with lottery balls: two consecutive numbers. Four of the eight sets contain such a ‘run’; indeed, the first set contains two such runs. Chances are you missed them because they’re such trivial patterns that they elude even the renowned pattern-spotting abilities of H. sapiens. Yet this is a hint of the patterns randomness can throw at us, and how they follow certain laws – all in seeming defiance of our beliefs about randomness. Using a notoriously tricky branch of maths called combinatorics, it’s possible to count up the ways of getting runs of different lengths among the six numbers, and it turns out that at least two consecutive numbers can be expected in half of all ‘6-from-49’ lottery draws. Thus in the eight draws during June 2014 we should expect around four to have a run of two or more numbers, and that’s just what we got – and would get most months, if we’d bothered to check.
    Before anyone thinks this might help predict which numbers are going to win each week, don’t forget we still have no idea which two or more numbers will be paired: that’s random, and thus unpredictable. What we’ve shown is simply that it will happen to some pair or longer run of numbers. Even so, this holds some very important lessons for us about patterns in randomness. First, it shows that patterns are not only possible in randomness, they’re actually surprisingly common – and their rate of appearance can be calculated. Secondly, it highlights the fact that many samples of randomness – including lottery draws – have lots of patterns, but we miss them because we deem them ‘insignificant’. In other words, we need to be wary about seeing ‘significant’ patterns in randomness, because patterns are in the eye of the beholder. And thirdly, while being very specific about what we want from randomness slashes the chances of getting it (e.g. that single set of six jackpot-winning balls), being very vague (e.g. ‘any consecutive pairs’) greatly increases the chances.
    We can put all this to work by looking for other patterns in those samples of randomness we see in lottery draws. Viewers of the 1,310th draw for the UK national lottery on 12 July 2008 were astonished to witness no fewer than four consecutive numbers emerge among the six balls drawn from the 49 put in the randomising machine: 27, 28, 29, 30. A month later, the lottery machine was churning out other patterns, this time of three consecutive numbers among the six: 5, 9, 10, 11, 23, 26. While more striking than mere pairs, these patterns are still surprisingly common – not least because we’re not fussed which three- or four-ball runs make up the pattern. Combinatoric calculations show that even the startling run of four consecutive numbers should pop up on average around once in every 350 draws – so arguably the biggest surprise was why it had taken 1,300-odd draws to see the first (sure enough, there have been several since).
    In the light of such insights, the appearance of a complete run of five consecutive numbers in Florida’s Fantasy 5 lottery draw of 21 March 2011 should no longer seem all that shocking. Again, we’re not demanding a specific set of numbers, and that makes it easier to achieve. Indeed, it’s easy to do the sums to see this. Following the reasoning for the UK lottery, extracting any five balls from the 36 in the Florida Fantasy 5 draw in the right sequence is possible in around 45 million ways. Again, the organisers cut us some slack, and any of the 120 different orderings of five balls is acceptable as a win, so there are around 375,000 ways of matching the jackpot-winning balls. But of these, only some will be all consecutive: the first such set is {1, 2, 3, 4, 5}, then {2, 3, 4, 5, 6} all the way up to {32, 33, 34, 35, 36}. There are just 32 such consecutive sets, so the probability of the five numbers being consecutive is 32/375,000 = 1 in 12,000. As draws are held seven days a week all year round, that means we should expect a roughly 30-year gap between each example of five consecutive balls. Give randomness enough time, and it’ll come up with anything. In the event, the first popped up after 23 years, which is a little early, but not egregiously so.
    There’s one more valuable lesson about randomness we can learn from lottery draws – and a case study popped up in the UK lottery midweek draw shortly after that set of four consecutively numbered balls. First came a triple of 9, 10, 11, then another – 32, 33, 34 – the following week, and then another – 33, 34, 35 – the week after that.
    This time we have a cluster of patterns. So what are we to make of this? Nothing – apart from its startling demonstration of how real randomness can throw up such clusters. Combinatoric calculations show that, in the long run, such triples will pop up in one in 26 draws from this kind of lottery. But randomness, with its customary lack of rhyme or reason, has no way of rigidly sticking to that rate. Sometimes the triples will be widely spaced, and sometimes they’ll come in clusters as they did in 2008. Only conspiracy theorists are likely to see anything in such clusters. It’s a different matter when the patterns produced by randomness represent not lottery numbers but, say, cancer cases in a town. Maybe there’s something in the patterns, maybe there isn’t, but even then we must remember that randomness is capable of producing patterns and clusters of patterns with surprisingly ease.
    Sometimes the lottery does things that can make even mathematicians smile. After churning out the simple patterns in July and August, on 3 September 2008 the UK lottery spat out its most sophisticated pattern yet: 3, 5, 7, 9, four consecutive odd numbers. And after that, it went back to months of doing what randomness is ‘supposed’ to do: being dull, flat and patternless.
    Many mathematicians regard playing lotteries as unspeakably stupid. They point to the shockingly low odds of winning the jackpot (remember those ten bags of sugar and that single grain?) and to the fact that the organisers set up lotteries so that players typically have to spend more than the average jackpot in tickets to stand a decent chance of winning. Which is true, though one could argue that paying for a ticket boosts the chances of winning by an infinite amount, from zero to 1 in 14 million, which is a lot of bang for your buck. But as we’ve seen, while you do have to be ‘in it to win it’, some priceless lessons about randomness can be had from any lottery for free.
    Most of us think we know what randomness looks like: nice and smooth and utterly lacking in any patterns or clusters. The reality is very different – as the numbers that emerge during lottery draws show. They feature all kinds of patterns and clusters. But while the frequency of these patterns can be predicted, their precise identity never can.
    Warning: there’s a lot of X about
    In May 2014, a verdict of suicide was recorded on a sixteen-year-old who asphyxiated himself in his bedroom in Hale, Greater Manchester. William Menzies was a straight-A student with no obvious problems. But the coroner had noticed something that worried him – something that connected the tragedy to another case of teenage suicide he’d personally dealt with, along with two others he’d encountered. All the victims had killed themselves after playing a video game. And not just any video game, either, but the best-selling Call of Duty, in which players take part in virtual warfare.
    Among its millions of fans – and its critics – Call of Duty is known for its immersive realism. The notorious lone terrorist Anders Breivik claimed to have used it for training before slaughtering 77 people in Norway in one day in July 2011. Could it be that Call of Duty is so realistic it triggers the same side effects as real-life combat, such as post-traumatic stress disorder, depression and even suicidal thoughts? The coroner was sufficiently concerned of the risk that he issued a warning, urging parents to keep their kids away from such games.
    Not everyone was convinced by his logic. Among the sceptics was Dr Andrew Przybylski, an experimental psychologist at the Oxford Internet Institute. He pointed out that millions of teenagers play Call of Duty in the UK, so it should hardly be a surprise if some of those who commit suicide also own it. Dr Przybylski underlined his point by an analogy: lots of teenagers wear blue jeans, making it pretty likely that many of those who commit suicide were wearing blue jeans at the time. So does it make sense to conclude blue jeans lead to suicide?
    Put like that, it’s clear why such arguments don’t really stack up. First, they focus on only part of what’s needed to make the case of a causal link between X and Y. That is, they focus on the surprisingly high probability of teenagers who commit suicide having recently played Call of Duty. But how do we know it’s surprisingly high? The only way to tell is by putting it into context – which means comparing it to the probability of teenagers who don’t commit suicide having recently played Call of Duty. And if we’re dealing with something as ubiquitous as teenagers playing Call of Duty, you can bet a high proportion of perfectly happy teenagers will have played it too.
    This highlights a general point: be wary about believing X explains Y if X is very common. But the flip-side is also true: if some effect is very common, be wary of blaming its appearance on some specific cause – as if it’s very common, it’s likely to have multiple causes. A classic example of this has recently made headlines in connection with a major public health debate in the UK. Statins are cholesterol-lowering drugs, and they’ve been shown to reduce the risk of death among people with a relatively high risk of heart disease. This has led some medical experts to propose that even people with little or no extra risk should also take statins as a preventive measure. It’s a proposal that has sparked a huge row among experts and patients alike. Some see it as a step towards the ‘medicalization’ of the population, in which we all pop pills rather than live healthier lives. But most concern centres on the widespread reports of fatigue and muscular aches and pains among those taking statins. No one is dismissing the distress such symptoms bring – though some would argue they’re a small price to pay in return for a reduced risk of premature death. What no one could argue with, however, is the fact that such symptoms are extremely widespread. And that leads to the suspicion that the link with statins may be entirely spurious.
    This possibility has recently been put to the test by an analysis of studies collectively involving over 80,000 patients.1 These studies were ‘double blinded’, meaning that neither the patients nor the researchers knew who was getting statins and who was getting a harmless placebo. The data showed that around 3 per cent of people given statins did indeed suffer from fatigue, and a startling 8 per cent from muscle aches. All very worrying – until one learns that virtually identical proportions of those patients getting the placebo also experienced these two symptoms. In other words, there’s no reason to think that taking statins leads to an increased risk of their most ‘notorious’ side effects. They’re simply so common that there’s a relatively high chance that someone who starts taking statins will also experience an outbreak of fatigue and aches and pains – and, entirely understandably, blame the drugs.
    Understandable, perhaps – but justifiable only if one has ruled out the risk of having mistaken ubiquity for causality. And sometimes it takes full-blown scientific studies involving huge amounts of data to do that.
    Strangely enough, an entire class of scientific studies has been justified on the basis of this kind of flawed reasoning. It concerns perhaps the most controversial issue in experimental science: the use of animals. There’s no question that experiments on animals have been important in many areas of medicine, from surgery to cancer research. Nor can anyone doubt that such use of animals has provoked strong reactions from both the pro- and anti-vivisection camps. The resulting debate has been vociferous, even violent, with each side exchanging claims and counter-claims. But for those who support the use of animals, one claim has acquired almost talismanic power: that ‘virtually every’ medical achievement of the last century has depended in some way on animal research.
    Despite being quoted by leading researchers and even the Royal Society, Britain’s premier scientific academy, the justification for this statement is far from clear. The wording comes from a claim made in an anonymous article in a newsletter circulated by the American Physiological Society about twenty years ago. It contains not a single reference to back up its impressive assertion. Even so, the implication is clear: it’s vital to continue with experiments on animals if scientists are to continue finding life-saving drugs. Yet like the supposed link between suicide and video games, this overlooks a key issue: the sheer ubiquity of animal experiments. Since the thalidomide tragedy of the 1950s, there’s been a legal requirement that every new drug undergoes testing on animals before it is allowed to be tested on human volunteers, let alone released onto the general market. As a result, every drug – regardless of whether it works in humans or not – will have been tested on animals. The fact that all the most successful drugs have been tested on animals is thus merely a truism, and tells us nothing about the causal link between the use of animals and the advancement of medicine. Claiming that it does makes as much sense as claiming that the equally ubiquitous practice of wearing lab coats is crucial to medical progress. As such, the statement endorsed by the Royal Society (among many others) is essentially vacuous. It’s important to stress, however, that this doesn’t imply animal experiments are pointless. What it does mean is that scientists need hard evidence if they are to prove the value of animal experiments. Surprisingly little work has been done in this area, and what has been done is largely not fit for purpose.2 What evidence there is points to a rather more nuanced view of animal experiments than either side of the debate seems willing to concede. It suggests that animal models do have some value in detecting toxicity before human trials, but are poor indicators of safety. Put more prosaically, if Fido the dog reacts badly to some compound, it’s likely humans will too. But if Fido can cope with it just fine, that says very little about what will happen in poor, delicate us.
    Proving that one thing causes something else is often tricky – and it’s fraught with danger if either the suspected cause or effect is very common. Showing that the suspected cause always precedes the effect is a start, but in such cases it’s rarely enough.
    Why the amazing so often turns ho-hum
    We see it everywhere, from breakthrough movies whose sequels suck to soaring stocks that suddenly collapse. Today’s skyrocketing successes have a habit of turning into tomorrow’s damp squibs. What’s especially galling is the way they so often lose their magic at the very moment we notice them. Our friends tell us of some absolutely amazing local restaurant they visited last week, so we give it a go – and it’s just ho-hum. We bet on a tennis player making headlines for her stellar performances – only to see her sink back into the pack. Sometimes it’s hard not to think everything’s just hype, and that most things are just, well, pretty average. And the thing is, when it comes to understanding this irksome quirk of life, you’d be on the right track.
    Everyone’s heard the line ‘Don’t believe the hype’, which of course none of us would if only we could distinguish it from reliable judgements. Hype is usually taken to mean some kind of exaggeration of the truth, but that presumes we know what the truth actually is. This is where knowing a bit of probability theory helps. First, the Law of Averages tells us that when trying to gauge the typical performance of anything that can be affected by random effects, we should collect plenty of data. Clearly, it makes little sense to expect an amazing sequel from a first-time author or tyro movie director, as both have given us just one data-point on which to judge them.
    But probability theory also warns us that collecting lots of data isn’t enough; it must also be representative. By definition data solely about exceptional performance aren’t representative. Yet that’s exactly what we’re being fed when we read rave reviews, see banner headlines, or hear pundits rave about some new soaring stock. As a result, when it comes to assessing exceptional events, we should always fear the phenomenal. Basing our judgement solely on evidence of exceptional performance makes us likely to fall prey to a tricksy effect known as regression to the mean. First identified almost 150 years ago by the English polymath Sir Francis Galton, it’s still not as widely known as it should be, despite its ubiquity.
    Perhaps the most common victims of regression to the mean are sports fans. They’ve seen it at work countless times, and may well have suspected something weird is going on – but rarely twig what it is. This is how it goes. At the start of the season, it all looks like business as usual – win some, lose some. Then the team goes off the boil, and starts heading for relegation. Action is clearly needed; heads must roll. After a run of defeats, the club gets the message and fires the manager. And sure enough, it does the trick: the team starts doing better under the new manager with his new tactics. But then it all starts going wrong again. After a run of solid performances, the team starts slipping. Just a few months after the upheaval, the team seems barely any better off – and the muttering about getting a new manager starts all over again.
    This will sound familiar even to those who wouldn’t know one end of a football from another. That’s because the same phenomenon can be seen at work everywhere from underperforming schools to tanking stocks. The basic idea behind regression to the mean is not hard to understand. The performance of a team – or a school or stock price – depends on a host of factors: some obvious, some less so, but all of which contribute to the average or ‘mean’ level. Yet at any given time, the actual performance is unlikely to be dead-on average. It will usually be a bit above or below the mean level, as a result of nothing more significant than random variation. This can be surprisingly large, and persist for a surprisingly long time, but eventually its positive and negative impacts balance out, and the performance will ‘regress’ back to its mean value. The trouble is, regression to the mean is especially strong with the most extreme events, as these are typically the most unrepresentative of all. Anyone who acts on the basis of such extreme events alone risks falling victim to the cruellest part of regression to the mean: its ability to make a bad decision initially look like a good one.
    So, for example, a manager brought in to run a sports team after ‘compelling’ evidence of poor performance may well benefit from a run of better performances. Yet the improvement may well be nothing more than regression to the mean, the team merely heading back towards its typical level of performance after the random bad run that cost the last manager his job. Wait long enough, and the typical level of performance will reassert itself. The first signs may well appear with players who seemed to sparkle under the new manager. They may just have had a run of luck that happened to coincide with the new manager’s arrival, and so will also experience regression to the mean – and start to look more average as time wears on. Then the apparent boost enjoyed by the whole team starts to fade too. Of course, sometimes teams underperform because managers genuinely lose their touch. Even so, research by statisticians and economists using real-life data shows that regression to the mean can and does affect sports teams, with managers hired and fired, but with little effect on overall team performance.
    Once you know about regression to the mean, you’ll start to see it everywhere. That’s because we so often focus on extremes. Take management techniques aimed at boosting performance. Many line managers are convinced fear is the best motivator – and even claim to have hard evidence to prove it. Every time their team seriously underperforms, they call them in for a kicking – and sure enough performance improves. And don’t give me all that stuff about rewarding performance, says the gung-ho manager: that’s ‘obvious’ baloney. After all, when bonuses are given to the top sales team each quarter, it has a habit of becoming ho-hum next quarter; that’s ‘obviously’ complacency.
    And yes, the performance data do seem to prove it – unless you know about regression to the mean. The trouble is, gung-ho bosses rarely welcome being told that the ‘compelling’ evidence for their effectiveness is probably nothing more than a statistical effect – which may be another reason so few know about it.
    The amazing curative powers of regression to the mean
    In their quest for new therapies, medical researchers run the risk of being tricked into thinking they’ve found a miracle cure by regression to the mean. That’s because, by its very nature, the search for such treatments often focuses on patients with abnormal characteristics, such as unusually high blood pressure. Yet sometimes these abnormalities can be nothing more significant than random deviations from normality which will fade away over time. Spotting such effects is a challenge for researchers testing a new drug, as they run the risk of being fooled into thinking the drug has brought about an improvement over time, when the condition has simply regressed to the mean. They deal with it by setting up so-called randomised controlled trials, in which patients are randomly allocated either to receive the drug, or receive a harmless placebo ‘control’. As both groups are equally likely to experience regression to the mean, its effects can be cancelled out by comparing the relative cure-rates in both groups. Unfortunately, no such safeguards are available to us when a friend recommends some remedy for, say, back pain. Lacking any comparison group, it’s hard to be sure that any benefit we get isn’t just regression to the mean. Indeed, some doctors argue that patients who believe they’ve been cured by ‘alternative medicine’ such as homeopathy have benefited from nothing more than regression to the mean. Its advocates insist, however, that studies that take this possibility into effect have been carried out, and still show a net benefit.
    We can at least protect ourselves from self-delusion, however. For example, when it comes to making investments, we need to be very wary about the go-go stocks highlighted by financial pundits. They naturally focus on phenomenal, headline-grabbing performance – the classic breeding-ground for regression to the mean. Again, this isn’t some theoretical risk. The Princeton University economist and scourge of Wall Street Dr Burton Malkiel has made a study of what happens to those who invest in ‘obvious’ winners.1 He compiled a list of the equity funds that performed best over the five years 1990 to 1994. The top 20 of these funds outperformed the S&P 500 index by an impressive annual average of 9.5 per cent, and were ‘obvious’ winners. Malkiel then looked at how these same funds did over the next five years. Collectively, they underperformed by an average of more than 2 per cent relative to the whole stock market. The rankings of the top three slipped from 1st to 129th, 2nd to 134th and 3rd to a truly dismal 261th. Such is the power of regression to the mean to hand out lessons in humility.
    As with football managers, however, a handful of investment managers really do seem to know what they’re doing and achieve consistently impressive performance that can’t be dismissed as some statistical fluke. One such is former Wall Street legend Peter Lynch, whose Magellan Fund performed astoundingly well during the 1970s and 1980s. Unfortunately, the evidence suggests that most ‘star’ fund managers are just temporarily benefiting from regression to the mean, and are destined to fade after a few ye