Predicting the Future Is Hard

Building better models, from elections to financial markets

Lynne Kiesling | From the March 2013 issue

The Signal and the Noise: Why So Many Predictions Fail—But Some Don't, by Nate Silver, Penguin Press, 544 pages, $27.95

The Physics of Wall Street: A Brief History of Predicting the Unpredictable, by James Owen Weatherall, Houghton Mifflin, 304 pages, $27

Human beings naturally look for patterns in the mess of events and data that surrounds us. Groping for hidden architecture is an evolutionary response to a complex world. In general it serves us well, but we risk detecting patterns where none actually exist.

Sometimes we can learn after the fact that our pattern-based predictions were incorrect, and we update and move on, ideally with more humility and an updated mental model for the future. But biases often persist even after correction, especially when the subject of our attention is something with deep emotional roots, like the predicted outcome of an election.

Given the power of pattern recognition and our inherent biases, how do we separate the signal from the noise? That question has intrigued statisticians for centuries, including the statistician of the moment, Nate Silver. In The Signal and the Noise, the well-known New York Times poll-watcher examines the phenomenon of prediction. Silver asks how, in the face of uncertainty, we can separate meaningful patterns from the vast amount of information and data available to us.

Our innate cognitive limitations and biases, the biases arising from our use of perception, and the biases we introduce into prediction due to our interpretation and analysis all combine to distort rather than clarify. As Yogi Berra once observed, "Prediction is very hard, especially about the future."

Prediction involves a theoretical model to formulate a hypothesis, an empirical model to gather and analyze the (necessarily incomplete) data to test that hypothesis, and a method of evaluating the inferences drawn from those models to see if the theoretical and empirical models can be improved, in order to generate better future predictions.

Silver argues that better models and more successful predictions come from applying Bayesian reasoning, which revolutionized statistics in the 18th century and is used in engineering, medicine, and economics to analyze data. Bayesian reasoning involves formulating a probability of an event's occurrence, then updating that probability as new data arrive. Silver uses the example of finding a strange pair of underwear in your partner's drawer. A Bayesian analysis of whether your partner is cheating on you requires a hypothesis (cheating), an alternative hypothesis or reason why the underwear would be there, and a prior probability you would have assigned to the cheating hypothesis before finding the underwear. This prior is crucial. Given estimates of these variables, you can calculate an estimate of the probability that your partner is cheating on you, which you can express as a degree of confidence in the cheating hypothesis.

A fundamental Bayesian insight is that we learn about the world (and its patterns) incrementally. As we gather more data, says Silver, we get "closer and closer to the truth" (emphasis in original). Thus we can refine our models and perform better approximations, yielding more accurate estimates of our confidence in the truth of the hypothesis.

Silver has applied these techniques in formulating statistical models in poker, in baseball, and most famously in U.S. presidential elections. (In 2008 he accurately predicted the outcome in 49 out of 50 states. In 2012 he was right about all 50.)

The Bayesian approach to probability and statistics is not the only one, and it is not always intuitive. The largest debate in probability theory arises between the Bayesian and the frequentist approaches. Frequentists interpret the probability of an event as a relative frequency of its occurrence, which is defined only in reference to a base set of events (for example, the probability of heads in a large number of coin tosses). In Bayesian statistics, a probability is a subjective degree of confidence based on a subjective prior, so each person can hold a different probability of the same event occurring. That subjectivity means abandoning the idea of probability as a frequency.

However esoteric this debate sounds, it's at the core of the different interpretations of Silver's 2012 U.S. presidential predictions. He chose as his subjective prior a set of state-level polls that in his judgment were more likely to represent underlying beliefs accurately, and therefore enable him to make predictions more accurately.

But by and large, people find frequentist representations more intuitive. Research from the psychologist Gerd Gigerenzer, supported by further evolutionary psychology research by Leda Cosmides and John Tooby, indicates that we tend to apply more accurate Bayesian reasoning when presented with probabilistic data in frequency form. Gigerenzer's pioneering research also shows how we use heuristics and rules of thumb to make approximations in complex situations when we cannot grasp all of the data relevant to a decision.

Silver contends correctly that such trial-and-error intuition contributes to the biases that can harm prediction, but he does not discuss the fundamental and important tradeoff that exists between the costs of those biases and the benefits that arise from informed approximation. Bayesian reasoning is itself a rule of thumb, and one that encourages us to be more systematic in our thinking about the future.

Silver develops his theme in application to several case studies told as freestanding vignettes, from political prediction to sports betting to climate. Predictions in these cases have differing degrees of success, depending on the quality of the theoretical and empirical models, as well as the availability and reliability of the data with which to test them. The quality of models depends on variables such as computing technology and how nonlinear and dynamic the underlying system is. It also depends on the judgment of the person constructing the model.

Human judgment enables a model to reflect human information, but it also introduces the potential for bias from using our perception to build models and interpret their results. This inclination can sometimes be a feature, as in weather prediction's improvements over time by searching for and testing for patterns, or a bug, as in cases like finance and baseball, where bias can lead to less accurate prediction.

Some of Silver's chapters cohere with the central Bayesian theme better than others do, and Silver does not consistently maintain the distinction between risk and uncertainty. Still, his skillful writing and storytelling make The Signal and the Noise an enjoyable read, even if you are not a prediction junkie. Overall it is a thoughtful, well-cited work with informative attention to detail.

Similarly, James Owen Weatherall's The Physics of Wall Street is an engaging, well-written history of the work of physicists, mathematicians, and statisticians on modeling financial markets since the late 19th century. Weatherall, a physicist, mathematician, and philosopher, unearths research from some unjustifiably underappreciated mathematicians, and he narrates a lively story about their work while making challenging ideas easier to understand.

Some of Weatherall's subjects, such as Benôit Mandelbrot, discovered entirely new fields of inquiry (in Mandelbrot's case, fractal geometry and chaos theory) as they developed theories to solve concrete problems. Others, such as physicist Fischer Black, pioneered the application of physics models to complicated finance problems like options pricing. In all cases Weatherall shows that intellectual nonconformity and interdisciplinary collaboration were key to his subjects' successes.

Weatherall's main theme is that the methodology of physics involves developing appropriately simple models, being honest about their assumptions, testing those models, and then revising them based on their performance and/or when the assumptions are invalid. Based on this foundation, he argues that the physicists and other quants are not entirely to blame for failures to predict financial market downturns such as the recent 2008 crisis, nor even for having developed models and financial innovations that made financial markets more brittle and less resilient.

"Putting all of the blame for the 2007–2008 crisis on Li's model, or even securitized consumer loans, is a mistake," Weatherall writes. "The crisis was partly a failure of mathematical modeling. But even more, it was a failure of some very sophisticated financial institutions to think like physicists. The model worked well under some conditions, but like any mathematical model, it failed when its assumptions ceased to hold."

Weatherall's research and argument are broadly persuasive but incomplete. His account does not address the fact that physicists develop these models within a framework of human institutions, the sets of formal and informal rules that govern how individuals act and interact in financial markets and the broader economy. These institutions shape the incentives of all kinds of people—including quants and the people who employ them—in the complex network of markets.

So Weatherall's conclusion is accurate, but the financial crisis was largely a failure of institutions and incentives that made financial markets more brittle, not solely a failure of mathematical modeling per se. While the Warren Zevon fan in me appreciates his epilogue's invocation to "send physics, math and money!" to enable better outcomes in financial markets, it's a prescription that overlooks the distorted incentives that existed, and persist, in financial markets.

These two books have several shared attributes that make them worth reading—lively writing that humanizes a difficult topic, attempts to understand modeling and prediction in the face of uncertainty, and application to well-examined case studies in finance, weather, earthquakes, and poker. A common theme is the danger of assuming that the risks of bad outcomes are independent of each other instead of related, especially in financial markets. Having faith that your model will work regardless of conditions leads to poor predictions and unexpected outcomes. Good modeling requires constant testing and humility, even (or especially) after a spectacularly successful election prediction.