Demystifying Chance: Understanding the Secrets of Probability

Ten Great Ideas about Chance. By Persi Diaconis and Brian Skyrms. Courtesy of Princeton University Press.

Ten Great Ideas about Chance. By Persi Diaconis and Brian Skyrms. Princeton University Press, Princeton, NJ, November 2017. 272 pages, $27.95.

For the better part of a decade, Persi Diaconis and Brian Skyrms taught a course at Stanford University on the history, philosophy, and common foundations of probability and statistics. With the passage of time, they realized that the story they were telling would likely be of interest to a larger audience. Thus, Ten Great Ideas about Chance was born.

As the title suggests, the book consists of 10 chapters exploring 10 significant ideas about chance. An appendix offers a tutorial on probability and extensive chapter notes, an index, and an “annotated select bibliography.” The latter comprises 10 numbered sections listing 41 seminal books and papers, with brief commentary on each. The entire book can be considered an extended digest of this list.

Some chapters—such as the fifth, concerning the mathematics of probability—are more or less obligatory in a book of this nature. After a few words about finite probability and a brief exposition of Borel and Cantelli’s proof of the strong law of large numbers, the authors describe the sixth of Hilbert’s 23 challenge problems. In this problem, Hilbert proposed that those physical sciences wherein mathematics—especially the theories of probability and mechanics—plays a significant role be placed on a sound axiomatic basis. He was apparently thinking of Ludwig Boltzmann’s theory of gases, in which a swarm of hard spheres moves about in a rigid container; the spheres rebound off one another and the surrounding walls without losing momentum. Can one demonstrate, given a plausible prior distribution on the spheres’ initial positions and momenta, that low-entropy states are likely to evolve into high-entropy states?

Little came of Hilbert’s suggestion until 1933, when Andrei Kolmogorov published his groundbreaking book [1] on the foundations of probability theory. Kolmogorov did three important things: used measure theory to place probability on a firm mathematical foundation, formalized the previously nebulous concept of conditional probability, and proved an extension theorem that shows how an infinite-dimensional stochastic process can be built up from a consistent family of finite-dimensional probability spaces. His work led to an almost immediate flowering of probability theory that continues to this day.

Equally indispensable to Diaconis and Skyrms’ purpose is a chapter on inverse inference, beginning with the question that concerned Reverend Thomas Bayes: after a coin of unknown bias has come up heads $n$ times in $N>n$ trials, what are the odds that the probability $p$ of its occurrence in a single subsequent trial lies within a given subinterval of $[0,1]$? Bayes solved this problem on the assumption that $p$ is equally likely to lie anywhere in the unit interval before trials begin. Laplace later revisited Bayes’ problem and arrived at his famous “rule of succession,” $p_{est}=\frac{n+1}{N+2}$. For large $n$ and $N$, this scarcely differs from the naïve estimate $\frac{n}{N}$. Modern critics have argued that, for an ordinary-looking coin, probabilities near the middle of $[0,1]$ seem more likely than those at either extreme. Indeed, postulating a prior beta distribution $B(x;\alpha, \beta)$ on $p$ shows that the same $n$ heads in $N$ trials leads to an updated beta distribution with parameters $\alpha+n$ and $\beta+N-n$. Hence, $p_{est}=\frac{n+\alpha}{N+\alpha+\beta}$, which again approximates $\frac{n}{N}$ for large $n$ and $N$.

Bayes’ theorem may present a valid rebuttal to philosopher David Hume’s 1748 essay, “An Enquiry Concerning Human Understanding,” which criticized conclusions drawn from records of past events. As investment advisors are honor-bound to warn potential customers that “past performance need not be indicative of future results,” predictions predicated on the assumption that the future will resemble the past are inherently risky and should not be acted upon without prior assessment of this source of risk. Hume also pointed out that randomness does not exist in nature (or did not seem to until quantum phenomena came to light) because in his day people believed that knowledge of Newton’s laws, together with the positions and momenta of every particle in the universe at one single instant, determined the entire future.

Another obligatory chapter concerns frequentism—the leading alternative to Bayesian inference—and the related notion that probability is a state of mind rather than a physical attribute observable only through repeated trials. The authors describe attempts by John Venn in 1866 and Richard von Mises in 1919 to base a coherent theory of probability on the premise that frequency testing alone can determine probabilities. Venn and Mises also tried to expose the fallacy in Johann Bernoulli’s argument that his weak law of large numbers makes it possible to determine the chance that a specific outcome will be forthcoming on a single trial, given the results of a sufficient number of previous trials. The authors concede that it is a subtle fallacy, yet one that notables like Borel, Kolmogorov, Paul Levy, and Andrey Markov have failed at times to avoid.

The first of the book’s 10 great ideas is the simple realization that chance can be measured. The authors reference Gerolamo Cardano’s advice to gamblers and Jacob Bernoulli’s correspondence with Blaise Pascal, Pierre de Fermat, and Christiaan Huygens, along with his proof of the weak law of large numbers, as early evidence of this fact. To be fair, the ancient Greeks and Romans were well aware of chance. Yet they tended to attribute the outcomes of wars, races, courtships, and other contested events to interventions of the gods or deeds of the so-called “Fates,” depicted as a trio of women. It remains a mystery that thinkers as perceptive as Euclid, Plato, and Archimedes never enunciated a law of large numbers or formulated a theory of discrete probability.

Diaconis and Skyrms’ second great idea is, unsurprisingly, that one can infer probabilities in situations where a priori estimates are either unavailable or unreliable. For example, if a coin turns up 17 heads in 50 tosses, it is only natural to suppose that (i) the coin is unfair and (ii) the likelihood of a head on the next toss is closer to 1/3 than 1/2. The authors cite Bruno de Finetti, Leonard “Jimmie” Savage, and Frank P. Ramsey as developers of the intuitive notion of subjective probability. The basic idea, which differs from the older concept of frequentism, is that probabilities can be inferred from a “coherent set of beliefs” concerning the possible outcomes of a particular chance event, such as a horse race or boxing match.

A set of beliefs is coherent if it is impossible to construct a “Dutch book” predicated on them. A Dutch book is a collection of wagers with positive overall expectation. For instance, if both entries in a two-horse race go off as 2-1 favorites, a Dutch book could consist of a $1 bet on each horse. The winning ticket would then return $3 while the loser would return nothing. The resulting $1 gain is an expectation rather than a guarantee, since a dead heat would return only the $2 held by the track in escrow.

Amos Tversky and Daniel Kahneman advanced an equally great idea: that people are remarkably inept in their responses to chance events. Chapter three features a careful analysis of the famous Allais paradox in light of Savage’s axioms of rational decision-making. Economist Maurice Allais asked a number of respondents to choose between payoff schemes A and B in two quite similar lotteries. Whereas respondents constrained by Savage’s seemingly self-evident “axiom of independence” would choose B in both cases, Allais found that many flesh-and-blood respondents—including both Diaconis and Skyrms—chose B in the second case but A in the first. The experiment has been repeated multiple times worldwide, with quite similar results.

Shortly after Allais published his findings, Daniel Ellsberg—who later released the Pentagon Papers and became an activist against the Vietnam War—proposed a series of problems intended to illustrate the difference between choices involving risk, where objective probabilities are known, and uncertainty, where they are not. Since then, Kahneman and Tversky have described numerous situations in which the psychology of chance conflicts with its logic. The authors suggest that training in decision theory might improve real-world outcomes, especially in medical decision-making.

The book’s final four chapters are at once more mathematically-challenging and philosophically-probing than their predecessors. The final chapter on inference seems to summarize all that modern scholarship has added to Hume’s timeless classic on human understanding.

Ten Great Ideas about Chance is not a book to be read in bed at night. It should be attacked with paper and pencil at hand, and a determination to backtrack early and often. The extra effort will prove rewarding to almost any reader.

References
[1] Kolmogorov, A.N. (1950). Foundations of the Theory of Probability. New York, NY: Chelsea. (Original work published 1933).

James Case writes from Baltimore, Maryland.