# Quantitative Evidence Often a Tough Sell in Court

**BOOK REVIEW:** **Math on Trial.** *By Leila Schneps and Coralie Colmez, Basic Books, New York, 2013, 255 pages, $26.99.*

This book discusses in detail ten legal cases in which mathematical testimony was introduced in evidence. Each case is assigned a “Math Error Number” and given a modestly catchy title. Most of the cases are well known and have been written about repeatedly in law reviews and books, as the authors acknowledge in their Sources section. So the book is a revisiting of some legal chestnuts (plus some other less well known cases) in which math appears to have played a role.

The authors have evidently done their homework, and they give us a raft of new details about the cases and their context. There is room here to discuss only three of the cases.

■ ■ ■ |

What appears to be the first case in which a mathematician testified to mathematical probability involved one Hetty Green, the niece of Sylvia Ann Howland, a wealthy unmarried woman living in New Bedford, Massachusetts. This case the authors call Math Error Number 9: Choosing a Wrong Model. When Sylvia died, in 1865, she left a will in which she bequeathed to Hetty a life interest in a trust of her large estate. But Hetty, unsatisfied with this generosity, produced a separate one-page document, purportedly signed by Sylvia, that gave her Sylvia’s estate outright, and purported to invalidate any future wills to the contrary. Hetty claimed that Sylvia had signed the document immediately before she had signed an earlier will. The executor rejected the claim on the ground, among others, that Sylvia’s signature was a forgery, and the case became a cause célèbre.

In the lawsuit that followed, mathematics entered the fray when Benjamin Peirce, a professor of mathematics at Harvard, working with his son, Charles Sanders Peirce, later a famous logician and philosopher, testified to the probability that so many of the downstrokes in the disputed and authentic signatures of Sylvia would coincide. To do this, they took 42 admittedly genuine documents signed by Sylvia and computed the number of matching downstrokes in each of 861 pairs of signatures. From this they calculated that the probability of coincidence of a downstroke in a pair of signatures was about 1/5. Since all 30 downstrokes in the disputed and authentic signatures coincided, Peirce père, using a binomial model, concluded that the probability of 30 coinciding downstrokes if both signatures were authentic was (1/5)^{30} = 1 in 2,666 millions of millions of millions. The calculation is not quite right, as the authors and others have pointed out, but the major fault lies in the inappropriateness of the binomial model, with its assumption of independence of pairs of downstrokes and the constant \(p\) = 1/5 across pairs of documents.

An 1860s lawsuit centered on the claim of a forged signature. On the left: 10 of 42 samples of the signature measured by the mathematician Charles Sanders Peirce. On the right: the signature on the first page of the will and the two disputed signatures. From Math on Trial. |

This much is well-trodden ground. What the book adds of interest is its discussion of the case’s rich context. However, I was troubled by the authors’ statement that followed their description of Benjamin Peirce’s testimony: “What he meant, *and what the jury took him to mean* . . . [italics added]” was that the chance of two signatures being identical was negligible. How could the authors know what the jury thought in this mid-19th-century case? And then I recollected that because the case was in equity—Hetty sought specific performance of the agreement recited in the one-page document—it would have been heard by a judge without a jury. In that case, the authors would have made up that “fact,” pure and simple. They also concluded that the judge “simply opted to reject Hetty’s testimony altogether, and the case ended with a settlement that was essentially identical to Sylvia’s latest will.” This coda leaves the impression that the court unfairly brushed aside Hetty’s claim. But a federal statute of the time provided that, in actions by or against an executor, neither party was allowed to testify against the other as to any transactions with the deceased. Because Hetty was the sole witness to the purported signing of the agreement, the circuit judges correctly dismissed her case.

These errors and infelicities are evidence of some bias (and sloppiness) on the part of the authors: They wanted to portray the mathematics as potent and the parties in these cases as victims of the misuse of mathematical models. Some of them undoubtedly were. The case of Sally Clark (Math Error Number 1: Multiplying Non-independent Probabilities), tried in England for the murder of two of her babies, is a horrible example of a miscarriage of justice that makes painful reading. Even there, however, the role of mathematics may not have been as important as the authors suggest. Both deaths were cases of sudden infant death syndrome, or SIDS. In addition to adducing much medical testimony, the Crown called the eminent pediatrician Roy Meadow. He testified that the risk of one such death in a family like Sally Clark’s was 1 in 8543, that of two such deaths 1 in 8543 × 8543 = 1 in 73 million. The authors omit mention of the rebuttal testimony and the judge’s comments on the evidence, stating simply that Meadow’s figure “was accepted without question by judge and jury.”

This is somewhat misleading. There was rebuttal testimony, by a Professor Berry, who challenged the 1:73 million figure on the ground that “familial factors” could lead to two SIDS deaths in a family. More significantly, the judge—in commenting on the evidence, as English judges are allowed to do—cautioned the jury not to put much reliance on the statistics: “We do not convict people in these courts on statistics.” He also noted, “If there is one SIDS death in a family, it does not mean there cannot be another one in the same family.” The jury evidently questioned the evidence, because two members voted against conviction (the jury convicted because 10 members of the jury found her guilty). On the first appeal, the court dismissed objections to the 1:73 million figure, pointing out that the arguments against squaring were known to the jury and that the trial judge repeated them in his summing up. On the second appeal, the prosecution announced that it would no longer defend the verdict, citing newly discovered evidence that one of the babies had died of an infection, as well as misleading statistics. The appellate court then reversed the conviction on the ground of the newly discovered evidence, and, in dictum, excoriated the statistics as “manifestly wrong” and “grossly misleading.”

A third case is the redoubtable *People v. Collins*, decided by the California Supreme Court in 1968 (Math Error Number 2: Unjustified Estimates). This is a granddaddy of legal chestnuts; almost every law school evidence course includes it. The story is quickly told: An elderly woman, while walking in an alley, was assaulted from behind and robbed. A witness said that a Caucasian woman with dark-blond hair in a ponytail ran out of the alley and entered a yellow automobile driven by a black man with a mustache and a beard. A couple answering roughly to that description (there were some variances) was subsequently arrested and tried. At the trial the prosecutor called an instructor of mathematics from a nearby college to testify to the product rule of elementary probability. He then had the witness assume the individual probabilities of six relevant characteristics, which were evidently plucked by the prosecutor from thin air (e.g., interracial couple in car—1/1000; black man with beard—1/10; woman with blond hair—1/3), and then apply the product rule to multiply them together. The witness came up with 1/12,000,000 as the probability of a couple answering to that description. The jury con-victed, but on appeal the Supreme Court of California reversed the conviction on the ground that there was no evidence to support the individual probabilities and multiplying them was clearly improper as they were not independent. The authors describe the mathematical exercise in greater detail than is usual and approve the Supreme Court’s rejection of the multiplication exercise because of lack of foundation and the obvious lack of independence of the factors. Nothing new here.

A deeper point is raised by an appendix to the Supreme Court’s opinion (evidently written by Harvard law professor Laurence Tribe, who was then a clerk to the chief judge). Using a mathematical model (which I have criticized elsewhere), the appendix purports to show that if the rate of such couples was indeed 1/12,000,000 in some population, there was a 40% chance there would be two such couples. From this, the appendix concludes that “the prosecutor’s computations . . . imply a very substantial likelihood that a couple other than the Collinses was the one at the scene of the robbery.”

The authors call the arguments in the appendix “flawless and convincing.” I do not agree. First, the rate of such couples in the population, by itself, does not give us probabilities of guilt or innocence, but only a likelihood ratio for the evidence, which is not the same thing. Second, the Bayesian case for guilt is overwhelming, and it is surprising that the authors, who belong to the Bayes in Law Research Foundation, did not at least allude to it. If there were two such couples, the likelihood ratio for this evidence would be on the order of 12,000,000 (assuming a probability of 1 that the witnesses would have so described the Collinses if they were guilty). With that enormous figure (or any figure in the millions), it would take only a wisp of other evidence, when combined with the likelihood ratio, to make the posterior odds of guilt overwhelming. And there was more than a wisp; indeed, the other evidence was rather significant (although the authors labor to disparage it). There is an important Bayesian teaching here that the appendix, the authors, and others have failed to recognize: A tell-tale trace doesn’t have to identify a suspect uniquely to make a powerful case when combined with other evidence.

■ ■ ■ |

In these and other cases, the authors paint mathematical evidence as extremely powerful. But in all these cases, other evidence made the role mathematics actually played in the decisions unclear. On that subject I must report my experience with a large group of moot court cases in which mathematics was the sole evidence. For many years, at the end of my course Statistics for Lawyers, at Columbia Law School (and other law schools), we have put on a moot court trial in which the students are divided into two teams and given the facts of a case and a data set. Each side also gets a professor of statistics as an expert witness. The students’ job is to prepare their witness for direct testimony, prepare to cross-examine the opposing witness, and give opening and closing arguments to the jury. The data set is designed to support a finding for the plaintiff, although with some weaknesses. Over many years of these trials, with different data sets, different expert witnesses, as well as generations of students and juries, a common theme has emerged: *The proponent of statistics almost always loses.* On questioning, jurors say in various ways that they didn’t believe the statistics. Evidently, quantitative evidence (particularly if presented in complex statistical models) is not an easy sell to legal decision makers and may have been less influential in actual cases than the authors suggest in this book.

Applied mathematicians who read the book should not take away the impression that the legal chestnuts discussed fairly represent the role of mathematics in legal proceedings today. Much has changed in the law, in addition to DNA evidence, which the book does discuss. Well-qualified statisticians and economists from academia and consulting organizations now routinely testify to multiple regression models in a variety of disputes, including those involving employment discrimination, voting, the death penalty, and antitrust. Binomial models are used in jury discrimination challenges. The U.S. Supreme Court has approved a number of such models and in a landmark opinion has required the district courts to determine whether proposed scientific testimony is sufficiently valid and reliable to be heard in evidence. To help the judges, the Federal Judicial Center produced the fat *Reference Manual on Scientific Evidence*, which has proved to be a best seller and is now in its third edition. Law reviews devoted to quantitative studies of the legal system have made their appearance, and there are textbooks on law and statistics (I am the co-author of one). This is a growing field, with variety and interest extending far beyond the world of the ten cases discussed in *Math on Trial*.