| April 01, 2020

The Mathematical Fight for Voting Rights

State and local governments will redraw voting districts based on new information following completion of the 2020 U.S. Census. Ideally, this process ensures fair representation. In practice, however, districting often involves gerrymandering: the deliberate planning of districts to dilute the voting power of certain groups in favor of others, which violates the law.

Racial gerrymandering—drawing districts to limit the power of voters of color to select candidates they favor—is a particularly pernicious problem. Section 2 of the Voting Rights Act (VRA) of 1965 specifically prohibits this practice, but that has not stopped authorities from doing it anyway. “A number of court decisions have purposefully asked mathematicians, political scientists, and statisticians to use specific methods to try and understand racial gerrymandering,” Matt Barreto, a professor of political science and Chicana/o studies at the University of California, Los Angeles, said.

Barreto and his colleagues employ powerful statistical methods and draw on census and other public data to identify gerrymandered districts. Utilizing these tools, mathematicians can test proposed district maps or draw their own, designing them from the ground up to prevent voter dilution.

Since gerrymanderers use the same data to intentionally disenfranchise voters, the question is whether mathematical approaches alone are enough to fight the problem. Just as machine learning algorithms can “learn” racism from their training data, studies show that the results of algorithmic districting can be as bad as deliberate gerrymandering [2]. To put it another way, can math solve problems it did not create?

“Previous efforts that used mathematics were not as accurate, and they did whitewash over some of the black and brown voters living in communities,” Barreto said. “By going that extra step and purposefully trying to bring in accurate data on racial and ethnic minorities, we can go back to our trusted mathematical and statistical methods to make sure we’re getting accurate counts of people.”

Racial Polarization, Racial Gerrymandering

Figure 1. Cartoonist Elkanah Tisdale’s 1812 depiction of Massachusetts Governor Elbridge Gerry’s partisan gerrymandering in favor of the Democratic-Republican Party. Public domain image.

In 1812, cartoonist Elkanah Tisdale noticed that one of the districts created under Massachusetts Governor Elbridge Gerry looked like the mythical fire-monster salamander, so he dubbed it the “Gerry-Mander” (that arguably makes “gerrymandering” the most important legal term ever coined in a cartoon, which pleases me as a frequent comics writer). This original gerrymander is a prime example of partisan gerrymandering because it was created to favor the Democratic-Republican Party over the Federalists (see Figure 1).

Racial gerrymandering has garnered less attention than its partisan counterpart, though the two often go hand in hand. However, racial gerrymandering also happens in effective one-party regions, such as cities where the Democratic Party dominates local politics. In practice, testing for unethical districting involves looking for racially polarized voting patterns — places in which minority voters strongly prefer one candidate over another, but districts are drawn to favor white voter preferences. Chicago—with a history of just two elected African American mayors despite its large black population—is a classic example of this form of gerrymandering.

Consider an imaginary mayoral election with two candidates: Smith, who is preferred by white/Anglo voters, and Herrera, who is preferred by Latinx voters. The city is divided in a such way that Latinx voters never amount to more than 40 percent of the total population in any district, while white voters never comprise fewer than 50 percent — regardless of the city’s total racial and ethnic makeup. Racial gerrymandering ensures that Smith always wins over Herrera and Latinx preferences are never represented, which is a violation of the VRA. Perhaps the districting scheme splits apart Latinx-majority neighborhoods and lumps the fragments with white-majority areas; a more equitable and representative division would keep those neighborhoods whole, possibly even allowing for Latinx-plurality districts.

The challenge for mathematicians involves reconstructing racial voting patterns without violating voter privacy, which is protected by law. Barreto and his collaborators use ecological inference (EI), a technique that infers individual behaviors from population-level datasets. Their EI methods involve an iterative Bayesian approach, utilizing publicly available data from petitions, voter records (which merely tabulate if a registered voter casts a ballot), and the census.

The census is the only public record that regularly includes racial information. However, it is only updated every 10 years, and citizens may relocate during that period and vote in more than one election in a given year. To infer the race of voters based on registration information, Barreto’s group employs a method called Bayesian Improved Surname Geocoding (BISG). This technique uses geographic information to assign a probability that a given surname belongs to one of the major racial/ethnic groups in America—white, black, Asian, Latinx, or other. For instance, my surname “Francis” is more likely to be shared by white people in Iowa but probably belongs to African Americans in New Orleans.

Barreto and his colleagues tested the BISG method using a dataset wherein people self-identified their race. By iteratively improving their Bayesian priors, their model now identifies the race of a particular voter with between 93 and 97 percent accuracy.

In the simplest case—like Herrera v. Smith—a district has two candidates and two distinct racial/ethnic groups (Latinx and Anglo/white). For every precinct \(i\) in the district, one must estimate the fraction of each group \((\beta_L^i, \beta_W^i)\) that voted for Herrera. The known quantities are the fraction of voters who cast a vote \((T_H^i)\) for Herrera and the Latinx fraction of total voters who participated in this election \((X_L^i)\), estimated using BISG and generally assumed to be independent of the \(\beta\) parameters. Because these quantities are all fractions, the complementary values for white participation is \(X_W^i=1-X_L^i\) and the vote fraction for Smith is \(T_S^i=1-T_H^i\).

Unfortunately, even the simplest system does not allow exact solutions, so Harvard University political scientist Gary King and his colleagues proposed the use of tomography graphs by analogy with medical imaging procedures, where one must infer three-dimensional structures from X-rays that pass through the human body [3]. Each precinct is represented by a line that accounts for all possible \((\beta_L^i, \beta_W^i)\) parameter values (as given by the linear equation), with the slope and intersect involving known quantities:

\[ \beta_W^i=\Big(\frac{T_H^i}{1-X_L^i}\Big)- \Big(\frac{X_L^i}{1-X_L^i}\Big)\beta_L^i. \]

If the data is clear-cut, the lines on the tomography graph will intersect in a well-defined region (see Figure 2). In this case, a bivariate normal distribution \((\)restricted to \(\beta_R^i \in [0,1]\) for \(R=\{L,W\})\) yields the likelihood function of the best aggregate values for \((\beta_L, \beta_W)\). In contrast, less well-defined data require more complicated analyses.

Figure 2. Each line on these tomography graphs represents all possible fractions of black and white voters \((\beta_B, \beta_W)\) who voted for a particular candidate in each precinct. The region where these lines intersect indicates these parameters’ “true” values for the entire district. The likelihood function for the parameters is sharply peaked where the overlap region is small, as in this example. Figure adapted from [1].

While this two-candidate, two-race EI model is adequate for some parts of the country, many districts necessitate extended forms of the model. One extension is iterative: separating one racial/ethnic group or candidate at a time and comparing it to the others in aggregate, repeating this process until all groups have been analyzed. Another expansion is the \(R \times C\) model, which combines all parameters into a matrix \(\beta_{RC}^i\), with row \(R\) tabulating race/ethnicity and column \(C\) tabulating candidate. Barreto and his collaborators developed eiCompare—a freely-available package for the R statistical programming language—to simultaneously calculate the different models’ parameters, compare their outcomes, and provide the best possible EI estimates in real-world elections.

“We’re not trying to prove that there’s always racially polarized voting,” Barreto said. “In some communities there is not, and the data will show us that.”

Accounting for Fairness

The U.S. Supreme Court laid out three criteria for demonstrating racial gerrymandering in their 1986 decision on Thornburg v. Gingles, including rules for legally proving racial polarization [4]. These “Gingles prongs” are as follows:

If there is a minority racial/ethnic group large enough to be a majority in a district
If this group votes in cohesive ways, tending to have preferred candidates as a bloc
If the white-preferred candidates are almost always able to defeat the minority-preferred candidates despite the first two criteria, then racial gerrymandering is present.

Any redesigned district must therefore account for these conditions to comply with the VRA. To ensure fairness, the court also instructed legislators to consult professional mathematicians and statisticians.

Tufts University mathematician Moon Duchin and her colleagues pair analysis techniques like EI with high-level geometric methods to identify where communities or individual neighborhoods define voting blocs, and generate alternative maps to eliminate gerrymandering. Duchin founded the Metric Geometry and Gerrymandering Group, which provides publicly-available tools to help identify better ways of creating districts. One such tool is Districtr, an interactive online Java program for drawing state-level congressional districts.

But racial gerrymandering is not the only problem that voters of color face. Polling station closures in minority-majority districts, poor polling locations (which are often exacerbated by district shape), arbitrary removal of registered voters, and inclusion of prisons comprise other issues that disproportionately affect minority voters. For instance, Duchin’s group was actively involved in the referendum when residents of Lowell, Mass., changed their polling system to ranked-choice voting. This shift provided a parallel way to identify community issues and racial polarization.

Barreto, Duchin, and like-minded researchers also use mathematical methods to break down voting patterns beyond the stereotypical white/African American dichotomy that often dominates national discourse. “There are racial power dynamics inherent in these political systems, which are also sometimes inherent in social sciences and even in math,” Barreto said. “We need to make sure that there is a perspective of black and brown scholars who are also very sophisticated statisticians that care about social policy.”

Matt Barreto and Moon Duchin presented their work during a session entitled “Gerrymandering and Mathematics: Redistricting the Nation" at the 2020 American Association for the Advancement of Science Annual Meeting, which took place this February in Seattle, Wash.

References
[1] Barreto, M., Collingwood, L., Garcia-Rios, S., & Oskooii, K. (2019). Estimating candidate support: Comparing iterative EI and EI-RC methods. Sociolog. Meth. Res., 48(4), 1-32.
[2] Hill, T.P. (2018). Slicing sandwiches, states, and solar systems. Amer. Sci., 106, 42.
[3] King, G., Rosen, O., & Tanner, T. (2004). Information in ecological inference: An introduction. In G. King, O. Rosen, & M. Tanner (Eds.), Ecological inference: New methodological strategies. New York, NY: Cambridge University Press.
[4] Thornburg v. Gingles, 478 U.S. 30 (1986).

Matthew R. Francis is a physicist, science writer, public speaker, educator, and frequent wearer of jaunty hats. His website is BowlerHatScience.org.