| July 02, 2018

Detecting Gerrymandering with Mathematics

Earlier this year, federal judges struck down the North Carolina state map as unconstitutional because it had been partisan gerrymandered. A few weeks later, Pennsylvania district maps met the same fate on similar grounds. While the Supreme Court has upheld the unconstitutionality of the Pennsylvania maps, it recently sidestepped its decision on partisan gerrymandering in Wisconsin and Maryland, letting the maps stand for the upcoming fall elections.

Gerrymandering comes into play every ten years after completion of the census. The political party in power in state legislatures uses census information to alter congressional districts in its favor via a process called redistricting. Such fudging of maps has occurred since 1812, and has been the target of numerous lawsuits. Although the Supreme Court has ruled racial gerrymandering unconstitutional, it has so far declined to overturn gerrymandering on partisan grounds.

Judiciable Standard to Curb Gerrymandering

Partisan gerrymandering involves packing vast swathes of the opponent’s supporters into fewer districts, or cracking areas of opposition majorities across many districts — thereby diluting the majority. These actions reap benefits over several elections. While the Supreme Court’s recent ruling declared extreme partisan gerrymandering unconstitutional, a judicially manageable standard measuring the “extremeness” of a given map is still lacking.

“The Supreme Court signed up for mathematics by ruling that a partisan gerrymander is unconstitutional if it is extreme,” Eric Lander, founding director of the Eli and Edythe L. Broad Institute of MIT and Harvard, said. “There’s a constitutional right to recognizing what is too far — and that is mathematical.” Lander wrote a court document last summer supporting the use of a statistical outlier standard. Jonathan Mattingly, professor of mathematics at Duke University, served as a consultant to the document. Mattingly has spent five years mathematically dissecting the structure of a typical redistricting to identify gerrymandering.

His interest was inspired by the 2012 elections for the North Carolina House of Representatives. “Republicans won the majority with nine out of 13 seats,” Mattingly said. “I was at a meeting where someone said that Democrats won the majority of the votes. That was shocking, since they should have had at least seven of the 13 seats.” He successfully testified about these numbers in October 2017 during Common Cause v. Rucho — the North Carolina partisan gerrymandering case.

Mattingly began to ponder the significance of seven as the magic number. “Maybe it’s not fair that Republicans won nine seats, but it could be seven or eight,” he said, highlighting the difficulty of discerning whether the number of seats won by any party is fair, given an election outcome. He also investigated the number of Democrats or Republicans that a district should have when affiliated with a particular party. Essentially, how much is too much?

Evaluating Partisan Gerrymandered Maps

Along with his postdoctoral fellow Gregory Herschlag and a team of students, Mattingly employs sampling methods to estimate the entire population of admissible redistricting plans. They accomplish this by sampling a probability measure placed on compliant redistricting plans. Mattingly’s goal is to characterize the level of gerrymandering in a district plan by identifying ways in which a plan deviates from what is typical. The team also utilizes sampling methods to estimate the population of redistricting’s characteristic and label outliers.

Districts are required to comply with certain federal and state criteria in order to be viable. To construct his model, Mattingly considers the district standards proposed by North Carolina legislation. The first of these is compactness, which enables the use of geometry to quantify a district. Mattingly defines compactness with the isoparametric score (popular in legal literature) — the ratio between the square of the perimeter and district area. Compared to other measures, the isoparametric measure is less forgiving to undulating district boundaries.

Since North Carolina has 13 districts, Mattingly’s model defines the score as

\[J_1(\xi) = \Sigma^{13}_{i =1} \frac{\big[boundary \big(\partial D_i (\xi)\big)\big]^2}{\big[area \big(D_i (\xi)\big)\big]}, \tag1 \]

where \(D_i (\xi)\) is the \(\textrm{i}\)^th district and \(\partial D_i (\xi)\) denotes the corresponding boundary. The function \(\xi : V \rightarrow \{1, 2 ... 13\}\) represents the redistricting plan and covers the 13 districts.

The second criterion ensures that the state population is evenly distributed across districts, as mandated by legislation. One defines it as

\[J_P(\xi) = \sqrt {\Sigma \big(\frac{pop(D_i(\xi))}{pop_{ideal}} - 1\big)^2}, \tag2 \]

where an ideal population \(pop_{ideal} = \frac{N_{pop}}{13}\).

The third stipulation ensures minimal splitting of counties across districts to maintain communities of interest. A single county becomes a split county if it is broken into two districts. “We want to penalize whenever you split the county,” Mattingly said. “In North Carolina, the Wake and Mecklenburg counties are split where Raleigh and Charlotte are respectively located. Both counties have too many people for one congressional district. The score penalizes whenever the county is further split, and we wanted to use the score to limit it to two splits utmost — hence the soft penalization.” The metric Mattingly thus described is called the county score function, and is given by

\[ J_C(\xi) = \{\# \: of \: counties \: split \: between \: two \: districts\}. W_2(\xi) + \\ M_c. \{ \# \: of \: countries \: split \: between \: three \: or \: more \: districts\}. W_3(\xi),\]

with \(W_2(\xi)\) and \(W_3(\xi)\) as the weight functions and defined as

\[W_2(\xi) = \Sigma (Fraction \: of \: county \:VTDs \: in \: second-largest \: intersection \: of \: district \: with \: county)^{1/2}\]

\[W_3(\xi) = \Sigma (Fraction \: of \: county \: VTDs \: not \:in \:first \:or \: second-largest \\ intersection \: of \: district \: with \: county)^{1/2}. \tag3\]

\(W_2(\xi)\) and \(W_3(\xi)\) are summed over counties split between two and three districts respectively. But what does “second-largest intersection of district with county” entail? “Splitting the county into two uneven chunks of one large and one small, such as 90-10, is better than 50-50,” Mattingly said. In the case of a 90-10 split, “10” is used. When the county is split in three or more different ways, \(M_c\)—a large constant—reflects the heavy penalty.

The Voting Rights Act (VRA) of 1965, which ensures that minorities elect a fair number of representatives that accurately mirrors their population, is the final criterion. African Americans make up 20 percent of North Carolina’s population. Thus, the 2016 interpretation of VRA stipulation warrants that they elect leaders from at least two districts, defined by

\[J_m(\xi) = \sqrt{H(44.48\% - m1)} + \sqrt{H(36.2\% - m2),} \tag4\]

with \(m1\) and \(m2\) representing the current percentage of the African American minority population living in districts with first- and second-highest percentage of the community, determined by the 2016 North Carolina redistricting plan to be 44.48 and 36.2 percent respectively. \(H\) is defined as \(H(x) = 0, x \le 0\) and \(H(x) = x, x > 0\). If \(m1\) and \(m2\) underrepresent the current percentage of African Americans, a positive value for \(J_m(\xi)\) results, thus converting the score into a penalty.

Mattingly calls these mathematical models of conditions “soft versions of the constraints,” referring to smoothing terms such as county-splitting constants—\(W_2\) and \(W_3\) in \((3)\), and a square root function in \((4)\)—to avoid discrete jumps and instead provide a smooth (continuous) ramping of values.

The researchers use a score function to add these subscore functions:

\[J(\xi) = w_pJ_p(\xi) + w_IJ_I(\xi) +\: w_cJ_c(\xi) + w_mJ_m(\xi).\]

The weights given by \(w\) are all positive constants.

Redistricting plans define a probability distribution function \(P_\beta (\xi) = \frac{e^{-\beta J(\xi)}}{Z_\beta}\). \(\beta > 0\) is characterized as the “inverse temperature,” analogous to the constant used in thermodynamics with an exponential distribution — a standard technique in Bayesian sampling. Thermodynamically speaking, low “energy”—represented by \(\beta J(\xi)\) —would imply higher probability \(P_\beta (\xi)\). Because exploring the entire state space of the gerrymandering model comes at a large computational cost, Mattingly uses a Metropolis-Hastings algorithm—a Markov chain Monte Carlo method—to produce a set of random samples from the distribution.

He and his collaborators create a sample of 24,000 possible redistricting plans. They tally the votes for each fictional district and compare the outcomes with those of actual districts. Using the sample of redistricting plans for the 2012 and 2016 North Carolina congressional elections, Democrats could secure four to nine and three to seven seats respectively (see Figure 1). The results tally with those from the redistricting plan used by a bipartisan commission as part of the “Beyond Gerrymandering” project. Figure 1 indicates that when compared to the bipartisan plan, the 2012 and 2016 North Carolina congressional elections show a bias towards Republicans. Results were calculated using fixed vote counts and changing district boundaries.

Figure 1. Probability distribution of the congressional delegation’s composition for the 2012 and 2016 North Carolina congressional elections. Based on the sample of redistricting plans, Democrats could secure four to nine and three to seven seats for the 2012 and 2016 congressional elections respectively. The plan used by the judges from a bipartisan commission shows that Democrats would win six seats in 2012 and four in 2016. In comparison, the 2012 and 2016 North Carolina congressional elections (in orange and purple) show a heavy bias towards Republicans. Image courtesy of [1].

Utilizing their sample of redistricting plans, Mattingly’s group represents the Democratic vote share distribution as a marginal box plot ordered from the most Republican to the most Democratic district, as shown in Figure 2 for the voting data from the 2012 (left) and 2016 (right) elections. They compare it with actual maps used in the 2012 and 2016 North Carolina elections, and the map generated from the judges’ bipartisan plan. The judges’ map almost follows a linear trend, very similar to the median map in Mattingly’s simulation set in the box and whisker plot. However, the actual election outcomes are quite different and resemble an “S” curve, with Democratic voters “packed” into overwhelmingly few districts with a Democratic majority (see upper right of Figure 2; the orange and purple dots occur as outliers). Similarly, the third- to sixth-most Democratic districts (eighth- to tenth-most Republican districts) seem to be “cracked,” i.e., underrepresented, with the election outcomes not reflective of the Democratic vote fraction, which is equal to or more than 50 percent.

Figure 2. Statistical result summary of districts ordered from most Republicans to most Democrats for 2012 and 2016 election data, shown as box plots. This data is compared with three redistricting plans (maps) — “NC2012” (orange dots) and “NC2016” (purple dots) congressional elections, as well as the judges’ map (green dots). Image courtesy of [1].

When considering the impossibility of defining a universal score function across all states, Mattingly indicates that one must recognize each state’s different geopolitical properties and every election’s varied geopolitical makeup.

Nevertheless, mathematics is now at the forefront of the gerrymandering debate, with more states requiring mathematicians to perform fair evaluations of redistricted maps. Pennsylvania Gov. Tom Wolf recently enlisted mathematician Moon Duchin, who leads the Metric Geometry and Gerrymandering Group at Tufts University, to determine if the state’s maps were gerrymandered with a partisan bias. As Duchin succinctly put it, “This math is at the center of what seems to be a promising breakthrough in developing a legal framework to identify gerrymanders.”

References
[1] Herschlag, G., Kang, H.S., Luo, J., Graves, C.V., Bangia, S., Ravier, R., & Mattingly, J. (2018). Quantifying Gerrymandering in North Carolina. Preprint, arXiv:1801.03783.

Further Reading
Department of Mathematics, Duke University. Quantifying gerrymandering: A nonpartisan research group centered @ Duke Math. Retrieved from https://sites.duke.edu/quantifyinggerrymandering/.
Mattingly, J. & Vaughn, C. (2014). Redistricting and the will of the people. Cornell University Library, arXiv:1410.8796.

Lakshmi Chandrasekaran received her Ph.D. in mathematical sciences from the New Jersey Institute of Technology. She earned her masters in science journalism from Northwestern University and is a freelance science writer whose work has appeared in several outlets. She can be reached on Twitter at @science_eye.