About the Author

Confronting Inequity and Data Bias in Travel Demand Modeling

By Jillian Kunze

The interstate highway system represents a clear present-day link to inequitable transportation decisions of the past. The construction of a freeway through Detroit, for example, helped to destroy entire Black neighborhoods in the 1950s and 1960s (see Figure 1). But this is just one case of many; over 475,000 households in the U.S. were displaced to make room for the federal highway system, the majority of which were in low income and minority urban communities. “This is evidence of how decisions that we make—if done without investigating impacts to vulnerable communities—can result in devastating consequences,” Tierra Bills, an assistant professor at the University of Southern California, Los Angeles, said.

During a technical vision talk at the recent Women in Data Science Worldwide Conference 2022, Bills spoke on the topic of confronting inequities and data biases in travel demand analysis. “In transportation engineering and planning, we have a long history of investigating equity in decision-making for large transportation infrastructure that dates back to cases in the 1950s and 1960s,” she said.

Bills introduced the urgency of working towards transportation equity by describing the enormous amount of income inequality in the U.S. As of 2018, the richest 0.1 percent of earners in the U.S. accrued 196 times more income than the bottom 90 percent of earners. This economic inequity constrains economic growth [1] and makes the U.S. less resilient in the face of emergencies such as the COVID-19 pandemic. These issues all contribute to the critical nature of addressing equity in all domains.

Figure 1. The view from the same vantage point in Detroit before and after the construction of a highway through the Black Bottom neighborhood. 1a. An aerial view along Hastings Street looking north from the Brewster-Douglass Housing Projects in 1959. 1b. An aerial view along the Chrysler Freeway looking north from the Brewster-Douglass Housing Projects in 1961. Figure courtesy of the Detroit Historical Society (1a, 1b).

One of those domains is transportation inequity, which is not a problem solely relegated to history. Data from ride-share services Uber and Lyft, for example, shed light on disparities in the quality of service across racial groups: Black and minoritized riders face significantly longer wait times for their rides due to higher cancellation rates by drivers [2, 3, 5]. In addition, although more vulnerable travelers tend to benefit more from bus services, federal transportation funding tends to favor heavy rail [4]. Furthermore, vulnerable communities are exposed to more noise and emission pollution than other communities, despite the fact that they contribute less to this pollution [7].

Motivated by these inequities, Bills provided a background understanding of how travel demand analysis fits in to the transportation landscape. “Travel demand models are computational tools that are often used in mid- to long-range forecasting of transportation plans and decisions,” she explained. “They can be used to decide for or against large transportation infrastructure.” For example, policymakers may look at the results from these models to consider whether to develop certain types of transportation, expand a highway, or implement traffic mitigation measures. Since these models are frequently incorporated into decision-making processes, they can drive upwards of $200 billion of annual transportation spending in the U.S. on the federal, state, and regional levels. 

Figure 2 provides an overview of the process of travel demand analysis. Based on the modeling objectives, researchers collect relevant data from surveys or sources of big data; classical household survey data are still the primary source for estimating travel demand models. Researchers then clean and expand the data, make estimations and predictions with a travel demand model, and perform analyses to inform transportation decisions. Bills’ current research focuses on the data collection phase, particularly on the biases that are associated with survey data collection. 

Figure 2. The process of travel demand analysis. Figure courtesy of Tierra Bills.

Several types of survey errors can bias the data. Non-coverage biases, for example, arise when some people are not included in the sampling frame at all. For those who do receive the survey, a unit nonresponse error can occur when an individual opts to not respond at all, and item nonresponse errors can occur when an individual does not respond to specific questions in the survey. These kinds of errors—which may be linked to low accessibility of survey modes due to literacy or digital access barriers—mean that surveys do not always reflect the needs of target groups. For instance, sampling errors are more prevalent for disadvantaged demographics such as low-income, elderly, underemployed, and transit dependent groups [6]. 

Researchers have proposed multiple possible solutions to address biased data, including larger sample sizes, sophisticated data stratification techniques, sample weighting, and data imputation. But there is still more to do, as the severity of underrepresentation in these datasets and the ramifications for policymaking are unknown. “The primary challenge for data science and estimating of these travel demand models is that we tend to assume that the data we have available actually gives us the complete picture,” Bills said. “In other words, we assume that any errors that might exist in the data are ignorable errors.”

Bills provided some context for how errors in data affect travel demand modeling with a simple example. In Figure 3a, Vanessa and Susan represent two demographics for which researchers would like to estimate job accessibility in a downtown location — a common objective in transportation planning. The goal is to calculate job accessibility to a downtown location where there are 100 jobs available to both demographic groups, based on the fact that they both live 10 miles from downtown and have the same access to transportation. These similarities would lead a travel demand model to reflect the same level of attractiveness or accessibility to traveling downtown for both groups. 

“However, what if there are biases unknown to model analysts or not reflected in the data, like discrimination with regard to the labor market?” Bills asked. For example, if 25 out of those 100 companies were less likely to hire Vanessa, then that should in reality result in a reduction of her accessibility to downtown employment — but the model would still predict the same level of accessibility for Vanessa and Susan because it does not account for these effects. 

Figure 3. A simplified example of how bias can enter travel demand modeling when estimating accessibility to a downtown location. 3a. Vanessa and Susan represent two demographics who both live 10 miles from downtown; however, there are less employers willing to hire Vanessa. 3b. If data from Vanessa and Susan are not available, analysts may use data from the similar demographic groups represented by Pam and Lydia as a proxy. Figure courtesy of Tierra Bills.

Other issues could arise if analysts did not have access to Vanessa and Susan’s data, and instead used responses from other individuals (represented by Pam and Lydia in Figure 3b) as proxies to reflect their demographics. Pam lives closer to downtown than Vanessa, whereas Lydia lives the same distance away as Susan. Because of Pam’s closer proximity to downtown than Vanessa, the jobs located downtown would appear to be more attractive and the model would incorrectly output an increased calculation of accessibility for Pam and Vanessa’s demographic group, since it does not have access to all the relevant data from Vanessa.

This example shows how nonresponse that are significantly prevalent for certain groups can lead to misleading model results, such as making it appear that a demographic group lives closer to downtown than they do in reality. “This is how our travel models might lead us to assume that working downtown is more attractive than it actually is for various demographics,” Bills said. Also, if social inequity and racial biases are affecting market outcomes, but not accounted for in travel demand models, this may lead to incorrect model predictions that drive wrong transportation decisions overall (at worst) and do not address the real needs of vulnerable communities (at best).

To conclude her talk, Bills reminded the audience that vulnerable communities are impacted by both inequitable decisions of the past and new transportation services today. Her ongoing research is seeking to advance equity through travel demand analysis. “There are a number of solutions that my team is working on, but I think primarily that we need to de-normalize the use of data as if it gives us the complete picture,” Bills said. “[We need to] work towards [implementing understandings] of the likelihood of the data or probability of responses from various groups in transportation data analysis.”

A recording of Tierra Bills' technical vision talk at the Women in Data Science Worldwide Conference 2022 is available on YouTube

[1] Bivens, J. (2017, December 12). Inequality is slowing U.S. economic growth. Economic Policy Institute. Retrieved from https://www.epi.org/publication/secular-stagnation/.
[2] Brown, A.E. (2018). Ridehail revolution: Ridehail travel and equity in Los Angeles [Ph.D. thesis, University of California, Los Angeles]. eScholarship: UCLA Electronic Theses and Dissertations.
[3] Ge, Y., Knittel, C.R., MacKenzie, D., & Zoepf, S. (2020). Racial discrimination in transportation network companies. J. Public Econ., 190, 104205.
[4] Golub, A., Marcantonio, R.A., & Sanchez, T.W. (2013). Race, space, and struggles for mobility: Transportation impacts on African Americans in Oakland and the East Bay. Urban Geogr., 34(5), 699-728.
[5] Hughes, R., & MacKenzie, D. (2016). Transportation network company wait times in Greater Seattle, and relationship to socioeconomic indicators. J. Transp. Geogr., 56, 36-44. 
[6] Liévanos, R.S., Lubitow, A., & McGee, J.A. (2019). Misrecognition in a sustainability capital: Race, representation, and transportation survey response rates in the Portland metropolitan area. Sustainability, 11(16), 4336.
[7] Sider, T., Hatzopoulou, M., Eluru, N., Goulet-Langlois, G., & Manaugh, K. (2015). Smog and socioeconomics: An evaluation of equity in traffic-related air pollution generation and exposure. Environ. Plann. B Plann. Des., 42(5), 870-887.

  Jillian Kunze is the associate editor of SIAM News