SIAM News Blog
SIAM News
Print

New Math to Manage Online Misinformation

By Neil F. Johnson

Social media continues to amplify the spread of misinformation and other malicious material [2, 3, 4, 9]. Even before the COVID-19 pandemic, a significant amount of misinformation circulated every day on topics like vaccines [6], the U.S. elections, and the U.K. Brexit vote. Researchers have linked the rise in online hate and extremist narratives to real-world attacks, youth suicides, and mass shootings such as the 2019 mosque attacks in Christchurch, New Zealand. The ongoing pandemic added to this tumultuous online battlefield with misinformation about COVID-19 remedies and vaccines [1, 8]. Misinformation about the origin of COVID-19 has also resulted in real-world attacks against members of the Asian community. In addition, news stories frequently describe how social media misinformation negatively impacts the lives of politicians, celebrities, athletes, and members of the public [10].

Social media companies like Facebook have invested significant resources into policing their platforms; even so, they struggle with the daily deluge of new material to monitor. These companies must simultaneously address the increasingly impatient calls from policymakers and governments that want social media platforms to do “more.” But without a quantitative model of the spread of misinformation, the meaning of “more” remains unclear.

Figure 1. The online world. People form online communities around shared interests regardless of physical location. Communities then become connected through hyperlinks that members share. Each circle is a cluster whose size is given by the number of linked communities within it. These links occasionally break due to loss of interest or moderator intervention, among other causes. This outcome produces an ongoing process of cluster coalescence and fragmentation that allows misinformation and malicious material to pass from infected communities to susceptible ones within the same cluster, then spread more widely as these clusters coalesce and fragment. Figure courtesy of Neil F. Johnson.
Policymakers have used mathematical models during the COVID-19 pandemic to attempt to prevent the real virus from spreading. Why can't we simply utilize these same models to describe the online spread of misinformation? 

Unfortunately, the details that surround individuals’ online interactions invalidate many of the assumptions in traditional mathematical models of spreading. People use social media to create and join virtual communities around shared interests, irrespective of their physical location (e.g., a Facebook page or group of cat lovers from across the globe). These types of online communities may have more than a million members whose mutual interest provides a level of trust that can encourage them to pay more attention to shared content within the groups — including (mis)information about COVID-19 and other topics. Communities can then become temporarily interlinked through the hyperlinks that their members post (see Figure 1). This activity effectively leads to a rapidly evolving ecology in which clusters of communities coalesce with probability \(v_\textrm{coal}\) at any given time. Such links can also break through loss of interest or moderator interventions, meaning that clusters of communities may fragment with probability \(v_\textrm{frag}\) at any given time. 

The major complication lies in the fact that these cluster dynamics occur on the same day-to-day scale as the sharing of content. This overlap means that misinformation and malicious material that may be dying out in terms of interest can get injected into new communities, where it can regrow and even mutate (see Figure 1). As a result, system-wide spreading of misinformation might occur even though there is no continuous path across the system — akin to crossing a wide river by continually repositioning a short plank between adjacent rocks. This occurrence gives rise to complex empirical snapshots that show the spread of misinformation between communities (circles) within and across social media platforms (see Figure 2a).

To mathematically address this new regime of behavior, we began with the fact that posting within an online community almost immediately exposes all members to the same material. At any given time, we can therefore assume that each community is either susceptible (S) and hence does not have a given type of misinformation posted within it; infected (I) and hence does have a given type of misinformation posted within it; or recovered (R). Each susceptible community becomes infected with probability \(p\) via an infected community to which it is connected. Each infected community recovers with probability \(q\). This knowledge allows us to turn Figure 1 into an equation for the probability that a given pair of communities are linked and that one is susceptible while the other is infected. We then average this equation across all communities. 

The resulting equation has a tipping point beyond which system-wide spreading of misinformation can occur. As Figure 2b illustrates, this tipping point condition is given by \(R>1\) where \(R=v_\textrm{coal }p/v_\textrm{coal }q\). Though simple, this expression captures the complex interplay between the social dynamics (\(v_\textrm{coal}\) and \(v_\textrm{frag}\)) and the viral dynamics (\(p\) and \(q\)) in Figure 1. It implies that a given piece of misinformation can suddenly spread throughout the system if the probability that clusters of communities will coalesce is sufficiently large (i.e., \(v_\textrm{coal}\) is large) and/or if the probability that clusters of communities will fragment is sufficiently small (i.e., \(v_\textrm{frag}\) is small). Our \(R\) formula also contains the traditional message that system-wide spreading is more likely if the material is highly infectious (i.e., \(p\) is large) and/or the recovery rate is slow (i.e., \(q\) is small). All of these conditions increase \(R\) and make it more likely that the spreading condition \(R>1\) is met. Our \(R\) formula therefore generalizes the 'R' number narrative that policymakers used during the real pandemic to the online world [5].

Figure 2. The new mathematics of online spreading. 2a. Empirical snapshots of the early spread of COVID-19 misinformation within and across six major social media platforms. Each circle represents an online community that formed around some common interest. 2b. Our derived mathematical expression for the condition under which misinformation will spread throughout a system, i.e., \(R>1\) where \(R=v_\textrm{coal }p/v_\textrm{coal }q\). Figure courtesy of Neil F. Johnson.

If we interpret the coalescence and fragmentation probabilities \(v_\textrm{coal}\) and \(v_\textrm{frag}\) as averages across platforms, this tipping point condition \(R>1\) also applies to misinformation that spreads across the entire social media multiverse. This fact is important because communities that want to promote malicious content and misinformation tend to link to other communities on other social media platforms (see Figure 2a). This provides the group with a place to escape should moderators chase members from their original platform. In some ways, the situation is like a bug infestation problem; if you spray one house in a neighborhood or one unit in an apartment building, it may appear that you have eradicated the bugs. However, most of the bugs simply move next door. They may come back a month later and perhaps be wiser about how to remain under the radar. The situation is even worse online, as all platforms are effectively neighbors with each other. 

Our simple formula can help policymakers better understand the control knobs that are available to them and the likely impact of any intervention. Nevertheless, our analysis only scratches the surface of this complex problem. What about the fact that misinformation can evolve and mutate? Given that different types of users are attracted to different platforms, shouldn't different platforms have different transmission dynamics? What is the consequence of many types of misinformation circulating and possibly interacting at the same time? What about the possibility of “vaccinating” some of the communities? 

These questions provide an exciting opportunity for applied mathematicians to address an important societal problem while exploring new mathematical models. The need could not be more urgent; as I write this piece, online COVID-19 misinformation is being supplemented with a deluge of misinformation about climate change in the wake of the 2021 Glasgow Climate Change Conference. Furthermore, online distrust of—and even hate and threats towards—scientists is at dangerous levels [7]. Indeed, many professional entities are now calling scientific misinformation the most important problem of our time. 

The good news for younger mathematicians is twofold; this brave new world of online behavior—which will likely produce exciting new mathematical models, solutions, and careers—is also a place that they already understand much better than their predecessors!


Neil F. Johnson presented this research during a minisymposium at the 2021 SIAM Conference on Applications of Dynamical Systems, which took place virtually in May 2021.

Acknowledgments: This material is based upon work that is supported by the Air Force Office of Scientific Research under award numbers FA9550-20-1-0382 and FA9550-20-1-0383.

References
[1] Calleja, N., AbdAllah, A., Abad, N., Ahmed, N., Albarracin, D., Altieri, E., … Purnat, T.D. (2021). A public health research agenda for managing infodemics: Methods and results of the first WHO infodemiology conference. JMIR Infodemiology, 1(1), e30979. 
[2] House of Commons Home Affairs Committee. (2017). Hate crime: Abuse, hate and extremism online (Fourteenth report of session 2016-17). UK Parliament. Retrieved from https://publications.parliament.uk/pa/cm201617/cmselect/cmhaff/609/609.pdf. 
[3] Johnson, N.F., Leahy, R., Johnson Restrepo, N., Velásquez, N., Zheng, M., Manrique, P., ... Wuchty, S. (2019). Hidden resilience and adaptive dynamics of the global online hate ecology. Nature, 573, 261-265.
[4] Johnson, N.F., Manrique, P., Zheng, M., Cao, Z., Botero, J., Huang, S., … Restrepo, E.M. (2019). Emergent dynamics of extremes in a population driven by common information sources and new social media algorithms. Sci. Rep., 9, 11895.
[5] Johnson, N.F., Velásquez, N., Jha, O.K., Niyazi, H., Leahy, R., Johnson Restrepo, N., … Wuchty, S. (2020). Covid-19 infodemic reveals new tipping point epidemiology and a revised R formula. Preprint, arXiv:2008.08513
[6] Johnson, N.F., Velásquez, N., Johnson Restrepo, N., Leahy, R., Gabriel, N., El Oud, S., … Lupu Y. (2020). The online competition between pro- and anti-vaccination views. Nature, 582, 230-233.
[7] Nogrady, B. (2021, October 13). ‘I hope you die’: How the COVID pandemic unleashed attacks on scientists. Nature. Retrieved from https://www.nature.com/articles/d41586-021-02741-x. 
[8] Sear, R.F., Velásquez, N., Leahy, R., Johnson Restrepo, N., El Oud, S., Gabriel, N., ... Johnson, N.F. (2020). Quantifying COVID-19 content in the online health opinion war using machine learning. IEEE Access, 8, 91886-91893.
[9] Velásquez, N., Leahy, R., Johnson Restrepo, N., Lupu, Y., Sear, R., Gabriel, N., … Johnson, N.F. (2021). Online hate network spreads malicious COVID-19 content outside the control of individual social media platforms. Sci. Rep., 11, 11549. 
[10] Velásquez, N., Manrique, P., Sear, R., Leahy, R., Johnson Restrepo, N., Illari, L., ... Johnson, N.F. (2021). Hidden order across online extremist movements can be disrupted by nudging collective chemistry. Sci. Rep., 11, 9965. 

Neil F. Johnson is a professor at George Washington University and a visiting professor at Royal Holloway, University of London. He is a Fellow of the American Physical Society and a recipient of the 2018 Joseph A. Burton Forum Award. 

blog comments powered by Disqus