SIAM News Blog
SIAM News
Print

Challenging Statistical Independence in Co-Infections by Noninteracting Pathogens

By Lina Sorg

If pathogen species do not interact with each other, one would naturally believe the proportion of co-infected hosts to be the product of individual prevalences. For this reason, the assumption of statistical independence underlies a wide variety of methods for identifying pathogen interactions based on cross-sectional survey data. However, simple epidemiological models challenge this fundamental assumption. During a minisymposium presentation at the 2020 SIAM Conference on the Life Sciences, which took place virtually last month, Lou Gross of the University of Tennessee modeled the dynamics of non-interacting pathogens that cause chronic infections and examined their non-independence.

Gross opened his talk with a brief discussion of maize lethal necrosis (MLN) disease, which results from the simultaneous infection of maize by a combination of two viruses: the maize chlorotic mottle virus (MCMV) and the sugarcane mosaic virus (SCMV). Individually, these viruses are not problematic for maize. However, they cause significant damage when they co-infect the crop.

Gross constructed a simple model of MLN disease and assumed that the two viruses do not interact within the crop’s season; multiplicative prevalence in field data supports this postulation. The model suggested that coinfection arises because of the oft-believed assumption of statistical independence. This result prompted Gross to further investigate statistical independence; the subsequent study is published in a recent PLoS Biology paper entitled “Coinfections by noninteracting pathogens are not independent and require new tests of interaction.”

Pathogen interaction tests are based on independence. To begin, Gross posed the following simple example question: Does pathogen A affect the susceptibility to pathogen B? He presented a sample scenario with two pathogens and a data set comprised of 1,000 individuals. Gross calculated the net prevalence of each pathogen based on how much of the data set it infected, then computed the expected prevalence under the assumption that the pathogens are independent. He employed a chi-squared test to determine whether the pathogens interact.

The independence assumption underlies tests for pathogen associations, but Gross questioned its accuracy. These tests sometimes utilize more complex statistics, such as log-linear or other regressions and methods that account for confounding factors like gender or risk group. In addition, the vast majority of data are cross-sectional rather than longitudinal. And all methods for cross-sectional data are based on the simple null hypothesis that no interaction leads to multiplicative prevalences.

Schematic of the simple two-pathogen model tracking a pair of noninteracting pathogens.
After citing a study of human papillomavirus (HPV), which found that multiple infections occurred significantly more frequently than predicted by chance, Gross further delved into the pre-existing assumption that independence means non-interaction. To investigate, he created a simple two-pathogen model that embodies susceptible-infectious-susceptible (SIS) dynamics. The assumption that no interactions exist between pathogens leads to a set of differential equations. “It’s pretty easy in this simple example to write down a differential equation model,” Gross said. “Dynamics are a nice, simple ordinary differential equation system to solve.” His simplest model indicated that prevalences are not multiplicative and that co-infections occur more often than would be expected by chance.

Next, Gross presented a stochastic version of the two-pathogen model. Once again, co-infection prevalence was higher than it would be if one assumes independence between pathogens. In this model, the net pathogen prevalences were correlated and infections in both virus strains decreased simultaneously whenever a co-infected died. Calculating the co-variants once again proved the correlation. Ultimately, Gross used the aforementioned two variations of the simple model to conclude that net prevalences are not expected to be independent, even if one pathogen has no effect upon the transmission of another.

Gross then set up a non-interacting different pathogen (NiDP) model, which is simple enough to yield a recursive solution. The model assumes a different \(R_0\) for each pathogen, all of which have SIS dynamics. However, some datasets have too many parameters based on the amount of data and cannot fit one \(R_0\) per pathogen.

The form of an NiDP or non-interacting similar pathogens (NiSP) model is data-driven. Gross therefore compared the model’s fit of equilibria against a binomial/multinomial-type model, tested whether an NiSP or NiDP model is sufficient to explain the data, and further explored the assumption of statistical independence. He fit the models via maximum likelihood and compared them using Akaike information criterion, since they are not nested. He then presented an example application of an NiSP model for HPV, which showed no evidence of data interaction, as well as an NiDP model for three species of malaria, which also offered no data-based evidence of interaction between the malaria parasites.

Ultimately, Gross concluded that non-interacting pathogens are not statistically independent, as the prevalence of co-infection is always greater than the product of the prevalences. This confirms that statistical independence is far from equivalent to the absence of biological interaction between pathogens. Such results are reasonable for chronic, long-lasting infections, though there are limitations associated with the use of a simple SIS model with just transmission and mortality. Nevertheless, Gross suggested that methods based on model-based distributions (assuming no biological interactions as a null expectation) should replace methods based on statistical independence and random distributions to detect interactions. 

 Lina Sorg is the managing editor of SIAM News.
blog comments powered by Disqus