| February 18, 2022

Optimization and Data Analysis for Improved COVID-19 Detection and Measurement

Researchers can use applied mathematics to refine and improve biometrology tools like quantitative polymerase chain reaction (qPCR), which is the touchstone for disease detection — particularly when individuals are contagious. This approach has proven especially useful during the COVID-19 pandemic. Loosely speaking, polymerase chain reaction (PCR) is a technique that amplifies specific portions of deoxyribonucleic acid (DNA) to facilitate its detection. A typical PCR experiment achieves this effect through repeated incubation cycles, each of which approximately doubles the number of target “amplicons.” One can add fluorescent dye and a photodetector to this construct in order to monitor the doubling in real time and enable the so-called quantitative PCR. Successful identification of the target nucleic acid relies on mathematical analysis of the corresponding data in terms of fluorescence versus cycle number [5]. Practitioners almost always address this task by determining whether the signal reaches a subjectively prescribed threshold: an agreed-upon indication that the target nucleic acid is present in the sample.

The nominal justification for such thresholding approaches relies upon the observation that the shape of a typical, well-behaved qPCR amplification curve can be divided into three visually distinct regions. The first of these regions is a linear background phase that often appears between cycles one and 15 and has a characteristic slope that is small relative to the rest of the curve. The second area is an amplification phase that manifests the exponential growth of the target nucleic acid; in practice, the signal crosses the detection threshold at this point, somewhere between cycle 20 and 35. Finally, the curve flattens or forms a plateau for the remainder of the cycles until the end of the experiment.

However, real-world testing does not always obtain this idealized picture. For example, the plateau phase may never form if not enough amplicons are present at the start of the reaction. Background effects can also be unexpectedly large and lead to false positives.

In March 2020, we began to address these issues by building a better mathematical understanding of background effects and generic properties of qPCR measurements. The key problem when interpreting qPCR data is the absence of a physical model that universally describes the background effects and/or “true” data; such a model does not (and may never) exist. This suggests that there may not be a single objective criterion for deciding whether a certain behavior is just instrument noise, spurious background effects, or a valid measurement signal, for example. Scientists thus need to rely on the data’s more generic properties, which suggests an opportunity for applied mathematics.

Figure 1. An example in which our affine transformation succeeded in mapping to a reasonable reference curve. 1a. A collection of quantitative polymerase chain reaction (qPCR) curves that are associated with the N1 subunit of SARS-CoV-2 ribonucleic acid (RNA). 1b. Data collapse that results from a transformation of the curves in 1a onto the left-most curve. The inset shows the residuals normalized to the maximum scale of the reference curve. The characteristic errors are less than one percent, which is likely due to photodetector noise. Figure courtesy of [6].

Our research led us to a two-part solution. The first portion involves recognizing that even though the amount of background varies per sample, the functional form appears fixed for a given instrument and amplification chemistry; even better, the background is always measured as part of a negative control. This suggested an optimization problem [6] that aims to determine the amount of background measurement that one must subtract from a test sample to ensure that the remaining baseline (cycles one to 15 or so) stays as close to white noise as possible [2]. In other words, subtracting the correct amount of systematic background effects—determined by the instrument—should leave only the photodetector instrument artifacts. We suspect that this is about as much as one can do with real-world measurements. Compared to more traditional methods that are based on polynomial extrapolation, our technique reduces background effects by up to an order of magnitude — therefore providing immediate benefits for sample classification.

For the second part of our solution, we recognized that qPCR’s photodetection measurement process is linear, i.e., the total fluorescence signal is a linear combination of the signals that are generated from the subsystems comprising an arbitrary partition of the sample during amplification. Utilizing this observation along with another well-known property of qPCR allowed us to derive a universal and useful result, which shows that all amplification curves for a fixed chemistry are identical up to a simple affine transformation. The simplicity and utility of our outcome stems from the fact that all measurements include positive controls: an idealized amplification curve as previously described. This realization suggested a second optimization problem wherein one seeks the transformation parameters that best map a test curve onto the reference curve. From a decision theory standpoint, the background subtraction also provides information about the characteristic scale of photodetector noise. It is possible to formulate and enforce constraints that effectively state that the data collapse should be accurate to this scale if one is to consider a sample positive. In both validation and real-world examples, we have found these constraints to be more robust than thresholding for the purpose of identifying positive samples. Such approaches may even provide routes for the detection of larger quality control issues in testing (see Figures 1 and 2).

Figure 2. An example in which our affine transformation failed to map to a reasonable reference curve. 2a. Quantitative polymerase chain reaction (qPCR) curves associated with the N2 subunit of SARS-CoV-2 ribonucleic acid (RNA). The inset shows the data collapse of all of the curves. 2b. Attempt to map the N2 amplification curves onto a reference curve that is associated with the N1 RNA subunit. The corresponding constrained optimization problems are infeasible. The inset shows the residuals from the algorithm’s best attempt to solve the optimization problem. These systematic residuals demonstrate that the morphology of the N1 and N2 curves are different, thus illustrating that qPCR curves may be able to detect subtle differences in genetic sequences. Figure courtesy of [6].

A family of mathematically interesting classification problems also arises in biometrology, especially in the context of SARS-CoV-2 antibody detection. Serology tests are often blood based [1], though saliva tests exist as well [4]. These tests are entirely different than qPCR in that they characterize development of an immune response to a pathogen; in short, they indicate a history of infection and not necessarily a current infection. The mathematical output of a serology test can be a scalar or vector. Loosely speaking, the data analysis first partitions the corresponding measurement space in which the vector resides into disjoint positive and negative domains based on training data. Next, it classifies test data according to the domain into which the data fall [3].

At the larger population scale, epidemiologists and public health organizations use information from multiple individuals to characterize the spread or prevalence of a disease. The tension between testing accuracy at the scale of an individual versus a larger population has led to several misconceptions within the serology community that are often associated with conditional probability. For example, one usually uses confidence intervals for negative training data—often with an explicit interpretation that is associated with 99.9 percent of samples falling in a particular class—to construct the boundary between classification domains. This approach clearly does not use the well-known Bienaymé-Chebyshev inequality and ignores information from sources like the positive population when seeking to classify negative members. It is universally assumed that one must classify samples before estimating prevalence, but we also know that prevalence impacts classification accuracy. It turns out that the first of these two statements is false, although the reasoning only becomes clear with a deeper analysis of the underlying problem. An interesting thought experiment involves considering the extreme and degenerate cases of zero and 100 percent prevalence.

Figure 3. Optimization domain that is associated with adaptive prevalence estimation and classification. The trial domain (light blue) is used to perform a counting exercise to estimate the prevalence. Given this quantity, one then constructs the optimized classification domain from conditional probability models of positive and negative training data. These domains minimize the test’s prevalence-weighted sensitivity and specificity. Figure courtesy of [3].

Our recent work aims to resolve many of these problems. In particular, we show that one can view diagnostic classification as a pair of tasks: conditional probability modeling of the target populations (such as classes of individuals that are positive or negative for SARS-CoV-2 antibodies) and optimization of corresponding, well-selected objective functions (like average classification error rates with respect to classification domains). Previous work has explored some of these ideas in the context of Bayesian classification, but diagnostics present an unusual reinterpretation and extension. For example, prevalence yields the probability model for an initially unknown test sample class as a convex combination of positive and negative conditional probability models. Prevalence might therefore play the role of a prior probability distribution in a Bayesian framework, thereby informing the probability of a measurement outcome for an individual at random. In contrast, our recent work shows that the conditional probability models—along with a classification-agnostic counting exercise—can sufficiently construct unbiased and rapidly converging (in mean square) estimates of the prevalence.

More generally, employment of an optimization framework to construct classification domains provides an excellent route to extending traditional diagnostic analyses. For instance, prevalence-weighting the probability of measurement outcomes yields optimal (in the sense of minimum error) classification domains that change with prevalence. This result is largely unheard of within the diagnostics community. Moreover, traditional rectilinear boundaries between cutoff domains are replaced with more generalized and flexible structures that rely heavily on conditional probability models (see Figure 3). Even more advanced concepts and problems arise when we leverage the full power of optimization theory and methods against the realities of testing. For example, some level of uncertainty always accompanies prevalence, even with our classification-agnostic approach. One can incorporate this uncertainty into the formulation of objective functions, where it leads to a third holdout class. Indeed, an abundance of open questions that relate to aspects of optimization exist in this area.

Anthony J. Kearsley presented this research during a minisymposium at the 2021 SIAM Annual Meeting, which took place virtually in July 2021.

References
[1] Liu, T., Hsiung, J., Zhao, S., Kost, J., Sreedhar, D., Hanson, C.V., ... Dai, H. (2020). Quantification of antibody avidities and accurate detection of SARS-CoV-2 antibodies in serum and saliva on plasmonic substrates. Nat. Biomed. Eng., 4, 1188-1196.
[2] Nocedal, J., & Wright, S.J. (2006). Numerical optimization. New York, NY: Springer Science+Business Media.
[3] Patrone, P.N., & Kearsley, A.J. (2021). Classification under uncertainty: Data analysis for diagnostic antibody testing. Math. Med. Biol., 38(3), 396-416.
[4] Patrone, P.N., Bedekar, P., Pisanic, N., Manabe, Y.C., Thomas, D.L., Heaney, C.D., & Kearsley, A.J. (2022). Optimal decision theory for diagnostic testing: Minimizing indeterminate classes with applications to saliva-based SARS-CoV-2 antibody assays. Preprint, arXiv:2202.00494.
[5] Patrone, P.N., Kearsley, A.J., Majikes, J.M., & Liddle, J.A. (2020). Analysis and uncertainty quantification of DNA fluorescence melt data: Applications of affine transformations. Anal. Biochem., 607, 113773.
[6] Patrone, P.N., Romsos, E.L., Cleveland, M.H., Vallone, P.M., & Kearsley, A.J. (2020). Affine analysis for quantitative PCR measurements. Anal. Bioanal. Chem., 412(28), 7977-7988.

Anthony J. Kearsley is a staff research mathematician in the Applied and Computational Mathematics Division at the National Institute of Standards and Technology.