| July 13, 2020

Deep Learning for COVID-19 Diagnosis

By Keegan Lensink, William Parker, and Eldad Haber

Over the last several months, the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has rapidly become a global pandemic, resulting in nearly 480,000 COVID-19 related deaths as of June 25, 2020 [6]. While the disease can manifest in a variety of ways—ranging from asymptomatic conditions or flu-like symptoms to acute respiratory distress syndrome—the most common presentation associated with morbidity and mortality is the presence of opacities and consolidation in a patient’s lungs. Upon inhalation, the virus attacks and inhibits the lungs’ alveoli, which are responsible for oxygen exchange. In response—and as part of the inflammatory repair process—the alveoli fill with fluid, causing various forms of opacification within the lungs. This opacification is visible on computed tomography (CT) scans. Due to their increased densities, these areas appear as partially opaque regions with increased attenuation, which is known as a ground-glass opacity (GGO). Consolidation occurs when the accumulation of fluid progresses to an opaque region on CT scans (see Figure 1).

As COVID-19 spreads, healthcare centers around the world are becoming overwhelmed and facing shortages of essential equipment that is necessary to manage the disease’s symptoms. Severe cases often require admission to the intensive care unit (ICU) and necessitate mechanical ventilation, both of which have limited availability. Rapid screening is crucial in diagnosing COVID-19 and slowing its spread, and effective tools are essential for prognostication in order to efficiently allocate increased care to those who need it most.

Figure 1. Visualization of an axial slice of a computed tomography (CT) scan, cropped to the left lung. 1a. Pulmonary opacification present in a patient with COVID-19. 1b. The corresponding annotation generated by a radiologist. Red indicates pure ground-glass opacity (GGO), purple designates GGO with intralobular lines (crazy paving), and black signifies consolidation.

While reverse transcription polymerase chain reaction (RT-PCR) has thus far been the gold standard for COVID-19 screening in many countries, equipment shortages and strict requirements for testing environments limit this test’s utility in all settings. Furthermore, reports indicate that RT-PCR testing suffers from high false negative rates due to its relatively low sensitivity and high specificity [1]. Chest CT scans—which have demonstrated effectiveness in the diagnostic process, including follow-up assessment and evaluation of disease evolution—are an important complement to RT-PCR tests [7]. Recent work indicates that trained radiologists’ analyses of chest CT scans enable highly sensitive diagnosis [1].

In addition to providing complimentary diagnostic properties, CT scans have proven invaluable for the prognostication of COVID-19 patients. The percentage of well-aerated lung (WAL) has emerged as a predictive metric for determining prognosis, including admission to the ICU and death [3]. Practitioners often quantify the percentage of WAL by visually estimating volume of opacification relative to healthy lung; one can approximate this automatically via attenuation values within the lung. In addition to the percentage of WAL—which does not account for the various forms of opacification—expert interpretation of CT scans provides insight into an infection’s severity by identifying numerous patterns of opacification (see Figure 2).

Figure 2. Classes annotated in the dataset, as well as the class groupings we utilized in our experiments.

The prevalence of these patterns, which correlate with the severity of infection, are associated with different stages of COVID-19. Quantification of both the WAL percentage and the opacification composition enables efficient estimation of the disease’s stage and potential patient outcome.

Radiologists typically analyze three-dimensional (3D) images. However, 3D quantitative assessment is both difficult and time consuming. Computerized techniques—particularly machine learning methods that are based on deep convolutional neural networks (CNNs)—can aid in this endeavor.

Researchers have widely applied deep learning-based methods in vision. These methods are based on a simple model:

\[{\bf Y}_{j+1} = F({\bf Y}_j, {\boldsymbol \theta}_j), \quad j=1,\ldots,n, \]

where \({\bf Y}_{j}\) specifies hidden layers, \(\bf Y\)\(_1\) is the original 3D image, and the function \(F\) (which depends on the parameters \({\boldsymbol \theta}\)) is typically composed of convolutions and a nonlinear activation function. One of the most successful architectures in recent years employed a function of the form \(F({\bf Y}_j, {\boldsymbol \theta}_j) = {\bf Y}_j + G({\bf Y}_j, {\boldsymbol \theta}_j)\). This architecture, called a residual method, is linked to the discretization of the ordinary differential equation (ODE) [4]:

\[\dot{\bf Y} = G({\bf Y}, {\boldsymbol \theta}).\]

In recent years, scientists have used such networks in medical imaging; several groups are now utilizing them to combat COVID-19. Although researchers have proposed plenty of artificial intelligence (AI) systems to provide assistance with the diagnosis of COVID-19 in clinical practice, AI has not yet shown any significant impact in improving clinical outcomes.

As part of a project that is spearheaded by Vancouver General Hospital, we aim to improve the clinical diagnosis—and particularly the prognosis—of COVID-19. We are combining advanced machine learning algorithms with annotated CT scans to develop a quantitative diagnostic tool that can help physicians diagnose and manage COVID-19 patients. Similar to other undertakings, the basic idea involves using annotated images and then training a deep learning network that can automatically classify areas on the 3D CT scan based on their type. Assuming that this can be done successfully, one can estimate the different labels’ volumes—in addition to the percent WAL—and correlate them to clinical outcomes. This approach thus allows practitioners to not only diagnose COVID-19 patients (which radiologists can do relatively easily), but also provide quantitative analyses that predict outcomes.

Data is one of the most important aspects of such a project. We were fortunate to obtain nearly 5,000 CT images from Iran; China; South Korea; Italy; Saudi Arabia; and Vancouver, Canada. Volunteer physicians in Vancouver then annotated this data, obtaining a large and diverse dataset for training, validating, and testing.

Although we initially imagined working with fairly standard networks and optimization routines for segmentation, we quickly encountered two main problems. The first issue is the variability between physicians in terms of the “correct” segmentation. Our images are very different from classical machine learning applications — such as the segmentation of objects on a street, wherein a nonexpert can easily identify the classes. In one of our first studies, 12 physicians segmented the same image. The results varied significantly (see Figure 3).

Figure 3. Variability between 12 physicians who segmented the same image slices.

This variability implies that it is misguided to use simple objective functions (like cross entropy) that are common in deep learning to guide the optimization process. It also indicates that utilization of well-known metrics, such as Intersection Over Unions, to check the segmentation’s quality is misguided as well. To handle the variability, we developed a noise model and included it in the optimization process. Creation of this model and its subsequent involvement in training procedures was a main goal in our effort to ensure that the results were meaningful for clinical use.

The size and dimension of the problem presented a second bottleneck. Unlike most image analysis problems, CT is typically collected in three dimensions and presents 3D targets. True comprehension of a CT image’s clinical implication requires a 3D understanding of structures. Previous researchers have employed 3D CNNs, mainly for video. However, these networks—especially when deep—tend to require a large amount of memory. This complication makes it impossible to train a deep network in three dimensions without special hardware. In response, and inspired by hyperbolic partial differential equations, we developed hyperbolic neural networks that necessitate a fixed amount of storage — a fraction of the storage required when training typical networks [2, 5]. These hyperbolic networks allow us to train deep networks on high-resolution 3D images. They are based on the leapfrog discretization of the second-order ODE,

\[\ddot{\bf Y} = G({\bf Y}, {\boldsymbol \theta}),\]

and rely on the properties of hyperbolic systems that move forward and backward in time. This permits us to train deep neural networks on modest hardware.

Vancouver General Hospital is currently validating the results of our research, which will soon be released as open software. Ultimately, we hope to provide radiologists around the world with better tools for the diagnosis and prognosis of COVID-19 patients.

This work is based on Eldad Haber’s minitutorial presentation as part of the 2020 SIAM Conference on Mathematics of Data Science (MDS20), which occurred virtually in May and June. Haber’s presentation is available on SIAM’s YouTube channel.

The figures in this article were generated by the authors.

References
[1] Ai, T., Yang, Z., Hou, H., Zhan, C., Chen, C., Lv, W., …, Xia, L. (2020). Correlation of chest CT and RT-PCR testing in coronavirus disease 2019 (covid-19) in China: a report of 1014 cases. Radiol., 200642.
[2] Chang, B., Meng, L., Haber, E., Ruthotto, L., Begert, D., & Holtham, E. (2018). Reversible architectures for arbitrarily deep residual neural networks. In The thirty-second AAAI conference on artificial intelligence. New Orleans, LA: Association for the Advancement of Artificial Intelligence.
[3] Colombi, D., Bodini, F.C., Petrini, M., Maffi, G., Morelli, N., Milanese, G., …, Michieletti, E. (2020). Well-aerated lung on admitting chest CT to predict adverse outcome in covid-19 pneumonia. Radiol.
[4] Haber, E., & Ruthotto, L. (2017). Stable architectures for deep neural networks. Inverse Prob. 34(1).
[5] Lensink, K., Haber, E. & Peters, B. (2019). Fully hyperbolic convolutional neural networks. Preprint, arXiv:1905.10484.
[6] World Health Organization (2020). Coronavirus disease (COVID-19) pandemic. Retrieved from https://www.who.int/emergencies/diseases/novel-coronavirus-2019.
[7] Zu, Z.Y., Jiang, M.D., Xu, P.P., Chen, W., Ni, Q.Q., Lu, G.M., & Zhang, L.J. (2019). Coronavirus disease 2019 (covid-19): a perspective from china. Radiol., 200490.

Keegan Lensink is a graduate student at the University of British Columbia and a research scientist at Xtract AI. William Parker is a medical doctor and radiology resident at the University of British Columbia and founder of SapienML. Eldad Haber is a scientific and Natural Sciences and Engineering Research Council of Canada (NSERC) industrial research chair at the University of British Columbia.