| December 11, 2017

Achieving Real Time Cone Beam CT Reconstruction

You may be aware of a medical imaging modality commonly known as computed tomography (CT). You might have even had a CT scan. However, have you thought about how CT works? Hospitals and clinics currently use two types of such scanners: conventional CT and cone beam CT (CBCT). Most people are familiar with conventional CT (see Figure 1a), which emits X-ray beams in a fan shape as the scanning bed moves further in towards the machine (see Figure 1b). On the other hand, CBCT (see Figure 1c) emits cone-shaped X-rays (see Figure 1d) while the patient lies stationary.

Figure 1. 1a. Conventional CT. 1b. X-ray beams and motion of CT. 1c. Cone Beam CT. 1d. X-ray beam and motion of CBCT. 1a courtesy of Mit-tech.com. 1b and 1d courtesy of Carestream Dental Blog. 1c courtesy of Perlove Medical.

One major advantage of CBCT over conventional CT is its reduction of the radiation dosage to which a patient is exposed. CBCT is thus used more frequently than CT in many clinical procedures. However, the image quality of CBCT is less than that of CT; CBCT is particularly sensitive to motion in the sense that small movements during the scan cause blurring in the reconstructed images.

Figure 2. Example of motion blur in images. 2a. Stationary fidget spinner. 2b. Rotating fidget spinner.

When taking a picture of an object in motion, such as a rotating fidget spinner, you might notice blurring along the trajectory of the object’s motion; this is motion blur (see Figure 2). Motion blur is unavoidable when scanning lung cancer patients because the patient's breathing and heartbeats cause the thorax to continuously move (see Figure 3b). Phase binning, which begins with capturing a breathing signal over a time interval, can eliminate motion blur. Then we group together the projection data within the same phase before performing standard CBCT image reconstruction. Since this process introduces a fourth dimension, i.e., the time interval, it is referred to as 4D CBCT. Unlike CBCT, 4D CBCT clearly reduces motion blur. However, streaking artifacts degrade the image quality of 4D CBCT at each phase (see Figure 3c); this is due to an insufficient number of projections for the reconstruction.

Figure 3. Comparison of CBCT and 4D CBCT reconstruction.

The temporal direction is discretized in 4D CBCT, but we aim to make it continuous — we want to reconstruct the volumetric images at the instant the projection is taken. This approach makes it possible to handle irregular breathing patterns and reconstruct the volumetric image in real time. In short, our algorithm consists of training (which takes time) and testing (which can be achieved in real time). The following three steps are particularly necessary for training:

Finding the deformation vector field (DVF)
Applying a dimension reduction scheme upon the DVF to build a 3D lung motion model
Training an ensemble of neural networks (NNs) to estimate the model parameters.

At the testing stage, the estimated parameters can be inferred from the NNs given the projection data, thus reconstructing the volumetric images.

We choose a reference image from the set of CBCT images and then use the Demons Algorithm to compute DVFs from the reference image to every other image in the set [2]. The DVF is the vector field that represents the transformation of an object from a reference configuration to a target configuration, as illustrated in Figure 4.

Figure 4. Deformation vector field. Image courtesy of [2].

Due to the lungs’ periodic motion, these DVFs are highly redundant. This motivates us to apply dimension reduction. In particular, we assume that any DVF caused by lung motion is a linear combination of three principal components, which represent nearly 95 percent of lung motion. If we can compute the three weights (or coefficients), we can determine the DVF and obtain the volumetric image by applying the DVF to the reference image.

Our next step is to train an ensemble of neural networks to build a model that takes a single projection and its projection angle to estimate the three coefficients. A NN is an information-processing paradigm built to simulate the way brain nerves process information. Just like with normal brains, the more information you pass through the NN, the more it “learns.” Once the NN acquires new information based on what it already “learned,” it will process the information accordingly (see Figure 5). We consider the ensemble of NNs in the sense that we decompose each projection in the training set into a set of non-overlapping 32x32 patches, each meant to train an NN after preprocessing, such as subtracting the mean and normalizing to unit norm.

Figure 5. Obtaining estimated principal components from NNs.

Once trained, we use the NNs to estimate the model parameters. As with training, we break a single X-ray projection into a set of 32x32 patches. After the same preprocessing, each patch is passed through the NN along with the projection angle. The NN then outputs three estimated coefficients used to compute a new DVF. We apply this estimated DVF to the reference image to reconstruct a volumetric image corresponding to this projection. One result is illustrated in Figure 6.

Figure 6. Reconstruction results (front, side, top-down view).

We evaluate our algorithm via relative error between the ground truth and the estimated volumetric images. The relative error was 5.5 percent for our simulated data in the worst-case scenario, an improvement when compared to [1], which was a basis for our study. Ultimately, we obtained some promising results with the simulated data. In the future, we plan to test on real patient data.

The author presented this research in a contributed presentation at the 2017 SIAM Annual Meeting, held in Pittsburgh, Pa., this past July.

Acknowledgments: This research was supported by the National Science Foundation’s Enriched Doctoral Training Program, DMS grant #1514808, "Team Training Mathematical Scientists Through Industrial Collaborations," at the University of Texas at Dallas. Special thanks to Yifei Lou and Yan Cao for advising me through this research. I am also grateful to the University of Texas Southwestern Medical Center, our external partner—especially Chenyang Shen and Xun Jia—for providing the necessary data and guidance.

References
[1] Li, R., Lewis, J.H., Jia, X., Gu, X., Folkerts, M., Men, C., & Jiang, S. B. (2011). 3D tumor localization through real-time volumetric X-ray imaging for lung cancer radiotherapy. Medical Physics, 38(5), 2783-2794.
[2] Pennec, X., Cachier, P., & Ayache, N. (1999). Understanding the “Demon’s Algorithm”: 3D Non-rigid Registration by Gradient Descent. In C. Taylor & A. Colchester (Eds.), Medical Image Computing and Computer-Assisted Intervention – MICCAI’99 (pp. 597-606) (Vol. 1679). Lecture Notes in Computer Science. Berlin, Heidelberg: Springer-Verlag.

Samiha C. Rouf is a second-year Ph.D. student in applied mathematics at the University of Texas Dallas. During her first year, she was a research assistant for the National Science Foundation’s Enriched Doctoral Training Program.