By Lina Sorg
Convolutional neural networks (CNNs) are a specific class of deep learning networks that researchers frequently utilize to analyze visual imagery. During a minisymposium presentation at the 2019 SIAM Conference on Computational Science and Engineering, currently taking place in Spokane, Wash., Altansuren Tumurbaatar of Washington State University presented an advanced prototype disease surveillance tool that uses CNNs to differentiate between sick and healthy individuals based on a single camera image. The tool combines mobile computing with state-of-the-art deep learning approaches in computer vision and is based on Tumurbaatar’s work during a summer internship.
“The main application of the Android app is to take a picture of a human with a phone camera to determine whether the person is sick or healthy,” Tumurbaatar said. She and her team developed this prototype to enhance disease surveillance and collect data for further analysis. They hope to use their app to predict the health status of soldiers in the field, which would require that it eventually be time efficient and function when offline.
Tumurbaatar’s work falls under the realm of machine learning. “Suppose we have a dataset containing human images under different environments where each image is labeled as sick or healthy,” she said. “We want to train a machine learning model that can classify a human image under an unpredictable environment as healthy or sick. The problem seems simple, but it’s actually challenging because it’s so general.” For instance, Tumurbaatar quickly realized that there is a lack of datasets of sick human faces, as scientists are not conducting much research in this field. While several related projects do exist, they focus on a specific disease or family of diseases rather than sickness as an all-encompassing diagnosis. Determining how to reduce the effect of background noise in the model was another challenge. Finally, Tumurbaatar and her collaborators decided to work with color rather than grayscale images. Although grayscale photos correctly highlight key points of the face, signs of sickness often appear in one’s eye or skin hue.
Next, Tumurbaatar experimented with several pre-trained deep CNNs to differentiate between sick and healthy faces before deciding on MobileNet. She retrained her model through MobileNet for 240x240 color images with pre-specified dimensions. This involved moving the filter over each image both horizontally and vertically, then selecting an area and using the filer to expand the chosen area. Applying one convolution yields another one-dimensional image, while applying convolutions several times with different techniques ultimately serializes the image and produces one long vector. “It’s already a trend to detect objects from images really well,” Tumurbaatar said. “The CNN is already set up nicely.”
Deploying the retrained model to the Android surveillance app results in a 98 percent validation accuracy and a 96 percent test accuracy. “The high performance of the model suggests that deep learning could be a powerful tool to detect sickness,” Tumurbaatar said. “If we have really big data—millions of sick and healthy images—we could detect sickness and health really well.” The model’s quick prediction time makes it potentially suitable for real-time detection and professional use on a mobile device.
However, further study of relevant machine learning techniques is required to improve the tool. “Of course we need to enlarge the dataset,” Tumurbaatar said, though she did augment the data upon input. “It needs to be much more than just thousands of images.” She also plans to apply multiple data augmentation methods to broaden her initial augmentation. Light sensitivity is another area that demands additional investigation. For instance, the model appeared to be light-sensitive when deployed on the mobile app, indicating that images’ status as sick or healthy might vary under different lighting settings, such as natural or LED light. Lastly, Tumurbaatar hopes to fine-tune her model in coming months. “MobileNet has a stack of convolution layers,” she said. “Fine-tuning is saying that you’ll only use the last five layers of the CNN, for example, not apply all the layers. You can try different layers and see which ones work better, then run testing on them and use the best layers.”
Acknowledgments: This work is funded by Department of Defense's Threat Reduction Agency (CB10190).