Deep Learning Methods Improve Microfossil Segmentation and Analysis

By Lina Sorg

Microfossils are extremely useful for age-dating, rock study, and paleoenvironmental reconstruction, the process by which scientists determine the climate and vegetation of a time and place in history to predict future changes. For this reason, professionals in the oil, mining, engineering, and environmental industries—as well as the general field of geology—actively seek more effective ways to analyze microfossils. In fact, the oil industry has been a major employer of paleontologists who specialize in these microscopic fossils since the 1920s.

Microfossils are fossils or fossil fragments that are only visible with a microscope.

In recent years, researchers have begun applying image reconstruction processes and artificial intelligence techniques—particularly the type of deep learning associated with computer vision—to the analysis of microfossils. Micropaleontological studies currently use light microscopy methods with thin sections of fossil extractions. However, this system is not without limitations. For instance, it relies on two-dimensional (2D) methods that are sometimes destructive. When a sample is destroyed or compromised, extraction becomes impossible and information is lost.

Computed tomography (CT) scans present a viable alternative with new possibilities, as the corresponding three-dimensional (3D) methods yield high-resolution images and allow for nondestructive sample observation. During a minisymposium presentation at the 2019 SIAM Conference on Computational Science and Engineering, currently taking place in Spokane, Wash., Amine Kerkeni of InstaDeep utilized volumetric deep learning to detect, segment, and classify microfossils in 3D micro-CT scans. He created a novel neural network architecture to do so.

Kerkeni is currently training geologists and paleontologists in fossil segmentation, which is not easy. “Typically these people do not like this kind of job,” he said. “The idea was to find a way to help geologists and paleontologists do this kind of task quickly and with less training.” As with any machine learning project, he starts with the data. One must first annotate the data in three dimensions. For one month geologists worked by hand to build the contour of every fossil in the CT scan sample. The initial sample had 800 fossils, and annotation also involved identifying each fossil and assigning it a class. Due to the presence of noise, Kerkeni worked with graphists at his company to build various artificial datasets that validated the process and technique. He used 10 percent of the real fossil data for testing (and hid it from the training process), another 10 percent for validation, and the remaining 80 percent for training.

After obtaining the data, Kerkeni separated it and defined a metric to follow. The main metric for image segmentation in computer vision problems is the Dice coefficient, which measures the similarity between the detected shape and the ground truth. “If you’re completely matching the training data, you’d have a Dice coefficient of 1,” he said, adding that annotation typically depends on the person. “You can take two geologists and ask them to annotate the same fossil and they won’t give you the same results.” It is thus important to have multiple people annotate the fossils.

Kerkeni ran an open-source neural-network library called Keras on the backend of Tensorflow, an open-source machine learning framework, to obtain his deep learning model. “We reprocessed the data with histogram equalization to enhance data quality and improve training,” he said. He then partitioned the raw input data to small cubes of size 128x128x128, selected only the cubes that would contain a mask, and trained the model on an Nvidia DGX-1 with Volta-100 GPUs.

Kerkeni’s first approach employed a state-of-the-art instance segmentation model (Mask R-CNN) to detect and segment the microfossils in each layer, thus reconstructing the whole 3D scan from 2D results. “The idea was to process the CT scan slice by slice and use the result to reconstruct the 3D shape of the fossil,” he said. While he achieved effective segmentation on many layers of the microfossils, the results were unsatisfactory due to discontinuities in the shapes and an insufficient amount of data.

Amine Kerkeni used multistage training and injected background noise into fossils to ultimately yield a Dice coefficient score of 0.77.

To avoid such discontinuities and obtain better 3D shapes, Kerkeni’s second approach involved an adapted state-of-the-art 3D model. He specifically modified 3D U-Net, which has proven successful in the medical field. Key adaptations include a leaky rectified linear unit (an activation function for heightened stability), added dropout for some layers of the encoder, and a different decoder architecture altogether. Kerkeni immediately began seeing improved results with smooth 3D shapes and less discontinuity. However, segmentation was still less than ideal because the shapes were not fully reconstructed. “This was mainly due to the nature of the data we provided to the system,” he said.

In search of more data and fully-annotated scans, Kerkeni turned to multistage training for his third attempt. During training, he injected background noise into fossils that were not leveled. He did this three times, with each iteration delivering an improved score: a Dice coefficient of 0.55 for the first training, 0.68 for the second, and 0.77 for the third. The training process lasted 24 hours and inference took one minute. “This is a kind of trick to overcome quality of data problems, but it’s still a temporary solution,” Kerkeni said.

He concluded his presentation with a discussion of takeaways and opportunities for further research. “The main takeaway of our experiment is that the quality of the annotation influences the quality of the training,” Kerkeni noted. This speaks to the necessity of multi-person annotation, just as the medical field demands that multiple doctors annotate the same dataset. He also learned that the dataset’s size is important, and that non-annotated data in the training set will drive analysis the wrong way.

Kerkeni continues his work on instance segmentation and hopes to train Mask R-CNN in three dimensions; right now he is beginning to have a workable model. Ultimately, the use of image segmentation on 3D fossils yields promising results for future geological work. “Distributing a volumetric deep learning model over multiple GPUs would allow wide applications,” he said. “This would unlock a new era for the oil industry.”

Lina Sorg is the associate editor of SIAM News.