SIAM News Blog

Machine Learning Techniques Advance Suicide Prevention Strategies

By Lina Sorg

Close to one million people worldwide commit suicide each year. It is the tenth leading cause of death in the United States and the second leading cause of death for individuals between the ages of 10 and 34. Over the last 20 years, the U.S. suicide rate has increased nearly 30 percent, with teenagers and veterans most at risk. Given these sobering statistics, there is an increased push to treat suicide as a public health crisis. 

Recent years have seen increased implementation of digital health record systems in hospitals to improve reproducibility within the scientific research community. Nevertheless, lack of interoperability and data integration are ongoing issues in hospitals. This means that despite high-performance computing, researchers cannot use hospital data to its fullest potential. However, the newfound availability of massive amounts of electronic health record data, including the MIMIC-III database, offers an abundant source of information for public health policy studies and data-driven precision medicine. MIMIC-III is a freely-available database that contains over a decade’s worth of de-identified comprehensive clinical health data from approximately 40,000 critical care patients admitted to the intensive care unit (ICU). The data is broken down into structured, unstructured, imaging, genomic, and demographic information. Its accessibility allows researchers to reproduce and improve upon clinical studies in ways that otherwise would not be possible.

During a minisymposium presentation at the 2019 SIAM Conference on Computational Science and Engineering, currently taking place in Spokane, Wash., Xinlian Liu of Hood College combined machine learning techniques with MIMIC-III data to suggest best-practice strategies for suicide prevention. Despite years of experience- and observation-based studies, suicide remains a largely unsolvable problem. Many factors complicate prevention efforts, including the size of the available dataset and the challenge of quantifying people’s behavior/thoughts. “People have been trying to identify causes from the genetic and environmental side of the issue,” Liu said. “But suicide involves the dynamic nature of thought processes.”

Xinlian Liu of Hood College combines machine learning techniques with MIMIC-III data to suggest best-practice strategies for suicide prevention.
Studies show that social, medical, and/or policy intervention is the most effective means of suicide prevention. While Liu and his colleagues were waiting for approved access to veteran statistics, they began working with data from MIMIC-III. First they tested the data to confirm its accuracy; a simple distribution of the patients’ ages came back as normal. “Knowing the issue, our plan was to find the most effective, high-risk group to prioritize our resources,” Liu said, adding that prioritization produces timely help for these risk groups. An age distribution of the data revealed patients aged 40-50 to be the high-risk group.

Liu then presented a sample table with ICD-9 diagnostic codes of non-fatal suicide attempts. The most common means of attempted suicide is drug overdose; because overdose often takes multiple tries, Liu hopes to use this as an opportunity for intervention. He clusters the high-risk group’s records by dividing them into unstructured and structured data. Unstructured data—social, environmental, mental, and familial observations by physicians, nurses, psychiatrists, social workers, etc.—becomes the nodes, thousands of which exist for each patient. Data pulled from charts—prescriptions and medications, examinations, hourly vitals, lab results, and the like—comprise the structured data. Unfortunately, code clustering reveals that it is difficult to separate susceptible and non-susceptible patients based on this information alone.

Next, Liu encodes the structured data via the Observational Medical Outcomes Partnership and Fast Healthcare Interoperability Resources, which provide a reliable hierarchy. He employs the Unified Medical Language System to tokenize the nodes. “We tried to capture the patient before he/she did something negative,” he said. “We tried to predict when a patient would have an unplanned readmission to the hospital.” Liu and his team assumes that an unplanned or avoidable admission correlates strongly with a suicide attempt. Predicting when such an episode might occur would allow practitioners to prioritize medical resources, reach out to the individual in question, and examine the larger social setting. “Even if you can just find the person, call the person, and ask them if they’re all right,” he said. “Reaching out to patients is really critical for suicide prevention.”

Liu also introduced the HOSPITAL Score for Readmissions, a tool that predicts potential hospital readmissions (after discharge) over the course of 30 days. A score of seven or higher indicates a heightened chance of readmission. Liu creates his own recurrent neural nets in which each state vector represents one admission during a patient stay at the ICU. The network architecture activates embedding layers that account for administered drugs, diagnoses, procedures, and other data courtesy of CHARTEVENTS — a facet of MIMIC-III that contains all charted data available for a patient. Liu’s neural nets yield an adjusted HOSPITAL score that is typically higher than the original. 

For now, he is working to obtain and utilize more data via a transfer learning scheme and natural language processing techniques. Liu has even contacted 23andMe, a genomics and biotechnology company that conducts serious research and is open to the science community. “If we can stop even one or two suicides, we’ll consider ourselves successful,” he said.

 Lina Sorg is the associate editor of SIAM News.
blog comments powered by Disqus