Natural Language Processing Techniques Improve Suicide Prevention in Veterans

By Lina Sorg

Each day, more than 120 people in the U.S. die by suicide. According to the Centers for Disease Control and Prevention, suicide is the tenth leading cause of death in the country. For individuals between the ages of 10 and 34, it jumps to the second. Suicide rates have increased by 33 percent since 1999, severely impacting the lives of individuals, families, and communities and costing the U.S. over 69 billion dollars each year.

In an effort to improve healthcare for Veterans, the Department of Veterans Affairs (VA) has partnered with the Department of Energy (DOE) to combine VA healthcare and genomic data with the DOE’s extensive high-performance computing resources, artificial intelligence methodologies, and data analytics techniques. The VA’s vast wealth of information includes electronic health records (EHR) for approximately 24 million people and spans 20+ years of patient data. The dataset also contains genomic statistics for more than 800,000 Veterans. The EHR is comprised of both structured data—such as demographics (height, weight, blood type, or blood pressure), diagnoses, treatments, and medications—and unstructured data — like physicians’ and nurses’ notes, discharge summaries, surveys, and progress reports.

Because an average of 20 Veterans commit suicide every day, the VA developed an outreach program called REACH VET (Recovery Engagement and Coordination for Health – Veterans Enhanced Treatment). REACH VET is based on structured data and utilizes predictive modeling, medical records, and variables (i.e., demographics, use of VA services, and medications) to identify Veterans who are at a high risk for suicide. When the model identifies an at-risk Veteran, a clinician checks on their wellbeing and reviews treatment plans to determine whether intense care is necessary. However, because the model does not account for unstructured data—which captures stressors like homelessness, social isolation, or marital troubles—it therefore lacks sensitivity. Natural language processing (NLP) can likely improve the sensitivity of predictive models such as REACH VET and help doctors more accurately identify triggering events. During a minisymposium presentation at the 2021 SIAM Conference on Computational Science and Engineering, which is taking place virtually this week, Silvia Crivelli of Lawrence Berkeley National Laboratory and the University of California, Davis explored the capacity of NLP techniques to complement structured data, improve REACH VET’s sensitivity, and help doctors identify at-risk patients.

Incorporating NLP into the REACH VET model is complicated since the VA data contains roughly 4.6 trillion notes. “Unfortunately this is not a straightforward task, as EHR was never accurately designed to capture traumatic life events,” Crivelli said. “As data scientists, we need to come up with clean and unbiased data for NLP so we can evaluate suicide risk with high accuracy.” Her team’s first attempt involved analyzing the metadata (document definitions) rather than the contents. They trained an association model over patients’ definitions prior to their first diagnosis of homelessness. An association metric reduced the patient corpus to documents with a high association with housing insecurity, and the resulting graph indicated a dramatic increase in the number of associated documents during the months immediately preceding new occurrences of homelessness. One can thus predict a potential homelessness event within a three-month window before it occurs. This knowledge would allow VA personnel to assist at-risk Veterans, who are less inclined towards suicide when receiving homeless services.

The aforementioned graph view suggests that Veterans likely experience homelessness sooner than its formal diagnosis, and social work notes confirm this suspicion. “We need to know that because we need to have a clean data set,” Crivelli said. “We need to know exactly when it happens so we can identify our positive and negative data sets.” She is also working to create a structure of other life events and plans to convert 180 identified events into a six-step life event index:

Classify 180 events into six classes: housing instability, job instability, social connection, food insecurity, justice, and access to means
Expand the life event corpus with NLP methods
Conduct entity extraction training using the corpus
Extract life events from clinical notes
Perform pilot test and compute normalized life event index
Include the findings in REACH VET.

Extraction of these life events requires significant data cleaning. For example, “job instability” can mean many different things and be phrased in numerous ways — including “stopped work,” “without employment,” “job loss,” “fired,” “semi-retired,” and “termination of employment,” among others. The classification system therefore must be familiar with all of these terms when examining job instability as a whole so as not to miss cases. “The unstructured data can offer much more information than just looking at the diagnosis notes in the structured data,” Crivelli said. Unfortunately, old-school extraction methods—i.e., simple NLP techniques such as CLEVER (CLinical EVEnt Recognizer)—do not capture language complexity and cannot resolve some of the necessary relationships (see Figure 1).

Figure 1. Simple natural language processing (NLP) methods do not capture language complexity.

However, more sophisticated methods like BERT (Bidirectional Encoder Representations from Transformers) can handle ambiguity. BERT’s internal representation of word interdependencies builds a directed graph between the extracted terms in a way that simpler NLP processes cannot. It can also account for measures of connectivity between nodes on the graph. Because of these promising BERT results, Crivelli’s team is currently moving towards more sophisticated language modeling.

NLP methods like BERT are capable of correctly identifying word interdependencies that allow for concept extraction from complex text and identify patterns between life events (i.e., factors that led to unemployment or divorce). They can also identify topic interdependencies. “Training language models is computationally expensive and requires a large corpus of text,” Crivelli said. “The good news is that once a model has been pretrained for language modeling, training a model for text classifications can be performed with a small set of annotated text and less time.”

Lina Sorg is the managing editor of SIAM News.