An automated process based on computer algorithms that can interpret a text from medical examiners’ death certificates can significantly speed up data collection of overdose deaths, ensuring a faster public health response time than the current system, according to new UCLA research.
The study, which will be published on August 8 in the peer-reviewed journal JAMA Network Open, used artificial intelligence tools to quickly identify substances that caused overdose deaths.
According to Dr. David Goodman-Meza who is the study’s lead author and an assistant professor of medicine in the Division of Infectious Diseases at the David Geffen School of Medicine at the University of California, Los Angeles. In America, the overdose crisis is the leading cause of death among young adults, but we don’t know the exact number of overdose deaths until months after they occur. We also don’t know how many overdoses have occurred in our communities because rapidly released data is only available at the state level, at best. We need systems that distribute this data quickly and locally so that public health can respond. This is where machine learning and natural language processing can help.
Overdose data collection currently entails several steps, starting with medical examiners and coroners, who ascertain a cause of death and record questionable drug overdoses on death certificates, including the drugs that caused the death. The certificates, which contain unstructured text, are then forwarded to local jurisdictions or the Centers for Disease Control and Prevention (CDC), which code them in accordance with ICD-10-Tenth Edition – the International Statistical Classification of Diseases & Related Health Problems.
Because it must be done manually, this coding process takes time. As a result, there is a significant time lag between the date of death and the reporting of those deaths, slowing the release of surveillance data. As a result, the public health response is slowed.
What is more complicated under this system is that various drugs with different uses and effects are accumulated under the same code. An example of this will be buprenorphine, a partial opioid helpful in treating opioid use disorder, and the synthetic opioid fentanyl both of which are listed under the same ICD-10 code.
For the purpose of this study, the researchers utilized “natural language processing” (NLP) and machine learning to analyze nearly 35,500 death records from Connecticut and nine other states which are as follows:
- Cook (Illinois)
- Jefferson (Alabama)
- Johnson, Denton, Tarrant, and Parker (Texas)
- Milwaukee (Wisconsin), and Los Angeles and San Diego.
They investigated how combining NLP, which utilizes computer algorithms to comprehend text, and machine learning can automate the precise and accurate decoding of huge amounts of data.
They discovered that the most common specific substances in the 8,738 overdose deaths that year were the following:
- Fentanyl (4758, 54%)
- Alcohol (2866, 33%)
- Cocaine (2247, 26%)
- Methamphetamine (1876, 21%)
- Heroin (1613, 18%)
- Prescription opioids (1197, 14%), as well as any benzodiazepine (1076, 12%).
Only the classification of benzodiazepines was suboptimal using this method, while the rest were perfect or nearly perfect.
The CDC most recently released preliminary overdose data four months after the deaths, according to Goodman-Meza.
If these algorithms are ingrained within medical examiner’s offices, the time could be decreased to as soon as toxicology testing is concluded, which could be three weeks after the death, he said.
Other substances, like amphetamines, antidepressants, antihistamines, anticonvulsants, antipsychotics, muscle relaxants, barbiturates, and hallucinogens, were responsible for the remaining overdose deaths. The researchers point out some limitations of the study, the most significant of which is that the system was not tested on less common substances such as anticonvulsants or other designer drugs, so it is unknown whether it would work for these. Furthermore, because the models must be trained to make predictions using a large volume of data, the system may be unable to detect emerging trends.
However, the researchers write that rapid and accurate data are required to develop and implement interventions to restrain overdoses and that NLP tools such as these should be combined in data surveillance workflows to boost the rapid dissemination of data to the public, researchers, and policymakers.
In addition to Goodman-Meza, co-authors of the study comprise Chelsea Shover, Dr. Amber Tang, Dr. Jesus Medina, Steven Shoptaw, and UCLA’s Alex Bui.