Researchers at the Sandia National Laboratories (SNL), the University of New Mexico (UNM) and the Centers for Disease Control and Prevention (CDC) are working to improve the U.S. biosurveillance system that warns authorities of disease outbreaks by mimicking the human immune system.
The CDC coordinates the National Syndromic Surveillance Program. It collects anonymized data from emergency departments around the nation, analyzes public health indicators and uses statistical analyses to look for anomalies, such as a sudden increase in ER visits, that could indicate an outbreak. The programs aim to save lives by detecting outbreaks more quickly, but flagging non-outbreaks can waste resources.
“The national biosurveillance system serves essentially the same purpose as the human immune system, just on a larger scale,” Drew Levin, a computer scientist at SNL working on the project, said. “The immune system is made up of numerous T-cells that all operate independently. There’s no centralized controller and yet we do pretty well not dying.”
T-cells, a type of white blood cell, recognize and kill virus-infected cells and other foreign pathogens. They learn to do this by undergoing a negative-selection “training” process in which every T-cell that attacks normal body cells is destroyed. There is no central “brain” controlling the T-cells.
The research team began creating algorithms that mimic these T-cells and look at multiple variables at once, such as the number of clinic visits, the day of the year and intake temperature. Levin then ran the T-cell algorithms against CDC and New Mexico Department of Health data, mimicking the T-cell negative selection process.
He compared the various algorithms used and selected the most accurate. Initial tests on a pilot-scale biosurveillance system in 2016 found that the synthetic T-cells performed better than the traditional statistical methods.
The project also uses deep learning to convert the context of words into mathematical vectors using an algorithm called Word2vec. The team found that this algorithm outperformed standard keyword searches and other machine learning algorithms when running it on anonymized chief complaint data, information that describes why a patient went to the emergency room before they’ve seen a doctor and been diagnosed.
The team used an algorithm that converts words into random, or untrained, vectors in order to deal with misspelled words and abbreviations in this data set.
The researchers are now working on how to mimic lymph nodes to improve the biosurveillance system. Lymph nodes act as immune system hubs and contain T-cells and B-cells, which produce antibodies that fight off infections.
Pat Finley, another SNL computer scientist on the project, says this could be especially useful for detecting outbreaks of regional diseases like Lyme disease, plague, and Hantavirus. They could also more efficiently bypass physical and power consumption limits that computers face.
Finley said he hopes to have this system ready by October to be combined with traditional statistical methods at the national scale.
“This project with Sandia has provided us with an opportunity to test the practical application of the concepts we’ve learned from our models,” Melanie Moses, a professor of computer science and biology at UNM who is working on the project, said. “Ultimately, this project will lead to a more complete understanding of the immune system, as well as a practical way to quickly identify and respond to disease outbreaks and other biological threats.”