A team of researchers at the National Institutes of Health (NIH) recently published a study in The Journal of Infectious Diseases on the impact big data, such as electronic health records and social media, could have on identifying emerging disease threats or potential outbreaks.
Traditional infectious disease surveillance involves laboratory testing and data collected by public health institutions. However, this approach often leads to time lags and lacks the local resolution needed for accurate monitoring.
Big data streams from internet queries have the benefit of speeding up a process, but can often come with its own biases on how a researcher looks for information. For instance, some non-traditional data streams may lack demographic identifiers such as age and sex.
The authors said a hybrid approach, combining traditional surveillance with big data sets, could provide a new way forward to serve as a complement to existing methods.
“The ultimate goal is to be able to forecast the size, peak or trajectory of an outbreak weeks or months in advance in order to better respond to infectious disease threats,” Cecile Viboud, co-editor of the study and senior NIH scientist, said.
“Integrating big data in surveillance is a first step toward this long-term goal. Now that we have demonstrated proof of concept by comparing data sets in high-income countries, we can examine these models in low-resource settings where traditional surveillance is sparse,” she said.
The authors said that while the new approach shows promise, there is still an overall scarcity of reliable surveillance information, especially when compared to other scientific fields like climatology.