How Big Data And Machine Learning Can Predict, Prevent Isolated Cases Of Disease
Measles, once thought to have been eliminated in the U.S., is popping up in isolated outbreaks as a result of skipped well-child visits and parents’ fears that the measles-mumps-rubella (MMR) vaccine is linked to autism. Though some 350 measles cases occurred in 15 states in the first three months of 2019, more than half were in Brooklyn, N.Y., and nearby Rockland County, N.Y., where large religious communities have adopted anti-vaccine positions. Rockland County responded by pulling 6,000 unvaccinated children out of schools and barring them from public places.
The county’s actions were effective; in just a few months, 17,500 doses of MMR were administered to area children. Yet, wouldn’t it have been better to contain the outbreak before it got started? Predicting when and where measles cases will occur would be of great assistance to medical professionals and public health officials, who could more aggressively and precisely target public awareness campaigns and other programs to improve immunization rates for the sometimes deadly virus.
For missed well-child visits, we can leverage predictive analytics, specifically machine learning tools and AI technology that take in historical healthcare information coupled with family medical and immunization schedule history including critical markers such as population health/clinical indicators to identify those parents who are likely to purposefully forego or accidentally miss immunizations appointments.
A great public source for this data is the World Health Organization. It has developed a risk assessment tool to help officials identify areas not meeting measles vaccination targets, and based on the findings, guide and strengthen measles elimination program activities.
Accurate predictions demand Big Data
More difficult are those situations where parents refuse to have their children immunized. Other than the New York cases, these are isolated instances spread across many states. For predictive power we need to dig deeper and leverage publicly available retrospective data such as statistics on vaccination rates and disease outbreaks from the Centers for Disease Control and Prevention, as well as non-traditional health data. The latter includes syndromic surveillance data generated by software that mines a huge range of medical records sources. Another is social media; by using geotagging capture technology, predictive algorithms can be enhanced to essentially provide a map of future outbreak hotspots.
Outbreaks of measles are predictable using supervised learning methods from machine learning, but the quality of the inputs is everything. The techniques and application of predictive modeling require data from thousands of variables and must constantly be fined-tuned. Variables include language, addresses, income, family histories, and other demographic and socioeconomic data.
Similar to predicting measles, the analytics company I work for, Blue Health Intelligence, uses modeling tools to identify health plan members who are incurring or will soon incur annual claims in the six figures. With fewer than one in 1,000 commercially insured plan members being high-cost, you need a huge dataset that is geographically widespread and represents the care experiences of millions of people. We have a database of more than 19 billion medical claims from every ZIP code. Our research shows the top conditions for these chronic cases are blood disorders, metabolic diseases and renal disease. This is enabling health plans to take a fresh look at how to care for patients with those conditions.
You won’t get these outputs with a smaller data set because the numbers of cases of rare conditions or high spending are so few that they might not show up in the data at all. Certainly, they won’t show up on a regional or local level, making predictions such as measles outbreaks far less effective.
The tools of artificial intelligence are becoming increasingly important as we look to reduce disease incidence and improve the safety and quality of care delivery. Keeping the public free of contagious illnesses like measles – which is safely and easily preventable in two doses of vaccine administered three years apart – is easily worth the investment of resources in Big Data.
Sanket Shah is senior director of Blue Health Intelligence, a data analytics company, and an assistant professor of clinical informatics at the University of Illinois at Chicago.