April 26, 2010
From DNA to Dashboards — Data Mining on the Brink of a Breakout
By Greg Goth
For The Record
Vol. 22 No. 8 P. 20
One of the most intriguing applications of data mining is a Vanderbilt University Medical Center project that incorporates EHR technology and a DNA database.
Mike Finley, MD, sees the future of the spirit, as well as the letter, of healthcare data accessibility, and it portends a clear shift in the way clinicians will have to regard the information in their care. As a new generation of tech-savvy patients enters the demographic of needing more healthcare services—and also demands more of a right in the use of their own health records—Finley says the industry must adapt.
“America will shift away from doctors who will not stay up with their patients,” says Finley, vice president of medical staff affairs at CHRISTUS St. Michael Health System in Texarkana, Tex. “We used to be the owners of the knowledge and keepers of the knowledge, and now we’re the organizers of the knowledge.”
As more data are entered into the collective record digitally, the term data mining is becoming more prevalent as providers attempt to organize all that information from multiple sources to better diagnose and treat patients. But, as Finley says, the providers are no longer undisputed “owners” of the knowledge; a new generation of patients has access to an array of information from widely disparate sources. The larger question for clinicians may likely be to determine exactly which varieties of data mining are relevant clinically and then how to evaluate or build platforms to share with a variety of users across the entire continuum of care.
For example, linking an EMR system with a deidentified DNA bank to deliver new information on factors such as gene mutations may predict effective treatments and outcomes for a given population of patients. But other applications could also be considered. For instance, a billing platform that scans payment decisions by competing insurance plans—and quickly tells a practice manager that coding mistakes resulted in patients being erroneously billed for balances by all those payers—may not be strictly considered clinical data mining or have anything to do with patient care per se. Nevertheless, a patient with a chronic and expensive condition may not agree.
At the Bedside
CHRISTUS Health is a development partner for a new clinical data-mining platform launched in September 2009 by Humedica. The platform includes a core “data factory” that collects information from disparate applications and creates a standard ontology. A retrospective comparative clinical analytic tool called Minedshare and a real-time predictive tool called MinedStream then retrieve and present the integrated data and help providers find high-cost and high-risk patients. All the applications are hosted by Humedica on a software-as-a-service model, priced based on discharge volume and/or number of full-time equivalent employees within an organization, and accessed via a browser.
The core idea behind the tools, says A. G. Breitenstein, JD, MPH, Humedica’s vice president and general manager of provider markets, is to allow caregivers to predictively use population-based and longitudinal data to predict that “this patient will stay three days longer, cost three times as much, and the likelihood of a bad outcome is four times as great, so here are 17 things we want to make sure to get done before discharge. And let’s track those things in real time so we can reduce the chance you’re going to have any of those bad events occur.”
Finley is thus far satisfied with the Humedica platform; such tools, he says, may be increasingly important as the number of caregivers at any given patient’s bed increases.
“We can be much more efficient and accurate so we have better patient coverage for all these core measures,” he says. “Much of healthcare is going away from one doctor at the bedside. We have hospitalists, nurse practitioners, physicians’ assistants—a lot of people. The better your tool to give you good information at the bedside, the better you can be. With Humedica, I see something that can make us more efficient and provide a safer environment for our patients.”
However, Finley also cautions that the deluge of data-driven healthcare will make deciding what applications and platforms to deploy an extremely difficult proposition. This choice will be made even harder as federal stimulus money for EHRs begins to flow through the system.
“We have a perfect storm,” Finley says. “Database companies are mining data, showing us reports and pie charts and neat things. I’ve had six of them in my office in the last six months wanting to sell me stuff. And you can be inundated with it.”
Predicting Drug Reactions Likely First Step
However bright the future of mining individual and demographic data to help clinicians deliver safer and more efficient care, the current reality is overall adoption of electronic records, especially in primary care settings, is low. Many EHR implementations are siloed from each other, even within the same institution, and payment incentives are still not aligned to make the best use of digital clinical intelligence.
“The payment system, which is the ultimate driver of incentives in this country, rewards them for duplicating tests over and over again,” Breitenstein says. “We don’t have accountable care organizations, so medical organizations are not incentivized to coordinate care. In many cases, the coordination of care is a monetary disincentive.”
While Finley believes the federal stimulus money may move up widely functional data mining by 10 to 15 years, Breitenstein says the existence of the technologies alone may not denote installing systems that promise improved and more efficient coordinated delivery.
“In the short term, stimulus money is a nice incentive for people to go out on a technology spending spree, but to have all those technologies work together to reduce costs will not happen until the healthcare payment system is aligned to really pay attention to outcomes and isn’t just a fee-for-service ATM for more tests and procedures,” Breitenstein says.
As a result, the existing vanguard of large-scale clinical data mining may reside in integrated provider organizations that are already linking EHR capabilities with DNA data banks. Often supported by grants from the National Institutes of Health (NIH), these organizations cite pharmacogenetics as a likely first beneficiary of clinical data mining. For example, the NIH awarded researchers at Kaiser Permanente and the University of California, San Francisco $24.8 million in 2009 to conduct a genomewide analysis of DNA samples from 100,000 Kaiser Permanente members. This analysis will be linked to decades of historical clinical and other health-related information on these participants culled from health surveys and the Kaiser Permanente EHR. In announcing the award, Kaiser noted that the genetic information generated by the project will include new data regarding drug metabolism and drug response, information that may help researchers discover genetic factors that explain why people react differently to medications.
Advanced pharmacogenomics are also likely to be a first payoff for the combination technologies of Vanderbilt University Medical Center’s BioVU DNA data bank, which now has 78,000 separate samples, and Vanderbilt’s EHR, which contains data on 1.9 million patients.
Dan Masys, MD, chair of biomedical informatics at Vanderbilt, says clinicians can look for stories in the deidentified clinical data of people who did “very well or badly with particular medications” and then obtain corresponding DNA samples.
“We can then do a genomewide scan and find out if we could have predicted who would get bad and good effects. As a result, we expect we’ll have a small number of points in the genome which, if changed, would signal likelihood of a good or bad outcome,” Masys says. “We would like, in our vision of personalized medicine, to get that panel on everybody who walks in the door so it’s already in the record before the doctor even sees the patient. That way a decision support rule can say, ‘This person has this DNA variation to take into account.’”
Masys estimates the university spent $2 million constructing BioVU.
“In the first two years of operation, we have already paid that back many times over from research grants in the areas of genomics and pharmacogenomics,” he says, adding that the finances included both direct and indirect grants, where sponsors paid for research. “But we are much more interested in the power of the science than the trade-off in dollars.”
Doing Good by Getting Well
Socially conscious investment bankers have become fond of saying “doing good by doing well” to describe their investment style. A new generation of tech-savvy patients would like to follow suit by making their data available to researchers and clinicians in hopes that such a data bank can help research their conditions as well. For example, Masys says the combination of Vanderbilt’s EHR system and DNA bank will profoundly change the way medical trials will be conducted. Patients may elect to have their DNA data withheld from the bank but must actively opt out. (Masys says only about 5% choose to do so.)
“It has allowed us to not only grow a large data bank for research but also do it in an unbiased way,” he says. “Any health condition that walks through the door at Vanderbilt is likely to have data and tissue available, so groups that historically haven’t engaged in research will be represented. Traditionally, folks who sign up for trials are better educated and have more involvement in the healthcare setting in general. In a sense, that is almost a bias against science in that it skews results toward or away from given demographic groups. This allows us to research almost any health condition, no matter which population group might have it, and also at relatively low incremental cost. To add a sample into a bio bank is tiny compared to the traditional cohort method of populating data banks.”
Masys cites prostate cancer treatment as a potential beneficiary of such a technical combination.
“We are one of the few institutions that can ask the question, could the genomics have predicted a particular story?” he says. “Could we go into deidentified EMR data, find everybody who had early-stage prostate cancer and their treatment, then track to see what happened to that group and look to the DNA to see if we could have predicted the difference in outcomes? We are already at a place where we can begin to make those correlations, and when there’s a common disorder like prostate cancer, any institution that has a rich EMR and bio bank can do those correlations.”
Masys says achieving this systems engineering approach has two prerequisites. One is being able to harvest from the data banks to look for patterns that are predictive; the other is having the necessary computerized decision support to send the harvested guidance back to clinicians.
“That infrastructure is installed in less than 10% of medical institutions nationwide,” he notes. “Everybody else has the professional credential model—your doctors, based on what they read and remember and the data they get, can do the right thing every time, and you just know human beings are not designed that way. We need some sort of aid to help interpret the complexity and volume of data.”
But How Patient Must Patients Be?
Finley, Masys, and Breitenstein all predict that the new generation of tech-savvy patients, with a symbiotic mindset, will start demanding access to a variety of these decision-support tools as well.
Masys says, in the short term, large organizations that offer their data-driven information to patients via patient portals will likely have a leg up on those that don’t, but the federal push to adopt EHRs will close that gap, especially as providers meet meaningful use thresholds for stimulus funding. Already, he says, NIH-funded research is showing promise in the eMERGE Network run by Vanderbilt and five research partners.
“Those five institutions have found that even though they have wildly different designs for their EMRs, they’ve been able to reuse each other’s phenotype definitions with very little adaptation,” he says. “The focus isn’t really rigidly on the data model as long as there’s an ability to do data mining from both structured and unstructured sources. So that’s a very promising harbinger of interoperability.”
Ultimately, the holy grail of clinical data mining may not involve putting everything on a single presentation layer for providers and patients but rather deploying technologies that are analogous to a football playbook and game film, with the doctor serving as the coach and the patient as the fully involved athlete. The patient’s own longitudinal data and population-based data on similar patients could be offered as predictive possibilities, while a real-time dashboard such as Humedica’s would serve as a safeguard that the game plan was being carried out.
“If we could produce a consumer-friendly digest of your data relative to risk and could infer predictively against your record, that is much more useful information than being simply an aggregator of your data to shuttle back and forth,” Breitenstein says. “Helping people at the patient level in predictive analytics will be a real breakthrough because then the patient can be the real watchdog for themselves or their loved one.”
Such a product, she says, is not science fiction.
“Our MinedStream product is about going to the patient level and saying to the provider, ‘You have 17 patients walking in the door who are at risk for a bad outcome. These are the 15 things you should do, and you are off track on five of them.’
“I envision a day in the not terribly distant future where we provide a consumer version of that. The best advocate for the patient is the patient’s family,” Breitenstein adds. “They are the best insurance policy against somebody walking in without washing their hands or hanging seven bags of fluid and not monitoring output.”
— Greg Goth is a freelance journalist from Oakville, Conn., specializing in technology and healthcare policy issues.