Fall 2022
Referential Treatment
By Susan Chapman, MA, MFA, PGYT
For The Record
Vol. 34 No. 4 P. 22
While a recent study touts the effectiveness of referential patient matching, some experts say it’s not a cure-all.
One of the major challenges to maintaining data integrity is accurate patient matching—ensuring that an individual is connected to the correct medical record—which also can directly affect care. “This problem of patient matching has been around for decades,” says Joaquim Neto, chief product officer at Verato. “HIM professionals have been struggling with it for years. The ecosystem of data is so much more complex now. That has only made the record-matching problem worse, despite organizations having sought primary EHRs from a single vendor. Because there are so many innovative technologies that need to integrate with EHRs, the patient-matching problem is bigger than it’s ever been before.”
In an effort to find better, more accurate ways to approach this longstanding issue, Regenstrief Institute recently completed a study that compares a probabilistic method of patient matching with that of referential matching. The study, “Evaluation of Real-World Referential and Probabilistic Patient Matching to Advance Patient Identification Strategy,” was published in the August 2022 issue of the Journal of the American Medical Informatics Association. Led by Regenstrief Institute Vice President for Data and Analytics and Indiana University School of Medicine Professor Shaun Grannis, MD, MS, the research team compared the two methods “with the goal of identifying evidence-based opportunities for improving match accuracy for record linkage.”
“In our study, we compared a probabilistic algorithm to a referential approach. Probabilistic methods are a broad class of approaches that assign algorithm-determined weights to each matching field. As an example, if the patient’s first name, last name, and phone number all agree, creating a composite score that exceeds a threshold weight, it is declared a match,” Grannis explains. “Neither the probabilistic nor the referential algorithms were tuned or adjusted in any way for the specific data under review. Significant work goes into tuning data. If you apply a generic probabilistic algorithm, it won’t be as accurate as one that is tuned.
“We evaluated a default—untuned—referential algorithm, which analyzes many data points, including information such as credit history and addresses. If a person has lived at multiple addresses, for instance, then you might want to look at all of them, the new and old addresses. If data show that a Mary Smith now lives at 426 Georgetown Place and if we also have access to her old addresses, then we can more accurately identify the two Mary Smith records as the same individual.”
Through their analysis, the Regenstrief team found that “referential patient matching, an increasingly popular method among health IT vendors, demonstrated notably greater accuracy than a more traditional probabilistic approach without the adaptation of the algorithm to the data that the traditional probabilistic approach usually requires.”
Grannis was surprised to learn that the untuned referential matching algorithm (RMA) used in the study exhibited high accuracy without tuning it to the data set the team used. “In fact, what we discovered was that the untuned RMA performed better than some tuned probabilistic matching algorithms,” he notes.
Grannis believes the findings will improve the overall process of patient matching, explaining that the general workflow for patient matching ideally should be as automated as possible. “Assigning office staff to work on matching removes them from other tasks with greater clinical impact. Sometimes the matching work is straightforward, like definite matches and not definite matches. But there are records that fall into a gray zone that require human interaction, and we want to keep that as small as possible,” he says. “Referential matching, using broader, more comprehensive data for matching, is very effective. With broader, more complete data, matching accuracy can be improved. The goal is to have automated matching algorithms that are as accurate as possible. Because we’re in an automated space, we want an algorithm that does a very good job of finding correct matches and minimizes false matches. There is the possibility of false matching, which we don’t want to have happen. To do this we need the most complete data records possible.”
According to Neto, the study’s findings were indeed welcome but not at all unexpected. “I think it’s great that the team can quantify the benefits of the specific new approaches to matching. I think it was the first time this was done. And it makes sense that the more data you have, the better decisions you can make. If you have more data, you can take advantage of that with smart algorithmic approaches and get better answers. Being able to prove it is the important aspect of the study,” he says.
However, not everyone is hailing the study’s findings as the ultimate solution for the challenges of accurate patient matching.
Rachel Podczervinski, MS, RHIA, vice president of professional services at Harris Data Integrity Solutions, notes that there are some patients who fall outside the parameters of referential matching. “There is value in a referential match but what it’s missing can be its pitfalls. For instance, some of the populations that are left out are children under 18 who have no credit history. If a hospital only uses credit history, then they will be missed. The same is true for undocumented immigrants, someone here from outside the country in general, or refugees. It misses those underserved populations, which is already an issue in health care,” she explains. “In order to have a full picture of what’s happening in the EHR and full pictures of your patients, referential matching won’t accomplish everything. What we need to look at is to how to use multiple tools for patient matching.”
“Nicknames are another thing that creates a challenge that will not find a solution in a referential match,” says Jason Caturano, PMP, vice president of sales at Harris Computer. “If a person has a name that they use on the medical record, but that individual uses a nickname virtually everywhere else, then there may be a need to use other data points for accurate matching. Ultimately, though, we know that data are power, and adding more points of data to make an evaluation is a great idea. I love the idea of adding more data points to make the right match. But, unless you’re a human making the determination, you need some probabilistic determination to make sure the patient that you’re adding is correct,” he says.
Neto concurs that the more data health care organizations have for an individual record, the more data can be used to associate records for the same person. “But there are different data sets and different ways to manage that data, which is why referential data is so important,” he says. “When it comes to addressing matching and other similar data points, referential data is a better method. It provides a longitudinal data set that offers better matching at that time. If you link up more of the clinical data together, then the physicians can better treat you. It gives you a holistic view of the person rather than a siloed perspective.”
Podczervinski adds, “Certainly, for a team evaluating something like duplicates, it does help to have referential data. What we’ve seen, the more information my staff has to look at, the less likely they are to have to dive deeper to find more sources or go into the EHR. The more data we have to look at, the less time it takes.”
Moving Toward a Hybrid Approach
Grannis agrees with Podczervinski’s assessment that referential matching can have its shortcomings, acknowledging the lack of referential data for underserved populations and children as examples of those limitations.
Additionally, Grannis says that the referential algorithm the team tested can arouse privacy concerns. “Some people want to know where this data comes from, and there is sensitivity around the algorithm because typically it’s driven by voter, DMV, and other types of similar data that can be used to improve matching. That can make some people feel uncomfortable. If we wanted to go with a referential solution for the United States, then that would require amassing demographic data from many sources. Doing so can improve matching accuracy, but if that approach is taken, we will need to address the related social and ethical policy issues,” he says.
Neto also believes that privacy is paramount to the referential-method conversation. “In general, health care organizations that have taken this approach have done their due diligence, and privacy is top of mind. So, we meet the needs and take advantage of this information, which in the past has been a challenge but now allows us to treat the patients in how they want to be treated,” he says.
“Currently, there is no single best approach to patient matching. There are only solutions that work in context,” Grannis offers, adding, “there is currently a blurring of the algorithmic space. The probabilistic and referential approaches are not mutually exclusive and, if more data are available to the probabilistic approach, then it can be effective.”
Podczervinski is also of the opinion that it is important to use a combination of tools: referential, probabilistic, and deterministic, the latter of which runs the data through a set of rules and is particularly important in automerging. “If a site is going to allow records to be automerged in their system without human intervention, those automerges should be based on deterministic rules to prevent merging together different people who have very similar data points—such as twins,” she explains. “Twins come back in referential matching and are sometimes thought to be the same person, because that data can be comingled in the referential data set. There needs to be evaluation of what is returned to prevent merging together two records that belong to different people. A deterministic approach would catch twins and recognize them as two individual people.”
Neto notes that the various methods for patient matching all have value, and referential is a welcome addition to help make the process more accurate and efficient. “We’re building on a lot of work and innovation that has happened over time. It’s not discarding the traditional way but adding in a new aspect,” he says.
Solutions on the Horizon
Grannis points out that the United States is the only developed country that doesn’t have an assigned national patient identifier, something that could be effective in accurately matching patient information. “We know from Canada’s and other countries’ experiences that a national patient identifier helps to more accurately identify patients,” he says. “Several years ago, Indiana health IT leaders considered deploying a statewide unique patient identifier for each individual. Conservative estimates for a national rollout at that time exceeded $11 billion. Interpolating to Indiana’s population yielded an estimated implementation cost of approximately $250 million. Currently, patient matching is at roughly 92% accuracy, and political leaders are hesitant to commit that kind of money for, at most, an 8% increase.”
With the unique national patient identifier conversation ongoing, the immediate question is how to improve accuracy. “Even if we did have the unique identifier, there could be someone who doesn’t have a card in the emergency department or forgets it. We then have to leverage demographic data to match back to the unique identifier,” Grannis says. “We have to be sure we don’t assign duplicates in those cases. There has to be a way, an algorithm, to look for duplicates. Different individuals can sometimes share a Social Security number, for example. So, an identifier would help, but we need to be mindful of its return on investment.”
Another option is digital identification, much like the biometrics—facial recognition and fingerprints—used with cell phones. Health care organizations are looking at ways to leverage this technology to improve patient identification. “Facial recognition is widely used with phones and other things. Hospitals are taking photos, so adding that as a data point can help eliminate duplicates,” Caturano says. “There is always newer pattern matching that will help—machine learning and artificial intelligence are two examples to assist with matching data. At the root, it comes back to referential and probabilistic matching, along with deterministic. But biometrics will take off in the years to come.”
Grannis believes the general population is becoming increasingly comfortable with sharing identifying information through social media. “While certainly not universal, there is growing understanding of the value of sharing identifying information in a meaningful way. Social media can help identify individuals in health care,” he says. “I believe it was Associate Justice Brandeis who once said, ‘We have a constitutional right to be let alone, not necessarily the right to be unknown,’ and I think that is at the crux of a lot of these discussions.”
Improving Patient Care
Neto believes health care organizations are focused not only on the delivery of care but also on treating patients like customers. “If you’re trying to understand a patient through the whole journey of your health system, it may begin before a patient ever comes to the facility. Their initial contact may be ‘meeting you’ on your website. How can I engage my patients to use the services that are available to them, to actively participate in wellness programs, to facilitate not just clinical encounters but true patient engagement? An EHR has good processes, but referential data has a big uplift on that.
“As we move forward, we know that technology has made patient matching easier,” Neto continues. “Simply knowing that there are new techniques with better results, like referential matching and hybrid methods, people are becoming aware that there are new options that make a bigger dent in the problem and help us be more efficient and accurate for better patient care.”
— Susan Chapman, MA, MFA, PGYT, is a Los Angeles–based freelance writer and editor.