November 2016
The Future of Speech Recognition
By Beth W. Orenstein
For The Record
Vol. 28 No. 11 P. 10
From virtual assistants to ambient intelligence, the next generation of speech recognition promises to be intriguing.
Speech recognition technology has come a long way since it was introduced in the 1960s; back then, systems could only recognize digits. In the health care arena, there's no doubt that today's versions of speech recognition are "light-years" ahead of those initial models, says Joseph Desiderio, president of Voicebrook, a provider of integrated speech recognition and digital dictation solutions for automated clinical documentation.
While still not 100% error-proof, speech recognition technology is far more accurate and user-friendly than ever, says Jay Vance, CMT, CHP, AHDI-F, vice president of operations at WahlScribe, which provides tailored transcription services to hospitals, clinics, and group practices. However, the technology still has more cards to play.
Experts says future versions will offer enhanced features that will make clinicians more productive and allow them to be more patient-focused. "It is a very, very exciting time for speech recognition and related technology, especially in health care information—a ton is going on," says Peter Mahoney, senior vice president and general manager of Dragon and clinical documentation for the health care division of Nuance Communications.
Many of the improvements to speech recognition engines coincide with the growing use of EHRs. The ARRA required all public and private health care providers and other eligible professionals to adopt and demonstrate meaningful use to maintain their existing Medicaid and Medicare reimbursement levels. EHR adoption, which is already clipping along at an impressive rate, is expected to continue to grow as the penalties for providers that don't adopt the technology increase substantially through 2018.
The intent of EHRs—to make providers more efficient—is spot on, Mahoney says, "but in reality, it's gotten in the way of the practice of medicine." Physicians and other providers have become slaves to their computers because of the data they have to enter into their patients' medical records before, during, and after visits, he says. "They've become data-entry clerks, and it's pulling them away from interacting with patients," Mahoney says.
That's where speech recognition technology, as it advances in capabilities, can play a big role, he says. Speech recognition can help "turn the doctor's chair around" so he or she is facing the patient, not the computer, during encounters. "It will take a while," Mahoney says, "but you'll see it more and more over the coming years as the speech recognition industry has focused on this problem and is looking at specific applications that can help."
Still, a human element will play a key role in any future in which speech recognition dominates, says Vance, an immediate past president of the Association for Healthcare Documentation Integrity who believes that it's essential to review any technology-generated documentation before it becomes permanent in a patient's chart. He cautions that one error in an EMR can be replicated multiple times as it is shared with other physicians and hospitals.
"There's too much at stake for there not to be a process in place for medical documentation specialists to review reports for accuracy—regardless of what methods are used to document patient encounters," Vance says.
Improvements Coming to the Front End
How to deploy speech recognition basically comes down to two options: back-end and front-end. When used on the back end, provider dictation is converted into electronic text, which is then edited by a health care documentation specialist (HDS). An HDS may seek clarification or obtain missing information from the physician, who must sign the document when it is completed.
Vance believes back-end speech recognition is rather mature, with not too much on the horizon other than perhaps increased accuracy. "I can't think of too many more things that could be done feature-wise to make back-end speech recognition technology more usable," he says, noting that he has had years of personal experience working with this form of speech recognition.
However, Vance says front-end speech recognition—a version in which the physician dictates, self-edits the transcript, and signs off on it—has plenty of room for improvement. "I'm convinced," he says, "that there are people who are working on it every day, looking for ways to help that computer to become more literate and fluent in ways that can better identify and accurately reproduce human speech."
Some of the features that industry experts expect to see as front-end speech technology matures and improves include the integration of more templates, virtual assistance, and artificial intelligence. For example, Voicebrook, which serves pathology departments, is developing an add-on that allows users to call up and complete structured data templates by voice.
The ability to drive workflow with voice is a tremendous advantage in the pathology lab because pathologists and their assistants need their hands free to handle specimens, Desiderio says. Also, new compliance regulations from the College of American Pathologists (CAP) require that when a positive cancer finding is noted, the pathologist must create a cancer checklist in the patient's record, with each data element saved in a CAP-specified structure. Some of the information needed for the checklist is already in the report the pathologist is dictating. Being able to merge the two with voice commands would make workflow more natural and the entire process more efficient, Desiderio says. "This additional reporting requirement can add significant time and duplication of data already documented," he notes.
Voicebrook is building new structured data templating tools to voice-optimize structured data entry (including CAP electronic cancer checklists) and reduce duplicate data entry, Desiderio says.
Over the decades, speech recognition technology has become not only more accurate but also better able to understand the context of what the user is dictating, Mahoney says. In the near future, he believes such natural language understanding will go even further and be able to make suggestions to users that make their jobs easier.
In fact, Nuance's Dragon Medical Advisor asks physicians to be more specific if necessary. For example, if when dictating a report, physicians describe a patient's diabetes but neglect to indicate whether it is type 1 or type 2, or its severity, the tool prompts them to provide the missing information. Obtaining such data improves patient care as well as the chances of the chart being coded correctly.
Another product, dubbed Florence, understands verbal requests such as orders for medications, labs, and diagnostic imaging procedures. Florence acts not only as a virtual assistant but also as a physician educator. For example, Florence can alert a physician to the potential effects of a certain medication being ordered for a specific patient, whether it be an 87-year-old woman or an otherwise healthy 21-year-old male.
Taking advantage of speech recognition capabilities allows physicians to be more productive, Mahoney says. Without speech recognition, physicians may spend an extra 10 to 15 minutes in the EHR inputting data. "But if you can have a system that asks you follow-up questions and provides information you need on its own, it allows you to complete tasks much more rapidly," he says.
Next Up: Ambient Intelligence
The innovations and improvements won't stop there, Mahoney says. The next step, and one that Nuance and others are working toward, is ambient intelligence.
"The idea of ambient intelligence in health care is that the speech recognition technology is listening all the time to the doctors and their interactions with their patients," Mahoney explains. The hope is that the technology will be able to extract key medical facts from physician-patient conversations to help improve documentation and facilitate a care plan. Ambient intelligence software should even be able to proactively make recommendations to the physicians. "It will do so in a very natural, ambient way, meaning it's just there in the background, listening in an unobtrusive way," Mahoney says.
It's hard to say when ambient intelligence will become a reality in health care, but Mahoney believes it's in the not-too-distant future. "You'll see bits of it starting to emerge in the next couple of years," he notes. Although Nuance is actively working with a number of interested health care systems on developing ambient intelligence for its speech recognition technology, "being able to deliver on the whole vision is probably in the five-plus year timeframe," Mahoney says.
Detlef Koll, chief technology officer and cofounder of M*Modal, which offers cloud-based speech recognition solutions for all medical specialties, agrees the day "is not too far off but still evolving" when ambient intelligence will work in the background to help with clinical decision support and guidelines.
While speech recognition has already evolved to be an assistive system, he says it soon will be able to make suggestions to physicians based on the current documentation. For example, speech recognition may be able to recognize whether the patient's pneumonia was acquired while in the hospital or from a cold that turned bad. It may even be able to suggest the best antibiotic, Koll says, but adding that "decision support for treatment is down the road."
Pocketbook Effects
Will these innovations increase the cost of speech recognition technology? The answer is debatable, but, like many new products, providers can expect the price to decrease over time, Vance says, noting that the marketplace tends to settle down once vendors figure out more efficient production methods and the competition heats up.
He believes any new innovations will have little effect on pricing. "There could be some minor fluctuations one way or another, but I would be shocked to see much of an increase in price," Vance says, adding that steeper costs would only create another barrier in a market where there are already hurdles trying to get people to adopt the technology. He predicts the cost will remain stable and perhaps even become more affordable. "I'd be shocked if there were a significant increase in price even with new innovations because that would be counterproductive in the long run," Vance notes.
Even if new speech recognition features make the technology more expensive, health care systems may save money in the long run, an economic equation that will make it an attractive option, Mahoney says. "Ultimately, they may be spending more of their budget on these kinds of technologies because the applications are getting broader and broader. But if we do our job right, they will be spending less money overall because the applications will make users more productive and more efficient," he says.
— Beth W. Orenstein of Northampton, Pennsylvania, is a freelance medical writer and a regular contributor to Great Valley Publishing's magazines.