November 2013
The Case for a ‘Front-End’ Alignment
By Susan Chapman
For The Record
Vol. 25 No. 15 P. 14
Do the mechanics of front-end speech recognition create a smooth-running operation or throw a wrench into documentation and patient care?
A growing number of physicians are completing their medical transcription via front-end speech recognition, a speech-to-text process that occurs in real time wherever the provider is located. Physicians speak directly into a device, with the results appearing immediately.
Despite the increased popularity of front-end speech recognition, many physicians still employ back-end transcription, in which they dictate into a recorder or phone. In this process, the information frequently goes to the cloud, where transcriptionists can access and transcribe the data to create a document that requires editing before it can be entered into the medical record.
There are two types of speech recognition systems. The first is speaker dependent (front end), in which the system knows the speaker, has been trained on the voice, and recognizes previous corrections. This approach reduces editing time and cost. A speaker-independent (back-end) process uses samplings of thousands of recorded voices to try and determine what is being dictated. It is unaware of the speaker’s identity or the corrections being input, and always makes the same errors.
Benefits vs. Challenges
Gary David, PhD, an associate professor of sociology at Bentley University in Waltham, Massachusetts, notes that front-end speech recognition does not necessarily save labor, and it changes workflow. “Does e-mail make it easier to communicate with people?” he asks. “It can definitely make it easier to create something, but it can also make more work. With e-mail, there is a new mode of communication that we must address every day when, before e-mail, that wasn’t the case. Considering this example, it then becomes important to examine the impacts of new technology on workflow and not just on whether it makes any individual task more efficient.”
David is engaged in research that observes how front-end speech recognition impacts physicians’ workloads. With the absence of transcriptionists on the back end to edit the dictation, the onus to manage the text created by the speech recognition software is on the physicians. Many physicians simply do not have enough time to see patients, dictate notes, and edit records.
“The issue of quality assurance becomes paramount,” David says. “Who is doing the proofreading? In the past, the transcriptionist would correct the errors that invariably came up when doctors dictate. Without the medical transcriptionist, you have no one in charge of quality assurance but the physicians themselves. These programs still make mistakes and, in my research, I found that physicians would sometimes let small errors go—substituting ‘he’ for ‘she,’ for instance. At times, the errors can be very difficult to identify. So it doesn’t make sense for doctors to spend their time proofreading transcriptions.”
Nick van Terheyden, MD, chief medical information officer for Nuance, believes that with a 90% accuracy rate, the company’s speaker-dependent speech recognition technology improves overall workflow. “The software is very sophisticated and can distinguish among accents and specialties, information that becomes part of the provider’s profile,” he explains. “With that data and a limited amount of training, physicians can produce very accurate documents.”
While speech recognition software is appropriate in many cases, van Terheyden, like David, sees time management as an ongoing issue for physicians. “There is a time issue,” he notes. “Before, if I were in an EMR system, I would dictate to a back-end system. The transcript would then be returned to me after review. I expect the medical editor to review the document, so my editing would be minimal.
“In a front-end circumstance, you remove that extra set of eyes. This takes more time. Even with perfect technology, if the provider is in a noisy environment and his profile wasn’t adapted to that or if he has a cold, the recognition would be less accurate and require more review. Luckily, technologies like computer-assisted physician documentation are helping to further automate this process, which streamlines documentation and ensures the integrity of the patient note starting at the point of care.”
Both David and van Terheyden point to the review process as the main potential time cost for busy physicians. This challenge can be compounded if a physician works in multiple environments, some of which employ front-end speech recognition while others use back end. “It’s key that we give physicians the choice to optimize workflow, whichever system they prefer,” van Terheyden says.
Tracy “Bud” Lawrence, MD, an emergency department (ED) physician, the director of risk management, and an IT physician champion at Henry Mayo Newhall Memorial Hospital in Valencia, California, uses front-end speech recognition software. “In our ED, we exclusively use speech recognition and are happy with it,” he says. “Prior to this, we used standard dictation to update the medical records of approximately 50,000 patients per year. This translated into about $1.5 million spent on back-end transcription each year. Now, we’re realizing a significant financial savings.”
In contrast to David and van Terheyden’s observations, Lawrence believes front-end speech recognition technology has improved the ED’s overall workflow. “Previously, we had to complete charts in one block of time,” he says. “We had to set aside time to do this. Doctors would write notes on paper and then, at the end of the day, they would dictate all the charts. Two and three hours after the end of a shift, some doctors would still be dictating. Because of this, they were missing fine details and historical items. Charts were not as accurate.”
With speech recognition, and EMRs in general, Lawrence says physicians can create staged documents, which they can save, and then input test results. They also can view data on-screen, make final edits, and sign off on charts. “What we noticed was that our physicians were getting out of their shifts on time,” he says. “This is a huge physician satisfier.”
Lawrence adds that when the new software first was introduced, physicians had much to learn. “In speech recognition, there are no spelling errors. Instead, the system replaces a misspelled word with something that sounds close to it,” he says. “Therefore, we had to learn to tailor our editing to those things that are commonly misrecognized. The bottom line is that the editing process has a steep learning curve.”
van Terheyden notes that physicians using speech recognition to update an EMR at the point of care potentially could impact the patient experience. “When you’re dictating or typing, you’re taking focus away from the patient,” he says. “Dictation reduces eye contact. What I’m hearing from the field is that, typically, physicians will interact with the patient and then move to dictation.”
Ben Brown, vice president of business development and investment services at KLAS Enterprises, recently coauthored a study on how speech recognition operates in both back- and front-end environments. The researchers discovered that most physicians preferred speech recognition over pointing and clicking through a patient encounter and appreciated the software’s enhanced flexibility and ability to paint a broader picture of patients’ stories.
“Another positive is that you do get the context,” he says. “An EMR is point and click. Getting the context of patient history with details can lead to better patient care. Also, when clinicians do speech recognition on the spot, they actually complete a patient report much quicker than waiting until the end of the day or waiting for a transcriptionist to create a document that then must be reviewed, edited, and finalized. We saw radiologists who adopted speech recognition witness their productivity and competitiveness increase quite a bit.”
The researchers also observed age to be a factor in speech recognition success. “Younger physicians tend to adapt to this technology more readily than established practitioners,” Brown says. “Generally speaking, for those who have spent years documenting patient encounters by phone, paper chart, or other means, getting used to and training a speech engine is challenging. Many of those who are challenged by this unfamiliar technology are receiving assistance from their facilities, with health care systems utilizing physician champions and superusers who can help physicians overcome the adoption challenges.”
The study found that recognition rates for accents, small words, words with more than one meaning, and other nuances still are problematic. “Despite these issues, we see adoption happening pretty aggressively now,” Brown says, adding that physicians resistant to the technology can employ back-end speech recognition software. “In this case, the transcriptionist becomes an editor; [the physician] just has to review the text for accuracy. It doesn’t change physician workflow, but transcriptionists can turn the data around quicker. And as the need for traditional medical transcriptionists dwindles, those roles can be changed to editors.”
Free vs. Fee: Security and Other Issues
Steve Jones, CEO of technology company PortNexus, believes there are only two ways to improve transcript accuracy: ensure input quality and enlist an efficient translation system. “The old saying ‘garbage in, garbage out’ definitely applies to accuracy in dictation,” he says. “We have all had the very funny moments with the ‘free’ speech recognition tools of today. For business, funny costs time, which is money and, if the chosen components are not put together or selected properly, compliance with regulatory requirements such as HIPAA can become a challenge at a minimum.”
Jones adds that when users select free speech recognition tools, they must be certain data are protected and the service is HIPAA compliant. Otherwise, the data could be sitting on a marketing server. “Some free vendors—if you read the fine print in the license agreement—can upload your contacts from your phone for use in other ways,” he says. “Free is not free.”
The hardware used, the author’s ability to communicate effectively, and how the sound is manipulated at the hardware level all affect input quality, Jones says. Today’s mobile environment features components such as Bluetooth headsets, hands-free recording systems, and both wired and voice-over Internet protocol phones, not all of which are ideal. “If physicians are dictating into a device with a Bluetooth headset that is approved for a particular speech recognition program, they’ll get a high quality of sound capture,” he says. “You cannot expect to have the same quality of sound from a $25 headset as one that costs $100. Higher-end headsets do much better because of the noise cancellation systems built in. We tell our clients that even the highest costing headsets are a low-cost investment to improving the quality of the recording.”
Speaker-independent systems also can be a source of potential trouble. “When physicians use speaker-independent systems, they require much more editing every time the speaker dictates,” Jones says. “In most cases, the converted text is sent back to the author for editing and placement, removing the potential for workflow automation to reduce costs. This type of system is also where the risk of it not being HIPAA compliant is increased. Where is the data sitting? How did it get to me? Who else has access to the data and for what purpose? These are all questions that should be answered before embracing this type of speech recognition system.”
To determine a speech recognition system’s cost benefits, several factors must be weighed. “The time it takes to get an accurate record must be included in the total value calculations,” Jones says. “Who does the clean up editing and proofing once a draft transcript—no matter how good—has been created should be considered as well. Limiting the time spent on editing reduces costs to the process, and paying attention to the engine being used to translate is just as important.”
David expresses concern over unnoticed errors in the medical record. “If there is an error in the EMR, how does that hold up in an audit, coding, or court?” he questions. “Would even the smallest error introduce reasonable doubt in the eyes of a judge or jury?”
David believes that physicians should consider “recipient design”—an awareness that there is another party (or parties) in the interaction—when creating medical records. “Physicians should be asking, ‘Who am I trying to make this record usable for? How does it impact usability for coders, patients, other physicians, clinical documentation specialists, or the court system?’ We can’t assume a pediatrician can read a note from an ophthalmologist, for example. Consequently, users should always remain conscious of an imagined other when dictating into the EMR. And, given the number of potential audiences for the medical record, creating one record to satisfy everyone can be daunting. There should be something in the workflow to help physicians with the process of making records that are more usable vs. just figuring out how to get them done more quickly.”
The Industry’s Future
Currently, there are limited vendors in the front-end speech recognition market, mainly because of a high barrier to entry, according to Brown. “Although there are many competitors in the radiology technology space, Nuance capitalized on the broad health care market for a long time,” he says. “Speech recognition is complex technology, and there is a complex language in health care. That presents challenges to companies when developing this type of software.”
Besides Nuance, Dolby also offers a front-end solution. More recently, M*Modal has developed its own proprietary applications in the cloud to compete with Nuance and Dolby.
The technology is evolving, says Jonathon Dreyer, Nuance’s director of mobile solutions marketing, adding that the cloud is bringing big changes to clinical documentation. “Cloud-based front-end medical speech recognition is being adopted at an increasing rate, especially as clinicians turn to tablets and smartphones for EMR access and real-time documentation,” he says. “Hardware and processing capacity varies from device to device, so the technology works by streaming audio and text in real time, offsetting processing requirements from mobile devices. The technology has advanced so rapidly in the past two years that the speech-to-text user experience on mobile devices is very similar to that of the desktop.”
As health care systems work to better integrate new technology into the patient experience, van Terheyden believes speech recognition eventually will become a passive player, with the software allowing physicians to face the patient and ask questions while the program records information, extracts content, and formulates patient notes.
“I also feel that the physician can then engage the patient more deeply in the process,” he says. “For instance, at the end of the consultation, the physician and patient can look at the screen and review the information. In this way, the technology becomes a partner. That, for me, is where we want to go. We hope to improve physician-patient engagement and use the technology to support and enhance that relationship.”
— Susan Chapman is a Los Angeles-based writer.