April 23, 2012
Conflicting Messages
By Selena Chavis
For The Record
Vol. 24 No. 8 P. 20
While past studies tout the cost-effectiveness of speech recognition, a new report suggests traditional transcription may produce better quality.
Advances in speech recognition software over the last decade have taken the technology from near obscurity in the healthcare field to a mainstream strategy in the greater HIT movement. Alongside the technology’s general ability to deliver a solid return on investment (ROI), federal initiatives such as the HITECH Act and meaningful use are further driving adoption as healthcare organizations try to create efficiencies and standardize patient care practices.
Consider that the KLAS research group pointed to high expectations for future growth in its Speech Recognition 2010 report, which at the time revealed that one in four hospitals were already using the technology. It’s clear that speech recognition practices are gaining acceptance and influencing documentation procedures across the healthcare landscape.
That said, the technology’s use has also raised more than one eyebrow in the transcription field as concerned professionals look to define their role while the electronic movement continues to evolve. Opinions on where the field is headed run the gamut from some industry professionals predicting its demise to others seeing an elevated role for transcriptionists who can morph into editors.
While many studies laud the efficiencies of eliminating traditional transcription practices for the benefits of speech recognition, a recent study suggests that concerns voiced over the years about the potential for errors may need to be revisited. Published in the October 2011 issue of the American Journal of Roentgenology, the research revealed that breast imaging reports generated using an automatic speech recognition system were eight times as likely to contain major errors than those generated with conventional dictation transcription.
It’s one of several studies conducted in recent years, with others revealing significant benefits to speech recognition. The question for hospitals and clinicians is how to weigh the benefits against the risks. Three studies are outlined below with professional insights into the technology’s pros and cons along with best-practice advice.
The Studies
• “Error Rates in Breast Imaging Reports: Comparison of Automatic Speech Recognition and Dictation Transcription”: This aforementioned study published in the American Journal of Roentgenology reviewed 615 complex cases occurring on multidisciplinary team rounds at a breast imaging center in which 308 reports were generated with automatic speech recognition and 307 were constructed using conventional dictation transcription. In the case of speech recognition, a radiologist would dictate the report, and the software would immediately transcribe the information onto a computer screen.
According to coauthor Anabel Scaranelo, PhD, MD, of Toronto-based University Health Network, the study revealed “that at least one major error was found in 23% of automatic speech recognition reports compared with 4% of conventional dictation transcription reports.” She described major errors as those that impacted the understanding of the report or affected patient management. Errors could include mishaps such as an incorrect unit of measurement (eg, millimeter instead of centimeter) or missing a word such as “no” in the case where a report should have read “no malignancy.”
The study found that the error rate increased when breast MRI reports were looked at separately. In this case, the major error rate was 35% for speech recognition reports and 7% for conventional reports. “We think this is because MRI reports are more complex, with more description,” Scaranelo said in a press release.
In the past, speech recognition critics have pointed to the technology’s inability to understand heavy or foreign accents. In this case, Scaranelo pointed out that native language of the radiologist dictating had no effect on the automatic speech recognition report error rate. “We thought that there may be a higher error rate for non-native English speakers because the software works with voice recognition, but that didn’t happen,” she said.
Prior to the study, the speech recognition system was fed with several hours of dictation from the radiologists, where it “learned” from the voice data that was input, the report suggested.
• “The Utility and Cost Effectiveness of Voice Recognition Technology in Surgical Pathology”: Walter Henricks, MD, a pathologist at the Cleveland Clinic and coauthor of this 2002 study published in Modern Pathology, notes that while speech recognition has made major inroads in radiology, it has not been integrated into pathology workflows with as much success. The genesis for the study was the researchers’ beliefs that speech recognition had the potential to improve efficiency and reduce transcription delays and costs within pathology operations.
The team investigated the utility and cost-effectiveness of targeted speech recognition in surgical pathology, focusing specifically on the grossing process. The pathology process begins with patient tissue arriving in the lab, at which time the “gross” description is dictated for reporting purposes. The report also includes the pathologist’s more descriptive findings and diagnosis.
“A lot of the gross description is repetitive,” Henricks says, adding that it was not difficult to set up templates within the speech recognition system that provided boiler-plate descriptions. “We did not go into this study expecting to have to use free-text description.”
Instead of dictating the gross description of the report each time, professionals were able to call out numbers that corresponded to the boiler-plate descriptions. Henricks notes that smaller specimens—approximately two-thirds of the total fell into this less complex category—were much more amenable to the process.
Templates for speech recognition were developed for all reports, and free-text speech entry was used to enter information not covered by templates. A computer program was written to analyze the number of lines of text entered, and overall cost savings relating to transcription services was calculated based on per-line costs from an outside agency.
According to Henricks, the study was completed over an 18-month period during which gross descriptions for an average of 5,617 specimens per month were entered using the speech recognition technology. Same-day processing of specimens was achieved for those received after the previous day’s processing cut-off time, an average of 35 specimens per day.
By focusing only on the gross description template process, monthly savings of $2,625 were achieved by having the technology generate an average of 23,864 lines of text. Henricks says the pathology facility estimated a complete ROI on the cost of the technology could be achieved in 1.9 years.
Despite the study being 10 years old, Henricks is confident the findings remain pertinent. “I think the results and outcome would be generally applicable today,” he says. ”There are ways in which the technology can be more integrated into the departmental systems rather than as an interfaced third party, but I think the other considerations would generally apply.”
• “Lessons Learned From Implementation of Voice Recognition for Documentation”: When Bob Hoyt, MD, a physician with the School of Allied Health and Life Sciences at the University of West Florida, realized that the Navy had purchased a large number of speech recognition licenses, he felt it was a prime opportunity to study the technology’s effectiveness. Hoyt, a physician with the Naval Hospital Pensacola at the time, says it was a massive deployment spread over several specialties.
Published in 2010 in Perspectives in Health Information Management, the research—which did not consider ROI—evaluated the implementation process involved in deploying speech recognition to document outpatient encounters in the EHR at Naval Hospital Pensacola and its 12 outlying clinics. Hoyt says 75 clinicians volunteered to adopt the speech recognition technology, 64 of whom responded to an online postimplementation survey to identify variables related to acceptance or discontinuance.
“If this [implementation] was a huge success, the process could be deployed at other military hospitals,” he says.
The study analyzed variables such as user characteristics, training experience, logistics, and utility. The results showed a drop out rate of 31%, with the other 69% who continued using the technology expressing satisfaction. Specifically, the satisfied users found the software to be accurate and faster than typing, allowing for patient encounters to be closed the same day of service. They also felt the overall process improved note quality.
Hoyt says the drop out rate was related primarily to inadequate training at an outlying clinic and decreased productivity associated with speech recognition errors. “We did not do a good job of follow-on training,” he notes, pointing out that while the researchers attempted to train the doctors as best they could, there was not a process for following up after the first phase of use. “I suspect that is typical since physicians tend to be resistant to adopting new things.”
What Does It All Mean?
Henricks points out that the pathology study purposefully focused on speech recognition use for noncomplex dictation that could easily be deposited into templates, avoiding the challenges associated with using speech recognition for free text. “We have not implemented [speech recognition] for the pathologist phase because we did not see the value of that,” he notes. “If we were relying on free text [speech recognition], I don’t think it would be as effective in our environment.”
In contrast, the breast imaging study encompassed more complex cases, where radiologists dictated free text and the software automatically generated a report on the computer screen. “The results of our study emphasize the need for careful editing of reports generated with [speech recognition]. They also show a strong need for standardized templates and use of structured reports, especially for breast MRI,” Scaranelo said in the press release.
Currently, there are two approaches to using speech recognition in a healthcare setting. Front-end speech recognition refers to the process where a clinician dictates directly into a speech recognition engine and is responsible for editing the document to ensure accuracy. In back-end speech recognition processes, providers dictate into an electronic system that drafts a document and delivers the file to an editor, where the report is finalized.
While many national initiatives, such as meaningful use, are pushing the industry to use more front-end workflow strategies with speech recognition, Henricks believes this movement may not be the most judicious approach. “Pathologists need to be doing pathology work,” he says. “The more clerical work you transfer to physicians, the less productive they are. It’s a distraction and takes away from their role as physicians.”
Reiterating the need for careful editing, Henricks also suggests that one of the dangers of speech recognition is that errors are not always easy to identify because the technology always uses a “real word.” “To the human eye, that doesn’t stand out as much as a true error,” he explains.
Hoyt points out that a premise behind using speech recognition with EHRs is to help alleviate what many physicians found in the study to be an overabundance of time spent entering data, ultimately leading to lower adoption rates. Potential improvements to speech recognition could be made in the way of speed and accuracy, decreasing one of the key barriers to EHR adoption, he says.
While the front-end rollout of speech recognition proved advantageous for many physicians at Naval Hospital Pensacola, Hoyt points to the need for solid ongoing monitoring and training of physicians. “It’s important that you have in mind what you are going to do each week for those struggling,” he says, adding that he believes such a strategy could have cut the number of failures in half. “Anytime you introduce something new, you will get good success with early adopters. You have to turn your attention to those who don’t like change.”
While foreign accents did not appear to be a factor in the breast imaging study, Hoyt notes that they did come into play in the Naval hospital study. “For physicians with a foreign accent, it was more of a struggle,” he says. Study results revealed that 35% of those who discontinued use of speech recognition cited “failure to recognize the user’s voice” as the reason.
Looking ahead, Hoyt and Henricks believe that transitioning transcriptionists to assume more of an editor role could become a staple of speech recognition deployment. Hoyt believes that the use of transcription services will decrease but not disappear entirely, especially as electronic systems begin to generate elements such as continuity-of-care documents.
There’s also the matter of changing physician habits. “There are always going to be some physicians who just prefer transcription, even with its shortcomings,” Hoyt says. “Whether hospitals use transcriptionists for editing remains to be seen.”
— Selena Chavis is a Florida-based freelance journalist whose writing appears regularly in various trade and consumer publications covering everything from corporate and managerial topics to healthcare and travel.