How Well Do You Measure Up?

Home | Subscribe | Resources | Reprints | Writers' Guidelines

October 27, 2008

How Well Do You Measure Up? — Gauging Transcription Efforts Against the New Industry Standards
By Dale Kivi, MBA
For The Record
Vol. 20 No. 22 P. 10

In part one of a two-part series, learn about the evolution of line counts and other nuances that have perplexed transcription folks for years.

It doesn’t matter whether your management expertise was gained through an RHIA or RHIT program, Six Sigma training, Total Quality Management, the school of hard knocks, or any other life experience—everyone accepts that you can’t manage what you can’t measure. It is equally expected that while figures don’t lie, they can create inaccurate impressions.

How can lower line rates result in higher costs for the same document? Where does the quality data come from that demonstrate compliance with a 98% accuracy requirement when so many reports are clearly unacceptable?

Until recently, the medical transcription industry had few, if any, universally applied quantifiable standards, making it nearly impossible to measure the performance of one transcription approach against another. Adding to the confusion, service and technology vendors seem to do their best to prevent apple-to-apple pricing or quality program comparisons, no matter how well you craft your request for proposals.

Fortunately, between the soon-to-be released standards for cost, quality, and turnaround time (TAT) backed by the AHIMA, the Medical Transcription Industry Association (MTIA), and the Association for Healthcare Documentation Integrity (AHDI), definitive scales are now available to help gauge how your transcription approach measures up. Analysis of your transcription process against these new measurement scales, along with a disciplined approach to ensure improvements made to one aspect of the process do not adversely affect other stages, now enables HIM professionals to critically review their efforts and drive quantifiable improvements relative to agreed-upon industry standards.

The Need for Industrywide Standards
There are those who are convinced that transcription has become a commodity, a noncore, back-office business activity that can be managed well enough by any of the dozens, if not hundreds, of vendors vying for your business. Advertisements, white papers, and testimonials all promise better, faster, and cheaper solutions. However, the results are rarely so consistently black and white.

Unfortunately, the school of hard knocks continues to teach painful lessons. Industry insiders all know of bargain basement service vendors who, although they may deliver jobs quickly and claim to be within quality standards, seem to lose clients as quickly as they are gained when physicians revolt against unacceptable quality. On the other hand, colleagues from the IT department may tout the guaranteed productivity improvements for transcriptionists gained through speech recognition without realizing the added cost of the upgraded technology, service contracts, and per-physician licensing fees that have shown in some circumstances to more than outweigh any labor savings.

So before chasing off after industry hype (or being pushed by your chief financial officer or another interested administrator), it is critically important to define a baseline on your cost, quality, and TAT performance with the new measurement scales. Evaluating your operation with these standard units of measure and including detailed performance requirements based on the same auditable scales in your requests for proposals empower you to compare and contrast competing approaches in an unbiased manner and drive measurable process improvements.

Cost, quality, and TAT improvements are certainly achievable in this highly competitive marketplace. Your ability to understand and apply these new standards, however, may be the difference between achieving your expected results and learning expensive lessons.

Cost Control: Volume Measure Standards
Most medical transcription managers and practitioners are familiar with the 65-character line as the expected unit of measure for production volume in the industry. This historic measurement standard is rooted in what now seems like archaic technology: the IBM Selectric typewriter. Since its inception, major technological advancements have been followed by less obvious changes in how such a line is defined, resulting in ambiguity.

As transcriptionists migrated from hourly pay to production-based wages, practitioners began counting the number of pages or the number of horizontal lines of text within a document as their production volume scales. Unfortunately, it didn’t take long for transcriptionists to realize that if they brought the margins in a bit or used a larger font, their volume counts would increase for the same amount of work.

To circumvent such manipulations, a standard line was defined based on the ubiquitous production tool of the time, the IBM Selectric. This brand of electric typewriter produces 10 characters per inch with a 10-point font. When combined with an 8.5- by 11-inch sheet of paper bordered by one-inch margins, the result was a maximum of 65 printed characters and/or spaces per horizontal line. Thus, the 65-character line was born (10 characters per inch multiplied by 61/2 inches of line width equals 65 characters per line).

Applying this common tool and agreed-upon page restrictions, practitioners could simply count the number of horizontal lines within a document and compare production volumes with reasonable accuracy. This legacy-driven scale, now referred to as a “gross” line calculation, considers each horizontal line with characters equally, regardless of the actual number of characters it contains.

As the industry graduated into the software era, technology enabled more precise calculation options. Using the 65-character line as the starting point, software was able to count the total number of characters within a document and divide it by 65. This is considered the 65-character “net” line definition, which eliminated concerns over giving full credit for horizontal lines with limited characters due to end of paragraphs, tabs, or formatting issues.

As transcriptionists and vendors realized their net line volumes were lower than their gross line volumes for the same document, they encouraged calculations based on their effort to be made even more precise by counting keystrokes. After all, if they had to capitalize a letter, it took an effort of two keystrokes (the letter and the shift key). It also took more time to align content into tables or signature blocks, so those “formatted” characters, they logically argued, should also be worth a premium. This effort led to the 65-character net line “with formatting” definition (also referred to as the keystrokes method), which gives premium credit for any characters capitalized, bolded, underlined, or formatted.

Such gross line, net line, and keystroke calculations are supported by essentially every transcription-specific typing application. Unfortunately, with so many conflicting volume measurement scales that can loosely be referred to as a 65-character line, manipulation and abuse began to occur when industry participants focused solely on the cost per line and not the specific line definition being applied.

Consequently, MTIA, with the support of the AHIMA, embarked on an effort to settle on a single volume measurement standard that was easily auditable and unambiguous. The result was the visible black character (VBC) method, which gives credit for every printed character that appears on the document, with each having a value of one.

Every method other than the VBC standard can (and has been) manipulated at one point or another. Gross lines have been inflated by cheating on the margins or increasing font sizes. Sixty-five–character net lines have been supplemented with unseen extra spaces, tabs, and carriage returns. The keystroke/net lines with formatting method, which is driven by values assigned per ASCII character, have been known to be inflated by setting the credit for certain characters to inappropriate values. After all, how would you know (or how could you catch in an audit) whether the period key always counted as five characters?

In the end, the driving force behind the MTIA and AHIMA effort was to define a single, unambiguous, industrywide volume measurement scale that circumvented manipulation. With the VBC standard, a character is either printed on the page or isn’t—there is no extra credit for formatting, unseen spaces, programming language that accompanies each document file, etc. It’s a simple, definitive volume calculation that will be the same regardless of the software used to generate or audit the document.

Figure 1 highlights the various “industry standard” volume definitions enabled by virtually every transcription platform or basic typing application (including Microsoft Word).

AHIMA/MTIA Transcription Volume Measurement Standard: VBC
A VBC is defined as any printed letter, number, symbol, and/or punctuation mark excluding any and all formatting (eg, bold, underline, italics, table structure, or formatting codes). All VBCs can be seen with the naked eye as a mark regardless of whether they are viewed electronically or on a printed page.

Practical Considerations: Volume Measurement/Cost
For comparative purposes, Figure 2 presents equivalent per-line/per-character pricing calculations for the various legacy-counting practices and the AHIMA/MTIA-supported VBC standard. The mathematical conversions used between the competing methods are included in the far right column for easy reference. Note that due to the inherent industry tendency to define volume from a 65-character–line perspective, even when the VBC method is employed, some buyers and vendors alike prefer to multiply the individual VBC character value by 65.

The chart range covers typical entry-level, domestic transcriptionist pay scales on the left up to reasonably competitive domestic service rates on the right. The highlighted row for the 65-character net line definition (without formatting) is used as the base reference, as it had been the predominant standard used within the industry prior to the AHIMA/MTIA-supported VBC white paper recommendation. (Example: If you have a contract for 16 cents per 65-character line including spaces, tabs, and returns, you will see the same cost per document if you establish a VBC contract at $0.00297 per character.)

Quality Control: Statistically Valid Assessments
It seems every outsourced transcription service contract or direct staff member employment agreement includes a requirement to maintain a 98% accuracy level for the documents produced. And although the AHDI-supported error values used to establish scores have been relatively stable and reasonably consistently applied, different interpretations of what constitutes 98% certainly exist. After all, should an acute care report (averaging 55 lines of volume) with three error point deductions be considered equivalent to a radiology report (averaging 10 lines of volume) with the same number of deductions? Moving forward, the AHDI quality best-practices document recommends these targets be expressed as a score of 98 (as opposed to 98%) to avoid such confusion.

The AHDI quality report recognizes that assessments do not produce a percentage of accuracy, as has long been the accepted industry expression. Instead, assessment scores indicate the point values subtracted from the number of lines assessed or point values subtracted per document. Such scores cannot be percentages, since error values are in no way directly proportional to per-line or per-document volume measurements (which vary constantly anyway). Consequently, individual document assessments and/or target average objectives should be expressed simply as quality scores. This clarification makes the ubiquitous requirement of maintaining a 98% accuracy level moot, although a department could express an average score of 98 as a valid performance expectation.

Quality assessment and quality management of healthcare documentation plays a vital role in today’s healthcare marketplace. The quality of the transcribed document directly affects patients by impacting medical decisions and contributing to continuity of care, as well as affecting provider reimbursements. Patient safety issues and rising healthcare costs are driving HIM professionals, accrediting bodies, and healthcare compliance agencies to better ensure the accuracy and clarity of medical records.

To ensure that assessment scores for each employee, department, and vendor reflect consistent evaluation criteria, the AHDI has developed a best-practices document on healthcare documentation quality assessment and management. This effort is intended to provide the industry with repeatable quality program tools for error scoring (assessment) and sample set selection (statistical validity), along with other quality program management standards.

From a practical standpoint, such statistically valid assessments are required to drive true quality management such as targeted training, quality bonuses, or corrective actions such as penalties within vendor contracts to minimize if not eliminate identified deficiencies from recurring.

Quality Assurance Scoring Practices
The AHDI quality assessment and management best practices document offers three scoring formulas based on department management preferences.

1. Error value/volume: Scores are based on error values correlated to the total line counts. For example, if 1,000 lines are assessed during an evaluation period with a combined error value of 30, the resulting quality score would be 97. This is calculated as follows: 1,000 - 30 = 970; (970/1,000) X 100 = 97. This method levels the playing field for transcriptionists who specialize in work types with predominantly short (radiology) or long (psychology) documents.

2. Error value from 100 by document: Scores are based on error values subtracted from 100 for each document, regardless of document size. This method is recommended for acute care reports, which constitute the predominant volume and typically average slightly more than fifty 65-character lines per document. If the same volume analyzed above is from a sample set of 20 reports (averaging 50 lines each), the score would be 1.5 deductions per report (30 error values divided among 20 reports).

You must double the error point totals from the value by volume method to reflect a measurement scale that directly compares with the value from 100 method. With the example used, if the average value by volume deduction of 1.5 is doubled, the result is 3, which leads to same “score” of 97.

3. Pass/fail: A pass/fail assessment is based on a maximum number of error types per document (recommended for department scoring only). Based on the number of critical, major, and minor errors, the document either passes or fails. The department standard for documents to pass will be 98% of those reviewed. This method is advantageous for departments that do not measure production with line counts or that use multiple line-counting methods.
Contracts and employment agreements should specify which method will be used, sampling rates, and expected performance levels. (Some vendors, for example, include sliding scales for deductions off their base line rate if documents fall under quality targets.) Errors should also be tallied by category (critical, major, minor) and type to identify patterns and drive employee development efforts/vendor management.

Error Values
Figure 3 highlights common transcription errors and their respective quality value deductions. Additional error types to support department quality, such as habitual abuse of manually sending documents into the quality assurance (QA) pool, are detailed in the complete AHDI document, which will be available on the AHDI Web site (www.ahdionline.org).

Statistically Valid Sample Sets
The volume of reports required to obtain a statistically valid sample has long been debated within the transcription community. Many practitioners accept the “industry standard” of 3%. However, it is expected that new transcriptionists (either fresh out of school or new to a service or in-house department) will begin their assignment with 100% of their work reviewed until their performance justifies recalibration.

Similarly unjustified from a statistical perspective, many sampling levels are based on the cost allotted to quality management efforts within the outsourced service vendor’s or in-house department’s business model or how QA sampling has traditionally been done at the facility in parallel HIM applications, such as coding.

Practical Guidelines for Sample Size Determination
The objective of any transcription QA effort is to have the number of reviewed reports be large enough to accurately reflect the collective total while not being so large that staff expend unnecessary resources reviewing documents that will not have a material influence on the score determined by a smaller sample set.

When defining your sample size, it is important to consider all the ways documents end up in the hands of the QA/editorial staff. In typical department workflow, reports can be routed into QA review from the following three potential sources. Balanced totals from all sources constitute the complete sample set:

• reports randomly pulled from a medical transcriptionist for in-process QA auditing;

• reports returned by physicians due to errors or corrective edits (vs. new input); and

• reports pulled postdistribution for employee review/development.

Classic “statistic textbook” approaches for determining sample sizes that will produce 95% confidence levels (meaning you can be 95% sure the average scores for the selected reports will be equal to the scores for the entire group of reports if you tested them all) are detailed in the AHDI document, as well as margin of error and standard deviation discussions relevant to proper sampling.

AHDI QA Standard: Required Sample Set Size for Statistical Validity
The recommended production volume to be considered for a statistically valid quality assessment of a full-time transcriptionist is 1,350 lines or 30 documents per quarter, whichever is greater. The AHDI recognizes there are real-world restrictions to being able to achieve optimal sampling. When that is the case, departments need to have mutual agreements on the confidence level, margin of error, and standard deviation that their scores represent to interested contractual or administrative parties.

Practical Considerations: Quality
It is important to acknowledge that many factors that can negatively influence report quality are not transcriptionist dependent. Aspects such as dictator articulation and/or accent, audio quality, and the accuracy of the demographics delivered by the admission, discharge, transfer feed can hinder report quality. Other process characteristics such as diversity of content (extensive work types, localisms such as street or business names for clients being supported in an unfamiliar location, etc) and any facility-specific conventions that may conflict with published AHDI/American Association for Medical Transcription Guidelines (Roman numerals vs. Arabic numerals, bullets vs. dashes, acceptable abbreviations, etc) can substantially impact the quality scores of even the most qualified and skilled transcriptionist.

Various tools that support quality management are available for before and after documents are finalized. The BenchMark KB Knowledge Base (www.interfix.biz), developed in partnership with the AHDI, incorporates real-time access to extensive transcription reference materials such as Stedman’s medical reference library, the AHDI Book of Style for Medical Transcription, a national database of more than 850,000 physicians, and other important reference materials.

Proven by thousands of practitioners to reduce QA markers by 20% to 30%, BenchMark KB is available as a standalone product accessed through a separate browser window or directly integrated into leading transcription software applications. During editing, products such as QA Navigator (www.qanavigator.com) can be used to prompt editors with listings of error types and values. Selected issues are then compiled into a database for extensive tracking and reporting to guide professional development as well as trending issues relative to scores, dictator, report type, site, specialty, editor, etc.

— Dale Kivi, MBA, is director of business development for FutureNet Technologies Corporation and has served on the Medical Transcription Industry Association Billing Methods Principles workgroup, as well as the steering committee for the Association for Healthcare Documentation Integrity Quality Assessment Special Interest Group.

Coming Next Issue: Learn how to take advantage of industry TAT trends and the new cost and quality standards from the AHIMA, MTIA, and the AHDI.

How Well Do You Measure Up? Tables

Figure 1. Commonly Used Medical Transcription Volume Measurement Scales

Figure 2. Conversion Chart for Equivalent Transcription Prices Expressed in Terms of the Different Volume Measurement Scales

Figure 3. Quality Assessment Table for Error Levels, Types, and Values

Error Level	Error Type	Value/Volume Method	Value from 100 Method	Pass/Fail Method
Critical Error	Medical word misuse Omitted dictation Patient ID error Upgrade of lesser error that impacts patient safety	4	8	Document fails with one critical error or more
Major Error	Misspelling of medical or English word Incorrect verbiage usage Minor error upgrade for impact on document integrity Critical error downgrade for limited impact on document integrity	1.5	3	Document fails with three major errors or more
	Failure to flag for QA Abuse of flagging Protocol failure	1	2
Minor Error	Grammar or formatting error Other miscellaneous error	.5	1	Document fails with nine minor errors or more
	Critical or major error downgrade for minimal impact on patient safety or document integrity	.25	.5
	Punctuation Dictator effect error	0	0