Nevertheless, there is still one very large problem confronting those
mad scientists
who would make the above vision a reality instead of the transcriptionists'
greatest
nightmare. And that problem is mathematical. It is great enough that voice
recognition
technology cannot replace the majority of medical transcriptionists for
years, although
this technology will certainly redefine our jobs in the next decade. To
understand the
reason, one must understand a little about a linguistic concept the phoneme
and a little
about how voice recognition technology works.
A phoneme is the smallest recognizable, uniquely discrete sound element
in a
language. The English language has around fifty phonemes, depending on
the dialect
or accent of the speaker. All words (and phrases) are composed of combinations
of
these phonemes.
The computer, via a microphone, receives the utterances of the dictator
and the voice
recognition software parses or divides them into constituent phonemes.
It then
translates these phonemes into computer code, a series of zeros and ones,
the only
language a computer can understand. The software then matches up these
patterns of
ones and zeros with a predefined list of words and phrases, called a dictionary,
which
is stored in the computer.
This dictionary may have several hundred thousand words and phrases in
it, including
the common words such as "the," "woman," medical words such as "cholelithiasis"
or common phrases and boiler plate, such as "within normal limits." After
finding the
closest match, the voice recognition software prints the corresponding
English word
or phrase onto the computer screen, asking the dictator if this is the
correct choice. If
the dictator flags the choice as incorrect, the program prints the next
closest match.
At any point, the dictator can make a final choice from a menu of likely
alternatives
with a mouse or keystroke or voice actuation.