And yet, there still remains that one large stumbling block to the development
of the
robot transcriptionist" which makes your replacement by a computer unlikely
in the
very near future. The Achilles tendon is that the voice recognition software
cannot
determine when a word or phrase begins and ends, thus eliminating "continuous
speech recognition." That is why the dictator must pause between each
word or
predefined phrase. And in natural speech, we do not pause between words,
but let
them flow together in a connected series of sounds. And medical schools
do not let
one graduate unless one can rapidly utter an astonishing variety of syllables
in a
monotone delivery such that all fifty or so English phonemes greatly resemble
one
another. This makes the determination of where a word begins and ends
for the
human difficult enough; for the computer, it is impossible now and for
the foreseeable
future.
To understand the magnitude of this problem, we need to simplify it.
Therefore, let us
look at a fictitious language which has a limited supply of phonemes such
that there
are only four possible syllables and no word greater than five syllables,
no sound-alike
words (homonyms such as two, to, and too), and dictators who precisely
pronounce
each word. And let us represent each syllable with a number, either one,
two, three, or
four.
In our imaginary language, there could only be four one-syllable words
(remember, no
homonyms). These monosyllabic words could only be the equivalent of one,
two,
three, or four. However, there are sixteen possible two-syllable words.
Let us look at
each two-syllable word that begins with the syllable one. We have the
series one-one,
one-two, one-three, and one-four. That exhausts the possibilities of two-syllable
words beginning with one. There are also four possible two-syllable words
beginning
with two, i.e. two-one, two-two, two-three, and two-four. This process
could be
repeated for two-syllable words beginning with three and four.
When we finish the tally of all two-syllable words in this imaginary
language, we have
a total of 42 or sixteen. Adding the one-syllable words and two-syllable
words
together, we come up with 4 to the 1st power plus 4 to the 2nd power or
4 + 16 or
20. And by sitting down with pencil and paper, we could show that there
would be 4
to the 3rd power or 64 three-syllable words, 4 to the 4th power or 256
four-syllable
words, and 4 to the 5th or 1024 five-syllable words. Thus our extremely
limited
language of four syllables of words no larger than five syllables would
have a potential
vocabulary of 1364 words.