Copyright © 1992 by Alan Stancliff. All rights reserved.

And yet, there still remains that one large stumbling block to the development of the
robot transcriptionist" which makes your replacement by a computer unlikely in the
very near future. The Achilles tendon is that the voice recognition software cannot
determine when a word or phrase begins and ends, thus eliminating "continuous
speech recognition." That is why the dictator must pause between each word or
predefined phrase. And in natural speech, we do not pause between words, but let
them flow together in a connected series of sounds. And medical schools do not let
one graduate unless one can rapidly utter an astonishing variety of syllables in a
monotone delivery such that all fifty or so English phonemes greatly resemble one
another. This makes the determination of where a word begins and ends for the
human difficult enough; for the computer, it is impossible now and for the foreseeable
future.

To understand the magnitude of this problem, we need to simplify it. Therefore, let us
look at a fictitious language which has a limited supply of phonemes such that there
are only four possible syllables and no word greater than five syllables, no sound-alike
words (homonyms such as two, to, and too), and dictators who precisely pronounce
each word. And let us represent each syllable with a number, either one, two, three, or
four.

In our imaginary language, there could only be four one-syllable words (remember, no
homonyms). These monosyllabic words could only be the equivalent of one, two,
three, or four. However, there are sixteen possible two-syllable words. Let us look at
each two-syllable word that begins with the syllable one. We have the series one-one,
one-two, one-three, and one-four. That exhausts the possibilities of two-syllable
words beginning with one. There are also four possible two-syllable words beginning
with two, i.e. two-one, two-two, two-three, and two-four. This process could be
repeated for two-syllable words beginning with three and four.

When we finish the tally of all two-syllable words in this imaginary language, we have
a total of 42 or sixteen. Adding the one-syllable words and two-syllable words
together, we come up with 4 to the 1st power plus 4 to the 2nd power or 4 + 16 or
20. And by sitting down with pencil and paper, we could show that there would be 4
to the 3rd power or 64 three-syllable words, 4 to the 4th power or 256 four-syllable
words, and 4 to the 5th or 1024 five-syllable words. Thus our extremely limited
language of four syllables of words no larger than five syllables would have a potential
vocabulary of 1364 words.