Now let us look at a dictation in this imaginary language. Supposing
this dictation is
100 lines containing 1000 words of various lengths, and the number of
syllables in the
dictation is 2500 a nice mean. Let us also assume that the dictionary
has the full
complement of 1364 distinct words possible in this imaginary language.
If the dictator pauses between each word, the computer simply takes each
word and
compares it to its list to find the closest match. Because it is not necessarily
going for
the exact match, it will have to go through the entire list for each word.
Therefore, this
task will be done 1000 x 1364, or 1,364,000 times, not too large a task
for our
microcomputer.
But suppose the program must decide where the words begin and end. It
would then
have to look at each syllable to see if it were a word. It then would
look at each two
consecutive syllable, each three consecutive syllable, etc up to each
five consecutive
syllables. This would mean that it would have to look at all potential
word
combinations, and there would be 2500 possible one-syllable words, 2499
possible
two-syllable words, 2498 possible three-syllable words, etc, for a grand
total of
12,490 potential word combinations. This means that instead of going through
the
match-up process a mere 1,364,000 times, the software would have to perform
this
task 17,036,360 times. That is to say, such a task requires twelve times
as much
computing power. However, the task becomes geometrically more difficult
when we
deal with the real world of English.
Taking this problem to the real world, we need to look at phonemes instead
of just
syllables. Each syllable has at least one phoneme and many have three.
For example,
the word "sick" has three, i.e., "s," "ih," and "kh." Moreover, many of
the entries in
the dictionary are phrases, and so there are many more than a few phonemes.
The
word "choledochoduodenostomy" has around nineteen phonemes; "salpingo-
ophorectomy" has a similar amount. The common phrase "pupils are equal,
round,
and reactive to light and accommodation," which most voice recognition
software
dictionaries contain, has about 44.