The Technology of Embedded Speech

An embedded speech device is one that incorporates speech synthesis, footprint speech recognition, speaker verification and identification of the speaker, all in the same product. The sound of the audio in the embedded speech device is first pre-processed, and then the voice properties are digitized and mapped. Finally, language recognition software works to interpret the data and transfers the results to the device. Many embedded speech devices use speaker independent recognition, making it possible for more than one person to use these applications. Embedded speech software use finite state and dynamic grammar, which goes up to 50,000 words. The vocabulary can be increased by using phoneme recognition along with spelling and pronunciation rules, and this allows the software to recognize words it normally would not be able to.

Embedded speech devices are now multilingual. Voice Signal Technologies announced that their embedded speech applications would have the ability to recognize more than one languages. Embedded TTS engines now share dictionaries and language models with embedded speech technology, and this works to improve both efficiency and competency of the system. Improvements in embedded speech synthesis make it possible for digitized voices to sound like, and match authentic voice talents. The technology used in embedded speech software is constantly developing, and with further advancements, more interesting uses of embedded speech technology can be expected in the future.