Talking Computers

Dr.S.S.Verma, Department of Physics, S.L.I.E.T., Longowal, Distt.-Sangrur-148 106, Punjab

2018-01-20 09:25:45



Man made machines to help him in performing some of his dull and difficult works. When the machines worked wonders and provided solutions for many of his complicated tasks, he shifted his energy to build more powerful ones that will bring in more prosperity easily, without the necessity of putting in labor. And thus the science of Artificial Intelligence emerged. Specifically the progress in computer development and its applications has worked wonders in a very short span of time.  Today we have computers doing almost every machine work except human qualities like understanding, talking and emotion/feelings.  Sitting in a movie theater in the 1960s watching a space odyssey about two astronauts and a computer, the audience encountered a computer that could speak. This computer, not only spoke, he was friendly & understanding and was definitely ahead of its time. The way we interact with computers today  by typing on a keyboard to input information and receiving responses on a video screen  was just being designed. Spoken communication with a computer was a luxury that existed only in science-fiction books and the movies. Talking computers will become a reality soon. Rather than conforming to the expectations about computer voices only the future computers would speak and function like human beings.

Speech Recognition and Synthesis

Speech has an important advantage as an input medium. Speech is our fastest mode of communication (about twice as fast as the average typist). Casual users with relatively little training could access a computer that is capable of receiving speech input. And a speech input allows users free use of their hands for pointing, manipulating a display, flying an airplane, or making a repair. Currently, a lot of research is going on in this area known as Speech Recognition and Synthesis.  Speech recognition is to understand basically what someone speaks to a computer, asking a computer to translate speech into its corresponding textual message. Issues involved in Speech Recognition are more complex than Speech Synthesis. Most of these back-end systems consist of three parts: a speech-to-text engine for turning oral commands into something a computer can understand; a prompt engine, or pre-recorded set of responses to guide the caller; and a text-to-speech engine, which allows a computer to orally send back a response or ask a question that isn't covered by the pre-recorded prompts. Whether in India or abroad, synthesizers equipped with large and accurate vocabulary are still confined to the laboratory even though unlimited speech synthesizers have hit the market at least a decade back. Initially, speech synthesizers were considered for the exclusive use of the handicapped. For example, Professor Stephen Hawkins speaks through a synthesizer. Rapid advances in computer and communication technology, coupled with a growing need for information, have increased the importance of speech technology for all.

History of Developments

Human fascination with talking machines is not new. For centuries, people have tried to empower machines with the ability to speak; prior to the machine age humans even hoped to create speech for inanimate objects. The first scientific attempts to construct talking machines were recorded in the eighteenth century. First such device built in 1779 was able to produce vowel sounds by blowing air through a reed into a variable-resonance chamber that resembled a human vocal tract, starting at the vocal chords and continuing through the mouth. After two decades another device capable of speaking whole utterances was constructed. Since then a tremendous progress has been achieved in speech synthesis. India also has made good progress in speech synthesis. Several Indian institutes are dedicated to speech synthesis research in Indian languages. They have got success in developing a talking dictionary-cum-spellchecker, a text-to-speech synthesizer and a continuous speech synthesizer using formant synthesis technique. The activities picked up from mid 80s following a renewed interest shown by the Govt. for promoting Indian languages in computers, and the availability of super fast computers for implementing speech recognition and synthesis system, which needs a lot of computing power.

Challenges ahead

No doubt, machines that listen and talk like human are becoming a reality. The technical kinks, high costs and application misfires that have held back the acceptance of speech recognition and activation are being ironed out. As a result, companies are coming out with a variety of products that will let consumers’ access to databases using voice commands. However, great variations in pronunciations which exist among people due to various factors like regional accents, high/low pitch, stress and intonation pattern, stammering, different grammars, no fixed rules for speech etc. are still a challenge for speech recognition and synthesis. Humans are able to perform speech recognition with amazing competence, even under extremely adverse circumstances. Even the best recognition system of today are unable to come anywhere near this level of performance. Studying how cognitive skills develops in humans or, better still, how they develop in a child will prove useful while implementing such speech recognition strategies in a computer. Moreover, background noise can also contribute to aural confusion and to the likelihood of misclassification of speech sounds so it has to be controlled.


Applications of Talking Computers

  • Written language is a technology developed 10,000 years ago to store and retrieve new information. With today's information glut, that old technology is failing us. Eighty percent of the world's people are still nonliterate, and governments lack the resources and/or political will to create literate citizens. Talking computers will replace writing, reading, and written language/text by 2050. Using these, everyone will be able to store and retrieve information without reading or writing
  • Storgae and dissemination of information will be easy. Some immediate application could be - access to railway reservation status, flight schedules and latest share prices over the telephone and mobile, reading of e-mail over the phone etc.
  • Speech recognition and synthesis have found applications, not only in robotics, but also in human systems, such as security systems, announcing systems, etc.
  • These will be more reliable and private method to conduct interviews with children on sensitive topics like sex knowledge (sexually transmitted diseases, unwanted pregnancy /sex abuse), drug abuse, terrosim, religion and regionalism.
  • Their (talking computers) usefulness in imparting knowledge to the handicapped is being increasingly felt. There are blind professionals and students who need computer skills just like everybody else. Light and sound from computer screens may well point them towards a new direction. Special screen-reader software enables them (blind people) to listen, rather than see, what they are generating on the screen.
  • Chat-hubs in local languages for the so called illiterate masses as well.