Speech recognition (also known as automatic speech recognition or computer speech recognition) converts spoken words to machine-readable input (for example, to keypresses, using the binary code for a string of character codes). The term voice recognition may also be used to refer to speech recognition, but more precisely refers to speaker recognition, which attempts to identify the person speaking, as opposed to what is being said.
Speech recognition applications include voice dialing (e.g., "Call home"), call routing (e.g., "I would like to make a collect call"), domotic appliance control and content-based spoken audio search (e.g., find a podcast where particular words were spoken), simple data entry (e.g., entering a credit card number), preparation of structured documents (e.g., a radiology report), speech-to-text processing (e.g., word processors or emails), and in aircraft cockpits (usually termed Direct Voice Input).
Hands-free computing is a term used to describe a configuration of computers so that they can be used by persons without the use of the hands interfacing with commonly used human interface devices such as the mouse and keyboard. Hands-free computing is important because it is useful to both able and disabled users. Speech recognition systems can be trained to recognize specific commands and upon confirmation of correctness instructions can be given to systems without the use of hands. This may be useful while driving or to an inspector or engineer in a factory environment. Likewise disabled persons may find hands-free computing important in their everyday lives. Just like visually impaired have found computers useful in their lives.
This can range from using the tongue, lips, mouth, movement of the head to voice activated interfaces utilizing speech recognition software and a microphone. Examples of available hands-free computing devices include mouth-operated joysticks types and camera based head tracking systems. The joystick types require no physical connections to the user and enhances the user's feeling of independence. Camera types require targets mounted on
the user, usually with the help of a caregiver, that are sensed by the camera and associated software. Camera types are sensitive to ambient lighting and the mouse pointer may drift and inaccuracies result from head movements not intended to be mouse movements. Other examples of hands-free mice are units that are operated using switches that may be operated by the feet (or other parts of the body).
- To build a generic application interface based on voice/speech recognition.
- To build a context based search for keywords that user enters so that appropriate action can be taken when appropriate keywords are encountered. E.g. the system must be able to start or open up Media Player even if the user annotates “MEDIA” or “SONGS” or “MUSIC” or “MP3”.
- User must be able to set/change the system preferences and context search parameters to his/her needs.
- Even multilingual commands must be accepted and recognized. For this the system must allow the user to map standard actions/commands to new keywords (keywords from different languages) (Optional)
- System must also allow the user to create new actions and map respective voice commands to them.
- Using advanced Robot API, the system must generate mouse, keyboard events so that almost all parameters of a standard browser can be controlled using this system.
The context-guided information retrieval process involves semantic keyword extraction and clustering to automatically generate new, augmented queries. The latter are submitted to a host. The results are then semantically re-ranked, again, using context. It is our belief that letting context guide the search provides a better match to the user’s current needs than just relying on the user’s fixed personal profile. Results show that using context to guide search effectively offers even inexperienced users an advanced control tool.