The Role of Intelligent Systems in the National Information Infrastructure, страница 25

3.8 Speech and Language Processing

The ultimate goal of natural language-processing (NLP[5]) research is to create systems able to communicate with people in natural languages. Such communication requires an ability to understand the meaning and purpose of communicative actions, such as spoken utterances, written texts, and the gestures that accompany them and an ability to produce such communicative actions appropriately. These abilities, in their most general form, are far beyond our current scientific understanding and computing technology.

Ambiguity is one reason general natural language processing capabilities are difficult to achieve. Human languages all use a small set of resources (such as words, structures, intonations, and gestures) to convey an exceedingly wide, rich, and varied set of meanings. Any one word, structure, or gesture will often be used in many different ways. Although people rarely notice such lexical, structural, semantic and intonational ambiguities, their identification and resolution challenge current speech- and language-processing systems. Another source of difficulty is the difference between what people say (or write) and what they actually mean. People rely on their audience to understand much that is not explicitly said or written, deriving this information from context and common knowledge. Furthermore, people often begin to speak or write before their ideas are well thought out, using the formulation of real utterances as a step in understanding their own partially formed ideas. Both practices result in partial and imperfect evidence for what people are really trying to communicate.

3.8.1 Relevance to the National Information Infrastructure

Because natural language is often the preferred way to communicate with people, it will undoubtedly become a popular means for communicating with computers. Natural language is also currently the most prevalent medium for knowledge representation; most of what humankind knows is stored as written text. Thus, the potential relevance of natural language processing to the NII is immense. Natural language understanding and generation could be central to the next generation of intelligent interfaces; in the short term at least, information management will largely mean text management; and natural language processing of some kind will be needed to facilitate the cooperative work environments required for efficient design and development of the NII.

3.8.2 State of the Art

Natural language processing research has resulted in several significant achievements, including the following: techniques for parsing, semantic interpretation, and discourse modeling sufficient to process realistic database queries posed in natural language; the reuse of language-understanding techniques for generation of reports customized to context, task, and user; statistical models of speech acoustics, word pronunciation, and word sequencing that are sufficiently accurate to support usable speech understanding of restricted vocabulary utterances; speech-generation systems that can generate spoken language with intonation contours that begin to conform to and reflect intended meaning and underlying context; machine-translation systems that can improve the efficiency of human translators by providing useful first drafts; and content-based retrieval systems that can glean useful information from unstructured text documents.

A relatively recent development is the use of statistical models. Typically generated automatically, statistical models can predict with good accuracy simple grammatical features of utterances such as a words part of speech, as well as semantic properties such as a word’s most likely sense in a given context; they thus reduce problems caused by ambiguities in the grammatical and semantic properties of words. Improved discourse modeling is another result of recent research; idealized models of purposive communicative action have been developed based on empirical studies, planning, and collaboration, and the models have been incorporated in experimental systems.