The Role of Intelligent Systems in the National Information Infrastructure, страница 12 Scalability

The NII must support both a vast amount of data and a huge number of users with different interests and needs. Although a monolithic information index could combine sources such as telephone books, airline schedules, and encyclopedias, such a centralized scheme would likely suffer from rigidity (being unable to respond to rapid changes in the world) and represent a likely point of failure, especially in high-activity situations. A similar problem results from naive attempts to scale current technology for information dissemination--indiscriminate advertising causes both cognitive and network congestion. In addition to broadcasting, the NII must support "narrow casting." The AI challenge is to identify the select set of people and agents who are likely to be interested in an announcement of a new service (or, symmetrically, who could provide a desired service).

Several factors will contribute to a solution. First, instead of relying on passive indices, NII data access could use brokers that actively scan for new relevant resources. Each broker would specialize on a different subject area and would monitor closely related brokers in addition to primary sources. The onus for advertising rests on the information provider; interested brokers do the rest. An information seeker contacts an appropriate local broker and its request is passed along until it is fulfilled. A centralized index can be seen as the degenerate case of this distributed scheme, but the existence of multiple, competing brokers could provide faster response time, improved specificity, and better adaptation to change in primary sources.

Second, the indices could utilize semantic information rather than mere syntactic criteria. Brokers could reason about index terms, their relation, and their relevance. Simple hypertext versions of extant reference books would serve poorly, because they require a human for navigation. Codified semantic information, however, can be processed by automated agents (as well as a manual browser). Knowledge representation techniques (Subsection 3.1) could enable inheritance along multiple taxonomic dimensions during index formation and allow a broker to determine which other brokers might be relevant during query processing. For this approach to be feasible, however, substantial effort must be invested to leverage existing classification schemes (such as the Dewey decimal system and Chemical Abstracts) and develop comprehensive new ontologies and encodings of commonsense knowledge (Subsection 3.7). Only then will search engines be able to deduce connections between relevant information sources.

2.2.2 Integration and Translation Services

We already commented on the problems posed by heterogeneous data in the NII (Subsection However, even data that are similar in content can vary greatly in form and in the operations that can be performed on them. For example, even within a relatively constrained domain such as health care, the languages of doctors, nurses, patients, and insurance agents differ dramatically. Although much data-format conversion and some level of application-system interoperability are achievable through the development of standards (e.g. the RTF data format, the COBRA and OLE interapplication wrappers), standards enforce the lowest common denominator. By the very nature of the consensus that creates them, standards will always lag behind the needs of NII users.

Instead of data translation, the NII infrastructure should support semantic translation. Instead of simply scaling between currencies (for example, Japanese yen to U.S. dollars) semantic translation would use world knowledge to convert between derived quantities (for example, from raw cost to total cost including applicable import duties, taxes, and fees). Because determining which duties must be paid requires reasoning about the type of merchandise, the scope of services provided, and the relevance of tax codes, AI techniques can provide significant assistance in meeting the challenges of integration and translation. Knowledge representation systems (especially those using modal and higher-order logics for reasoning about representations, Subsection 3.1) form a substrate into which relational and other database systems can be embedded and over which standardized ontologies (Subsection 3.7) can be defined. When the information to be converted is relatively standardized and well structured, representation transformation techniques from automated software development are appropriate (Subsection 3.1). When the information is less structured, machine-translation technology (Subsection 3.8) can be used. In many cases, applications will use negotiation techniques (Subsection 3.6) to select a common language and ontology in order to facilitate coordination and translation.