Page images

The knowledge engineer

The knowledge engineer is that second party just discussed. She works intensively with an expert to acquire domain-specific knowledge and organize it for use by a program. Simultaneously she is matching the tools of the Al workbench to the task at hand-program organizations, methods of symbolic inference, techniques for the structur. ing of symbolic information, and the like. If the tool fits, or nearly fits, sbe uses it. If not, necessity mothers Al invention, and a new tool gets created. She builds the early versions of the intelligent agent, guided always by her intent that the program eventually achieve expert levels of per. formance in the task. She refines or reconceptualizes the system as the increasing amount of acquired knowledge causes the Al tool to "break" or slow down intolerably. She also refines the human interface to the intelligent ageat with several aims: to make the system appear "comfortable" to the human user in his linguistic transactions with it; to make the system's inference processes understandable to the user, and to make che assistance controllable by the user when, in the context of a real problem, he has an insight that previously was not elicited and therefore not incorpo rated.

In the next section, I wish to explore (in summary form) some case studies of the knowledge engineer's art.



I will draw material for this section from the work of my group at Stanford. Much exciting work in knowledge engineering is going on elsewhere. Since my intent is not to survey literature but to illustrate themes, at the risk of appearing parochial I have used as case studies the work I know best.

My collaborators (Professors Lederberg and Buchanan) and I began a series of projects, initially the development of the DENDRAL program, in 1965. We had dual motives: first, to study scientific problem solving and discovery, par. ricularly the processes scieatisus do use or should use in inferring hypotheses and theories from empirical evidence; and second, to conduct this study in such a way that our experimental programs would one day be of use to working scientists, providing intelligent assistance on important and difficult problems. By 1970, we and our co-workers had gained enough experience that we felt comfortable in laying out a program of research encompassing work on theory formation, knowledge utilization, knowledge acquisition, explanation, and knowledge engineering techniques. Although there were some surprises along the way, the general lines of the research are proceeding as eavisioned.


Generation-and-test: Omnipresent in our experiments is the

“classical" generation-and-test framework that has been the hallmark of Al programs for two decades. This is aot a coasequence of a doctrinaire attitude on our part about heuristic search, but rather of the usefulness and sufficiency of the concept.

Situation Action Rules: We have chosen to represent the

knowledge of experts in this form. Making no doctrinaire claims for the universal applicability of this representation, we nonetheless point to the demonstrated utility of the rule-based representation. From this represeatation llow rather directly many of the characteristics of our programs: for example, ease of modification of the knowl edge, ease of explanation. The esseace of our approach is that a rule must capture a "chunk" of domain knowledge that is meaningful, in and of itself, to the domaia specialist. Thus our rules bear only a historical relationship to the production rules used by Newell and Simon (1972) which we view as "machine-language programming' of a recognize act machine.

The Domain-Specific knowledge: It plays a critical role in

organizing and constraining search. The theme is that in the knowledge is the power. The interesting action arises from the knowledge base, not the inference engine. We use knowledge in rule form (discussed above), in the form of inferentially-rich models based on theory, and in the form of tableaus of symbolic data and relationships (i.e., frame-like structures). System processes are made to COBform to natural and convenieat representations of the do main-specific knowledge.

Flexibility to modify the knowledge base: If the so-called

"grain size" of the knowledge representation is chosen properly (i.e., small enough to be comprehensible but large enough to be meaningful to the domain specialist), then the rule-based approach allows great flexibility for adding, removing, or changing knowledge in the system.

Line-of-reasoning: A central organizing principle in the de

sign of knowledge-based intelligent agents is the maintenance of a line-of-reasoning that is comprehensible to the domain specialist. This principle is, of course, not a logical necessity, but seems to us to be an eagineering principle of major importance.

Multiple Sources of Knowledge: The formation and main

tenance (support) of the line-of-reasoning usually require the integration of many disparate sources of knowledge. The representational and inferential problems in achieving a smooth and effective integration are formidable engineering problems.

Explanation: The ability to explain the line-of-reasoning in

a language convenient to the user is necessary for application and for system development (e.8., for debugging and for extending the knowledge base). Once again, this is an engineering principle, but very important. What con

As a road map to these case studies, it is useful to keep in mind certain major themes:

Rules of this form are natural and expressive to mass spectrometrists.

Sketch of method

DENDRAL's inference procedure is a heuristic search that takes place in three stages, without feedback: plangenerate-test.

"Generate" (a program called CONGEN) is a generation process for plausible structures. Its foundation is a combinatorial algorithm (with mathematically proven properties of completeness and non-redundant generation) that can pro duce all the topologically legal candidate structures. Coostraints supplied by the user or by the “Plan" process prune and steer the generation to produce the plausible set (i.e., those satisfying the constraints) and not the enormous legal set.

"Test" refines the evaluation of plausibility, discarding less worthy candidates and raak-ordering the remainder for examination by the user. “Test" first produces a "predicted" set of instrument data for each plausible candidate, using the rules described. It then evaluates the worth of each candidate by comparing its predicted data with the actual input data. The evaluation is based on heuristic criteria of goodness-of-fit. Thus, "test" selects the "best" explanations of the data.

"Plan" produces direct (i.e., not chained) inference about likely substructure in the molecule from patterns in the data that are indicative of the presence of the substructure. (Paiterns in the data trigger the left-hand-sides of substructure rules). Though composed of many acoms whose intercoogections are given, the substructure can be manipulated as atom-like by "generate." Aggregating many units entering into a combinatorial process into fewer higher-level units reduces the size of the combinatorial search space. “Plan" sets up the search space so as to be relevant to the input data. "Generate is the inference tactician; “Plan" is the inference strategist. There is a separate Plan" package for each type of instrument data, but each package passes substructures (subgraphs) to "Generale." Thus, there is a uniform interface berween "Plan" and "Generate." User-supplied constraints enter this interface, directly or from userassist packages, in the form of substructures.

Sources of knowledge

The various sources of knowledge used by the DEN. DRAL system are:

stitutes. "m explanation" is not a simple concept, and considerable thought needs to be given, in each case, to the structuring of explanations.


Lo this section I will try to Wustrate these themes with murious case studies.

DENDRAL: Inferring chemical structures

Historical note

Begun in 1965, this collaborative project with the Stanford Mass Spectrometry Laboratory has become one of the longest-lived continuous efforts in the history of Al (a fact that in no small way has contributed to its success). The basic framework of generation-and-test and rule-based representation has proved rugged and extendable. For us the DENDRAL system has been a fountain of ideas, many of which have found their way, highly metamorphosed, into our other projects. For example, our long-standing commitment to rule-based representations arose out of our (successful) attempt to head of the imminent ossification of DENDRAL caused by the rapid accumulation of new knowledge in the system around 1967.



To enumerate plausible structures (atom-bond graphs) for organic molecules, given two kinds of information: analytic instrument data from a mass spectrometer and a nuclear magnetic resonance spectrometer, and user-supplied constraints on the answers, derived from any other source of knowledge (instrumental or contextual) available to the user.


Chemical structures are represented as node-link graphs of atoms (nodes) and bonds (links). Constraints on search are represented as subgraphs (atomic configurations) to be denied or preferred. The empirical theory of mass spectrometry is represented by a set of rules of the general form:

Situation: Particular acomic


Probability, P,

of occurring

Action: Fragmentation of the

particular configuration
(Breaking links)

Valences (legal connections of atoms); stable and unstable configurations of atoms; rules for mass spectrometry fragmentations; rules for NMR shifts; experts' rules for planning and evaluation; user-supplied constraints (contextual).


DENDRAL's structure elucidation abilities are, paradoxically, both very general and very narrow. In general, DENDRAL handles all molecules, cyclic and tree-like. In pure structure elucidation under constraints (without instrument data), CONGEN is unrivaled by human performance. In structure elucidation with instrumeat data, DENDRAL'S performance rivals expert human performance only for a small number of molecular families for which the program has been given specialist's knowledge, namely the families of interest to our chemist collaborators. I will spare this computer science audience the list of names of these families. Within these areas of knowledge-intensive specialization, DENDRAL's performance is usually not only much faster but also more accurate than expert human performance.

The statement just made summarizes thousands of runs of DENDRAL on problems of interest to our experts, their colleagues, and their students. The results obtained, along with the knowledge that had to be given to DENDRAL TO obtain them, are published in major journals of chemistry. To date, 25 papers have been published there, under a series title "Applications of Artificial Intelligence for Chemical Inference: (specific subject)" (see for example, the Buchanan, Smith, et al., 1976, reference).

The DENDRAL system is in everyday use by Stanford chemists, their collaborators at other universities and collaborating or otherwise interested chemists in industry. Users outside Stanford access the system over commercial computer communications network. The problems they are solving are oftea difficult and aovel. The British government is currently supporting work at Edinburgh aimed at transferring DENDRAL to industrial user communities in the UK.


Representation and extensibility. The representation cho sea for the molecules, constraints, and rules of instrument data interpretation is sufficiently close to that used by chemists in thinking about structure elucidation that the knowl edge base has been extended smoothly and easily, mostly by chemists themselves in receat years. Only one major reprogramming effort took place in the last 9 years when a new generator was created to deal with cyclic structures.

Representation and the Integration of multiple sources of knowledge. The generally difficult problem of integrating various sources of knowledge has been made easy in DENDRAL by careful engineering of the representations of ob jects, constraints, and rules. We insisted on a common language of compatibility of the representations with each other and with the inference processes: the language of molecular structure expressed as praphs. This leads to a straightforward procedure for adding a new source of knowledge, say, for example, the knowledge associated with a new type of instrument data. The procedure is this: write rules that describe the effect of the physical processes of the instrument on molecules using the situationection form with molec. DENDRAL work for two reasons: first, a decision that with DENDRAL we had a sufficiently firm foundation on which to pursue our long-standing interest in processes of scientific theory formation; second, by a recognition that the acquisition of domain knowledge was the bottleneck problem in the building of applications-oriented intelligent agents.

ular praphs on both sides; any special inference process using these rules must pass its results to the generator only (!) in the common graph language.

It is today widely believed in Al that the use of many diverse sources of knowledge in problem solving and data interpretation has a strong effect on quality of performance. How strong is, of course, domain-dependent, but the impact of bringing just one additional source of knowledge to bear on a problem can be startling. In one difficult (but not unusually difficult) mass spectrum analysis problem, the propram using its mass spectrometry knowledge alone would have generated an impossibly large set of plausible candidates (over 1.25 million!). Our engineering response to this was to add another source of data and knowledge, proton NMR. The addition on a simple interprelive theory of this NMR data, from which the program could infer a few additional constraints, reduced the set of plausible candidates to one, the right structure! This was not an isolated result but showed up dozens of times in subsequent analyses.

DENDRAL and data. DENDRAL's robust models (topological, chemical, instrumental) permit a strategy of find. ing solutions by generating hypothetical "correct answers" and choosing among these with critical tests. This strategy is opposite to that of piecing together the implications of each data point to form a hypothesis. We call DENDRAL'S strategy largely model-driven, and the other data-driven. The consequence of having enough knowledge to do modeldriven analysis is a large reduction in the amount of data that must be examined since data is being used mostly for verification of possible answers. In a typical DENDRAL mass spectrum analysis, usually no more than about 35 data points out of a typical total of 250 points are processed. This important point about data reduction and focus-of-attention has been discussed before by Gregory (1968) and by the vision and speech research groups, but is not widely understood.

Conclusioa. DENDRAL was an early herald of Al's shift to the knowledge-based paradigm. It demonstrated the point of the primacy of domain-specific knowledge in achieving expert levels of performance. Ils developmcat brought to the surface important problems of knowledge representation, acquisition, and use. It showed that, by and large, the Al tools of the first decade were sufficient to cope with the demands of a complex scientific problem-solving task, or were readily extended to handle unforeseen difficulties. It demonstrated that Al's conceptual and programming tools were capable of producing programs of applications interest, albeit in aarrow specialties. Such a demonstration of competence and sufficiency was important for the credibility of the Al field at a critical juncture in its history.

META-DENDRAL: inferring rules of mass spectrometry

Aktorical note

The META-DENDRAL program is a case study in automatic acquisition of domain knowledge. It arose out of our

•The ralysis of xyctic uning with formula C20H4SN.


META-DENDRAL's job is to infer rules of fragmentation of molecules in a mass spectrometer for possible later use by the DENDRAL performance program. The inference is to be made from actual spectra recorded from known molecular structures. The output of the system is the set of fragmentation rules discovered, summary of the evidence supporting each rule, and a summary of contra-indicating evidence. User-supplied constraints can also be input to force the form of rules along desired lines.


The rules are, of course, of the same form as used by DENDRAL that was described earlier.

Sketch of method

META-DENDRAL, like DENDRAL, uses the generation-and-test framework. The process is organized in three stages: Reinterpret the data and summarize evidence (INTSUM); generate plausible candidates for rules (RU. LEGEN); test and refine the set of plausible rules (RULE. MOD).

INTSUM: gives every data point in every spectrum an interpretation as a possible, (highly specific) fragmentation. It then summarizes statistically the "weight of evidence" for fragmentations and for atomic configurations that cause these fragmentations. Thus, the job of INTSUM is to translate data to DENDRAL subgraphs and boad-breaks, and to summarize the evidence accordingly.

RULEGEN: conducts a beuristic search of the space of all rules that are legal under the DENDRAL rule syntax and the user-supplied constraints. It searches for plausible rules, i.e., those for which positive evidence exists. A search path is pruned when there is no evideace for rules of the class just generated. The search tree begins with the (single) most general rule (loosely put, "anything" fragments from "anything'') and proceeds level-by-level toward more detailed specifications of the "anything." The heuristic stopping criterion measures whether a rule being generated has become 100 specific, in particular whether it is applicable to too few molecules of the input set. Similarly there is a criterion for deciding whether an emerging rule is too general. Thus, the output of RULEGEN is a set of candidate rules for which there is positive evidence.

RULEMOD: tests the candidate rule set using more com

plex criteria, including the presence of negative evidence. It removes redundancies in the candidate rule set; merges rules that are supported by the same evidence; tries further specialization of candidates to remove Degative evidence; and tries further generalization that preserves positive evidence.


META-DENDRAL produces rule sets that rival in quality those produced by our collaborating experts. In some tests, META-DENDRAL re-created rule sets that we had previously acquired from our experts during the DENDRAL proj ect. In a more stringent test involving members of a family of complex ringed molecules for which the mass spectral theory had not been completely worked out by chemists, META-DENDRAL discovered rule sets for each subfamily. The rules were judged by experts to be excellent and a paper describing them was recently published in a major chemical journal (Buchanan, Smith, et al, 1976).

In a test of the generality of the approach, a version of the META-DENDRAL program is currently being applied to the discovery of rules for the analysis of nuclear magnetic resonance data.

MYCIN and TEIRESIAS: Medical diagnosis

Historical note

MYCIN originated in the Ph.D. thesis of E. Shortliffe (now Shorliffe, M.D. as well), in collaboration with the Infectious Disease group at the Stanford Medical School (Shortliffe, 1976). TEIRESIAS, the Ph.D. thesis work of R. Davis, arose from issues and problems indicated by the MYCIN project but generalized by Davis beyond the bounds of medical diagnosis applications (Davis, 1976). Other MYCIN-related theses are in progress.


The MYCIN performance task is diagnosis of blood infections and meningitis infections and the recommendation of drug treatment. MYCIN conducts a consultation (in English) with a physician-user about a patient case, constructing lines-of-reasoning leading to the diagnosis and treatment plan.

The TEIRESIAS knowledge acquisition cask can be described as follows:

In the context of a particular consultation, confront the expert with a diagnosis with which he does not agree. Lead him systematically back through the line-of-reasoning that produced the diagnosis to the point at which he indicates the analysis went awry. Interact with the expert to modify offending rules or to acquire new rules. Rerun the consulcation to test the solution and gain the expert's concurrence.

[blocks in formation]

IF: 1) the patient is a compromised host, and 2) there are rules which meation in their

premise pseudomonas 3) there are rules which meation in their

premise klebsiellas THEN: There is suggestive evidence (.4) that the former should be done before the latter.

My therapy recommendations will be designed to treat for organisms that are either very likely or, although less likely, would have a significant effect on therapy selection if they were present. It is important to cover for the following probable infection(s) and associated organism(s):

Sketch of method

MYCIN employs a generation-and-test procedure of a familiar sort. The generation of steps in the line-of-reasoning is accomplished by backward chaining of the rules. Ag IF. side clause is either immediately true or false (as determined by patient or test data catered by the physician in the comsultacioa); or is to be decided by subgoaling. Thus, "test" is interleaved with generacions and serves to prune out incorrect lines-of-reasoning.

Each rule supplied by an expert has associated with it a “degree of cerainty" representing the expert's confidence in the validity of the rule (a number from 1 to 10). MYCIN uses a particular ad-hoc but simple model of inexact reason ing to cumulate the degrees of certainty of the rules used in an inference chain (Shortline and Buchanan, 1975).

It follows that there may be a number of "somewhat true" lines-of-reasoning-some indicating one diagnosis, some in

INFECTION-I is CYSTITIS (Item 1) PSEUDOMONAS-CEPACIA (ORGANISM-S1 (Item 2) CITROBACTER-DIVERSUS (ORGANISM-4] (Item 3) E.COLI (ORGANISM-3] INFECTION-2 is UPPER-RESPIRATORY. INFECTION (Item 4) LISTERIA (ORGANISM-2) INFECTION-3 is BACTEREMIA (Item 3) ENTEROCOCCUS (ORGANISM-1) (REC-1) My preferred therapy recommendation is as follows: la order to cover for Items (1 2 3 4 5): Give the following in combination: 1) KANAMYCIN Dose: 750 mg (7.5 mg/kg) q12h IM (or IV)

for 28 days Comments: Modify dose in renal failure 2) PENICILLIN Dose: 2.500,000 units (25000 units/kg)

q4h IV for 28 days

« PreviousContinue »