Page images

Appendix E

Edward A. Feigenbaum. “The art of artificial intelligence – Themes and case studies of knowledge engineering,” pp 227-240, in the Proceedings of the National Computer Conference 1978, copyrighted 1978. Reproduced by permission of AFIPS Press.

The art of artificial intelligence-Themes and case studies of

— knowledge engineering

by EDWARD A. FEIGENBAUM Stanford University Stanford, California


RULE 31 IF: 1) The severity of obstructive airways disease of the patieat is greater than or equal to mild, and 2) The degree of diffusion defect of the patient is greater than or equal to mild, and 3) The dlc (body box) observed predicted of the patient is greater than or equal to 110


This paper will examine emerging themes of knowledge engineering, illustrate them with case studies drawa from the work of the Stanford Heuristic Programming Project, and discuss general issues of knowledge engineering art and practice.

Let me begin with an example new to our workbench: a system called PUFF, the early fruit of a collaboration between our project and a group at the Pacific Medical Center (PMC) in San Francisco.*

A physician refers a patieat to PMC's pulmonary function testing lab for diagnosis of possible pulmonary function disorder. For one of the tests, the patient inhales and exhales a few times in a tube connected to an instrumen computer combination. The instrument acquires data on flow rules and volumes, the so-called flow-volume loop of the patient's lungs and airways. The computer measures certain parameters of the curve and presents them to the diagnostician (physician or PUFF) for interpretation. The diagnosis is made along these lines: normal or diseased; restricted lung disease or obstructive airways disease or a combination of both; the severity; the likely disease type(s) (e.g., emphysema, bronchitis, etc.); and other factors important for diagnosis.

PUFF is given not only the measured data but also certain items of information from the patient record, c.8., sex, age, number of pack-years of cigarette smoking. The task of the PUFF system is to infer a diagnosis and print it out in English in the normal medical summary form of the inter. pretation expected by the referring physician.

Everything PUFF knows about pulmonary function diagnosis is contained in (currenty) 55 rules of the IF... THEN... form. No textbook of medicine currently records these rules. They constitute the partly-public, partly-private knowledge of an expert pulmonary physiologist at PMC, and were extracted and polished by project engineers working intensively with the expert over a period of time. Here is an example of a PUFF rule (the unexplained acronyms refer to various data measurements):

4) The observed-predicted difference in
rv/tlc of the patieat is greater than or
equal to 10
1) There is strongly suggestive evidence
(.9) that the subtype of obstructive airways
disease is emphysema, and
2) It is definite (1.0) that "OAD,
Diffusion Defect, elevated TLC, and elevated
RV together indicate emphysema." is one of
the findings.

One hundred cases, carefully chosen to span the variety of disease states with sufficient exemplary information for cach, were used to extract the SS rules. As the knowledge emerged, it was represented in rule form, added to the sysiem and tested by running additional cases. The expert was sometimes surprised, sometimes frustrated, by the occasional gaps and inconsistencies in the knowledge, and the incorrect diagnoses that were logical consequences of the existing rule set. The interplay between knowledge engineer and expert gradually expanded the set of rules to remove most of these problems.

As cumulation of techniques in the art demands and allows, a new tool was not invented when an old one would do. The knowledge engineers pulled out of their toolkit a version of the MYCIN system (to be discussed later), with the rules about infectious diseases removed, and used it as the inference engine for the PUFF diagnoses. Thus PUFF, like MYCIN, is a relatively simple backward-chaining infer

• Dr. J. Osbora, Dr. R. Fallas. Jobn Kunz, Diane McClung.


ence system. It seeks a valid line-of-reasoning based on its rules and rooted in the instrument and patient data. With a little more work at fitting some existing tools together, PUFF will be able to explain this line-of-reasoning, just as MYCIN does.

As it is, PUFF only prints out the final interpretation, of which the following is an example:

PATIENT DATA: The degree of dyspnea: MODERATELY-SEVERE The severity of coughing: MILD Sputum production MODERATELY-SEVERE The number of pack-years of smoking: 48 referral diagnosis: BRONCHITIS IVCIVC-predicted: 80 RV/RV-predicted: 191 FVCFVC-predicted: 37 TLC(body box)observed/predicted: 127 Predicted FEV1/FVC: 83 TLC(DLCO)observed predicted: 83 FEVVFVC ratio: 50 RV/TLC Observed-Predicted: 21 MMF/MMF-predicted: 19 the DLCO/DLCO-predicted: 48 The slope (FSO_obs-F25_obsVFVC obs: 19 DEGREE OF OBSTRUCTIVE AIRWAYS DISEASE: QAD degree by SLOPE: (MODERATELY-SEVERE 700) OAD degree by MMF: (SEVERE 900) OAD degree by FEV1: (MODERATELY-SEVERE 700) FINAL OAD DEGREE: (MODERATELY-SEVERE 910) (SEVERE 900) No conflict. Final degree: (MODERATELY-SEVERE 910) INTERPRETATION: Obstruction is indicated by curvature of the flow-volume loop. Forced Vital Capacity is normal and peak flow rates are reduced, suggesting airway obstruction. Flow rate from 25-75 of expired volume is reduced, indicating severe airway obstruction. OAD, Diffusion Defect, elevated TLC, and elevated RV together indicate emphysema. OAD, Diffusion Defect, and elevated RV indicate emphysema. Change in expired flow rates following bronchodilation shows that there is reversibility of airway obstruction. The presence of a productive cough is an indication that the OAD is of the bronchitic type. Elevated lung volumes indicate overinflation. Air trapping is indicated by the elevated

difference between observed and predicted
RV/TLC ratios.
Improvement in airway resistance indicates
some reversibility of airway
Airway obstruction is consistent with the
patient's smoking history.
The airway obstruction accounts for the
patient's dyspnea.
Although bronchodilators were not
useful in this one case, prolonged use may
prove to be beneficial to the patient.
The reduced diffusion capacity indicates
airway obstruction of the mixed
bronchitic and emphysematous types.
Low diffusing capacity indicates loss of
alveolar capillary surface.
Obstructive Airways Disease of mixed types

150 cases not studied during the knowledge acquisition process were used for a test and validation of the rule set. PUFF inferred a diagnosis for each. PUFF-produced and expert-produced interpretations were coded for statistical analysis to discover the degree of agreement. Over various types of disease states, and for two conditions of match between human and computer diagnoses ("same degree of severity' and within one degree of severity''), agreement ranged between approximately 90 percent and 100 percent.

The PUFF story is just beginning and will be told perhaps at a later NCC. The surprising punchline to my synopsis is that the current state of the PUFF system as described above was achieved in less than 50 hours of interaction with the expert and less than 10 man-weeks of effort by the knowledge engineers. We have learned much in the past decade of the art of engineering knowledge-based intelligent agents!

In the remainder of this essay, I would like to discuss the route that one research group, the Stanford Heuristic Programming Project, has taken, illustrating progress with case studies, and discussing themes of the work.



The dichotomy that was used to classify the collected papers in the volume Computers and Thoughe still characterizes well the motivations and research efforts of the Al community. First, there are some who work toward the coastruction of intelligent artifacts, or seek to uncover priociples, methods, and techniques useful in such constructioa. Second, there are those who view artificial intelligence as (to use Newell's phrase) "theoretical psychology," seeking explicit and valid information processing modeis of human thought.

For purposes of this essay, I wish to focus on the mori. vations of the first group, these days by far the larger of the two. I label these motivations "the intelligent agent viewpoint" and here is my understanding of that viewpoint:

"The potential uses of computers by people to accommarily a coasequence of the specialist's knowledge employed by the agent, and only very secondarily related to the generality and power of the inference method employed. Our agents must be knowledge-rich, even if they are methods-poor. In 1970, reporting the first major summary-ofresults of the DENDRAL program (to be discussed later), we addressed this issue as follows:

plish tasks can be 'one-dimensionalized into a spectrum representing the nature of instruction that must be given the computer to do its job. Call it the WHAT-10-HOW spectrum. At one extreme of the spectrum, the user sup plies his intelligence to instruct the machine with precision exactly HOW to do his job, step-by-step. Progress in Computer Science can be seen as steps away from the extreme *HOW point on the spectrum: the familiar panoply of assembly languages, subroutine libraries, compil. ers, extensible languages, etc. At the other extreme of the spectrum is the user with his real problem (WHAT he wishes the computer, as his iastrument, to do for him). He aspires to communicate WHAT he wants done in a language that is comfortable to him (perhaps English); via communication modes that are convenient for him (including perhaps, speech or pictures); with some generality, some vagueness, imprecision, even error; without having to lay out in detail all necessary subgoals for adequate performance

with reasonable assurance that he is addressing an intelligent agent that is using knowledge of his world to understand his intent, to fill in his vagueness, to make specific his abstractions, to correct his errors, to discover appropriate subgoals, and ultimately to translate WHAT he really wants done into processing steps that define HOW it shall be done by a real computer. The research activity aimed at creating computer programs that act as "intelligent agents" near the WHAT end of the WHAT-TO-HOW spectrum can be viewed as the long-range goal of Al research." (Feigenbaum, 1974)

general problem-solvers are too weak to be used as the basis for building high-performance systems. The behavior of the best general problem-solvers we know, human problem-solvers, is observed to be weak and shallow, except in the areas in which the human problemsolver is a specialist. And it is observed that the transfer of expertise between specialty areas is slight. A chess master is unlikely to be an expert algebraist or an expert mass spectrum analyst, etc. In this view, the expert is the specialist, with a specialist's knowledge of his area and a specialist's methods and heuristics." (Feigenbaum, Buchanan and Lederberg, 1971, p. 187)

Subsequent evidence from our laboratory and all others has only confirmed this belief.

Al researchers have dramatically shifted their view on generality and power in the past decade. In 1967, the canonical question about the DENDRAL program was: “It sounds like good chemistry, but what does it have to do with Al?" In 1977, Goldstein and Papert write of a paradigm shift in Al:

"Today there has been a shift in paradigm. The fundamental problem of understanding intelligence is not the identification of a few powerful techniques, but rather the question of how to represent large amounts of knowledge in a fashion thai permits their effective use and interac. tion." (Goldstein and Papert, 1977).

Our young science is still more art than science. Art: "the principles or methods governing any craft or branch of learning." Art: "skilled workmanship, execution, or agency. These the dictionary teaches us. Knuth tells us that the endeavor of computer programming is an art, in just these ways. The art of constructing intelligent agents is both pan of and an extension of the programming art. It is the art of building complex computer programs that represent and reason with knowledge of the world. Our art therefore lives in symbiosis with the other worldly arts, whose practitioners experts of their art-hold the knowledge we need to construct intelligent agents. In most "crafts or branches of learning" what we call "expertise" is the essence of the art. And for the domains of knowledge that we touch with our art, it is the "rules of expertise" or the rules of "good judgment" of the expert practitioners of that domain that we seek to transfer to our programs.

The second insight from past work concerns the nature of the knowledge that an expert brings to the performance of a task. Experience has shown us that this knowledge is largely heuristic knowledge, experiential, uncertain-mostly "good guesses" and "good practice," in lieu of facts and rigor. Experience has also taught us that much of this knowledge is private to the expert, not because he is unwilling to share publicly how he performs, but because he is unable. He knows more than he is aware of knowing. (Why else is the Ph.D. or the Internship a guild-like apprenticeship to a presumed "master of the craft?" What the masters really know is not written in the textbooks of the màsters.) But we have learned also that this private knowledge can be uncovered by the careful, painstaking analysis of a second party, or sometimes by the expert himself, operating in the context of a large number of highly specific performance problems. Finally, we have learned that expertise is multifaceted, that the expert brings to bear many and varied sources of knowledge in performance. The approach to capturing his expertise must proceed on many fronts simultaQeously.

Lessons of the past

Two insights from previous work are pertinent to this essay.

The first concerns the quest for generality and power of the inference engine used in the performance of intelligent acts (what Minsky and Papert (see Goldstein and Papert, 1977) have labeled "the power strategy''). We must hypothesize from our experience to date that the problem solving power exhibited in an intelligent agent's performance is pri

« PreviousContinue »