« PreviousContinue »
Edward A. Feigenbaum. “The art of artificial intelligence – Themes and case studies of knowledge engineering,” pp 227-240, in the Proceedings of the National Computer Conference - 1978, copyrighted 1978. Reproduced by permission of AFIPS Press.
The art of artificial intelligence—Themes and case studies of knowledge engineering
IF: 1) The severity of obstructive airways disease of the patient is greater than or equal to mild, and 2) The degree of diffusion defect of the patient is greater than or equal to mild, and 3) The tc (body box) observed predicted of the patient is greater than or equal to 110 and 4) The observed-predicted difference in rvitic of the patient is greater than or equal to 10
This paper will examine emerging themes of knowledge engineering, illustrate them with case studies drawn from the work of the Stanford Heuristic Programming Project, and discuss general issues of knowledge engineering art and practice.
Let me begin with an example new to our workbench: a system called PUFF, the early fruit of a collaboration between our project and a group at the Pacific Medical Center (PMC) in San Francisco.
A physician refers a patient to PMC's pulmonary function testing lab for diagnosis of possible pulmonary function disorder. For one of the tests, the patient inhales and exhales a few times in a tube connected to an instrument computer combination. The instrument acquires data on flow rules and volumes, the so-called flow-volume loop of the patient's lungs and airways. The computer measures certain parameters of the curve and presents them to the diagnostician (physician or PUFF) for interpretation. The diagnosis is made along these lines: normal or diseased; restricted lung disease or obstructive airways disease or a combination of both; the severity; the likely disease type(s) (e.8., emphysema, bronchitis, etc.); and other factors important for di2.gnosis.
PUFF is given not only the measured data but also certain items of information from the patient record, e.8., sex, age, number of pack-years of cigarette smoking. The task of the PUFF system is to infer a diagnosis and print it out in English in the normal medical summary form of the inter. pretation expected by the referring physician.
Everything PUFF knows about pulmonary function diagnosis is contained in (currently) 55 rules of the IF... THEN... form. No textbook of medicine currently records these rules. They constitute the partly-public, partly-private knowledge of an expert pulmonary physiologist at PMC, and were extracted and polished by project engineers working intensively with the expert over a period of time. Here is an example of a PUFF rule (the unexplained acronyms refer to various data measurements):
THEN: 1) There is strongly suggestive evidence (.9) that the subtype of obstructive airways disease is emphysema, and 2) It is definite (1.0) that "OAD, Diffusion Defect, elevated TLC, and elevated RV together indicate emphysema." is one of the findings.
One hundred cases, carefully chosen to span the variety of disease states with sufficient exemplary information for each, were used to extract the SS rules. As the knowledge emerged, it was represented in rule form, added to the system and tested by running additional cases. The expert was sometimes surprised, sometimes frustrated, by the occasional gaps and inconsistencies in the knowledge, and the incorrect diagnoses that were logical consequences of the existing rule set. The interplay between knowledge engineer and expert gradually expanded the set of rules to remove most of these problems.
As cumulation of techniques in the art demands and allows, a new tool was not invented when an old one would do. The knowledge engineers pulled out of their toolkit a version of the MYCIN system (to be discussed later), with the rules about infectious diseases removed, and used it as the inference engine for the PUFF diagnoses. Thus PUFF. like MYCIN, is a relatively simple backward-chaining infer
Dr. J. Osborn. Dr. R. Fallas. John Kunz, Diane McClung.
ence system. It seeks a valid line-of-reasoning based on its rules and rooted in the instrumeat and patient data. With a linde more work at fitting some existing tools together, PUFF will be able to explain this line-of-reasoning, just as MYCIN does.
As it is, PUFF only prints out the final interpretation, of which the following is an example:
difference between observed and predicted
150 cases not studied during the knowledge acquisition process were used for a test and validation of the rule set. PUFF inferred a diagnosis for each. PUFF-produced and expert-produced interpretations were coded for statistical analysis to discover the degree of agreement. Over various types of disease states, and for two conditions of match between human and computer diagnoses ("same degree of severity" and "within one degree of severity''), agreement ranged between approximately 90 percent and 100 percent.
The PUFF story is just beginning and will be told perhaps at a later NCC. The surprising puachline to my synopsis is that the current state of the PUFF system as described above was achieved in less than 50 hours of interaction with the expert and less than 10 man-weeks of effort by the knowledge engineers. We have learned much in the past decade of the art of engineering knowledge-based intelligent agents!
In the remainder of this essay, I would like to discuss the route that one research group, the Stanford Heuristic Propramming Project, has taken, illustrating progress with case studies, and discussing themes of the work.
ARTIFICIAL INTELLIGENCE & KNOWLEDGE
OAD degree by SLOPE: (MODERATELY-SEVERE 700) OAD degree by MMF: (SEVERE 900) OAD degree by FEVI: (MODERATELY-SEVERE 700) FINAL OAD DEGREE: (MODERATELY-SEVERE 910) (SEVERE 900) No conflict. Final degree: (MODERATELY-SEVERE 910) INTERPRETATION: Obstruction is indicated by curvature of the flow-volume loop. Forced Vital Capacity is normal and peak flow rates are reduced, suggesting airway obstruction. Flow rate from 25-75 of expired volume is reduced, indicating severe airway obstruction. OAD, Diffusion Defect, elevated TLC, and elevated RV together indicate emphysema. OAD, Diffusioa Defect, and elevated RV indicate emphysema. Change in expired flow rates following bronchodilation shows that there is reversibility of airway obstruction. The presence of a productive cough is an indication that the OAD is of the bronchitic type. Elevated lung volumes indicate overinflation. Air trapping is indicated by the elevated
The dichotomy that was used to classify the collected papers in the volume Computers and Thought still characterizes well the motivations and research efforts of the Al community. First, there are some who work toward the constructioa of intelligent artifacts, or seek to uncover prio ciples, methods, and techniques useful in such construction. Second, there are those who view artificial intelligence as (to use Newell's phrase) "theoretical psychology," seeking explicit and valid information processing models of human thought.
For purposes of this essay, I wish to focus on the morivacions of the first proup, these days by far the larger of the two. I label these motivations "the intelligent ageat view. point" and here is my understanding of that viewpoint:
**The potential uses of computers by people to accomplish tasks can be 'one-dimensionalized' into a spectrum representing the nature of instruction that must be given the computer to do its job. Call it the WHAT-LO-HOW spectrum. At one extreme of the spectrum, the user supplies his intelligence to instruct the machine with precision exacuy HOW to do his job, step-by-step. Progress in Computer Science can be seen as steps away from the extreme ‘HOW' point on the spectrum: the familiar panoply of assembly languages, subroutine libraries, compilers, extensible languages, etc. At the other extreme of the spectrum is the user with his real problem (WHAT he wishes the computer, as his instrument, to do for him). He aspires to communicate WHAT he wants done in a language that is comfortable to him (perhaps English); via communication modes that are convenient for him (including perhaps, speech or pictures); with some generality, some vagueness, imprecision, even error; without having to lay out in detail all necessary subgoals for adequate performance
with reasonable assurance that he is addressing an intelligent agent that is using knowledge of his world to understand his intent, to fill in his vagueness, to make specific his abstractions, to correct his errors, to discover appropriate subgoals, and ultimately to translate WHAT he really wants done into processing steps that define HOW it shall be done by a real computer. The research activity aimed at creating computer programs that act as "intelligent agents" near the WHAT end of the WHAT-TO-HOW spectrum can be viewed as the long-range goal of Al research." (Feigenbaum, 1974)
Our young science is still more art than science. Art: "the principles or methods governing any craft or branch of learning." Art: "skilled workmanship. execution, or agency." These the dictionary teaches us. Knuth tells us that the endeavor of computer programming is an art, in just these ways. The art of constructing intelligent agents is both part of and an extension of the programming art. It is the art of building complex computer programs that represent and reason with knowledge of the world. Our art therefore lives in symbiosis with the other worldly arts, whose practitioners experts of their ar-hold the knowledge we need to construct intelligent agents. In most "crafts or branches of learning" what we call "expertise" is the essence of the art. And for the domains of knowledge that we touch with our art, it is the rules of expertise" or the rules of "good judgment" of the expert practitioners of that domain that we seek to transfer to our programs.
marily a consequence of the specialist's knowledge employed by the agent, and only very secondarily related to the generality and power of the inference method employed. Our agents must be knowledge-rich, even if they are methods-poor. In 1970, reporting the first major summary-ofresults of the DENDRAL program (to be discussed later), we addressed this issue as follows:
general problem-solvers are too weak to be used us the basis for building high-performance systems. The behavior of the best general problem-solvers we know, human problem-solvers, is observed to be weak and shallow, except in the areas in which the human problemsolver is a specialist. And it is observed that the transfer of expertise between specialty areas is slight. A chess master is unlikely to be an expert algebraist or an expert mass spectrum analyst, etc. In this view, the expert is the specialist, with a specialist's knowledge of his area and a specialist's methods and heuristics." (Feigenbaum, Buchanan and Lederberg, 1971. p. 187)
Subsequent evidence from our laboratory and all others has only confirmed this belief.
Al researchers have dramatically shifted their view on generality and power in the past decade. In 1967, the canonical question about the DENDRAL program was: “It sounds like good chemistry, but what does it have to do with AI?" In 1977, Goldstein and Papert write of a paradigm shift in AI:
"Today there has been a shift in paradigm. The fundamental problem of understanding intelligence is not the identification of a few powerful techniques, but rather the question of how to represent large amounts of knowledge in a fashion that permits their effective use and interac. tion." (Goldstein and Papert, 1977).
The second insight from past work concerns the nature of the knowledge that an expert brings to the performance of a task. Experience has shown us that this knowledge is largely heuristic knowledge, experiential, uncertain-mostly "good guesses" and "good practice," in lieu of facts and rigor. Experience has also taught us that much of this knowledge is private to the expert, not because he is unwilling to share publicly how he performs, but because he is unable. He knows more than he is aware of knowing. (Why else is the Ph.D. or the Internship a guild-like apprenticeship to a presumed master of the craft? What the masters really know is not written in the textbooks of the masters.) But we have learned also that this private knowledge can be uncovered by the careful, painstaking analysis of a second party, or sometimes by the expert himself, operating in the context of a large number of highiy specific performance problems. Finally, we have learned that expertise is multifaceted, that the expert brings to bear many and varied sources of knowledge in performance. The approach to capturing his expertise must proceed on many fronts simultaacously.
Lessons of the past
Two insights from previous work are pertinent to this essay.
The first concerns the quest for generality and power of the inference engine used in the performance of intelligent acts (what Minsky and Papert (see Goldstein and Papert, 1977) have labeled the power strategy''). We must hypothesize from our experience to date that the problem solving power exhibited in an intelligent agent's performance is pri