Two Ilypotheses: On Generality and Ill-Structured Problems 373 lationship is important enough to the argument of this essay that we indicate it symbolically in Fig. 10.3. The abcissa represents increasing information in the problem statement, that is, decreasing generality. The ordinate represents increasing power. For each degree of generality an upper bound exists on the possible power of a method, though there are clearly numerous methods which do not fully exploit the information in the problem statement. 2. TWO HYPOTHESES: ON GENERALITY AND ON ILL-STRUCTURED PROBLEMS With this view of method and problem solver we can move back toward the nature of ill-structured problems. However, we need to address one intermediate issue: the nature of a general problem solver. The first heuristic programs that were built laid claims to power, not to generality. A chess or checker program was an example of artificial intelligence because it solved a problem difficult by human standards; there was never a pretense of its being general. Today's chess programs cannot even play checkers, and vice versa. Now this narrowness is completely consistent with our general experience with computer programs as highly special methods for restricted tasks. Consider a typical subroutine library, with its specific routines for inverting matrices, computing the sine, carrying out the simplex method, and so on. The only general "programs" are the higher-level programming languages, and these are not problem solvers in the usual sense, but only provide means to express particular methods.2 Thus the view has arisen that, although it may be possible to construct an artificial intelligence for any highly specific task domain, it will not prove possible to provide a general intelligence. In other words, it is the ability to be a general problem solver that marks the dividing line between human intelligence and machine intelligence. The formulation of method and problem solver given earlier leads rather directly to a simple hypothesis about this question: Generality Hypothesis. A general problem solver is one that has a collection of successively weaker methods that demand successively less of the environment in order to be applied. Thus a good general problem solver is simply one that has the best of the weak methods. This hypothesis, although itself general, is not without content. (To put it the way that philosophers of science prefer, it is falsifiable.) It says that there are no special mechanisms of generality-nothing beyond the willingness to carry around specific methods that make very weak demands on the environment for information. By the relationship expressed in Fig. 10.3 magic is unlikely, so that these methods of weak demands will also be methods of low power. Having a few of them down at the very low tip in the figure gives the problem solver the ability to tackle almost any kind of problem, even if only with marginal success. There are some ways in which this generality hypothesis is almost surely incorrect or at least incomplete, and we will come to these later; but let us remain with the main argument. There is at least one close association between generality and ill-structured problems: it is man that can cope with both. It is also true that ill-structured problems, whatever else may be the case, do not lend themselves to sharp solutions. Indeed, their lack of specificity would seem to be instrumental in prohibiting the use of precisely defined methods. Since every problem does present some array of available information-something that could meet the conditions of a problem statement of some method-the suspicion arises that lack of structure in a problem is simply another indication that there are not methods of high power for the particular array of information available. Clearly this situation does not prevail absolutely, but only with respect to a given problem solver and his collection of methods (or, equally, a population of highly similar problem solvers). We can phrase this suspicion in sharper form: The relationship of programming languages to problem solvers, especially as the languages become more problem-oriented, is unexplored territory. Although relevant to the main question of this essay, it cannot be investigated further here. Ill-structured Problem Hypothesis. A problem solver finds a problem ill structured if the power of his methods that are applicable to the problem lies below a certain threshold. The lack of any uniform measure of power, with the consequent lack of precision about a threshold on this power, is not of real concern: the notion of ill-structuredness is similarly vague. The hypothesis says that the problem of locating a new warehouse will look well structured to a firm that has, either by experience, analysis, or purchase, acquired a programmed procedure for locating warehouses, providing it has decided that the probability of obtaining an answer of suitable quality is high enough simply to evoke the program in the face of the new location problem. The problem will look ill structured to a firm that has only its general problem-solving abilities to fall back on. It can only have the most general faith that these procedures will discover appropriate information and use it in appropriate ways in making the decision. My intent is not to argue either of these two hypotheses directly, but rather to examine some of their implications. First, the weak methods must be describable in more concrete terms. This we will do in some detail, since it has been the gradual evolution of such methods in artificial intelligence that suggested the hypotheses in the first place. Second, the picture of Fig. 10.3 suggests not only that there are weak methods and strong ones, but that there is continuity between them in some sense. Phrased another way, at some level the methods of artificial intelligence and those of operations research should look like members of the same family. We will also look at this implication, although somewhat more sketchily, since little work has been done in this direction. Third, we can revisit human decision makers in ill-structured situations. This we do in an even more sketchy manner, since the main thrust of this essay stems from the more formal concerns. Finally, after these (essentially positive) explications of the hypotheses, we will turn to discussion of some difficulties. 3. THE METHODS OF HEURISTIC PROGRAMMING There has been continuous work in artificial intelligence ever since the article quoted at the beginning of this chapter [21] took note of the initial efforts. The field has had two main branches. We will concentrate on the one called heuristic prograinming. It is most closely identified with the programmed digital computer and with problem solving. Also, almost all the artificial intelligence efforts that touch management science are included within it. The other branch, identified with pattern recognition, self-organizing systems, and learning systems, although not exempt from the observations to be made here, is sufficiently different to preclude its treatment. A rather substantial number of heuristic programs have been constructed or designed and have gone far enough to get into the literature. They cover a wide range of tasks: game playing, mostly chess, checkers, and bridge; theorem proving, mostly logic, synthetic geometry, and various elementary algebraic systems; all kinds of puzzles; a range of management science tasks, including line balancing, production scheduling, and warehouse location; question-answering systems that accept quasi-English of various degrees of sophistication; and induction problems of the kind that appear on intelligence tests. The main line of progress has constituted a meandering tour through new task areas which seemed to demand new analyses. For example, there is considerable current work on coordinated effector-receptor activity (e.g., hand-eye) in the real world—a domain of problems requiring intelligence that has not been touched until this time. Examination of this collection of programs reveals that only a few ideas seem to be involved, despite the diversity of tasks. These ideas, if properly expressed, can become a collection of methods in the sense used earlier. Examination of these methods shows them to be extraordinarily weak compared with the methods, say, of linear programming. In compensation, they have a generality that lets them be applied to tasks such as discovering proofs of theorems, where strong methods are unknown.3 It thus appears that the work in heuristic programming may provide a first formalization of the kind of weak methods called for by our two hypotheses. (To be sure, as already noted, psychological invention runs the other way: the discovery that there seems to be a small set of methods underlying the diversity of heuristic programs suggested the two hypotheses.) It might be claimed that the small set of methods shows, not parsimony, but the primitive state of development of the field and that investigators read each other's papers. Although there is clearly some force to this argument, to an important degree each new task attempted in heuristic programming represents an encounter with an unknown set of 'Strong methods of proof discovery do get developed in the course of mathematical progress, but their effect is to reduce whole arcas to a calculus. The development of the operational calculus-later the Laplace and Fourier transforms-is a case in point. But the present theorem-proving programs and the methods that lie behind them do not involve mathematical advances; rather they appear to capture methods available for proof discovery within the existing state of ignorance. demands that have to be met on their own terms. Certainly the people who have created heuristic programs have often felt this way. In fact, the complaint is more often the opposite to the above caveat-that artificial intelligence is a field full of isolated cases with no underlying coherency. In fact, the view expressed in this chapter is not widely held. There is some agreement that all heuristic theorem provers and game players make use of a single scheme, called heuristic search. But there is little acknowledgment that the remainder of the methods listed below constitute some kind of basic set. With this prelude, let us describe briefly some methods. An adequate job cannot be done in a single chapter; it is more an undertaking for a textbook. Hopefully, however, some feeling for the essential characteristics of generality and power can be obtained from what is given. The first three, generate-and-test, match, and hill climbing, rarely occur as complete methods in themselves (although they can), but are rather the building blocks out of which more complex methods are composed. 3.1. Generate-and-Test This is the weak method par excellence. All that must be given is a way to generate possible candidates for solution plus a way to test whether they are indeed solutions. Figure 10.4 provides a picture of generate-and-test that permits us to view it as a method with a problem statement and a procedure. The flow diagram in the figure adopts some conventions that will be used throughout. They allow expresson of the central idea of a method without unnecessary detail. The lines in the diagram show the flow of data, rather than the flow of control-more in the style of an analog computer flow diagram than a digital computer flow diagram. Thus the nodes represent processes that receive inputs and deliver outputs. If a node is an item of data, as in the predicate P or the set {x} (braces are used to indicate sets), it is a memory process that makes the data item available. A process exccutes (or fires) when it receives an input; if there are several inputs, it waits until all appropriate ones have arrived before firing. A generator is a process that takes inforination specifying a set and produces elements of that set one by one. It should be viewed as autonomously "pushing" elements through the system. Hence there is a flow of elements from generate to the process called test. Test is a process that determines whether some condition or predicate is true of its input and behaves differentially as a result. Two different outputs are possible: satisfied (+) and unsatisfied (-). The exact output behavior depends |