« PreviousContinue »
Miller, Galanter and Pribram35 discuss possible analogies between human problem-solving and some heuristic planning schemes. It seems certain that, for at least a few years, there will be a close association between theories of human behavior and attempts to increase the intellectual capacities of machines. But, in the long run, we must be prepared to discover profitable lines of heuristic programming which do not deliberately imitate human characteristics. 36
V. Induction and Models
A. Intelligence In all of this discussion we have not come to grips with anything we can isolate as “intelligence.” We have discussed only heuristics, shortcuts, and classification techniques. Is there something missing? I am confident that sooner or later we will be able to assemble programs of great problemsolving ability from complex combinations of heuristic devices-multiple optimizers, pattern-recognition tricks, planning algebras, recursive administration procedures, and the like. In no one of these will we find the
are stored in “pushdown lists” and both the program and the data are stored in the form of "list structures." Gelernter (1959) extended FORTRAN to manage some of this. McCarthy has extended these notions in LISP (1960) to permit explicit recursive definitions of programs in a language based on recursive functions of symbolic expressions; here the management of program state variables is fully automatic. See also Orchard-Hays (1960).
35 See chaps. 12 and 13 of Miller, Galanter, and Pribram (1960).
* Limitations of space preclude detailed discussion here of theories of self-organizing neural nets, and other models based on brain analogies. [Several of these are described or cited in Proceedings of a Symposium on Mechanisation of Thoughi Processes, London: H. M. Stationery Office, 1959, and Self Organizing Systems, M. T. Yovitts and S. Cameron (eds.), New York: Pergamon Press, 1960.) This omission is not too serious, I feel, in connection with the subject of heuristic programming, because the motivation and methods of the two areas seem so different. Up to the present time, at least, research on neural-net models has been concerned mainly with the attempt to show that certain rather simple heuristic processes, e.... reinforcement learning, or property-list pattern recognition, can be realized or evolved by collections of simple elements without very highly organized interconnections. Work on heuristic programming is characterized quite differently by the search for new, more powerful heuristics for solving very complex problems, and by very little concern for what hardware (neuronal or otherwise) would minimally suffice for its realization. In short, the work on "nets” is concerned with how far one can get with a small initial endowment; the work on “artificial intelligence" is concerned inn using all we know to build the most powerful systems that we can. It is my expediation that, in problem-solving power, the (allegedly brainlike) minimal-strucillre systems will never threaten to compete with their more deliberately designed contemporaries; nevertheless, their study should prove profitable in the development of component elements and subsystems to be used in the construction of the more systematically conceived machines.
seat of intelligence. Should we ask what intelligence “really is”? My own view is that this is more of an aesthetic question, or one of sense of dignity, than a technical matter! To me “intelligence" seems to denote little more than the complex of performances which we happen to respect, but do not understand. So it is, usually, with the question of "depth” in mathematics. Once the proof of a theorem is really understood its content seems to become trivial. (Still, there may remain a sense of wonder about how the proof was discovered.)
Programmers, too, know that there is never any “heart" in a program. There are high-level routines in each program, but all they do is dictate that “if such and such, then transfer to such and such a subroutine.” And when we look at the low-level subroutines, which "actually do the work,” we find senseless loops and sequences of trivial operations, merely carrying out the dictates of their superiors. The intelligence in such a system seems to be as intangible as becomes the meaning of a single common word when it is thoughtfully pronounced over and over again.
But we should not let our inability to discern a locus of intelligence lead us to conclude that programmed computers therefore cannot think. For it may be so with man, as with machine, that, when we understand finally the structure and program, the feeling of mystery (and self-approbation) will weaken.3? We find similar views concerning “creativity” in Newell, Shaw, and Simon (1958c). The view expressed by Rosenbloom (1951) that minds (or brains) can transcend machines is based, apparently, on an erroneous interpretation of the meaning of the “unsolvability theorems” of Godel.38
37 See Minsky (1956, 1959).
** On problems of volition we are in general agreement with McCulloch (1954) that our freedom of will “presumably means no more than that we can distinguish between what we intend (i.e., our plan), and some intervention in our action.” See also MacKay (1959) and (the) references; we are, however, unconvinced by his eulogization of "analog" devices. Concerning the "mind-brain" problem, one should consider the arguments of Craik (1952), Hayek (1952), and Pask (1959). Among the active leaders in modern heuristic programming, perhaps only Samuel (1960b) has taken a strong position against the idea of machines thinking. His argument, based on the fact that reliable computers do only that which they are instructed to do, has a basic flaw; it does not follow that the programmer therefore has full knowledge (and therefore full responsibility and credit for) what will ensue. For certainly the programmer may set up an evolutionary system whose limitations are for him unclear and possibly incomprehensible. No better does the mathematician know all the consequences of a proposed set of axioms. Surely a machine has to be in order to perform. But we cannot assign all the credit to its programmer if the operation of a system comes to reveal structures not recognizable or anticipated by the programmer. While we have not yet seen much in the way of intelligent activity in machines, Samuel's arguments (circular in that they are based on the presumption that machines do not have minds) do not assure us against this. Turing (1956) gives a very knowledgeable discussion of such matters.
B. Inductive Inference
Let us pose now for our machines, a variety of problems more challenging than any ordinary game or mathematical puzzle. Suppose that we want a machine which, when embedded for a time in a complex environment or “universe,” will essay to produce a description of that world—to discover its regularities or laws of nature. We might ask it to predict what will happen next. We might ask it to predict what would be the likely consequences of a certain action or experiment. Or we might ask it to formulate the laws governing some class of events. In any case, our task is to equip our machine with inductive ability—with methods which it can use to construct general statements about events beyond its recorded experience. Now, there can be no system cor inductive inference that will work well in all possible universes. But given a universe, or an ensemble of universes, and a criterion of success, this (epistemological) problem for machines becomes technical rather than philosophical. There is quite a literature concerning this subject, but we shall discuss only one approach which currently seems to us the most promising; this is what we might call the “grammatical induction” schemes of Solomonoff (1957, 1958, 1959a), based partly on work of Chomsky and Miller (1957b, 1958).
We will take language to mean the set of expressions formed from some given set of primitive symbols or expressions, by the repeated application of some given set of rules; the primitive expressions plus the rules is the grummar of the language. Most induction problems can be framed as problems in the discovery of grammars. Suppose, for instance, that a machine's prior experience is summarized by a large collection of statements, some labelled “good” and some "bad" by some critical device. How could we generate selectively more good statements? The trick is to find some relatively simple (formal) language in which the good statements are grammatical, and in which the bad ones are not. Given such a language, we can use it to generate more statements, and presumably these will tend to be more like the good ones. The heuristic argument is that if we can find a relatively simple way to separate the two sets, the discovered rule is likely to be useful beyond the immediate experience. If the extension fails to be consistent with new data, one might be able to make small changes in the rules and, generally, one may be able to use many ordinary problem-solving methods for this task.
The problem of finding an efficient grammar is much the same as that of finding efficient encodings, or programs, for machines; in each case, one needs to discover the important regularities in the data, and exploit the regularities by making shrewd abbreviations. The possible importance of Solomonoff's work (1960) is that, despite some obvious defects, it may point the way toward systematic mathematical ways to explore this discovery problem. He considers the class of all programs (for a given general-purpose computer) which will produce a certain given output (the body of data in question). Most such programs, if allowed to continue, will add to that body of data. By properly weighting these programs, perhaps by length, we can obtain corresponding weights for the different possible continuations, and thus a basis for prediction. If this prediction is to be of any interest, it will be necessary to show some independence of the given computer; it is not yet clear precisely what form such a result will take.
C. Models of Oneself If a creature can answer a question about a hypothetical experiment, without actually performing that experiment, then the answer must have been obtained from some submachine inside the creature. The output of that submachine (representing a correct answer) as well as the input (representing the question) must be coded descriptions of the corresponding external events or event classes. Seen through this pair of encoding and decoding channels, the internal submachine acts like the environment, and so it has the character of a “model.” The inductive inference problem may then be regarded as the problem of constructing such a model.
To the extent that the creature's actions affect the environment, this internal model of the world will need to include some representation of the creature itself. If one asks the creature “why did you decide to do such and such” (or if it asks this of itself), any answer must come from the internal model. Thus the evidence of introspection itself is liable to be based ultimately on the processes used in constructing one's image of one's self. Speculation on the form of such a model leads to the amusing prediction that intelligent machines may be reluctant to believe that they are just machines. The argument is this: our own self-models have a substantially “dual” character; there is a part concerned with the physical or mechanical environment with the behavior of inanimate objects---and there is a part concerned with social and psychological matters. It is precisely because we have not yet developed a satisfactory mechanical theory of mental activity that we have to keep these areas apart. We could not give up this division even if we wished to—until we find a unified model to replace it. Now, when we ask such a creature what sort of being ii is, it cannot simply answer “directly”; it must inspect its model(s). And it must answer by saying that it seems to be a dual thing—which appears to have two parts—a “mind” and a “body.” Thus, even the robot, unless equipped with a satisfactory theory of artificial intelligence, would have to maintain a dualistic opinion on this matter.39
» There is a certain problem of infinite regression in the notion of a machine
In attempting to combine a survey of work on “artificial intelligence” with a summary of our own views, we could not mention every relevant project and publication. Some important omissions are in the area of “brain models”; the early work of Farley and Clark (1954) [also Farley's paper in Yovitts and Cameron (1960), often unknowingly duplicated, and the work of Rochester (1956) and Milner (1960)]. The work of Lettvin et al. (1959) is related to the theories in Selfridge (1959). We did not touch at all on the problems of logic and language, and of information retrieval, which must be faced when action is to be based on the contents of large memories; see, e.g., McCarthy (1959). We have not discussed the basic results in mathematical logic which bear on the question of what can be done by machines. There are entire literatures we have hardly even sampled—the bold pioneering work of Rashevsky (c. 1929) and his later co-workers (Rashevsky, 1960); Theories of Learning, e.g., Gorn (1959); Theory of Games, e.g., Shubik (1960); and Psychology, e.g., Bruner et al. (1956). And everyone should know the work of Polya (1945, 1954) on how to solve problems. We can hope only to have transmitted the flavor of some of the more ambitious projects directly concerned with getting machines to take over a larger portion of problem-solving tasks.
One last remark: we have discussed here only work concerned with more or less self-contained problem-solving programs. But as this is written, we are at last beginning to see vigorous activity in the direction of constructing usable time-sharing or multiprogramming computing systems. With these systems, it will at last become economical to match human beings in real time with really large machines. This means that we can work toward programming what will be, in effect, "thinking aids.” In the years to come, we expect that these man-machine systems will share, and perhaps for a time be dominant, in our advance toward the development of "artificial intelligence.”
having a good model of itself: of course, the nested models must lose detail and finally vanish. But the argument, e.g., of Hayek (see 8.69 and 8.79, 1952) that we cannot “fully comprehend the unitary order” (of our own minds) ignores the power of recursive description as well as Turing's demonstration that (with sufficient external writing space) a "general-purpose" machine can answer any question about a description of itself that any larger machine could answer.