Machine Intelligence and Robotics

portant methods are missing; for example, an almost universally used paradigm for pattern recognition.

Two characteristics of the set of methods stand out. First, explicit references to processes occur in the problem statement, whereas this is not true of mathematical methods, such as the simplex method. Thus generate-and-test specifies that you must have a generator and you must have a test; then the procedure tells how to organize these. This feature seems to be related to the strength of the method. Methods with stronger assumptions make use of known processes whose existence is implied by the assumptions. In the simplex method generation on a set of variables is done over the index, and the tests used are equality and inequality on real numbers. Hence there is no need to posit directly, say, the generator of the set.

The second characteristic is the strong similarity of the methods to each other. They give the impression of ringing the various changes on a small set of structural features. Thus there appear to be only two differences between heuristic search and hill climbing. First, it is necessary to compare for the greater element in hill climbing; heuristic search needs only a test for solution (although it can use the stronger comparison, as in means-ends analysis). Second, hill climbing keeps only the best element found so far, that is, it searches the problem space from where it is. Heuristic search, on the other hand, keeps around a set of obtained elements and selects from it where next to continue the search. In consequence, it permits a more global view of the space than hill climbing-and pays for it, not only by extra memory and processing, but also by the threat of exponential expansion.

Similarly, the difference between match and heuristic search is primarily one of memory for past actions and positions. Our diagram for match does not reveal this clearly, since it shows only the case of a single modification operation, substitution; but with a set of modification operations (corresponding to the set of operators in heuristic search) the match looks very much like a means-ends analysis that never has to back up.

Finally, the more complex processes, such as LT and the warehouse program, seem to make use of the more elementary ones in recognizable combinations. Such combination does not always take the form of distinct units tied output to input (i.e., of closed subroutines), but a flavor still exists of structures composed of building blocks.

In reviewing these methods instruction in the details of artificial intelligence has not been intended. Hopefully, however, enough information has been given to convince the reader of two main points: (1) there is in

heuristic programming a set of methods, as this term was used in the beginning of the paper; and (2) these methods make much weaker demands for information on the task environment than do methods such as the simplex, and hence they provide entries toward the lower, more general end of the graph in Fig. 10.3.

4. THE CONTINUNITY OF METHODS

If the two hypotheses that we have stated are correct, we should certainly expect there to be methods all along the range exhibited in Fig. 10.3. In particular, the mathematical methods of management science should not be a species apart from the methods of artificial intelligence, but should be distinguished more by having additional constraints in the problem statement. Specific mathematical content should arise as statements strong enough to permit reasonable mathematical analysis are introduced.

Evidence for this continuity comes from several sources. One is the variety of optimization techniques, ranging from hill climbing to the calculation methods of the differential calculus, each with increasing amounts of specification. Another is the existence of several methods, such as so-called branch and bound techniques, that seem equally akin to mathematical and heuristic programming. Again, dynamic programming, when applied to tasks with little mathematical structure, leads to procedures which seem not to differ from some of the methods in heuristic programming, for example, the minimax search techniques for playing games such as chess and checkers.

What we should like most of all is that each (or at least a satisfactory number) of the mathematical methods of management science would lie along a line of methods that extends back to some very weak but general ancestors. Then, hopefully, the effect on the procedure of the increasing information in the problem statement would be visible and we could see the continuity directly.

As a simple example, consider inverting a matrix. The normal algorithms for this are highly polished procedures. Yet one can look at inversion as a problem-as it is to anyone who does not know the available theory and the algorithms based on it. Parts of the problem space are clear: the elements are matrices (say of order n), hence include both the given matrix, A, the identity matrix, I, and the desired inverse, X. The problem statement is to find X such that AX = I. Simply generating and testing is not a promising way to proceed, nor is expressing X as a form, multiplying out, and getting n2 equations to solve. Not only are these

poor approaches, but also they clearly are not the remote ancestors of the existing algorithms.

If the inverse is seen as a transformation on A, carrying it into I, a more interesting specification develops. The initial object is A, the desired object is I, and the operators consist of premultiplication (say) by some basic set of matrices. Then, if operators E1, E2,..., E, transform A into I, we have Ex E2E1A = I, hence Ex··· E2E1I = Ä-1. If the basic operators are the elementary row operations (permute two rows, add one row to another, multiply a row by a constant), we have the basic ingredients of several of the existing direct algorithms (those that use elimination rather than successive approximation). These algorithms prescribe the exact transformations to be applied at each stage, but if we view this knowledge as being degraded we can envision a problem solver doing a heuristic search (or perhaps hill climbing if the approach were monotone). Better information about the nature of the space should lead to better selection of the transformation until existing algorithms are approached.

4.1. An Example: the Simplex Method

The simplex method clearly involves both optimization and search, hence should eventually show kinship with the methods that we have been describing. We should be able to construct a sequence of methods, each with somewhat less stringent conditions and therefore with more general applicablilty but less power. Power can be measured here by applying the method to the original linear programming (LP) problem, where the true nature of the problem is known.

Figure 10.12 reformulates the LP problem and the simplex algorithm in the present notation. Indices have been suppressed as much as possible, partly by using the scalar product, in the interests of keeping the figures uncluttered. The problem statement for the simplex method, which we call SM from now on, is a special case of the LP problem, having equalities for constraints rather than inequalitities, but involving n + m variables rather than just n. The transformation between problems is straightforward (although the original selection of this specialization is not necessarily so).

The elements of the problem space (called bases) are all the subsets of m out of the n+m variables. An element consists of much more information than just the subset of variables, of course, namely, of the entire tableau shown in Fig. 10.2. The operators theoretically could be any rules that replace one subset of m variables with another. In fact, they involve adding just a single variable, hence removing one. This

Problem statement for LP problem

Given: a set of n variables, {x}, where each x > 0: let ≈ be the n-tuple (x1, x2, . . ., Xn);

Find:

a set of m constraints, {g = b — ax}:
let the feasible region be {lg> 0};
an objective function, z = CI.

in the feasible region such that z is maximum. Problem statement for SM, the simplex method

Given: a set of n + m variables, {x}, where each x > 0: let be the (n + m)-tuple (x,, X2, . . ., Xn+m);

Find:

a set of m constraints, {g = b — ax}:

let the feasible region be {g = 0};

an objective function, z = Cx.

in the feasible region such that z is maximum.

Note: any LP problem can be solved if this one can. It is a separate problem to provide the translation between them (define xn+i = gi and determine c and the & accordingly).

Problem space for SM

Elements: {B (bases), the ("+m)

subsets of m variables from {x}};

with B is associated T(B) (the tableau) containing:

a feasible x such that x E B implies x > 0; otherwise x
the current value of z for ;

the exchange rate (e) for each x [-(z - c) in tableau];
auxiliary data to permit application of any operator.

Initial element: Bo, a feasible basis (not obtained by SM).
Operators: {x not in B}.

[merged small][merged small][merged small][merged small][merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small]

would still leave (n − m)m operators (any of n-m in, any of m out), except that no choice exists on the one to be removed. Hence, there are just n―m operators, specified by the nm variables not in the current basis. Applying these operators to the current element consists of almost the entire calculation of the simplex procedure specified in Fig. 10.2 (actually steps 2, 3, and 4), which amounts to roughly m (n + m) multiplications (to use a time-honored unit of computational effort).

The procedure in Fig. 10.12 looks like a hill climber with the comparison missing: as each operator is selected, the new (B,T) is immediately calculated (i.e., the tableau updated) and a new operator selected. No comparison is needed because the selection produces only a single operator, and this is known to advance the current position. The procedure for selecting the operator reveals that the process generates over all potential operators-over all x not in the current basis-and selects the best one with respect to a quantity called the exchange rate (e). Thus the select is a simple optimizer, with one exception (not indicated in the figure): it is given an initial bias of zero, so that only operators with e> 0 have a chance.

The exchange rate is the rate of change of the objective function (z) with a change in x, given movement along a boundary of the constraints, where the only variables changing are those already in the basis. Given this kind of exploration in the larger space of the n+m variables, e measures how fast z will increase or decrease. Thus, e> 0 guarantees that the compare routine is not needed in the main procedure.

The selection of an x with the maximum exchange rate does not guarantee either the maximum increase from the current position or the minimum number of steps to the optimum. The former could be achieved by inserting a compare routine and trying all operators from the current position; but this would require many times as much effort for (probably) small additional gain. However, since the space is unimodal in the feasible region, the procedure does guarantee that eventually an optimum will be reached.

We need a method at the general end of the scale against which the SM can be compared. Figure 10.13 gives the obvious one, which we call M1. It retains the shape of the problem but without any specific content. The problem is still to optimize a function, f, of n positive variables, subject to m inequality constraints, {g}. But the only thing known about f and the g is that f is unimodal in the feasible set (which accounts for its descriptive title). This justifies using hill climbing as the procedure. The operators must be any increments to the current position, either positive or negative. Many will produce a location outside the feasible

« Previous Continue »

Books

Machine Intelligence and Robotics: Report of the NASA Study Group : Final Report