Page images
PDF
EPUB

heuristic programming a set of methods, as this term was used in the beginning of the paper; and (2) these methods make much weaker demands for information on the task environment than do methods such as the simplex, and hence they provide entries toward the lower, more general end of the graph in Fig. 10.3.

4. THE CONTINUNITY OF METHODS

If the two hypotheses that we have stated are correct, we should certainly expect there to be methods all along the range exhibited in Fig. 10.3. In particular, the mathematical methods of management science should not be a species apart from the methods of artificial intelligence, but should be distinguished more by having additional constraints in the problem statement. Specific mathematical content should arise as statements strong enough to permit reasonable mathematical analysis are introduced.

Evidence for this continuity comes from several sources. One is the variety of optimization techniques, ranging from hill climbing to the calculation methods of the differential calculus, each with increasing amounts of specification. Another is the existence of several methods, such as so-called branch and bound techniques, that seem equally akin to mathematical and heuristic programming. Again, dynamic programming, when applied to tasks with little mathematical structure, leads to procedures which seem not to differ from some of the methods in heuristic programming, for example, the minimax search techniques for playing games such as chess and checkers.

What we should like most of all is that each (or at least a satisfactory number) of the mathematical methods of management science would lie along a line of methods that extends back to some very weak but general ancestors. Then, hopefully, the effect on the procedure of the increasing information in the problem statement would be visible and we could see the continuity directly.

As a simple example, consider inverting a matrix. The normal algorithms for this are highly polished procedures. Yet one can look at inversion as a problem-as it is to anyone who does not know the available theory and the algorithms based on it. Parts of the problem space are clear: the elements are matrices (say of order n), hence include both the given matrix, A, the identity matrix, I, and the desired inverse, X. The problem statement is to find X such that AX = I. Simply generating and testing is not a promising way to proceed, nor is expressing X as a form, multiplying out, and getting n2 equations to solvc. Not only are these

poor approaches, but also they clearly are not the remote ancestors of the existing algorithms.

=

...

If the inverse is seen as a transformation on A, carrying it into I, a more interesting specification develops. The initial object is A, the desired object is I, and the operators consist of premultiplication (say) by some basic set of matrices. Then, if operators E1, E2,..., E, transform A into I, we have Ex ... E2E1A I, hence Ex E2E1I= A-1. If the basic operators are the elementary row operations (permute two rows, add one row to another, multiply a row by a constant), we have the basic ingredients of several of the existing direct algorithms (those that use elimination rather than successive approximation). These algorithms prescribe the exact transformations to be applied at each stage, but if we view this knowledge as being degraded we can envision a problem solver doing a heuristic search (or perhaps hill climbing if the approach were monotone). Better information about the nature of the space should lead to better selection of the transformation until existing algorithms are approached.

4.1. An Example: the Simplex Method

The simplex method clearly involves both optimization and search, hence should eventually show kinship with the methods that we have been describing. We should be able to construct a sequence of methods, each with somewhat less stringent conditions and therefore with more general applicablilty but less power. Power can be measured here by applying the method to the original linear programming (LP) problem, where the true nature of the problem is known.

Figure 10.12 reformulates the LP problem and the simplex algorithm in the present notation. Indices have been suppressed as much as possible, partly by using the scalar product, in the interests of keeping the figures uncluttered. The problem statement for the simplex method, which we call SM from now on, is a special case of the LP problem, having equalities for constraints rather than inequalitities, but involving n + m variables rather than just n. The transformation between problems is straightforward (although the original selection of this specialization is not necessarily so).

The elements of the problem space (called bases) are all the subsets of m out of the n+m variables. An element consists of much more information than just the subset of variables, of course, namely, of the entire tableau shown in Fig. 10.2. The operators theoretically could be any rules that replace one subset of m variables with another. In fact, they involve adding just a single variable, hence removing one. This

Problem statement for LP problem

Given: a set of n variables, {z}, where each x > 0:
let be the n-tuple (1, 2,..., In);
a set of m constraints, {g = b - ax}:
let the feasible region be {g > 0};
an objective function, z = ca.

Find:

in the feasible region such that z is maximum. Problem statement for SM, the simplex method

Given: a set of n + m variables, {x}, where each x ≥ 0: let be the (n + m)-tuple (xi, X2, . . ., Xn+m);

Find:

a set of m constraints, {g = b — ax}:

let the feasible region be {g = 0};

an objective function, z cz.

in the feasible region such that z is maximum.

Note: any LP problem can be solved if this one can. It is a separate problem to provide the translation between them (define In+i = gi and determine & and the ā accordingly).

Problem space for SM

Elements: {B (bases), the

(n+m)

subsets of m variables from {x}};

with B is associated T(B) (the tableau) containing:

a feasible x such that x E B implies x > 0; otherwise x = 0; the current value of z for ;

the exchange rate (e) for each x [-(z — c) in tableau]; auxiliary data to permit application of any operator. Initial element: Bo, a feasible basis (not obtained by SM). Operators: {x not in B}.

[blocks in formation]

would still leave (nm) m operators (any of n m in, any of m out), except that no choice exists on the one to be removed. Hence, there are just nm operators, specified by the n-m variables not in the current basis. Applying these operators to the current element consists of almost the entire calculation of the simplex procedure specified in Fig. 10.2 (actually steps 2, 3, and 4), which amounts to roughly m(n+m) multiplications (to use a time-honored unit of computational effort).

The procedure in Fig. 10.12 looks like a hill climber with the comparison missing: as each operator is selected, the new (B,T) is immediately calculated (i.e., the tableau updated) and a new operator selected. No. comparison is needed because the selection produces only a single operator, and this is known to advance the current position. The procedure for selecting the operator reveals that the process generates over all potential operators over all x not in the current basis-and selects the best one with respect to a quantity called the exchange rate (e). Thus the select. is a simple optimizer, with one exception (not indicated in the figure): it is given an initial bias of zero, so that only operators with e>0 have a chance.

The exchange rate is the rate of change of the objective function (z) with a change in x, given movement along a boundary of the constraints, where the only variables changing are those already in the basis. Given this kind of exploration in the larger space of the n+m variables, e measures how fast z will increase or decrease. Thus, e>0 guarantees that the compare routine is not needed in the main procedure.

The selection of an x with the maximum exchange rate does not guarantee either the maximum increase from the current position or the minimum number of steps to the optimum. The former could be achieved by inserting a compare routine and trying all operators from the current position; but this would require many times as much effort for (probably) small additional gain. However, since the space is unimodal in the feasible region, the procedure does guarantee that eventually an optimum will be reached.

We need a method at the general end of the scale against which the SM can be compared. Figure 10.13 gives the obvious one, which we call M1. It retains the shape of the problem but without any specific content. The problem is still to optimize a function, f, of n positive variables, subject to m inequality constraints, {g}. But the only thing known about f and the g is that f is unimodal in the feasible set (which accounts for its descriptive title). This justifies using hill climbing as the procedure. The operators must be any increments to the current position, either positive or negative. Many will produce a location outside the feasible

Problem statement for M1, the unimodal objective method

Given: a set of n variables, {x}, where each x > 0:
let z be the n-tuple (x1, x2, . . . Xn);
a set of m constraints, {g() > 0}:
let the feasible region be {g > 0};
an objective function z = f(x);

Find:

f is unimodal in the feasible region.

in the feasible region such that z is maximum.

Problem space PS1, for hillclimbing

Elements: {}.

Initial element: xo, a feasible solution (not obtained by M1).
Operators: {A, where each Ax is any real number},

[merged small][merged small][merged small][merged small][ocr errors][merged small][merged small][merged small][merged small][merged small][merged small][merged small]

region, but these can be rejected by testing against the g. The procedure in the figure does not provide any information on how to generate the operators.

If M1 is compared to SM, several differences are apparent. First, and most striking, the problem space for M1 is n-dimensional Euclidean space, whereas for SM it is a finite set of

[ocr errors]

points in (n + m)-dimensional space. Thus the search space has been drastically reduced, independently of what techniques are used to search it. Second, and almost as striking, M1 has all of {AT} as operators (i.e., all of n-dimensional space again), whereas SM has only n - m operators. Third, a unique operator is selected for application on the basis of partial information; it always both applies and improves the position. In M1 there is no reason to expect an operator either to produce a feasible solution or, if it does, to obtain an improvement; thus, extensive testing and comparing must be done. Finally, we observe that the cost per step in M1 (when applied to the same LP problem as SM) is mk, where k is the number of variables

« PreviousContinue »