Handbook of Learning and Approximate Dynamic ProgrammingJennie Si
|
From inside the book
Results 1-3 of 29
Page 211
... limited use . Still , there are some situations in which simple behaviors might be wholly or partially specified , and algorithms have been designed to take advantage of this . Less drastically , the internal policy of a behavior could ...
... limited use . Still , there are some situations in which simple behaviors might be wholly or partially specified , and algorithms have been designed to take advantage of this . Less drastically , the internal policy of a behavior could ...
Page 255
... limited to establish any kind of solid conclusion . However , in such cases , the optimality result for LSTD of Konda ( see Section 9.1 ) , and the comparability of the behavior of LSTD and A - LSPE , suggest a substantial superiority ...
... limited to establish any kind of solid conclusion . However , in such cases , the optimality result for LSTD of Konda ( see Section 9.1 ) , and the comparability of the behavior of LSTD and A - LSPE , suggest a substantial superiority ...
Page 633
... limited dynamics Angle Stability Transient model with limited dynamics Probability based Congestion static congestion Reliability constraints Expected Unserved Energy ( EUE ) and Loss of Load Probability , ( LOLP ) Model is either ...
... limited dynamics Angle Stability Transient model with limited dynamics Probability based Congestion static congestion Reliability constraints Expected Unserved Energy ( EUE ) and Loss of Load Probability , ( LOLP ) Model is either ...
Contents
Foreword | 1 |
Reinforcement Learning and Its Relationship to Supervised Learning | 47 |
ModelBased Adaptive Critic Designs | 65 |
Copyright | |
20 other sections not shown
Other editions - View all
Common terms and phrases
action network actor adaptive critic designs agent algorithm analysis angle applications approach approximate dynamic programming approximate LP backpropagation behavior Bellman equation BPTT chapter computational constraints control law control problems convergence cost critic network curse of dimensionality defined derivatives DHP neurocontroller direct NDP equation error estimate example Figure formulation function approximation fuzzy goal gradient helicopter Heuristic hierarchical IEEE Trans implemented improve initial input iteration learning algorithms learning rate linear programming load Lyapunov function Machine Learning Markov decision processes methods micro-alternator minimize module neural network node nonlinear operating optimal control optimal policy optimization problem output parameters Pareto optimal performance PI controller power system Proc Q-learning reinforcement learning reward robot Section simulation solve space stability stochastic structure supervised learning task techniques Theorem trajectory transition update Utility function value function variables vector voltage weights Werbos