Handbook of Learning and Approximate Dynamic ProgrammingJennie Si
|
From inside the book
Results 1-3 of 84
Page 155
... error induced by an approximate dynamic programming algorithm is usually characterized in relative terms compare it against the best error that can be achieved given the selection of basis functions . we Section 6.5 presents an error ...
... error induced by an approximate dynamic programming algorithm is usually characterized in relative terms compare it against the best error that can be achieved given the selection of basis functions . we Section 6.5 presents an error ...
Page 173
... error in the approximation of the optimal cost - to - go function - approximate LP produces approximations that are comparable to the best that could have been achieved with the given selection of basis functions . The performance and ...
... error in the approximation of the optimal cost - to - go function - approximate LP produces approximations that are comparable to the best that could have been achieved with the given selection of basis functions . The performance and ...
Page 524
... error is 0.93 . The bottom graph shows the PI output control signal . how performance depends on these parameter values . Performance was measured by the average RMS error over the final 30 trials and also by the average RMS error over ...
... error is 0.93 . The bottom graph shows the PI output control signal . how performance depends on these parameter values . Performance was measured by the average RMS error over the final 30 trials and also by the average RMS error over ...
Contents
Foreword | 1 |
Reinforcement Learning and Its Relationship to Supervised Learning | 47 |
ModelBased Adaptive Critic Designs | 65 |
Copyright | |
20 other sections not shown
Other editions - View all
Common terms and phrases
action network actor adaptive critic designs agent algorithm analysis angle applications approach approximate dynamic programming approximate LP backpropagation behavior Bellman equation BPTT chapter computational constraints control law control problems convergence cost critic network curse of dimensionality defined derivatives DHP neurocontroller direct NDP equation error estimate example Figure formulation function approximation fuzzy goal gradient helicopter Heuristic hierarchical IEEE Trans implemented improve initial input iteration learning algorithms learning rate linear programming load Lyapunov function Machine Learning Markov decision processes methods micro-alternator minimize module neural network node nonlinear operating optimal control optimal policy optimization problem output parameters Pareto optimal performance PI controller power system Proc Q-learning reinforcement learning reward robot Section simulation solve space stability stochastic structure supervised learning task techniques Theorem trajectory transition update Utility function value function variables vector voltage weights Werbos