Distributed Reinforcement Learning with ADMM-RL

Research output: Contribution to conferencePaperpeer-review

17 Scopus Citations

Abstract

This paper presents a new algorithm for distributed Reinforcement Learning (RL). RL is an artificial intelligence (AI) control strategy such that controls for highly nonlinear systems over multi-step time horizons may be learned by experience, rather than directly computed on the fly by optimization. Here we introduce ADMM-RL, a combination of the Alternating Direction Method of Multipliers (ADMM) and reinforcement learning that allows for integrating learned controllers as subsystems in generally convergent distributed control applications. ADMM has become the workhorse algorithm for distributed control, combining the advantages of dual decomposition (namely, enabling decoupled, parallel, distributed solution) with the advantages of the method of multipliers (namely, convexification/stability). Our ADMM-RL algorithm replaces one or more of the subproblems in ADMM with several steps of RL. When the nested iterations converge, we are left with a pretrained subsolver that can potentially increase the efficiency of the deployed distributed controller by orders of magnitude. We illustrate ADMM-RL in both distributed wind farm yaw control and distributed grid-aware demand aggregation for water heaters.

Original languageAmerican English
Pages4159-4166
Number of pages8
DOIs
StatePublished - Jul 2019
Event2019 American Control Conference, ACC 2019 - Philadelphia, United States
Duration: 10 Jul 201912 Jul 2019

Conference

Conference2019 American Control Conference, ACC 2019
Country/TerritoryUnited States
CityPhiladelphia
Period10/07/1912/07/19

Bibliographical note

Publisher Copyright:
© 2019 American Automatic Control Council.

NREL Publication Number

  • NREL/CP-2C00-72690

Keywords

  • ADMM
  • Alternating Direction Method of Multipliers
  • artificial intelligence
  • distributed reinforcement learning
  • RL

Fingerprint

Dive into the research topics of 'Distributed Reinforcement Learning with ADMM-RL'. Together they form a unique fingerprint.

Cite this