Grid-Interactive Multi-Zone Building Control Using Reinforcement Learning with Global-Local Policy Search

Research output: Contribution to conferencePaperpeer-review

6 Scopus Citations


In this paper, we develop a grid-interactive multi-zone building controller based on a deep reinforcement learning (RL) approach. The controller is designed to facilitate building operation during normal conditions and demand response events, while ensuring occupants comfort and energy efficiency. We leverage a continuous action space RL formulation, and devise a two-stage global-local RL training framework. In the first stage, a global fast policy search is performed using a gradient-free RL algorithm. In the second stage, a local fine-tuning is conducted using a policy gradient method. In contrast to the state-of-the-art model predictive control (MPC) approach, the proposed RL controller does not require complex computation during real-time operation and can adapt to nonlinear building models. We illustrate the controller performance numerically using a five-zone commercial building.

Original languageAmerican English
Number of pages8
StatePublished - 25 May 2021
Event2021 American Control Conference, ACC 2021 - Virtual, New Orleans, United States
Duration: 25 May 202128 May 2021


Conference2021 American Control Conference, ACC 2021
Country/TerritoryUnited States
CityVirtual, New Orleans

Bibliographical note

See NREL/CP-2C00-78000 for preprint

NREL Publication Number

  • NREL/CP-2C00-80740


  • building control
  • demand response
  • reinforcement learning


Dive into the research topics of 'Grid-Interactive Multi-Zone Building Control Using Reinforcement Learning with Global-Local Policy Search'. Together they form a unique fingerprint.

Cite this