Abstract
This paper develops an online algorithm to solve a time-varying optimization problem with an objective that comprises a known time-varying cost and an unknown function. This problem structure arises in a number of engineering systems and cyber–physical systems where the known function captures time-varying engineering costs, and the unknown function models user's satisfaction; in this context, the objective is to strike a balance between given performance metrics and user's satisfaction. Key challenges related to the problem at hand are related to (1) the time variability of the problem, and (2) the fact that learning of the user's utility function is performed concurrently with the execution of the online algorithm. This paper leverages Gaussian processes (GP) to learn the unknown cost function from noisy functional evaluation and build pertinent upper confidence bounds. Using the GP formalism, the paper then advocates time-varying optimization tools to design an online algorithm that exhibits tracking of the oracle-based optimal trajectory within an error ball, while learning the user's satisfaction function with no-regret. The algorithmic steps are inexact, to account for possible limited computational budgets or real-time implementation considerations. Numerical examples are illustrated based on a problem related to vehicle control.
Original language | American English |
---|---|
Article number | 109767 |
Number of pages | 15 |
Journal | Automatica |
Volume | 131 |
DOIs | |
State | Published - 2021 |
Bibliographical note
Publisher Copyright:© 2021 Elsevier Ltd
NREL Publication Number
- NREL/JA-5D00-80460
Keywords
- Cyber–physical systems
- Gaussian processes
- Machine learning
- Online optimization
- Upper-confidence bounds