Abstract
High Performance Computing (HPC) has developed at an unprecedented pace in recent decades. This growth has demanded corresponding development in the area of HPC Operational Data Analytics (ODA), which encompasses a wide range of data analysis techniques, ML/AI efforts, tools, and visualizations. Published studies in ODA offer a variety of practical ways to inform HPC users, administrators, procurement managers, and other stakeholders. Uncertainty analysis, however, is rare in the related published literature. For instance, we identify only 1 out of 14 existing studies focused on job queue time prediction that investigates the uncertainty aspect of their proposed predictions. We recognize the utmost importance uncertainty quantification can have in such predictive analytics solutions, with consequences in how users interpret information they receive, and attempt to bridge this gap. With the goal of improving access to such insights, we develop a process for determining upper and lower bounds of the predicted queue times of a regression model at a specified confidence level. Our current research is focused on the uncertainty in predicting job queue times, yet our approach may be employed in predicting other metrics.
Original language | American English |
---|---|
Pages | 1-3 |
Number of pages | 3 |
DOIs | |
State | Published - 2024 |
Event | PEARC24 - Providence, RI Duration: 22 Jul 2024 → 25 Jul 2024 |
Conference
Conference | PEARC24 |
---|---|
City | Providence, RI |
Period | 22/07/24 → 25/07/24 |
NREL Publication Number
- NREL/CP-2C00-90232
Keywords
- HPC
- job queue times
- predictions
- uncertainty