Quantifying Uncertainty in HPC Job Queue Time Predictions: Article No. 100

Kevin Menear, Connor Scully-Allison, Dmitry Duplyakin

Research output: Contribution to conferencePaper

Abstract

High Performance Computing (HPC) has developed at an unprecedented pace in recent decades. This growth has demanded corresponding development in the area of HPC Operational Data Analytics (ODA), which encompasses a wide range of data analysis techniques, ML/AI efforts, tools, and visualizations. Published studies in ODA offer a variety of practical ways to inform HPC users, administrators, procurement managers, and other stakeholders. Uncertainty analysis, however, is rare in the related published literature. For instance, we identify only 1 out of 14 existing studies focused on job queue time prediction that investigates the uncertainty aspect of their proposed predictions. We recognize the utmost importance uncertainty quantification can have in such predictive analytics solutions, with consequences in how users interpret information they receive, and attempt to bridge this gap. With the goal of improving access to such insights, we develop a process for determining upper and lower bounds of the predicted queue times of a regression model at a specified confidence level. Our current research is focused on the uncertainty in predicting job queue times, yet our approach may be employed in predicting other metrics.
Original languageAmerican English
Pages1-3
Number of pages3
DOIs
StatePublished - 2024
EventPEARC24 - Providence, RI
Duration: 22 Jul 202425 Jul 2024

Conference

ConferencePEARC24
CityProvidence, RI
Period22/07/2425/07/24

NREL Publication Number

  • NREL/CP-2C00-90232

Keywords

  • HPC
  • job queue times
  • predictions
  • uncertainty

Fingerprint

Dive into the research topics of 'Quantifying Uncertainty in HPC Job Queue Time Predictions: Article No. 100'. Together they form a unique fingerprint.

Cite this