Is Knowledge about Running Applications Helping Improve Runtime Prediction of HPC Jobs?

Kevin Menear, Dmitry Duplyakin

Research output: Contribution to conferencePaper

Abstract

High-performance computing systems rely upon scheduling algorithms to achieve high utilization. These schedulers rely upon user estimates of job resource requirements, such as runtime, to determine optimal scheduling of incoming jobs. These user estimates, however, are prone to error. To mitigate this error, significant research has been directed at providing better estimates of job runtime, usually employing machine learning techniques. These techniques are dependent upon the input features selected. Among the possible features is the primary application used by the job. In a survey of more than 20 papers directed at improving runtime prediction, only four included primary application as an input feature. We focus this investigation specifically on the value of adding primary application as an input feature, and find that it does improve model performance, especially for jobs with longer runtimes, though this improvement varies based on the application used. We recommend further research to determine the cause of this variability as well as an optimal strategy for employing a mixture of models both including and not including primary application as a feature.
Original languageAmerican English
Pages463-465
Number of pages3
DOIs
StatePublished - 2023
EventPEARC '23: Practice and Experience in Advanced Research Computing - Portland, Oregon
Duration: 23 Jul 202327 Jul 2023

Conference

ConferencePEARC '23: Practice and Experience in Advanced Research Computing
CityPortland, Oregon
Period23/07/2327/07/23

NREL Publication Number

  • NREL/CP-2C00-88316

Keywords

  • operations
  • runtime prediction
  • SHAP values

Fingerprint

Dive into the research topics of 'Is Knowledge about Running Applications Helping Improve Runtime Prediction of HPC Jobs?'. Together they form a unique fingerprint.

Cite this