Artificial Intelligence for Data Center Operations (AIOps): Cooperative Research and Development Final Report, CRADA Number CRD-19-00804

Research output: NRELTechnical Report

Abstract

High performance computing data centers will increasingly need to rely on automation to keep pace with exascale growth in compute capability and to manage and optimize the data center environment and facility resources. Artificial intelligence and machine learning approaches provide the means to improve HPC data center operational efficiency, by learning historical trends and training models to operate on real-time data collected from both IT and facilities sources. NREL has developed methods of real-time collection, aggregation and streaming of these data in the ESIF HPC Data Center and has collected a significant dataset of relevant metrics across computer systems, racks, environmental, building and utility sources for research into various predictive analytics problems. HPE's Advanced Technology Group (ATG) is doing comprehensive research into exascale monitoring and management for High Performance Computing (HPC) systems (hereinafter HPE's Data Monitoring/ Management Technology). NREL and HPE will collaborate to add Artificial Intelligence (AI) to NREL's real-time data collection/ aggregation/ streaming system and HPE's Data Monitoring/ Management System, with the goal of improving the operational efficiency of NREL's Energy Systems Integration Facility (ESIF) HPC Data Center through data analytics on both historical and real-time data from IT systems and facilities operations. This collaboration will consist of efforts in Data Management, Data Analytics, and AI/ML Optimization for both manual and autonomous intervention in data center operations. This will be a multi-year, multi-staged effort with a goal towards building capabilities for an Advanced Smart Facility, and demonstration of these techniques in the NREL ESIF HPC Data Center.
Original languageAmerican English
Number of pages19
DOIs
StatePublished - 2025

NREL Publication Number

  • NREL/TP-2C00-95486

Keywords

  • CRADA
  • data center
  • smart facility

Fingerprint

Dive into the research topics of 'Artificial Intelligence for Data Center Operations (AIOps): Cooperative Research and Development Final Report, CRADA Number CRD-19-00804'. Together they form a unique fingerprint.

Cite this