An Empirical Deep Dive into Deep Learning's Driving Dynamics: arXiv:2207.12547 [cs.LG]

Research output: Contribution to journalArticle

Abstract

We present an empirical dataset surveying the deep learning phenomenon on fully-connected networks, encompassing the training and test performance of numerous network topologies, sweeping across multiple learning tasks, depths, numbers of free parameters, learning rates, batch sizes, and regularization penalties. The dataset probes 178 thousand hyperparameter settings with an average of 20 repetitions each, totaling 3.5 million training runs and 20 performance metrics for each of the 13.1 billion training epochs observed. Accumulating this 671 GB dataset utilized 5,448 CPU core-years, 17.8 GPU-years, and 111.2 node-years. Additionally, we provide a preliminary analysis revealing patterns which persist across learning tasks and topologies. We aim to inspire work empirically studying modern machine learning techniques as a catalyst for the theoretical discoveries needed to progress the field beyond energy-intensive and heuristic practices.
Original languageAmerican English
Number of pages37
JournalArXiv.org
DOIs
StatePublished - 2022

NREL Publication Number

  • NREL/JA-2C00-83002

Keywords

  • batch size
  • dataset
  • depth
  • empirical study
  • fully-connected networks
  • generalization
  • label noise
  • learning rate
  • neural architecture search
  • optimization
  • regularization
  • shape
  • topology

Fingerprint

Dive into the research topics of 'An Empirical Deep Dive into Deep Learning's Driving Dynamics: arXiv:2207.12547 [cs.LG]'. Together they form a unique fingerprint.

Cite this