Abstract
Addressing the "Red-AI" trend of rising energy consumption by large-scale neural networks, this study investigates the measured energy consumption of training various fully connected neural network architectures. We introduce the BUTTER-E dataset, an augmentation to the BUTTER Empirical Deep Learning dataset, containing energy consumption and performance data from 41,129 individual experimental runs spanning 30,582 distinct configurations: 13 datasets, 20 sizes (trainable parameters), 8 "shapes", and 14 depths on both CPUs and GPUs using node-level watt-meters. This dataset reveals the complex relationship between dataset size, network structure, and energy use. Our analysis uncovers a surprising, hardware-mediated non-linear relationship between energy efficiency and network design, challenging the assumption that reducing the number of parameters or FLOPs is the best way to achieve greater energy efficiency. We propose a straightforward and effective energy model that accounts for network size, computing, and memory hierarchy. Highlighting the need for cache-considerate algorithm development, we suggest a codesign approach to energy efficient network, algorithm, and hardware design. This work contributes to the fields of sustainable computing and Green AI, offering practical guidance for creating more energy-efficient neural networks and promoting sustainable AI.
| Original language | American English |
|---|---|
| Number of pages | 33 |
| DOIs | |
| State | Published - 2026 |
NLR Publication Number
- NLR/TP-2C00-91181
Keywords
- artificial intelligence
- deep learning
- empirical
- energy efficiency
- energy model
- machine learning
- neural architecture