Compelling ReLU Networks to Exhibit Exponentially Many Linear Regions at Initialization and During Training

  • Max Milkert
  • , David Hyde
  • , Forrest Laine

Research output: Contribution to conferencePaper

Abstract

In a neural network with ReLU activations, the number of piecewise linear regions in the output can grow exponentially with depth. However, this is highly unlikely to happen when the initial parameters are sampled randomly, which therefore often leads to the use of networks that are unnecessarily large. To address this problem, we introduce a novel parameterization of the network that restricts its weights so that a depth d network produces exactly 2d linear regions at initialization and maintains those regions throughout training under the parameterization. This approach allows us to learn approximations of convex, one-dimensional functions that are several orders of magnitude more accurate than their randomly initialized counterparts. We further demonstrate a preliminary extension of our construction to multidimensional and non-convex functions, allowing the technique to replace traditional dense layers in various architectures.
Original languageAmerican English
Number of pages24
StatePublished - 2025
EventICML 2025 Forty-Second International Conference on Machine Learning - Vancouver, Canada
Duration: 13 Jul 202519 Jul 2025

Conference

ConferenceICML 2025 Forty-Second International Conference on Machine Learning
CityVancouver, Canada
Period13/07/2519/07/25

NLR Publication Number

  • NREL/CP-2C00-95312

Keywords

  • activation regions
  • linear regions
  • network initialization
  • ReLU network

Fingerprint

Dive into the research topics of 'Compelling ReLU Networks to Exhibit Exponentially Many Linear Regions at Initialization and During Training'. Together they form a unique fingerprint.

Cite this