DPG Phi
Verhandlungen
Verhandlungen
DPG

Regensburg 2025 – wissenschaftliches Programm

Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe

DY: Fachverband Dynamik und Statistische Physik

DY 33: Machine Learning in Dynamics and Statistical Physics I

DY 33.4: Vortrag

Donnerstag, 20. März 2025, 10:15–10:30, H47

Explaining Near-Zero Hessian Eigenvalues Through Approximate Symmetries in Neural Networks — •Marcel Kühn and Bernd Rosenow — Institute for Theoretical Physics, University of Leipzig, 04103 Leipzig, Germany

The Hessian matrix, representing the second derivative of the loss function, offers crucial insights into the loss landscape of neural networks and significantly influences optimization algorithms, model design, and generalization in deep learning. A common characteristic of the Hessian eigenspectrum is the presence of a few large eigenvalues alongside a bulk of near-zero eigenvalues. We propose that this bulk structure arises from approximate symmetries inherent in network architectures -- an often overlooked aspect. First, we demonstrate that in deep, fully connected linear networks, exact continuous symmetries that leave the loss invariant lead to zero eigenvalues in the Hessian. These zero eigenvalues and their corresponding eigenvectors can be attributed to symmetries such as rotations between weight layers. Extending this concept, we suggest that in networks with nonlinear activation functions, approximate symmetries introduce a large number of small but finite eigenvalues, viewed as perturbations of the linear case. We illustrate this phenomenon in a two-layer ReLU student-teacher setup and in a multi-layer network trained on CIFAR-10, showing that eigenvectors with small eigenvalues predominantly align with symmetry directions. Finally, we apply our symmetry-based analysis to convolutional networks, demonstrating the generality of our approach in understanding the Hessian eigenspectrum across different architectures.

Keywords: Neural Networks; Hessian Matrix; Symmetry; Eigenspectrum; Loss Landscape

100% | Mobil-Ansicht | English Version | Kontakt/Impressum/Datenschutz
DPG-Physik > DPG-Verhandlungen > 2025 > Regensburg