Regensburg 2025 – wissenschaftliches Programm
Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe
DY: Fachverband Dynamik und Statistische Physik
DY 42: Stochastic Thermodynamics
DY 42.6: Vortrag
Freitag, 21. März 2025, 10:45–11:00, H43
Is learning in Neural Networks just very high dimensional parameter fitting? — •Ibrahim Talha Ersoy — Universität Potsdam, Institut für Astronomie und Physik, Potsdam, Deutschland
Neural Networks (NNs) are known for their highly non-convex loss landscapes, shaped by the data and the error function. Unlike in convex optimization, the model navigates regions of changing curvature to find the global minimum. We anticipate qualitative changes in the model occurring at points where new error basins are explored. In information bottleneck settings, Tishby et al. (2015) suggested that transitions between distinct loss regions are associated with phase transitions, a concept proven in L2 setups by Ziyin et al. (2023), where they examined the onset of learning when varying the L2 regularizer strength. We extend the findings of Ziyin et al. (2023), interpreting them from an information geometric perspective and demonstrate further phase transitions when model changes. By distinguishing between the loss and error landscapes, we provide a rigorous argument that extends the scope of our results beyond the L2 setup. This approach enables a better understanding of the limitations of the free energy interpretation of the L2 loss function and provides a more accurate depiction. Finally, our results suggest a clear distinction between learning, characterised by phase transitions at points of model change, and fitting, where the model remains qualitatively fixed, lacking phase transitions.
Keywords: Neural Networks; Information Theory; Phase Transitions; Information Geometry; Deep Learning