Berlin 2024 – wissenschaftliches Programm
Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe
DY: Fachverband Dynamik und Statistische Physik
DY 19: Machine Learning in Dynamics and Statistical Physics II (joint session DY/SOE)
DY 19.9: Vortrag
Dienstag, 19. März 2024, 11:45–12:00, BH-N 243
Fluctuating weight dynamics and loss landscape in deep linear networks — •Markus Gross — DLR, Institute for AI Safety and Security, Germany
Understanding how weights fluctuate during training of neural networks and how this impacts the loss landscape is key to optimizing training processes and performance. We investigate this dynamics in (deep) linear networks within the continuum limit of stochastic gradient descent. For a two-layer network, we highlight the role of the inter-layer coupling and analytically derive from first principles a recently discovered key relation between weight fluctuations and loss landscape. The uncovered behaviors are rooted in general statistical properties of the network architecture und training data.
Reference: https://arxiv.org/abs/2311.14120
Keywords: neural networks; stochastic gradient descent; stochastic processes; loss landscape