DPG Phi
Verhandlungen
Verhandlungen
DPG

Berlin 2018 – wissenschaftliches Programm

Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe

DY: Fachverband Dynamik und Statistische Physik

DY 28: Evolutionary Game Theory (joint SOE/BP/DY) (joint session SOE/DY/BP)

DY 28.3: Vortrag

Dienstag, 13. März 2018, 13:00–13:15, MA 001

Analytical approximation of temporal difference multi-agent reinforcement learning — •Wolfram Barfuss1,2, Jonathan F. Donges1,3, and Jürgen Kurths1,2,41Potsdam Institute for Climate Impact Research, GER — 2Humboldt University, Berlin, GER — 3Stockholm Resilience Centre, Stockholm University, SWE — 4University of Aberdeen, UK

Reinforcement learning in multi-agent systems has been studied by the fields of economic game theory, artificial intelligence and physics. Especially an economic and physics perspective has lead to analytical approaches of learning dynamics in multi-agent systems. However, these studies put their focus on simple iterated normal form games, such as the iterated Prisoners Dilemma. Environmental dynamics, i.e. changes in the state of the agent's environment affecting the payoffs received by the agents are mostly lacking. In this work we combine the analytical approach from physics with temporal difference learning from the field of artificial intelligence. This form of learning explicitly uses the discounted value of future environmental states to adapt the agent's behavior. We develop a uniform notation for multi-agent environment systems, generalizable to any environment and an arbitrary number of agents. We find four reinforcement learning variants emerging and compare their dynamics. This work is important to advance the understanding of interlinked social and environmental dilemmas, such as climate change, pollution and biosphere degradation.

100% | Mobil-Ansicht | English Version | Kontakt/Impressum/Datenschutz
DPG-Physik > DPG-Verhandlungen > 2018 > Berlin