Berlin 2018 – scientific programme
Parts | Days | Selection | Search | Updates | Downloads | Help
BP: Fachverband Biologische Physik
BP 13: Evolutionary Game Theory (joint session SOE/DY/BP)
BP 13.3: Talk
Tuesday, March 13, 2018, 13:00–13:15, MA 001
Analytical approximation of temporal difference multi-agent reinforcement learning — •Wolfram Barfuss1,2, Jonathan F. Donges1,3, and Jürgen Kurths1,2,4 — 1Potsdam Institute for Climate Impact Research, GER — 2Humboldt University, Berlin, GER — 3Stockholm Resilience Centre, Stockholm University, SWE — 4University of Aberdeen, UK
Reinforcement learning in multi-agent systems has been studied by the fields of economic game theory, artificial intelligence and physics. Especially an economic and physics perspective has lead to analytical approaches of learning dynamics in multi-agent systems. However, these studies put their focus on simple iterated normal form games, such as the iterated Prisoners Dilemma. Environmental dynamics, i.e. changes in the state of the agent's environment affecting the payoffs received by the agents are mostly lacking. In this work we combine the analytical approach from physics with temporal difference learning from the field of artificial intelligence. This form of learning explicitly uses the discounted value of future environmental states to adapt the agent's behavior. We develop a uniform notation for multi-agent environment systems, generalizable to any environment and an arbitrary number of agents. We find four reinforcement learning variants emerging and compare their dynamics. This work is important to advance the understanding of interlinked social and environmental dilemmas, such as climate change, pollution and biosphere degradation.