Parts | Days | Selection | Search | Updates | Downloads | Help

BP: Fachverband Biologische Physik

BP 13: Evolutionary Game Theory (joint session SOE/DY/BP)

BP 13.3: Talk

Tuesday, March 13, 2018, 13:00–13:15, MA 001

Analytical approximation of temporal difference multi-agent reinforcement learning — •Wolfram Barfuss^1,2, Jonathan F. Donges^1,3, and Jürgen Kurths^1,2,4 — ¹Potsdam Institute for Climate Impact Research, GER — ²Humboldt University, Berlin, GER — ³Stockholm Resilience Centre, Stockholm University, SWE — ⁴University of Aberdeen, UK

Reinforcement learning in multi-agent systems has been studied by the fields of economic game theory, artificial intelligence and physics. Especially an economic and physics perspective has lead to analytical approaches of learning dynamics in multi-agent systems. However, these studies put their focus on simple iterated normal form games, such as the iterated Prisoners Dilemma. Environmental dynamics, i.e. changes in the state of the agent's environment affecting the payoffs received by the agents are mostly lacking. In this work we combine the analytical approach from physics with temporal difference learning from the field of artificial intelligence. This form of learning explicitly uses the discounted value of future environmental states to adapt the agent's behavior. We develop a uniform notation for multi-agent environment systems, generalizable to any environment and an arbitrary number of agents. We find four reinforcement learning variants emerging and compare their dynamics. This work is important to advance the understanding of interlinked social and environmental dilemmas, such as climate change, pollution and biosphere degradation.