Regensburg 2025 – wissenschaftliches Programm
Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe
SOE: Fachverband Physik sozio-ökonomischer Systeme
SOE 10: Focus Session: Large Language Models, Social Dynamics, and Assessment of Complex Systems
SOE 10.3: Vortrag
Donnerstag, 20. März 2025, 15:45–16:00, H45
Collective Turing Tests on LLMs — •Azza Bouleimen1, 2, Giordano de Marzo3, Taehee Kim3, Silvia Giordano2, and David Garcia3 — 1University of Zurich, Zurich, Switzerland — 2SUPSI, Lugano, Switzerland — 3University of Konstanz, Konstanz, Germany
In this project, we investigate whether social media conversations generated by independent LLMs are indistinguishable from those of humans, i.e., whether LLMs used to generate social media content can pass the Turing test. Formally, we conduct an experiment in which we prepare a series of English Reddit submissions to which we attach two conversations. One of the conversations is an authentic human conversation ,while the other is generated artificially using an LLM. We generate conversations using GPT-4o and llama 3 70B. We vary the temperatures of the models used and the length of the conversations. We recruit participants from Prolific. These subjects are asked to select the conversation they believe is generated by AI. Our preliminary results suggest that, overall, participants are fooled by LLM 40% of the time. Llama 3 70B conversations appear to fool users more often than GPT-4o ones. Through this study, we investigate to what extent and with which configurations LLM could be best used to simulate user conversations on social media. To the best of our knowledge, this is the first attempt to evaluate the performance of LLMs in mimicking a social media conversation between a group of individuals.
Keywords: LLMs; Turing Test; Social Media