Regensburg 2025 – scientific programme

SOE 10.3: Talk

Thursday, March 20, 2025, 15:45–16:00, H45

Collective Turing Tests on LLMs — •Azza Bouleimen^{1, 2}, Giordano de Marzo³, Taehee Kim³, Silvia Giordano², and David Garcia³ — ¹University of Zurich, Zurich, Switzerland — ²SUPSI, Lugano, Switzerland — ³University of Konstanz, Konstanz, Germany

In this project, we investigate whether social media conversations generated by independent LLMs are indistinguishable from those of humans, i.e., whether LLMs used to generate social media content can pass the Turing test. Formally, we conduct an experiment in which we prepare a series of English Reddit submissions to which we attach two conversations. One of the conversations is an authentic human conversation ,while the other is generated artificially using an LLM. We generate conversations using GPT-4o and llama 3 70B. We vary the temperatures of the models used and the length of the conversations. We recruit participants from Prolific. These subjects are asked to select the conversation they believe is generated by AI. Our preliminary results suggest that, overall, participants are fooled by LLM 40% of the time. Llama 3 70B conversations appear to fool users more often than GPT-4o ones. Through this study, we investigate to what extent and with which configurations LLM could be best used to simulate user conversations on social media. To the best of our knowledge, this is the first attempt to evaluate the performance of LLMs in mimicking a social media conversation between a group of individuals.

Keywords: LLMs; Turing Test; Social Media

Regensburg 2025 – scientific programme

Parts | Days | Selection | Search | Updates | Downloads | Help

SOE: Fachverband Physik sozio-ökonomischer Systeme

SOE 10: Focus Session: Large Language Models, Social Dynamics, and Assessment of Complex Systems

SOE 10.3: Talk

Thursday, March 20, 2025, 15:45–16:00, H45