DPG Phi
Verhandlungen
Verhandlungen
DPG

Göttingen 2025 – wissenschaftliches Programm

Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe

T: Fachverband Teilchenphysik

T 33: Data, AI, Computing, Electronics III (ML in Jet Tagging, Misc.)

T 33.2: Vortrag

Dienstag, 1. April 2025, 16:30–16:45, VG 2.101

Aspen Open Jets: Unlocking LHC Data for Foundation Models in Particle PhysicsOz Amram1, Luca Anzalone2, •Joschka Birk3, Darius A. Faroughy4, Anna Hallin3, Gregor Kasieczka3, Michael Krämer5, Ian Pang4, Humberto Reyes-Gonzalez5, and David Shih41Fermi National Accelerator Laboratory — 2University of Bologna — 3University of Hamburg — 4Rutgers University — 5RWTH Aachen University

A foundation model is a type of deep learning model that is pre-trained on a large dataset, enabling it to serve as a versatile base for being fine-tuned to various downstream tasks or other datasets. This study illustrates the utility of data gathered from the CMS experiment at the Large Hadron Collider in pre-training foundation models for High Energy Physics (HEP). We present the AspenOpenJets dataset, which comprises approximately 180 million high-pT jets extracted from the CMS 2016 Open Data. Our findings include new studies conducted with the OmniJet-α foundation model, highlighting how pre-training on AspenOpenJets enhances performance on generative tasks that involve significant domain shifts, such as generating boosted top and QCD jets from the simulated JetClass dataset. Beyond showcasing the effectiveness of pre-training a jet-based foundation model using actual proton-proton collision data, we also offer the ML-ready AspenOpenJets dataset for public access and further research.

Keywords: Machine Learning; Foundation Models; Jet Physics; Open Data

100% | Mobil-Ansicht | English Version | Kontakt/Impressum/Datenschutz
DPG-Physik > DPG-Verhandlungen > 2025 > Göttingen