Die DPG-Frühjahrstagung in Dresden musste abgesagt werden! Lesen Sie mehr ...
Bereiche | Tage | Auswahl | Suche | Aktualisierungen | Downloads | Hilfe
CPP: Fachverband Chemische Physik und Polymerphysik
CPP 68: Topical Session: Data Driven Materials Science - Descriptors (joint session MM/CPP)
CPP 68.3: Vortrag
Mittwoch, 18. März 2020, 12:15–12:30, BAR 205
Size-Extensive Molecular Machine Learning with Global Descriptors — •Johannes Margraf1, Hyunwook Jung2, Sina Stocker1, Christian Kunkel1, and Karsten Reuter1 — 1Technical University Munich, Germany — 2Yonsei University, South Korea
Machine learning (ML) models are increasingly used to predict molecular properties in a high-throughput setting at a much lower computational cost than conventional electronic structure calculations. Such ML models require descriptors that encode the molecular structure in a vector. These descriptors are generally designed to respect the symmetries and invariances of the target property. However, size-extensivity is usually not guaranteed for so-called global descriptors. In this contribution, we show how extensivity can be build into ML models with global descriptors such as the Many-Body Tensor Representation. Properties of extensive and non-extensive models for the atomization energy are systematically explored by training on small molecules and testing on small, medium and large molecules. Our results show that the non-extensive model is only useful in the size-range of its training set, whereas the extensive models provide reasonable predictions across large size differences. Remaining sources of error for the extensive models are discussed.