Regensburg 2025 – scientific programme
Parts | Days | Selection | Search | Updates | Downloads | Help
HL: Fachverband Halbleiterphysik
HL 3: Focus Session: Machine Learning of semiconductor properties and spectra
HL 3.1: Invited Talk
Monday, March 17, 2025, 09:30–10:00, H17
Alexandria Database - Improving machine-learning models in materials science through large datasets — •Jonathan Schmidt1, Tiago Cerqueira3, Aldo Romero4, Silvana Botti2, and Miguel Marques2 — 1Department of Materials, ETH Zürich, Zürich, CH-8093, Switzerland — 2Research Center Future Energy Materials and Systems of the University Alliance Ruhr and ICAMS, Ruhr University Bochum, D-44801, Bochum, Germany — 3CFisUC, Department of Physics, University of Coimbra, Coimbra, PortugalLarga, 3004- — 4Department of Physics, West Virginia University, Morgantown, WV, 26506, USA
Accurate machine learning models hinge on the availability of large, high-quality datasets for both training and validation - a need that is particularly acute in materials science due to the scarcity of robust datasets. To address this gap, we present Alexandria, an open-access database generated through high-throughput studies using crystal-graph-attention networks to identify novel, stable crystal structures. The database includes over five million density-functional theory calculations for periodic compounds spanning one, two, and three dimensions.
Leveraging data generated with higher fidelity exchange-correlation functionals, we also explore the effectiveness of transfer learning strategies in detail. Finally, we assess to what extent incorporating experimental data can further enhance predictions of DFT band gaps.
Schmidt, Jonathan, et al. Materials Today Physics 48 (2024): 101560.
Keywords: Machine Learning; Big Data; Database; Density Functional Theory; Materials Discovery