Regensburg 2019 – scientific programme
Parts | Days | Selection | Search | Updates | Downloads | Help
DY: Fachverband Dynamik und Statistische Physik
DY 36: Condensed-matter simulations augmented by advanced statistical methodologies (joint session DY/CPP)
DY 36.7: Talk
Wednesday, April 3, 2019, 16:45–17:00, H20
Unsupervised machine learning of chemical compound space for hierarchical screening and coarse-graining applications — •Kiran Kanekal, Kurt Kremer, and Tristan Bereau — Max Planck Institute for Polymer Research, Mainz, Germany
Recently, the number and type of chemical fingerprints used for supervised learning of molecular properties has increased significantly, as the importance of properly encoding the 3D structure of a molecule to greatly increase accuracy has been recognized. Some of the most successful of these fingerprints encode a great deal of physical information, including coefficients for potential energy functions commonly used in classical atomistic molecular dynamics simulations [1]. In this work, we incorporate these fingerprints into an unsupervised machine learning (clustering) scheme to define subspaces in the chemical compound space, for which we use the Generated DataBase [2] (GDB) as a proxy. We show that the use of different molecular fingerprints leads to significant differences in the clustering observed, as each fingerprint will highlight specific molecular properties. Therefore, unsupervised learning, when coupled with these fingerprints, naturally enables hierarchical screening approaches for materials design. Furthermore, the presence of strong correlations between clusters identified using different fingerprints implies that a lowering of resolution (i.e. coarse-graining) is viable for that specific region of chemical compound space.
1. B. Huang and O. A. von Lilienfeld, J. Chem. Phys. 2016, 145, 161102. 2. T. Fink and Jean-Louis Reymond, J. Chem. Inf. Model. 2007, 47, 342.