Cheminformatics and topological descriptor based screening for organic electronic materials

H. Shaun Kwak,  Steven L. Dixon,  Woody Sherman,  Mathew D. Halls
Schrodinger, Inc.


Abstract

There is a pressing need for the development of low-cost, informatics-based modeling tools for organic semiconductors to rapidly explore the vast chemical space and advise experimental efforts. We hereby showcase a fast and user-friendly cheminformatics based design framework for the discovery and design of novel organic electronic compounds. For a representative list of organic light-emitting diode (OLED) materials, topology-based quantitative structure-property relationships (QSPR) regression models were constructed and validated for a set of optoelectronic properties including redox potentials, reorganization energies, and triplet exited-state energies. With the help of Schrodinger’s high-throughput virtual screening technology, we were able to confirm that kernel-based partial least square (KPLS) regression approach with binary fingerprint have good predictive capability against quantum chemical predictions. Quick post-regression analyses including scaffold decomposition, R-group analysis, and interactive model visualization can help to quickly identify the key structural motifs that would determine the predicted target property as well as the prediction accuracy itself. The work illustrates the importance of providing intuitive and easy-to-access interface to non-linear regression methods and chemically aware analysis tools, which enables an effective in silico design scheme for a wide selection of organic electronic materials.