Computer-Based Reaction Prediction

Technology

For solving the vast majority of chemical problems in the industry it is indispensable to combine various simulation methods. In our platform, this is realized by the combination of different tasks in a workflow-logic. A prominent example is theoretical prediction and analysis of chemical reactions, where not only quantum chemical methods but also machine learning (ML) are used.

Detailed knowledge of chemical reactions is one of the most important building blocks of the chemical and pharmaceutical industry. Without this knowledge, efficient synthesis of new products such as new active pharmaceutical ingredients or new materials such as OLEDs is unthinkable. In order to develop new products, improve existing products, or reduce manufacturing costs, a wide range of different reactions must currently be designed and tested in the laboratory in small-step, repetitive processes. These laboratory processes are expensive, time-consuming, and generate large amounts of chemical waste, resulting in high environmental impact and costs for disposal. In addition, experimental possibilities are limited by the combinatorial infinite chemical variations.

By using in silico methods, these challenges can be addressed. Thus, one first decomposes the problem into different subproblems and then combines their results to obtain an overall picture. The subproblems must be solved based on different technologies. For example, quantum chemical methods cannot easily predict a product from the reactants. On the other hand, ML and AI methods cannot simulate and elucidate the reaction mechanism.

If we consider a retrosynthesis, i.e., if we want to obtain a synthesis route for a given molecule, we first start from the desired product and use machine learning (ML) in the first step to find possible reaction cascades. Such a cascade shows possible ways to generate the desired product from commercially available reactants. On the other hand, if one considers an already known synthesis, it is essential to know, for example in the synthesis of active pharmaceutical ingredients, whether and if so, which unwanted byproducts could be generated in a synthesis route. Again, ML can be used to predict potential side reactions.

Building on this knowledge, individual reactions can now be studied in more detail; to do this, ML is first used to interpret the grammar of the reaction and to match and assign the atoms of the reactants to those of the products. A classical molecular dynamics simulation then generates and optimizes three-dimensional structures for the reactants and products, which form the basis for all further investigations.

In order to allow a quantum chemical analysis of the reaction, the reactants and products must now be studied separately. The molecular geometries are optimized by means of Density Functional Theory (DFT) and subsequently the obtained ground state structures are validated by normal mode analyses. As a by-product, an infrared (IR) spectrum of the reactants and products is obtained, which can be used for experimental analysis for comparison.

Finally, to gain a detailed insight into the reaction kinetics and to be able to understand and optimize processes, the course of the chemical reactions is predicted quantum-chemically by searching for the minimum energy path along the high-dimensional potential energy surface between two local minima (reactants and products). One possible approach to solve this challenging task is the nudged elastic band (NEB) method (1). In the simplest case, a linear reaction path is first interpolated, the resulting "images" are connected, and then the structures (transition states) along the path are optimized. From the course of the reaction, the reaction mechanism can now be extracted and interpreted. Moreover, other reaction parameters such as the activation energy can be determined.

For the reaction prediction, different methods from different theoretical fields were combined and linked in a complex workflow. This scientific and technological combination benefits from the high microscopic prediction accuracy of quantum chemical methods and the speed as well as macroscopic prediction accuracy of AI models. Due to the great flexibility of the workflows, even complex workflows and use cases can be mapped and simulated.

Literature

(1) G. Henkelman, B. P. Uberuaga, and H. Jónsson, "A climbing image nudged elastic band method for finding saddle points and minimum energy paths," J. Chem. Phys. 113, 9901 (2000).