Chemie  |  Biochemie  |  Medizin

 

Khaled Ramdoun, 2003 | Zürich, ZH

 

The high-impact materials of today come from exploring only a fraction of the known chemical space. Determining the properties of molecules traditionally requires plenty of labor and energy. Finding a method to compute such properties is very valuable and has the potential to accelerate research progress and drastically reduce the cost of research in the pharmaceutical domain. Traditional methods to determine molecular properties include chromatography, spectroscopy, and electrophoresis. Computational methods can be applied to calculate molecular properties that cannot be easily measured using traditional methods such as dipole moment, polarizability, and internal energy. The focus of this contribution was therefore the prediction of quantum molecular properties using a deep neural network. Exploring how this method works and applying it to a practical problem was the main goal. The neural network utilized is a Deep Tensor Neural Network that has already been developed in a previous paper.

Introduction

(I) The goal of this contribution is the computation of a neural network for the prediction of the molecular properties: dipole moment, polarisability, and internal energy to replace the computationally expensive quantum mechanic calculations and to (II) determine wether the number of tasks a model is trained on affect it’s accuracy. In addition to these two goals the optimisation and analysis of the convergence behaviour of the model was an inadmissible part of this paper.

Methods

The DTNN model was trained on the QM9 dataset with twelve different tasks. Three separate models and a general model were trained for the prediction of dipole moment, polarisability, and internal energy. In a further the step the single hyperparameters of the model are isolated and tested for their individual effect on the mean absolute error of. To save computational time and energy during the optimization process the models were trained on 1000 molecules only, with the hope to use the information gained from the models trained on 1000 molecules for the models trained on the whole dataset.

Results

The analysis of the models shows surprisingly high mean absolute error values. Consequently, the initial models are considered as being underfit. The general and specific models showed no significant advantage over each other. In a further step the hyperparameters were optimized by isolating the single parameters, varying them and checking their effect on the mean absolute error of the model. The optimal learning rate for the model trained on 1000 molecules is 0.0001 while the optimal learning rate for the model trained on the whole dataset was 0.001, a difference of a factor 0.1. The size of the model showed no significant effect on the mean absolute error of the model. This optimisation resulted in the convergence of the mean absolute error for the train dataset while the mean absolute error for the test dataset stayed relatively stable and even started increasing.

Discussion

The goal was only partially fulfilled due to the low performance of the model on the test dataset even after the optimization. The final model is considered as being overfit. This can be attributed to a low dropout rate when training the model. The optimisation process was a success as it lead to the convergence of the mean absolute error of the model. The methods used for the optimisation were effective at isolating the single hyperparameters and examining their effect on the mean absolute error.

Conclusions

As part of this contribution several models based on a Deep Tensor Neural Network (DTNN) were analysed for the prediction of molecular properties. The number of molecular properties a model was trained to predict didn’t show an effect on the mean absolute error of the model’s predictions. The DTNN initially trained were underfit. Through the optimisation of the model’s hyperparameters, specifically the learning rate and the model’s size, the model was able to learn the properties of the learning dataset but failed to generalise these properties over the whole dataset, which would be considered overfit. In a further step it would make sense to train a model and analyse what effect the dropout has on the fitting behaviour of the model. Furthermore, it would be interesting to train the model even longer than 500 epochs and vary more than one hyperparameter at a time. Due to restrictions on the Euler supercomputer this wasn’t possible. Furthermore, it would be interesting to construct a model for the prediction of molecular properties from the very basic mathematical concepts to their application in chemical research.

 

 

Würdigung durch den Experten

Marc Lehner

Khaled Ramdoun hat sich in seiner Arbeit mit Themen an der Front von Quantenmechanik und Künstlicher Intelligenz beschäftigt. Er hat sich mit hochtheoretischen und komplexen Themen der Chemo-Informatik auseinandergesetzt, um diese auf ein konkretes Problem, die Vorhersage von messbaren Grössen wie dem Dipol, anzuwenden. Auch als sich diese Aufgaben als nicht so einfach erwiesen hatten, versuchte er – in vorzüglicher, wissenschaftlicher Weise – diese systematisch zu lösen.

Prädikat:

sehr gut

 

 

 

Rämibühl-MNG, Zürich
Lehrer: René Oetterli