Application of Capsule Networks in Cheminformatics | Department of Chemistry

Application of Capsule Networks in Cheminformatics

The first application of Capsule Networks in Cheminformatics.

R. Dutt, N. Sukumar, Drug Activity Prediction using Capsule Networks and Dynamic Routing Algorithm, 15th German Conference on Cheminformatics, Nov. 3-5, 2019, Mainz, Germany.

Convolutional Neural Networks ( have dominated the field of computer vision for decades, with different variants of this Deep Neural Network ( being used for several vision tasks such as Image Processing, Image Recognition, and Image Segmentation and Feature Extraction However, CNNs do not take into account the spatial hierarchies between simple and complex objects Capsule Networks circumvent this issue by modeling hierarchical relationships using the Dynamic Routing algorithm and have achieved state of the art results with this approach 1 This ability of Capsule Networks has been employed for the task of drug activity prediction on several highly imbalanced datasets We have modified the architecture of the network by replacing all the 2 D Convolutional layers with 1 D convolution to accept a vector of features as input Dropout and regularization (L 2 layers have been added in the architecture, and the number of nodes has been reduced to prevent overfitting Different combinations of descriptors (RECON 2 and MOE 3 have been used for three datasets Checkpoint kinase 1 (CHK 1 7 inhibitors, Cyclin dependent Kinase 2 (CDK 2 inhibitors 8 and Urokinase inhibitors 9 datasets Several different approaches and methodologies such as oversampling, undersampling, and initialization of class weights have been used to circumvent the high imbalance in the datasets and the results have been compared A further improvement in results is obtained by employing a combination of feature selection methods (Ranking Feature Importance and Threshold based feature removal) Different values of the threshold have been experimented with to obtain the best performance on a validation set of molecules not used in training A Support Vector Classifier ( has been used as a baseline model for comparison Capsule Networks have demonstrated a significantly better performance than SVC for all the datasets on several metrics.

Conclusions

  • Capsule Networks perform better than Support Vector Machines on all three datasets.
  • Capsule Networks perform better than other classifiers such as Random Forest on 2 out of 3 datasets (CHK1, CDK2) and are marginally behind on the Urokinase dataset.
  • Capsule Networks show promise in Drug Activity Prediction and Virtual High Throughput Screening.
  • The interpretation of the Capsules in terms of chemical features is presently under investigation.

Faculty

Professor

Students

Raman Dutt
B.Tech. in Computer Science and Engineering
Class of: 
2020