Informatics & Molecular Modelling | Department of Chemistry

Informatics & Molecular Modelling

This course and the associated computer lab deal with Molecular Modelling and Cheminformatics, applied to the search for new drugs or materials with specific properties or specific physiological effects (in silico Drug Discovery). Students will learn the general principles of structure-activity relationship modelling, docking & scoring, homology modelling, statistical learning methods and advanced data analysis. They will gain familiarity with software for structure-based and ligand-based drug discovery. Some coding and scripting will be required.


  1. Introduction:
    • Drug Discovery in the Information-rich age
    • Introduction to Pattern recognition and Machine Learning
    • Supervised and unsupervised learning paradigms and examples
    • Applications potential of Machine learning in Cheminformatics & Bioinformatics
    • Introduction to Classification and Regression methods
  2. Representation of Chemical Structure and Similarity:
    • Sequence Descriptors
    • Text mining
    • Representations of 2D Molecular Structures: SMILES
    • Chemical File Formats, 3D Structure
    • Descriptors and Molecular Fingerprints
    • Graph Theory and Topological Indices
    • Progressive incorporation of chemically relevant information into molecular graphs
    • Substructural Descriptors
    • Physicochemical Descriptors
    • Descriptors from Biological Assays
    • Representation and characterization of 3D Molecular Structures
    • Pharmacophores
    • Molecular Interaction Field Based Models
    • Local Molecular Surface Property Descriptors
    • Quantum Chemical Descriptors
    • Shape Descriptors
    • Protein Shape Comparisons, Motif Models
    • Molecular Similarity Measures
    • Chemical Space and Network graphs
    • Semantic technologies and Linked Data
  3. Mapping Structure to Response: Predictive Modelling:
    • Linear Free Energy Relationships
    • Quantitative Structure-Activity/Property Relationships (QSAR/QSPR) Modeling
    • Ligand-Based and Structure-Based Virtual High Throughput Screening
    • 3D Methods - Pharmacophore Modeling and alignment
    • ADMET Models
    • Activity Cliffs
    • Structure Based Methods, docking and scoring
    • Model Domain of Applicability
  4. Data Mining and Statistical Methods:
    • Linear and Non-Linear Models
    • Data preprocessing and performance measures in Classification & Regression
    • Feature selection
    • Principal Component analysis
    • Partial Least-Squares Regression
    • kNN, Classification trees and Random forests
    • Cluster and Diversity analysis
    • Introduction to kernel methods
    • Support vector machines classification and regression
    • Introduction to Neural Nets
    • Self-Organized Maps
    • Deep Neural Networks
    • Introduction to evolutionary computing
    • Genetic Algorithms
    • Data Fusion
    • Model Validation
    • Best Practices in Predictive Cheminformatics


  1. Johann Gasteiger, Thomas Engel,Chemoinformatics: A Textbook (Wiley-VCH, 2003)
  2. Jürgen Bajorath (Editor), Chemoinformatics and Computational Chemical Biology (Methods in Molecular Biology) (Humana Press, 2004)
  3. Leach & Gillet, An Introduction to Chemoinformatics

Prerequisites: Basic Organic chemistry/Biochemistry, Basic Statistics, Computer Programming.

Course Code: 
Course Credits: 
Course Level: