Welcome to this space dedicated to the M2D2 Talks co-organized by Valence Discovery and Mila - Quebec AI Institute.
From applied research papers to open source projects, we're hoping to use these talks to help demystify AI for drug discovery and make the field more accessible for newcomers. M2D2 will bring our vibrant AI & drug discovery communities together and spark new perspectives, provoke discussions, and offer a safe space to share new ideas.
A wide range of drug discovery related topics will be covered reflecting the vibrant diversity of tools and methodologies in the community:
- Applications of ML in computational molecular design
- Representation learning for small- and macromolecules
- Prediction of molecular properties and bioactivities
- ML for quantum chemistry and molecular dynamics
- Generative models for de novo molecular design
- Multiparameter optimization and compound selection
- Interpretable and explainable activity models
- Reaction prediction, retrosynthesis, and synthesis planning
Latest Recorded Talks
Whenever possible, slides and videos will be available after each talk.
Beyond Atoms and Bonds: Contextual Explainability via Molecular Graphical Depictions
The field of explainable AI applied to molecular property prediction models has often been reduced to deriving atomic contributions. This has impaired the interpretability of such models, as chemists rather think in terms of larger, chemically meaningful structures, which often do not simply reduce to the sum of their atomic constituents. In this talk I will explain an explanatory framework yielding both local as well as more complex structural attributions. The key idea is to derive such contextual explanations in pixel space, exploiting the property that a molecule is not merely encoded through a collection of atoms and bonds, as is the case for string- or graph-based approaches. I’ll provide evidence that the proposed explanation method satisfies desirable properties, namely sparsity and invariance with respect to the molecule’s symmetries.
Bayesian modelling of synergistic drug combination effects in cancer using Gaussian Processes
High-throughput drug sensitivity experiments in cancer enable rapid in-vitro testing of various compounds on cancer cell lines, or patient-derived material, in order to determine the efficacy of a certain treatment. Accurate prediction of dose-response functions from a limited set of pre-clinical experiments is key to explore the large space of possible treatment options, or to prioritize which experiments to perform. This is particularly important when predicting the effect of drug combinations, where it is unfeasible to test all possible combinations. Drug sensitivity experiments are noisy by nature, due in part to the natural biological variability of cell growth but also technical error sources in the assays. This entails that the experimental observations of dose-response at different concentrations of a drug vary in estimation certainty both within and between experiments — a variability that has been often ignored in the literature.<br>In the bayesynergy R package, we implement a probabilistic approach for the description of the drug combination experiment, where the observed dose response curve is modelled as a sum of the expected response under a zero-interaction model and an additional interaction effect (synergistic or antagonistic). The interaction is modelled in a flexible manner, using a latent Gaussian process formulation. Since the proposed approach is based on a statistical model, it allows the natural inclusion of replicates, handles missing data and uneven concentration grids, and provides uncertainty quantification around the results.<br>We further extend the model from single-experiment to multi-experiment modelling and propose PIICM: a probabilistic framework for dose-response prediction in high-throughput drug combination datasets. PIICM utilizes a permutation invariant version of the intrinsic co-regionalization model for multi-output Gaussian Process regression to predict dose-response surfaces in untested drug combination experiments. The permutation invariance accounts for natural symmetries in the dose-response surfaces for drug combinations, which when not accounted for can have detrimental effects on prediction performance. Coupled with an observation model that incorporates experimental uncertainty, PIICM is able to learn from noisily observed cell-viability measurements in settings where the underlying dose-response experiments are of varying quality, and the training dataset is sparsely observed. We show that the model can accurately predict dose-response in held out experiments, and the resulting function captures relevant features indicating synergistic interaction between drugs.
Three open-source initiatives to get you started with AI in drug discovery
This talk features three open-source projects that are lowering the entrance barriers in AI for drug discovery and improving the speed at which researchers can develop new computational methods for the field. Datamol is a python library built on top of RDKit and aims to be as light as possible. The main features are: simple pythonic API, easy manipulation of molecular objects with good default options, built-in efficient parallelization, out-of-the-box support for modern IO operations using fsspec, easy pre-processing of molecular datasets for ML pipelines. TDC (Therapeutics Data Commons) is an open-science initiative with AI/ML-ready datasets and AI/ML tasks for therapeutics, spanning the discovery and development of safe and effective medicines. TDC provides an ecosystem of tools, libraries, leaderboards, and community resources, including data functions, strategies for systematic model evaluation, meaningful data splits, data processors, and molecule generation oracles. All resources are integrated via an open Python library. TorchDrug is designed to cover graph machine learning in drug discovery. It includes methods from graph neural networks, geometric deep learning, knowledge graphs, deep generative models, reinforcement learning and more. It provides a comprehensive and flexible interface to support rapid prototyping of drug discovery models in PyTorch.
Hadrien Mary, Chence Shi and Zuobai Zhang, Kexin Huang