Multimodal out-of-distribution individual uncertainty quantification enhances binding affinity prediction for polypharmacology.

Overview

abstract

Polypharmacology, a single drug that targets multiple proteins, holds promise for addressing unmet medical needs. Achieving accurate, reliable and scalable predictions of protein-ligand binding affinity across multiple proteins is crucial to realizing the potential of polypharmacology. Machine learning offers a powerful tool for multitarget binding affinity prediction. However, three major challenges remain: generalizing predictions to out-of-distribution compounds that are structurally different from those in the training data; quantifying the uncertainty of predictions in out-of-distribution scenarios where the assumption underlying existing methods does not hold; and scaling to billions of compounds, which remains unattainable for current structure-based methods. Here, to overcome these challenges, we propose a model-agnostic anomaly detection-based individual uncertainty quantification method: embedding Mahalanobis Outlier Scoring and Anomaly Identification via Clustering (eMOSAIC). eMOSAIC features the divergence between the multimodal representations of known cases and unseen instances and quantifies individual prediction uncertainty on a compound-by-compound basis. We integrate eMOSAIC with a multimodal deep neural network for multitarget ligand binding affinity predictions, leveraging a structure-informed large protein language model. Comprehensive validation in out-of-distribution settings demonstrates that eMOSAIC significantly outperforms state-of-the-art sequence-based and structure-based methods as well as existing uncertainty quantification approaches. These findings underscore eMOSAIC's potential to advance real-world polypharmacology and other applications that require robust predictions and scalable solutions.

authors

Badkul, Amitesh

Xie, Li

Zhang, Shuo

Xie, Lei

publication date

December 18, 2025

published in

Nature machine intelligence Journal

Identity

PubMed Central ID

PMC12798713

Scopus Document Identifier

105025357622

Digital Object Identifier (DOI)

10.1038/s42256-025-01151-2

PubMed ID

41537122

Additional Document Info

volume

7

issue

12

VIVO Weill Cornell Medical College

Multimodal out-of-distribution individual uncertainty quantification enhances binding affinity prediction for polypharmacology. Academic Article

Overview

abstract

authors

publication date

published in

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

volume

issue