Improving the generalizability of protein-ligand binding predictions with AI-Bind

Ayan Chatterjee, Robin Walters, Zohair Shafi, Omair Shafi Ahmed, Michael Sebek, Deisy Gysi, Rose Yu, Tina Eliassi-Rad, Albert László Barabási, Giulia Menichetti

Research output: Contribution to journalArticlepeer-review

Abstract (may include machine translation)

Identifying novel drug-target interactions is a critical and rate-limiting step in drug discovery. While deep learning models have been proposed to accelerate the identification process, here we show that state-of-the-art models fail to generalize to novel (i.e., never-before-seen) structures. We unveil the mechanisms responsible for this shortcoming, demonstrating how models rely on shortcuts that leverage the topology of the protein-ligand bipartite network, rather than learning the node features. Here we introduce AI-Bind, a pipeline that combines network-based sampling strategies with unsupervised pre-training to improve binding predictions for novel proteins and ligands. We validate AI-Bind predictions via docking simulations and comparison with recent experimental evidence, and step up the process of interpreting machine learning prediction of protein-ligand binding by identifying potential active binding sites on the amino acid sequence. AI-Bind is a high-throughput approach to identify drug-target combinations with the potential of becoming a powerful tool in drug discovery.

Original languageEnglish
Article number1989
Pages (from-to)1989
JournalNature Communications
Volume14
Issue number1
DOIs
StatePublished - 8 Apr 2023

Keywords

  • Amino Acid Sequence
  • Binding Sites
  • Ligands
  • Protein Binding
  • Proteins/metabolism

Fingerprint

Dive into the research topics of 'Improving the generalizability of protein-ligand binding predictions with AI-Bind'. Together they form a unique fingerprint.

Cite this