ProFSA: Self-supervised Pocket Pretraining via Protein Fragment-Surroundings Alignment
ICLR 2024(2023)
摘要
Pocket representations play a vital role in various biomedical applications,
such as druggability estimation, ligand affinity prediction, and de novo drug
design. While existing geometric features and pretrained representations have
demonstrated promising results, they usually treat pockets independent of
ligands, neglecting the fundamental interactions between them. However, the
limited pocket-ligand complex structures available in the PDB database (less
than 100 thousand non-redundant pairs) hampers large-scale pretraining
endeavors for interaction modeling. To address this constraint, we propose a
novel pocket pretraining approach that leverages knowledge from high-resolution
atomic protein structures, assisted by highly effective pretrained small
molecule representations. By segmenting protein structures into drug-like
fragments and their corresponding pockets, we obtain a reasonable simulation of
ligand-receptor interactions, resulting in the generation of over 5 million
complexes. Subsequently, the pocket encoder is trained in a contrastive manner
to align with the representation of pseudo-ligand furnished by some pretrained
small molecule encoders. Our method, named ProFSA, achieves state-of-the-art
performance across various tasks, including pocket druggability prediction,
pocket matching, and ligand binding affinity prediction. Notably, ProFSA
surpasses other pretraining methods by a substantial margin. Moreover, our work
opens up a new avenue for mitigating the scarcity of protein-ligand complex
data through the utilization of high-quality and diverse protein structure
databases.
更多查看译文
关键词
Drug Discovery,Pretraining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要