Decision tree-based identification of important molecular fragments for protein-ligand binding

Baiyi Li, Yunsong Wang, Zuode Yin,Lei Xu,Liangxu Xie,Xiaojun Xu

CHEMICAL BIOLOGY & DRUG DESIGN(2024)

引用 0|浏览2
暂无评分
摘要
Fragment-based drug design is an emerging technology in pharmaceutical research and development. One of the key aspects of this technology is the identification and quantitative characterization of molecular fragments. This study presents a strategy for identifying important molecular fragments based on molecular fingerprints and decision tree algorithms and verifies its feasibility in predicting protein-ligand binding affinity. Specifically, the three-dimensional (3D) structures of protein-ligand complexes are encoded using extended-connectivity fingerprints (ECFP), and three decision tree models, namely Random Forest, XGBoost, and LightGBM, are used to quantitatively characterize the feature importance, thereby extracting important molecular fragments with high reliability. Few-shot learning reveals that the extracted molecular fragments contribute significantly and consistently to the binding affinity even with a small sample size. Despite the absence of location and distance information for molecular fragments in ECFP, 3D visualization, in combination with the reverse ECFP process, shows that the majority of the extracted fragments are located at the binding interface of the protein and the ligand. This alignment with the distance constraints critical for binding affinity further supports the reliability of the strategy for identifying important molecular fragments. Identifying important molecular fragments through ECFP fingerprints for fragment encoding and decision tree algorithms for feature importance assessment.image
更多
查看译文
关键词
binding affinity,decision tree algorithms,important molecular fragments,molecular fingerprints,protein-ligand complex
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要