MolTC: Towards Molecular Relational Modeling In Language Models
CoRR(2024)
摘要
Molecular Relational Learning (MRL), aiming to understand interactions
between molecular pairs, plays a pivotal role in advancing biochemical
research. Recently, the adoption of large language models (LLMs), known for
their vast knowledge repositories and advanced logical inference capabilities,
has emerged as a promising way for efficient and effective MRL. Despite their
potential, these methods predominantly rely on the textual data, thus not fully
harnessing the wealth of structural information inherent in molecular graphs.
Moreover, the absence of a unified framework exacerbates the issue of
information underutilization, as it hinders the sharing of interaction
mechanism learned across diverse datasets. To address these challenges, this
work proposes a novel LLM-based multi-modal framework for Molecular inTeraction
prediction following Chain-of-Thought (CoT) theory, termed MolTC, which
effectively integrate graphical information of two molecules in pair. For
achieving a unified MRL, MolTC innovatively develops a dynamic
parameter-sharing strategy for cross-dataset information sharing. Moreover, to
train MolTC efficiently, we introduce a Multi-hierarchical CoT concept to
refine its training paradigm, and conduct a comprehensive Molecular Interactive
Instructions dataset for the development of biochemical LLMs involving MRL. Our
experiments, conducted across various datasets involving over 4,000,000
molecular pairs, exhibit the superiority of our method over current GNN and
LLM-based baselines. Code is available at https://github.com/MangoKiller/MolTC.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要