Lite-SeqCNN: A Light-Weight Deep CNN Architecture for Protein Function Prediction
IEEE/ACM Transactions on Computational Biology and Bioinformatics(2023)
摘要
The
short-and-long
range interactions amongst amino-acids in a protein sequence are primarily responsible for the function performed by the protein. Recently convolutional neural network (CNN)s have produced promising results on sequential data including those of NLP tasks and protein sequences. However, CNN's strength primarily lies at capturing
short
range interactions and are not so good at
long
range interactions. On the other hand, dilated CNNs are good at capturing both
short-and-long
range interactions because of varied –
short-and-long
– receptive fields. Further, CNNs are quite light-weight in terms of trainable parameters, whereas most existing deep learning solutions for protein function prediction (PFP) are based on multi-modality and are rather complex and heavily parametrized. In this paper, we propose a (sub-sequence +
dilated
-CNNs)-based simple, light-weight and sequence-only PFP framework
Lite-SeqCNN
. By varying
dilation-rates
,
Lite-SeqCNN
efficiently captures both
short-and-long
range interactions and has (0.50–0.75 times) fewer trainable parameters than its contemporary deep learning models. Further,
Lite-SeqCNN
$^+$
is an ensemble of three
Lite-SeqCNN
s developed with different segment-sizes that produces even better results compared to the individual models. The proposed architecture produced improvements upto 5% over state-of-the-art approaches
Global-ProtEnc Plus
,
DeepGOPlus
, and
GOLabeler
on three different prominent datasets curated from the UniProt database.
更多查看译文
关键词
Dilated-Convolutional neural network,protein function prediction,protein sub-sequence
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要