Convolutional Neural Networks Grouped by Transcription Factors for Predicting Protein-DNA Binding Site

Siquan Hu, Mengjin Fu,Haitao Jia

Proceedings of the 2019 International Conference on Artificial Intelligence and Computer Science(2019)

引用 1|浏览2
暂无评分
摘要
Understanding the specific interactions of transcription factors (TFs) and DNA is essential for comprehending regulatory processes in biological systems. Recently deep learning algorithms have outperformed conventional time-consuming and expensive methods such as ChIP-seq in predicting the sequence specificities of DNA-protein binding. However, because TF binding is a cell-specific behavior, most current deep learning methods build one model for each TF-cell line combination, which leads to problems such as the complexity of maintaining numerous models and the poor prediction performance of some models for cell lines without enough ChIP-seq data. Thus, it is useful to build models with both higher accuracy and wider range of application. We propose a method to build a series of Convolutional Neural Network (CNN) based models grouped by TFs, which are named TF models. Trained with the same database of 554 ChIP-seq data, the proposed TF models outperform DeepBind in the motif discovery task. On one hand, the amount of models has been reduced from 554 to 72, which extend the application scope of each model. On the other hand, TF models achieve higher AUC than Deepbind on 94.2% TF-cell line combinations. Moreover, we demonstrated that TF model achieve an average AUC 0.909 when predict the binding of TFs in cell lines that lack ChIP-seq data.
更多
查看译文
关键词
ChIP-seq, Convolutional Neural Network, DNA-Protein Binding, Transcription Factor
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要