High-Performance Virus Detection System by using Deep Learning.

CEC(2020)

引用 1|浏览15
暂无评分
摘要
Metagenomic shotgun sequencing enables us to explore diverse DNA sequences from viruses, bacteria, and eukaryotic microbes in complex samples. As the continuous advancement of sequencing technology generates a massive amount of sequencing data, its overall computational complexity has become a major challenge for traditional database sequence comparison methods. Studies have shown that deep learningoriented methods have been widely adopted to solve many classification problems, including those in the bioinformatics field, and have demonstrated this method\u0027s accuracy and efficiency for analyzing large-scale datasets. The aim of this study attempts to investigate how deep learning (LSTM model) can be used to learn sequential genome patterns through virus detection from metagenomic data. This study provides three major contributions. First, we provide the background and steps for the task of DNA sequencing classification from data collection, preprocessing, and normalization. Second, we analyze the effect of sequence length on LSTM classification accuracy and split the raw sequencing data to proper subsequences to improve the outcome of virus detection. Third, to enhance both the classification accuracy and processing speed, we introduce the concept of discrimination function that enables prediction results for multiple subsequences results and accelerated these processes through GPU parallel computing. Two case studies of HCV and influenza detection were conducted to elaborate upon the accuracy and computational efficiency of our proposed approach. Our test result showed that the proposed LSTM model obtained similar pathogen detection accuracy to the conventional BLAST method with a speed that was about 36 times faster.
更多
查看译文
关键词
Deep Learning,LSTM,GPU Acceleration,Parallel Computing,Metagenomic Shotgun Sequencing,DNA Sequence Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要