MaskDNA-PGD: An innovative deep learning model for detecting DNA methylation by integrating mask sequences and adversarial PGD training as a data augmentation method

Zhiwei Zheng, Nguyen Quoc Khanh Le, Matthew Chin Heng Chua

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS(2023)

引用 3|浏览10
暂无评分
摘要
DNA methylation occurs in mammals' various diseases, such as cancer and myocardial pain. For a long time, scholars have tried to use machine learning and deep learning to learn the characteristics of DNA sequences with high precision for methylation classifications. However, these studies primarily innovated in encoding and seldom employed deep neural networks for predictions. Hence, this research proposes a framework with random masking and adversarial sample generation in the previous process. Our proposed novel classification model approach composes of convolutional neural network (CNN), bidirectional long short term memory (Bi-LSTM) and attention mechanism as predictors. The benchmark illustrates the au-tomation and advancement of the proposed framework, which can accurately binarily classify diverse DNA methylation. Random masking and adversarial sample generation are proven effective by conducting ablation experiments. In detail, our model achieved the best accuracy of 85.07%, 94.97%, and 92.17% in predicting multi-species N4-methylcytosine, 5-methylcytosine, and N6-methyladenine sites, respectively. Moreover, by comparing performance with two other methods using the same datasets and indexes, the proposed model (namely MaskDNA-PGD) successfully surpasses it. Finally, our MaskDNA-PGD can be freely accessed via https://github.com/willyzzz/MaskDNA-PGD.
更多
查看译文
关键词
DNA methylation,Convolutional neural network,Bidirectional long short term memory,Data augmentation,Adversarial network,Sequence encoding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要