Stack-DHUpred: Advancing the accuracy of dihydrouridine modification sites detection via stacking approach

Md. Harun-Or-Roshid,Kazuhiro Maeda, Le Thi Phan,Balachandran Manavalan,Hiroyuki Kurata

COMPUTERS IN BIOLOGY AND MEDICINE(2024)

引用 0|浏览0
暂无评分
摘要
Dihydrouridine (DHU, D) is one of the most abundant post-transcriptional uridine modifications found in tRNA, mRNA, and snoRNA, closely associated with disease pathogenesis and various biological processes in eukaryotes. Identifying D sites is important for understanding the modification mechanisms and/or epigenetic regulation. However, biological experiments for detecting D sites are time-consuming and expensive. Given these challenges, computational methods have been developed for accurately identifying the D sites in genome-wide datasets. However, existing methods have some limitations, and their prediction performance needs to be improved. In this work, we have developed a new computational predictor for accurately identifying D sites called Stack-DHUpred. Briefly, we trained 66 baseline models or single-feature models by connecting six machine learning classifiers with eleven different feature encoding methods and stacked different baseline models to build stacked ensemble learning models. Subsequently, the optimal combination of the baseline models was identified for the construction of the final stacked model. Remarkably, the Stack-DHUpred outperformed the existing predictors on our new independent dataset, indicating that the stacking approach significantly improved the prediction per-formance. We have made Stack-DHUpred available to the public through a web server (http://kurata35.bio. kyutech.ac.jp/Stack-DHUpred) and a standalone program (https://github.com/kuratahiroyuki/Stack-DHUpr ed). We believe that Stack-DHUpred will be a valuable tool for accelerating the discovery of D modifications and understanding their role in post-transcriptional regulation.
更多
查看译文
关键词
RNA modification sites,Dihydrouridine,Stacking framework,Sequence analysis,Machine learning,Bioinformatics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要