Comparison of software packages for detecting unannotated translated small open reading frames by Ribo-seq.

Gregory Tong, Nasun Hah,Thomas F Martinez

bioRxiv : the preprint server for biology(2023)

引用 0|浏览0
暂无评分
摘要
Accurate and comprehensive annotation of microprotein-coding small open reading frames (smORFs) is critical to our understanding of normal physiology and disease. Empirical identification of translated smORFs is carried out primarily using ribosome profiling (Ribo-seq). While effective, published Ribo-seq datasets can vary drastically in quality and different analysis tools are frequently employed. Here, we examine the impact of these factors on identifying translated smORFs. We compared five commonly used software tools that assess ORF translation from Ribo-seq (RibORFv0.1, RibORFv1.0, RiboCode, ORFquant, and Ribo-TISH), and found surprisingly low agreement across all tools. Only ~2% of smORFs were called translated by all five tools and ~15% by three or more tools when assessing the same high-resolution Ribo-seq dataset. For larger annotated genes, the same analysis showed ~72% agreement across all five tools. We also found that some tools are strongly biased against low-resolution Ribo-seq data, while others are more tolerant. Analyzing Ribo-seq coverage as a proxy for translation levels revealed that highly translated smORFs are more likely to be detected by more than one tool. Together these results support employing multiple tools to identify the most confident microprotein-coding smORFs, and choosing the tools based on the quality of the dataset and planned downstream characterization experiments of predicted smORFs.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要