SAKit: an all-in-one analysis pipeline for identifying novel protein caused by variant events at genomic and transcriptic level

Social Science Research Network(2023)

引用 0|浏览19
暂无评分
摘要
Summary Genetic modifications that cause pivotal protein inactivation or abnormal activation may lead to cell signaling pathway change or even dysfunction, resulting in cancer and other diseases. In turn, dysfunction will further produce “novel proteins” that do not exist in the canonical human proteome. Identification of novel proteins is meaningful for identifying promising drug targets and developing new therapies. In recent years, several tools have been developed for identifying DNA or RNA variants with the extensive application of nucleotide sequencing technology. However, these tools mainly focus on point mutation and have limited performance in identifying large-scale variants as well as the integration of mutations. Here we developed a hybrid Sequencing Analysis bioinformatic pipeline by integrating all relevant detection Kits(SAKit): this pipeline fully integrates all variants at the genomic and transcriptomic level that may lead to the production of novel proteins defined as proteins with novel sequences compare to all reference sequences by comprehensively analyzing the long and short reads. The analysis results of SAKit demonstrate that large-scale mutations have more contribution to the production of novel proteins than point mutations, and long-read sequencing has more advantages in large-scale mutation detection. Availability and implementation SAKit is freely available on docker image (), which is mainly implemented within a Snakemake framework in Python language. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
Proteins
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要