A System for Compression of Sequencing Data

semanticscholar(2020)

引用 0|浏览0
暂无评分
摘要
The development of genomic sequencing technologies is directly related to the storage, analysis, visualization of huge amount of sequencing data. These data, in terms of quantity and quality, are a huge challenge for modern computer science and bioinformatics related to compression and decompression of huge data sets. The present study is devoted to the development of models and their implementation for compression of sequencing data. The main aim of the work is to develop a web-based system for sequencing data compression that provides opportunities for faster and more accurate compression and decompression, noise protection and error correction. For the purposes of this study, we developed models for compression and decompression based on the methods of literal coding. The developed Optimized algorithm is related to the well-known methods of Huffman and Shanon-Fano. Also are developed a web-based module (with user interface), a library of noise protection algorithms for the compression and decompression of omics data, and a server component to make the library accessible on the Internet. The implemented web module with user interface provides easy access to server methods for compression and decompression of sequences. Used data formats allows the module to be integrated with open access systems such as NCBI, UniProt, Ensembl and other well-known external resources.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要