Chrome Extension
WeChat Mini Program
Use on ChatGLM

Base-calling algorithm with vocabulary (BCV) method for analyzing population sequencing chromatograms.

Yuri S Fantin,Alexey D Neverov,Alexander V Favorov, Maria V Alvarez-Figueroa, Svetlana I Braslavskaya,Maria A Gordukova, Inga V Karandashova,Konstantin V Kuleshov, Anna I Myznikova, Maya S Polishchuk,Denis A Reshetov, Yana A Voiciehovskaya,Andrei A Mironov,Vladimir P Chulanov

PloS one(2013)

Cited 9|Views14
No score
Abstract
Sanger sequencing is a common method of reading DNA sequences. It is less expensive than high-throughput methods, and it is appropriate for numerous applications including molecular diagnostics. However, sequencing mixtures of similar DNA of pathogens with this method is challenging. This is important because most clinical samples contain such mixtures, rather than pure single strains. The traditional solution is to sequence selected clones of PCR products, a complicated, time-consuming, and expensive procedure. Here, we propose the base-calling with vocabulary (BCV) method that computationally deciphers Sanger chromatograms obtained from mixed DNA samples. The inputs to the BCV algorithm are a chromatogram and a dictionary of sequences that are similar to those we expect to obtain. We apply the base-calling function on a test dataset of chromatograms without ambiguous positions, as well as one with 3-14% sequence degeneracy. Furthermore, we use BCV to assemble a consensus sequence for an HIV genome fragment in a sample containing a mixture of viral DNA variants and to determine the positions of the indels. Finally, we detect drug-resistant Mycobacterium tuberculosis strains carrying frameshift mutations mixed with wild-type bacteria in the pncA gene, and roughly characterize bacterial communities in clinical samples by direct 16S rRNA sequencing.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined