Bridging expressed sequence alignments through targeted cDNA sequencing.

Genomics(2004)

Cited 1|Views14
No score
Abstract
One of the major challenges in genome research is the identification of the complete set of genes in a genome. Alignments of expressed sequences (RNA and EST) with genomic sequences have been used to characterize genes. However, the number of alignments far exceeds the likely number of genes in a genome, suggesting that, for many genes, two or more alignments can be joined through overlapping sequences to yield accurate gene structures. High-throughput EST sequencing becomes less efficient in closing those alignment gaps due to its nonselective nature. We sought to bridge these alignments through a novel approach: targeted cDNA sequencing. Human expressed sequences from GenBank version 124 were aligned with the genomic sequence from NCBI build 24 using LEADS, Compugen's EST and RNA clustering and assembly software system. Nine hundred forty-eight pairs of alignments were selected based on EST clone information and/or their homology to the same known proteins. Reverse transcriptase PCR and sequencing yielded sequences for 363 of those pairs. These sequences helped characterize over 60 novel or otherwise incomplete genes in the recent UniGene build 153, which included over 1 million additional ESTs. These results indicate that this integrated and targeted strategy, combining computational prediction and experimental cDNA sequencing, can efficiently generate the overlapping sequences and enable the full characterization of genomes. Additional information about the contig pairs, the resultant overlapping sequences, tissue sources, and tissue profiles are available in a supplemental file. The majority of sequences were deposited with GenBank and their accession numbers are within the range BU101504 to BU102193.
More
Translated text
Key words
gene structure,reverse transcriptase,sequence alignment,high throughput,genome sequence,software systems
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined