A Statistical Similarity/Dissimilarity Analysis of Protein Sequences Based on a Novel Group Representative Vector.

BIOMED RESEARCH INTERNATIONAL(2019)

Cited 4|Views2
No score
Abstract
Similarity/dissimilarity analysis is a key way of understanding the biology of an organism by knowing the origin of the new genes/sequences. Sequence data are grouped in terms of biological relationships. The number of sequences related to any group is susceptible to be increased every day. All the present alignment-free methods approve the utility of their approaches by producing a similarity/dissimilarity matrix. Although this matrix is clear, it measures the degree of similarity among sequences individually. In our work, a representative of each of three groups of protein sequences is introduced. A similarity/dissimilarity vector is evaluated instead of the ordinary similarity/dissimilarity matrix based on the group representative. The approach is applied on three selected groups of protein sequences: beta globin, NADH dehydrogenase subunit 5 (ND5), and spike protein sequences. A cross-grouping comparison is produced to ensure the singularity of each group. A qualitative comparison between our approach, previous articles, and the phylogenetic tree of these protein sequences proved the utility of our approach.
More
Translated text
Key words
Protein Sequences,sequence alignment,sequence variation,Periodicity Analysis,Phylogenetic Analysis
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined