Chrome Extension
WeChat Mini Program
Use on ChatGLM

The Recipe for Protein Sequence-Based Function Prediction and Its Implementation in the ANNOTATOR Software Environment.

DATA MINING TECHNIQUES FOR THE LIFE SCIENCES(2016)

Cited 15|Views22
No score
Abstract
As biomolecular sequencing is becoming the main technique in life sciences, functional interpretation of sequences in terms of biomolecular mechanisms with in silico approaches is getting increasingly significant. Function prediction tools are most powerful for protein-coding sequences; yet, the concepts and technologies used for this purpose are not well reflected in bioinformatics textbooks. Notably, protein sequences typically consist of globular domains and non-globular segments. The two types of regions require cardinally different approaches for function prediction. Whereas the former are classic targets for homology-inspired function transfer based on remnant, yet statistically significant sequence similarity to other, characterized sequences, the latter type of regions are characterized by compositional bias or simple, repetitive patterns and require lexical analysis and/or empirical sequence pattern-function correlations. The recipe for function prediction recommends first to find all types of non-globular segments and, then, to subject the remaining query sequence to sequence similarity searches. We provide an updated description of the ANNOTATOR software environment as an advanced example of a software platform that facilitates protein sequence-based function prediction.
More
Translated text
Key words
ANNOTATOR,Genome annotation,Globular domain,Non-globular segment,Protein function prediction,Protein sequence analysis
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined