Domainator, a flexible software suite for domain-based annotation and neighborhood analysis, identifies proteins involved in antiviral systems

crossref(2024)

Cited 0|Views5
No score
Abstract
The availability of large databases of biological sequences presents an opportunity for in-depth exploration of gene diversity and function. Bacterial defense systems are a rich source of diverse, but difficult to annotate genes with biotechnological applications. In this work, we present Domainator, a flexible and modular software suite for domain-based gene neighborhood and protein search, extraction, and clustering. We demonstrate the utility of Domainator through three examples related to bacterial defense systems. First, we cluster CRISPR-associated Rossman fold (CARF) containing proteins with difficult to annotate effector domains, classifying most of them as likely transcriptional regulators and a subset as likely RNAses. Second, we extract and cluster P4-like phage satellite defense hotspots and identify an abundant system related to Lamassu phage defense systems. Third, we integrate a protein language model into Domainator and use it to identify restriction enzymes with low homology to known reference sequences, validating the activity of one example in-vitro. Domainator is made available as an open-source package with detailed documentation and usage examples.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined