Genome-wide predictions of genetic redundancy in Arabidopsis thaliana

bioRxiv (Cold Spring Harbor Laboratory)(2020)

引用 0|浏览2
暂无评分
摘要
ABSTRACT Genetic redundancy refers to a situation where an individual with a loss-of-function mutation in one gene (single mutant) does not show an apparent phenotype until one or more paralogs are also knocked out (double/higher-order mutant). Previous studies have identified some characteristics common among redundant gene pairs, but a predictive model of genetic redundancy incorporating a wide variety of features has not yet been established. In addition, the relative importance of these characteristics for genetic redundancy remains unclear. Here, we establish machine learning models for predicting whether a gene pair is likely redundant or not in the model plant Arabidopsis thaliana . Benchmark gene pairs were classified based on six feature categories: functional annotations, evolutionary conservation including duplication patterns and mechanisms, epigenetic marks, protein properties including post-translational modifications, gene expression, and gene network properties. The definition of redundancy, data transformations, feature subsets, and machine learning algorithms used affected model performance significantly. Among the most important features in predicting gene pairs as redundant were having a paralog(s) from recent duplication events, annotation as a transcription factor, downregulation during stress conditions, and having similar expression patterns under stress conditions. Predictions were then tested using phenotype data withheld from model building and validated using well-characterized, redundant and nonredundant gene pairs. This genetic redundancy model sheds light on characteristics that may contribute to long-term maintenance of paralogs that are seemingly functionally redundant, and will ultimately allow for more targeted generation of functionally informative double mutants, advancing functional genomic studies.
更多
查看译文
关键词
genetic redundancy,<i>arabidopsis,genome-wide
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要