NetAllergen, a random forest model integrating MHC-II presentation propensity for improved allergenicity prediction

Yuchen Li,Peter Wad Sackett, Morten Nielsen,Carolina Barra

biorxiv(2023)

引用 0|浏览5
暂无评分
摘要
Allergy is a pathological immune reaction towards innocuous protein antigens. Although only a narrow fraction of plant or animal proteins induce allergy, atopic disorders affect millions of children and adults and cost billions in healthcare systems worldwide. In-silico predictors can aid in the development of more innocuous food sources. Previous allergenicity predictors used sequence similarity, common structural domains, and amino acid physicochemical features. However, these predictors strongly rely on sequence similarity to known allergens and fail to predict protein allergenicity accurately when similarity diminishes. In addition, ‘allergen’ is a broad terminology that may include different compounds, hindering the classification task. To overcome these limitations, we collected allergens from AllergenOnline, a curated database of IgE-inducing allergens, carefully removed allergen redundancy with a novel protein partitioning pipeline, and developed a new allergen prediction method, introducing MHC presentation propensity as a novel feature. NetAllergen outperformed a sequence similarity-based BLAST baseline approach, and previous allergenicity predictor AlgPred 2 when similarity to known allergens is limited. NetAllergen is available as a web service (services.healthtech.dtu.dk/service.php?NetAllergen-1.0) and can predict allergenicity from a protein sequence. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要