Extreme Gradient Boosting algorithm classification for predicting lifespan-extending chemical compounds

Mariia Yarmolenko,Brendan Howlin

Research Square (Research Square)(2022)

引用 0|浏览0
暂无评分
摘要
Abstract Human ageing has a great impact on global economy and society’s health with the risk factors for many chronic diseases. Discovery of the pharmaceutical interventions with the potential of promoting longevity and delaying the onset of age-associated diseases is one of the most challenging tasks in anti-ageing research today. The aim of this study was to build a new machine learning model based on the data of the DrugAge database to predict whether a chemical compound will extend the lifespan of the worm species Caenorhabditis elegans. The predictive models were built using the optimized Extreme Gradient Boosting algorithm with molecular fingerprints and molecular descriptors as features. The ranking of the models’ features was done with the built-in Extreme Gradient Boosting feature importance function and interpreted with confidence using Shapley values. The top 15 most important features included 2D molecular descriptors related to the subdivided surface areas, atom and bond counts, and electrostatic properties. The best performing model was applied to predict the class of compounds in the external database, DrugBank, consisting of approved small-molecules. The chemical compounds of the external database with a predictive probability of for increasing the lifespan of Caenorhabditis elegans were broadly separated into (i) flavonoids and isoflavonoids, (ii) fatty acids and conjugates, and (iii) other classes of compounds.
更多
查看译文
关键词
extreme gradient boosting,algorithm classification,lifespan-extending
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要