Classification and prediction of pulmonary tuberculosis infection in HIV/AIDS patients based on XGboost algorithm

Qing Yuan,Abudouresuli Tuersun, Yi You,Chencui Huang, Saimaitikari Abudoubari, Wenbing Chen,Xiaoguang Zou,Mayidili Nijiati

crossref(2024)

引用 0|浏览2
暂无评分
摘要
Abstract Background Describe the current epidemic situation of HIV/AIDS complicated with pulmonary tuberculosis (PTB) infection in Kashgar region, analyze the factors affecting HIV/AIDS complicated with PTB infection, and use XGboost model to classify and predict the risk of PTB infection in HIV/AIDS patients, so as to improve the level of protection of residents. Methods By collecting the data of HIV/AIDS patients in Kashgar area, the patients were divided into simple HIV/AIDS group and double infection group according to whether they had PTB. The prevalence of patients with double infection was described and the influencing factors of double infection were analyzed. All study subjects were divided into a train set and a test set with a ratio of 8:2. Linear models penalized with the L1 norm was selected as the feature selection method. With XGBoost and logistic regression algorithms, prediction models for the risk of PTB infection in HIV/AIDS patients were constructed. ROC curve, delong test, decision curve and calibration curve were used to evaluate the model effect. Results The PTB infection rate among HIV/AIDS patients was 33.6%. Residing in Kashgar, initial CD4 lymphocyte count < 200 cells/mm3, and white blood cell count of 4 ~ 10×10^9/L and > 10×10^9/L were associated with an increased risk of PTB in HIV/AIDS patients, whereas being a worker or farmer and having WHO clinical stage II were protective factors. The logistic regression model achieved an AUC of 0.6962 on the training set and 0.6681 on the test set, while the XGBoost model had an AUC of 0.9027 on the training set and 0.8026 on the test set. The Delong test P-values for the training set and test set were < 0.001 and 0.009, respectively, indicating superior predictive performance of the XGBoost model. Conclusions The dual infection rate of HIV/AIDS patients with PTB in Kashgar is high, and timely intervention should be performed for HIV/AIDS patients living in Kashgar with initial CD4 lymphocyte count < 200 cells/mm3, white blood cell count of 4 ~ 10×10^9/L and > 10×10^9/L. Moreover the XGboost model has better predictive efficacy than the logistic regression model, indicating that it can be used to classify and predict PTB infection in HIV/AIDS patients.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要