Count data models and Bayesian shrinkage priors with real-world data applications

Research Square (Research Square)(2022)

引用 0|浏览0
暂无评分
摘要
Abstract Background: Studies in the public health field often consist of outcome measures such as number of hospital visits or number of laboratory tests per person.They arise in genomics, electronic health records, epidemic modeling among many other areas. These measures are highly skewed distributions and requires count data models for inference. Count data modeling is of prime importance in these fields of public health and medical sciences. Also sparse outcomes, as in next-generation sequencing data, require further accounting for zero inflation. Methods: We present a unified Bayesian hierarchical framework that implements and compares shrinkage priors in negative-binomial and zero-inflated negative-binomial regression models. We first represent the likelihood by a Polya-Gamma data augmentation that makes it amenable to a hierarchical model employing a wide class of shrinkage priors. Shrinkage priors are especially relevant for high-dimensional regression. We specifically focus on the Horseshoe, Dirichlet Laplace, and Double Pareto priors. Extensive simulation studies address the model’s efficiency and mean square errors are reported. Further, the models are applied to data sets, namely covid-19 vaccine adverse events, no. of Ph.D. publications data, and the US National Medical Expenditure Survey, among other datasets. Results: The models consistently showed good performance in variable selection captured by model accuracies, sensitivities, and specificity and predictive performance by mean square errors. We even obtained mean square error rates as low as 0.003 in p > n cases in simulation studies. In real case studies, the variable selection results strongly confirmed current biological insights and opened the doors to potential new findings. For example, the number of days between the Covid-19 vaccination and onset of adverse events depended on age, sex, if there is life threat or not, if there was emergency room visit, no. of extended stay, other medications, laboratory data, disease during vaccination, prior vaccination status, allergy status among other factors. A remarkable reduction in MSE of the fitted values testified to the predictive performance of the model. Conclusions: Bayesian generalized linear models using shrinkage priors are robust enough to extract relevant predictors in high-dimensional regressions. They can be applied to a broad range of biometric and public health high dimensional problems. Also the R package ”ShrinkageBayesGlm” is available for hands-on experience at https://github.com/arinjita9/ShrinkageBayesGlm
更多
查看译文
关键词
bayesian shrinkage priors,models,data,real-world
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要