Low-rank matrix denoising for count data using unbiased Kullback-Leibler risk estimation

COMPUTATIONAL STATISTICS & DATA ANALYSIS(2022)

引用 3|浏览1
暂无评分
摘要
Many statistical studies are concerned with the analysis of observations organized in a matrix form whose elements are count data. When these observations are assumed to follow a Poisson or a multinomial distribution, it is of interest to focus on the estimation of either the intensity matrix (Poisson case) or the compositional matrix (multinomial case) when it is assumed to have a low rank structure. In this setting, it is proposed to construct an estimator minimizing the regularized negative log-likelihood by a nuclear norm penalty. Such an approach easily yields a low-rank matrix-valued estimator with positive entries which belongs to the set of row-stochastic matrices in the multinomial case. Then, as a main contribution, a data-driven procedure is constructed to select the regularization parameter in the construction of such estimators by minimizing (approximately) unbiased estimates of the Kullback-Leibler (KL) risk in such models, which generalize Stein's unbiased risk estimation originally proposed for Gaussian data. The evaluation of these quantities is a delicate problem, and novel methods are introduced to obtain accurate numerical approximation of such unbiased estimates. Simulated data are used to validate this way of selecting regularizing parameters for low-rank matrix estimation from count data. For data following a multinomial distribution, the performances of this approach are also compared to K-fold cross-validation. Examples from a survey study and metagenomics also illustrate the benefits of this methodology for real data analysis. (C) 2022 Elsevier B.V. All rights reserved.
更多
查看译文
关键词
Low-rank matrix denoising, Count data, Poisson distribution, Multinomial distribution, Nuclear norm penalization, Kullback-Leibler risk, Generalized Stein's unbiased risk estimate, Optimal shrinkage rule, Survey study, Metagenomics data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要