Cross-validation for the estimation of effect size generalizability in mass-univariate brain-wide association studies

bioRxiv (Cold Spring Harbor Laboratory)(2023)

引用 0|浏览39
暂无评分
摘要
Introduction Statistical effect sizes are systematically overestimated in small samples, leading to poor generalizability and replicability of findings in all areas of research. Due to the large number of variables, this is particularly problematic in neuroimaging research. While cross-validation is frequently used in multivariate machine learning approaches to assess model generalizability and replicability, the benefits for mass-univariate brain analysis are yet unclear. We investigated the impact of cross-validation on effect size estimation in univariate voxel-based brain-wide associations, using body mass index (BMI) as an exemplary predictor. Methods A total of n=3401 adults were pooled from three independent cohorts. Brain-wide associations between BMI and gray matter structure were tested using a standard linear mass-univariate voxel-based approach. First, a traditional non-cross-validated analysis was conducted to identify brain-wide effect sizes in the total sample (as an estimate of a realistic reference effect size). The impact of sample size (bootstrapped samples ranging from n=25 to n=3401) and cross-validation on effect size estimates was investigated across selected voxels with differing underlying effect sizes (including the brain-wide lowest effect size). Linear effects were estimated within training sets and then applied to unseen test set data, using 5-fold cross-validation. Resulting effect sizes (explained variance) were investigated. Results Analysis in the total sample (n=3401) without cross-validation yielded mainly negative correlations between BMI and gray matter density with a maximum effect size of R 2p=.036 (peak voxel in the cerebellum). Effects were overestimated exponentially with decreasing sample size, with effect sizes up to R 2p=.535 in samples of n=25 for the voxel with the brain-wide largest effect and up to R 2p=.429 for the voxel with the brain-wide smallest effect. When applying cross-validation, linear effects estimated in small samples did not generalize to an independent test set. For the largest brain-wide effect a minimum sample size of n=100 was required to start generalizing (explained variance >0 in unseen data), while n=400 were needed for smaller effects of R 2p =.005 to generalize. For a voxel with an underlying null effect, linear effects found in non-cross-validated samples did not generalize to test sets even with the maximum sample size of n=3401. Effect size estimates obtained with and without cross-validation approached convergence in large samples. Discussion Cross-validation is a useful method to counteract the overestimation of effect size particularly in small samples and to assess the generalizability of effects. Train and test set effect sizes converge in large samples which likely reflects a good generalizability for models in such samples. While linear effects start generalizing to unseen data in samples of n>100 for large effect sizes, the generalization of smaller effects requires larger samples (n>400). Cross-validation should be applied in voxel-based mass-univariate analysis to foster accurate effect size estimation and improve replicability of neuroimaging findings. We provide open-source python code for this purpose (). ### Competing Interest Statement The MACS and MNC studies are funded by the German Research Foundation (DFG, grant FOR2107 DA1151/5-1 and DA1151/5-2 to UD; SFB-TRR58, Projects C09 and Z02 to UD), the Interdisciplinary Center for Clinical Research (IZKF) of the medical faculty of Munster (grant Dan3/012/17 to UD), IMF Munster RE111604 to RR und RE111722 to RR, IMF Munster RE 22 17 07 to Jonathan Repple and the Deanery of the Medical Faculty of the University of Munster. TH was supported by the German Research Foundation (DFG grants HA7070/2-2, HA7070/3, HA7070/4). MP was supported by an ERC Consolidator grant (ERC-COG 101001062) and a NWO VIDI grant of the Dutch Research Council (Netherlands Organisation for Scientific Research Grant VIDI-452-16-015). The BiDirect study is funded by German Federal Ministry of Education and Research Grant Nos. 01ER0816, 01ER1506, and 01ER1205. Biomedical financial interests or potential conflicts of interest: TK received unrestricted educational grants from Servier, Janssen, Recordati, Aristo, Otsuka, neuraxpharm. This cooperation has no relevance to the work that is covered in the manuscript.
更多
查看译文
关键词
effect size generalizability,effect size,association,cross-validation,mass-univariate,brain-wide
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要