Benchmarking Algorithms for Gene Set Scoring of Single-cell ATAC-seq Data

Xi Wang,Qiwei Lian, Haoyu Dong, Shuo Xu,Yaru Su,Xiaohui Wu

biorxiv(2023)

引用 0|浏览4
暂无评分
摘要
Gene set scoring (GSS) has been routinely conducted for gene expression analysis of bulk or single-cell RNA-seq data, which helps to decipher single-cell heterogeneity and cell-type-specific variability by incorporating prior knowledge from functional gene sets. Single-cell assay for transposase accessible chromatin using sequencing (scATAC-seq) is a powerful technique for interrogating single-cell chromatin-based gene regulation, and genes or gene sets with dynamic regulatory potentials can be regarded as cell-type specific markers as if in scRNA-seq. However, there are few GSS tools specifically designed for scATAC-seq, and the applicability and performance of RNA-seq GSS tools on scATAC-seq data remain to be investigated. We systematically benchmarked ten GSS tools, including four bulk RNA-seq tools, five single-cell RNA-seq (scRNA-seq) tools, and one scATAC-seq method. First, using matched scATAC-seq and scRNA-seq datasets, we find that the performance of GSS tools on scATAC-seq data is comparable to that on scRNA-seq, suggesting their applicability to scATAC-seq. Then the performance of different GSS tools were extensively evaluated using up to ten scATAC-seq datasets. Moreover, we evaluated the impact of gene activity conversion, dropout imputation, and gene set collections on the results of GSS. Results show that dropout imputation can significantly promote the performance of almost all GSS tools, while the impact of gene activity conversion methods or gene set collections on GSS performance is more GSS tool or dataset dependent. Finally, we provided practical guidelines for choosing appropriate pre-processing methods and GSS tools in different scenarios. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
gene set scoring,single-cell,atac-seq
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要