谷歌Chrome浏览器插件
订阅小程序
在清言上使用

The GenoPred Pipeline: A Comprehensive and Scalable Pipeline for Polygenic Scoring.

medrxiv(2024)

引用 0|浏览4
暂无评分
摘要
Motivation: Polygenic scoring is a commonly used approach for estimating an individual's likelihood of a given outcome. Polygenic scores are typically calculated using genetic effects derived from genome-wide association study (GWAS) summary statistics and individual-level genotype data for the target sample. Using a reference-standardised framework ensures the polygenic score can be reliably interpreted. Going from genotype to interpretable polygenic scores involves many steps and there are many methods available, limiting the accessibility of polygenic scores for research and clinical application. Additional challenges exist for studies in ancestrally diverse populations. We have implemented the leading polygenic scoring methodologies within an easy-to-use pipeline called GenoPred. Results: Here we present the GenoPred pipeline, an easy-to-use, high-performance, reference-standardised and reproducible workflow for polygenic scoring. The pipeline requires just a few readily available inputs to get started, with configuration options available to cater for a range of use-cases. GenoPred implements a comprehensive set of analyses, including genotype and GWAS quality control, target sample ancestry inference, polygenic score file generation using a range of leading methods, and target sample scoring. GenoPred standardises the polygenic scoring process using reference genetic data, providing interpretable polygenic scores, and improving the transferability of results to external datasets. The pipeline is applicable to GWAS and target data from any population within the reference, facilitating studies of diverse ancestry. GenoPred is a Snakemake pipeline with associated Conda software environments, ensuring reproducibility. We apply the pipeline to UK Biobank data demonstrating the pipeline's simplicity, efficiency, and performance. GenoPred is open-source software, that will continue to develop as polygenic scoring methodology develops. Conclusions: The GenoPred pipeline provides a novel resource for polygenic scoring, integrating a range of complex processes within an easy-to-use framework. GenoPred widens access of the leading polygenic scoring methodology and their application to studies of diverse ancestry. ### Competing Interest Statement OP provides consultancy services for UCB pharma. AAC reports consultancies or advisory boards for Amylyx, Apellis, Biogen, Brainstorm, Cytokinetics, GenieUs, GSK, Lilly, Mitsubishi Tanabe Pharma, Novartis, OrionPharma, Quralis, and Wave Pharmaceuticals. CML sits on the Myriad Neuroscience Scientific Advisory Board and is a Key Opinion Leader for UCB Pharma. ### Funding Statement OP is supported by a Sir Henry Wellcome Postdoctoral Fellowship [222811/Z/21/Z]. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. AAC is an NIHR Senior Investigator (NIHR202421). This is in part an EU Joint Programme - Neurodegenerative Disease Research (JPND) project. The project is supported through the following funding organisations under the aegis of JPND - www.jpnd.eu (United Kingdom, Medical Research Council (MR/L501529/1; MR/R024804/1)). This study represents independent research part funded by the National Institute for Health Research (NIHR) Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King's College London. ### Author Declarations I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained. Yes The details of the IRB/oversight body that provided approval or exemption for the research described are given below: UK Biobank data can be accessed via application, the data for this publication was accessed as part of projects 82087. I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals. Yes I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance). Yes I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable. Yes GWAS summary statistics used in this study were publicly available (see Supplementary Table 1). The UK Biobank data was accessed via project 82087 - For access, go to: https://www.UK Biobankiobank.ac.uk/enable-your-research/apply-for-access.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要