Whole genome sequencing analysis identifies rare, large-effect non-coding variants and regions associated with circulating protein levels

bioRxiv (Cold Spring Harbor Laboratory)(2023)

Cited 0|Views33
No score
Abstract
The role of non-coding rare variation in common phenotypes is largely unknown, due to a lack of whole-genome sequence data, and the difficulty of categorising non-coding variants into biologically meaningful regulatory units. To begin addressing these challenges, we performed a cis association analysis using whole-genome sequence data, consisting of 391 million variants and 1,450 circulating protein levels in ∼20,000 UK Biobank participants. We identified 777 independent rare non-coding single variants associated with circulating protein levels ( P <1×10-9), after conditioning on protein-coding and common associated variants. Rare non-coding aggregate testing identified 108 conditionally independent regulatory regions. Unlike protein-coding variation, rare non-coding genetic variation was almost as likely to increase as decrease protein levels. The regions we identified overlapped predicted tissue-specific enhancers more than promoters, suggesting they represent tissue-specific regulatory regions. Our results have important implications for the identification, and role, of rare non-coding variation associated with common human phenotypes. ### Competing Interest Statement The authors have declared no competing interest.
More
Translated text
Key words
protein levels,whole genome,large-effect,non-coding
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined