Dataset of bulged G-quadruplex forming sequences in the human genome

Data in brief(2023)

Cited 0|Views19
No score
Abstract
When several continuous guanine runs are present closely in a nucleic acid sequence, a secondary structure called Gquadruplex can form (G4s). Such structures in the genome could serve as structural and functional regulators in gene expression, DNA-protein binding, epigenetic modification, and genotoxic stress. Several types of G4-forming DNA sequences exist, including bulged G4-forming sequences (G4BS). Such bulges occur due to the presence of non-guanine bases in specific locations (G-runs) in the G4-forming sequences. At present, search algorithms do not identify stable G4-BS conformations, making genome-wide studies of G4like structures difficult. Data provided in this study are related to a published article "Stable bulged G-quadruplexes in the human genome: Identification, experimental validation and functionalization" published by Nucleic Acids Research [DIO.org/10.193/nar/gkad252]. Based on our studies in vitro and G4-seq and G4 CUT&Tag data analysis, we have specified and validated three pG4-BS models. In this article, a large collection of 'raw' (unfiltered) dataset is presented, which includes three subfamilies of pG4-BS. For each of pG4BS, we provide strand-specific genomic boundaries. Data on
More
Translated text
Key words
G-quadruplex dataset,G4-bulge structures,DNA,Coordinates,Computational modelling and search,algorithm
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined