The usage of transcriptomics datasets as sources of Real-World Data for clinical trialling

biorxiv(2024)

引用 1|浏览11
暂无评分
摘要
Introduction: Randomized clinical trials (RCT) are limited in reflecting observable results out of controlled settings, which requires the execution of further lengthy observational studies. The usage of real-world data (RWD) has been recently considered to be a viable alternative to overcome these issues and complement certain clinical conclusions. Transcriptomics and other high-throughput data contain a molecular description of medical conditions and disease states. When linked to RWD, including demographical information, transcriptomics data is capable of elucidating nuances in disease pathways in specific patient populations. This work focuses on the construction of a patient repository database with clinical information resulting from the integration of publicly available transcriptomics datasets. Results: Samples from patient data were integrated into the patient repository by using a new post-processing technique allowing for the combined usage of samples originating from Gene Expression Omnibus (GEO) datasets. RWD was mined from GEO samples' metadata, and a clinical and demographical characterization of the database was obtained. Our post-processing technique, that we've called MACAROON, aims to uniformize, and integrate transcriptomics data (considering batch effects and possible processing-originated artefacts). This process was able to better reproduce the down streaming biological conclusions in a 10% enhancement (compared to other methods available). RWD mining was done through a manually curated synonym dictionary, allowing for the correct assignment (95.33% median accuracy) of medical conditions. Conclusion: Our strategy produced a RWD repository, including molecular information and clinical and demographical RWD. The exploration of these data facilitates shedding light on clinical outcomes and pathways specific to predetermined populations of patients by integrating multiple public datasets. ### Competing Interest Statement The authors have declared no competing interest.
更多
查看译文
关键词
transcriptomics datasets,clinical,real-world
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要