Application of Web-Scraping Techniques for Autonomous Massive Retrieval of Hematologic Patients' Information During SARS-CoV2 Pandemic

CLINICAL LYMPHOMA MYELOMA & LEUKEMIA(2020)

引用 1|浏览2
暂无评分
摘要
Context Data collection involving a large number of patients is usually known as a tedious and time-consuming task by healthcare professionals. Current patient load makes collecting clinical data almost impossible even though we need that information more than ever. Objective We wanted to deploy a system that automatically and autonomously retrieves clinical data from our patients suffering from SARS-CoV2 that arrive at hospital admission to collect that information for further analysis. Design We designed a daemon in PHP programming language connected to a MySQL MariaDB database that continuously searches for new patients consulting at hospital. We collected medical history, disease records, regular medication, physical exploration, vital signs, blood chemistry and count, and finally, microbiology testing of SARS-CoV2 (both PCR and ELISA antibody testing). As we don't have access to any API service (out-of-the-box connection to the data mainframe), we took advantage of web-scraping (brute-force data extraction from webpages using HTTP protocol) applied to our hospital web interface. Setting Monitoring was made between 1st March, 2020 and 15th April, 2020 (during worst Coronavirus outbreak phase of the country), using only one computer connected to the hospital network. The number of patients identified was 259, each one with 344 clinical and testing variables. Results Using this technique, we collected data of 259 hematologic patients without human intervention and more than 300 variables have been analyzed. Nowadays, manual revision of certain aspects of the database (e.g., comorbidities) is needed and some data needs to be manually entered due to the lack of proper codification. In the future, with the development of semantic-matching technologies, fully autonomous building of the databases will be possible. In the meantime, our technique can solve the capture of enormous amount of clinical information without effort. With that information, observational studies, even a prognosis score using machine learning, have been developed in our center. Conclusions Data collection for further analysis is usually a vital, but time-consuming, task in order to answer clinical questions. We developed a technique that helped our center retrieve patients' clinical information autonomously during the SARS-Cov-2 pandemic.
更多
查看译文
关键词
web-scraping,big data,information technology,infectious diseases,SARS-CoV-2,autonomous,CT,cellular therapy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要