谷歌Chrome浏览器插件
订阅小程序
在清言上使用

Clustering polycystic ovary syndrome laboratory results extracted from a large internet forum with machine learning

Intelligence-Based Medicine(2024)

引用 0|浏览4
暂无评分
摘要
Background Polycystic Ovary Syndrome (PCOS) is reported to affect between 4% and 21% of reproductive aged people with ovaries. It is a heterogeneous condition with a lack of established phenotypes that address the range of reproductive and metabolic features present in PCOS. These reproductive and metabolic features may result in patients undergoing a variety of relevant laboratory tests. Previous work has led to the gathering of laboratory test results from a PCOS specific forum, hosted on a website called reddit. Objectives In this paper, laboratory results and body mass index (BMI) posted on the PCOS reddit forum were clustered to show the usefulness of the PCOS forum for PCOS research and validate existing PCOS phenotypes or discover other appropriate phenotypes. Methods and results Over 1500 sets of PCOS-related reddit laboratory test results and BMIs were clustered using nearest neighbour imputation and K-means clustering. However, only non-imputed data was included in the final clusters. Kernel Density Estimation plots were used to display the distinct clusters. The clustered test results suggested the existence of distinct metabolic and reproductive phenotypes, as well as a group displaying mild features of both types of dysregulations and a group skewed towards normal results. It was also possible to separate the groups further into distinct hypothyroid groups within the mixed dysregulation group and to separate insulin resistant and diabetes-like groups within the metabolic group. Conclusions This research further validates the usefulness of exploring alternate data sources in the age of the internet and machine learning. The reddit clusters reinforced the existing notion that people with PCOS can be separated into a primarily metabolic pathology group, a primarily reproductive pathology group and an in between group with pathology in both domains.
更多
查看译文
关键词
PCOS,Clustering,Machine learning,Internet research
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要