Assessing privacy risks in population health publications using a checklist-based approach.

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION(2018)

引用 2|浏览7
暂无评分
摘要
Recent growth in the number of population health researchers accessing detailed datasets, either on their own computers or through virtual data centers, has the potential to increase privacy risks. In response, a checklist for identifying and reducing privacy risks in population health analysis outputs has been proposed for use by researchers themselves. In this study we explore the usability and reliability of such an approach by investigating whether different users identify the same privacy risks on applying the checklist to a sample of publications. The checklist was applied to a sample of 100 academic population health publications distributed among 5 readers. Cohen's kappa was used to measure interrater agreement. Of the 566 instances of statistical output types found in the 100 publications, the most frequently occurring were counts, summary statistics, plots, and model outputs. Application of the checklist identified 128 outputs (22.6%) with potential privacy concerns. Most of these were associated with the reporting of small counts. Among these identified outputs, the readers found no substantial actual privacy concerns when context was taken into account. Interrater agreement for identifying potential privacy concerns was generally good. This study has demonstrated that a checklist can be a reliable tool to assist researchers with anonymizing analysis outputs in population health research. This further suggests that such an approach may have the potential to be developed into a broadly applicable standard providing consistent confidentiality protection across multiple analyses of the same data.
更多
查看译文
关键词
data anonymization,confidentiality,privacy,biomedical research,health services research
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要