A multivariate frequency-severity framework for healthcare data breaches

BY HONG SUN,XU MAOCHAO,PENG ZHAO

ANNALS OF APPLIED STATISTICS(2022)

引用 0|浏览0
暂无评分
摘要
Data breaches in healthcare have become a substantial concern in recent years, and cause millions of dollars in financial losses each year. It is fundamental for government regulators, insurance companies, and stakeholders to understand the breach frequency and the number of affected individuals in each state, as these are directly related to the federal Health Insurance Portability and Accountability Act (HIPAA) and state data breach laws. However, an obstacle to studying data breaches in healthcare is the lack of suitable statistical approaches. We develop a novel multivariate frequency-severity framework to analyze breach frequency and the number of affected individuals at the state level. A mixed effects model is developed to model the square root transformed frequency, and the log-gamma distribution is proposed to capture the skewness and heavy tail exhibited by the distribution of numbers of affected individuals. We further discover a positive nonlinear dependence between the transformed frequency and the log-transformed numbers of affected individuals (i.e., severity). In particular, we propose to use a D-vine copula to capture the multivariate dependence among conditional severities given frequencies due to its inherent temporal structure and rich bivariate copula families. The rejection sampling technique is developed to simulate the predictive distributions. Both the in-sample and out-of-sample studies show that the proposed multivariate frequency-severity model that accommodates non-linear dependence has satisfactory fitting and prediction performances.
更多
查看译文
关键词
Copula,data breach,heavy tail,multivariate dependence,score
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要