Standing on FURM ground – A framework for evaluating Fair, Useful, and Reliable AI Models in healthcare systems
arxiv(2024)
摘要
The impact of using artificial intelligence (AI) to guide patient care or
operational processes is an interplay of the AI model's output, the
decision-making protocol based on that output, and the capacity of the
stakeholders involved to take the necessary subsequent action. Estimating the
effects of this interplay before deployment, and studying it in real time
afterwards, are essential to bridge the chasm between AI model development and
achievable benefit. To accomplish this, the Data Science team at Stanford
Health Care has developed a Testing and Evaluation (T E) mechanism to identify
fair, useful and reliable AI models (FURM) by conducting an ethical review to
identify potential value mismatches, simulations to estimate usefulness,
financial projections to assess sustainability, as well as analyses to
determine IT feasibility, design a deployment strategy, and recommend a
prospective monitoring and evaluation plan. We report on FURM assessments done
to evaluate six AI guided solutions for potential adoption, spanning clinical
and operational settings, each with the potential to impact from several dozen
to tens of thousands of patients each year. We describe the assessment process,
summarize the six assessments, and share our framework to enable others to
conduct similar assessments. Of the six solutions we assessed, two have moved
into a planning and implementation phase. Our novel contributions - usefulness
estimates by simulation, financial projections to quantify sustainability, and
a process to do ethical assessments - as well as their underlying methods and
open source tools, are available for other healthcare systems to conduct
actionable evaluations of candidate AI solutions.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要