AI Alignment in the Design of Interactive AI: Specification Alignment, Process Alignment, and Evaluation Support.
CoRR(2023)
摘要
AI alignment considers the overall problem of ensuring an AI produces desired
outcomes, without undesirable side effects. While often considered from the
perspectives of safety and human values, AI alignment can also be considered in
the context of designing and evaluating interfaces for interactive AI systems.
This paper maps concepts from AI alignment onto a basic, three step interaction
cycle, yielding a corresponding set of alignment objectives: 1) specification
alignment: ensuring the user can efficiently and reliably communicate
objectives to the AI, 2) process alignment: providing the ability to verify and
optionally control the AI's execution process, and 3) evaluation support:
ensuring the user can verify and understand the AI's output. We also introduce
the concepts of a surrogate process, defined as a simplified, separately
derived, but controllable representation of the AI's actual process; and the
notion of a Process Gulf, which highlights how differences between human and AI
processes can lead to challenges in AI control. To illustrate the value of this
framework, we describe commercial and research systems along each of the three
alignment dimensions, and show how interfaces that provide interactive
alignment mechanisms can lead to qualitatively different and improved user
experiences.
更多查看译文
关键词
specification alignment,process alignment,interactive ai
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要