SAGE: Task-Environment Platform for Autonomy and Generality Evaluation

semanticscholar(2020)

引用 1|浏览15
暂无评分
摘要
While several tools for training and evaluating narrow machine learning (ML) algorithms exist, their design generally does not follow a particular or explicit evaluation methodology or theory. Inversely so for more general learners, where many evaluation methodologies and frameworks have been suggested but few if any specific tools exist. In this paper we introduce a new methodology for evaluating the autonomy and generality of artificial intelligence (AI) and ML architectures, and a new tool that builds on this methodology. The tool and methodology platform are called SAGE (Simulator for Autonomy & Generality Evaluation), which works for training and evaluation of both kinds of systems as well as for detailed comparison between narrow and general ML and AI. It provides a variety of task construction and tuning options, allowing isolation of single parameters of different complexity dimensions. SAGE is aimed at helping AI researchers map out – and compare – strengths and weaknesses between divergent AI and ML approaches. Our hope is that it can help deepen understanding of the various tasks we want AI systems to do, including the relationship between their composition, complexity, and difficulty for various AI systems, as well as contribute to building a clearer road map for the field. We discuss the reasons why we think both narrow and general AI systems are in equal need of better tools and evaluation methodologies, describe the requirements that lead to the platform’s creation and give examples of use.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要