Representation Engineering: A Top-Down Approach to AI Transparency
Andy Zou,Long Phan, Sarah Chen, James Campbell,Phillip Guo,Richard Ren,Alexander Pan,Xuwang Yin,Mantas Mazeika,Ann-Kathrin Dombrowski,Shashwat Goel,Nathaniel Li, Michael J. Byun,Zifan Wang,Alex Mallen,Steven Basart,Sanmi Koyejo,Dawn Song,Matt Fredrikson,J. Zico Kolter,Dan Hendrycks CoRR(2023)
AI 理解论文
溯源树
样例
