谷歌浏览器插件
订阅小程序
在清言上使用

JBShield: Defending Large Language Models from Jailbreak Attacks Through Activated Concept Analysis and Manipulation

Shenyi Zhang, Yuchen Zhai, Keyan Guo,Hongxin Hu, Shengnan Guo, Zheng Fang, Lingchen Zhao,Chao Shen,Cong Wang,Qian Wang

CoRR(2025)

引用 0|浏览8
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要