Language Models Learn to Mislead Humans Via RLHFJiaxin Wen,Ruiqi Zhong,Akbir Khan,Ethan Perez,Jacob Steinhardt,Minlie Huang, Sam Bowman, He, Shi FengICLR 2025(2025)Cited 0|Views11AI Read ScienceMust-Reading TreeExampleGenerate MRT to find the research sequence of this paperChat PaperSummary is being generated by the instructions you defined