AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language Models Denoising.
CoRR(2023)
Abstract
Large language models (LLMs) have opened up enormous opportunities while
simultaneously posing ethical dilemmas. One of the major concerns is their
ability to create text that closely mimics human writing, which can lead to
potential misuse, such as academic misconduct, disinformation, and fraud. To
address this problem, we present AuthentiGPT, an efficient classifier that
distinguishes between machine-generated and human-written texts. Under the
assumption that human-written text resides outside the distribution of
machine-generated text, AuthentiGPT leverages a black-box LLM to denoise input
text with artificially added noise, and then semantically compares the denoised
text with the original to determine if the content is machine-generated. With
only one trainable parameter, AuthentiGPT eliminates the need for a large
training dataset, watermarking the LLM's output, or computing the
log-likelihood. Importantly, the detection capability of AuthentiGPT can be
easily adapted to any generative language model. With a 0.918 AUROC score on a
domain-specific dataset, AuthentiGPT demonstrates its effectiveness over other
commercial algorithms, highlighting its potential for detecting
machine-generated text in academic settings.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined