AILS-NTUA at SemEval-2024 Task 6: Efficient model tuning for hallucination detection and analysis
arxiv(2024)
摘要
In this paper, we present our team's submissions for SemEval-2024 Task-6 -
SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration
Mistakes. The participants were asked to perform binary classification to
identify cases of fluent overgeneration hallucinations. Our experimentation
included fine-tuning a pre-trained model on hallucination detection and a
Natural Language Inference (NLI) model. The most successful strategy involved
creating an ensemble of these models, resulting in accuracy rates of 77.8
79.9
the organizers' baseline and achieving notable results when contrasted with the
top-performing results in the competition, which reported accuracies of 84.7
and 81.3
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要