Chrome Extension
WeChat Mini Program
Use on ChatGLM

ChatGPT v Bard v Bing v Claude 2 v Aria v human-expert. How good are AI chatbots at scientific writing?

arXiv (Cornell University)(2023)

Cited 0|Views1
No score
Abstract
Historical emphasis on writing mastery has shifted with advances in generative AI, especially in scientific writing. This study analysed six AI chatbots for scholarly writing in humanities and archaeology. Using methods that assessed factual correctness and scientific contribution, ChatGPT-4 showed the highest quantitative accuracy, closely followed by ChatGPT-3.5, Bing, and Bard. However, Claude 2 and Aria scored considerably lower. Qualitatively, all AIs exhibited proficiency in merging existing knowledge, but none produced original scientific content. Inter-estingly, our findings suggest ChatGPT-4 might represent a plateau in large language model size. This research emphasizes the unique, intricate nature of human research, suggesting that AI's emulation of human originality in scientific writing is challenging. As of 2023, while AI has transformed content generation, it struggles with original contributions in humanities. This may change as AI chatbots continue to evolve into LLM-powered software.
More
Translated text
Key words
chatbots,ai
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined