AI and Veterinary Medicine: Performance of Large Language Models on the North American Licensing Examination.

Mirana Angel, Anuj Patel, Haiyi Xing, Dylan Balsz, Cody L. Arbuckle, David Bruyette,Pierre Baldi

2023 Tenth International Conference on Social Networks Analysis, Management and Security (SNAMS)(2023)

引用 0|浏览5
暂无评分
摘要
This study aimed to assess the performance of Large Language Models on the North American Veterinary Licensing Examination (NAVLE) and to analyze the impact of artificial intelligence in the domain of animal healthcare. For this study, a 200-question NAVLE self-assessment sourced from ICVA's website was used to evaluate the performance of three language models: GPT-3, GPT-4, and Bard. Questions involving images were omitted leaving a 164 text-only sample exam. Results were analyzed by comparing generated responses to the answer key, and scores were assigned to evaluate the models' veterinary medical reasoning capabilities. Our results showed that GPT-4 outperformed GPT-3 and Bard, passing the exam with 89 % of the text-only questions correctly. GPT-3 and Bard only achieved an accuracy of 63.4 % and 61 % respectively on the same set of questions. Language models hold promise for enhancing veterinary practices through expanded educational opportunities in the veterinary curriculum, improved diagnostic accuracy, treatment times, and efficiency. However, potential negatives include challenges in changing the current educational paradigm, reduced demand for professionals or paraprofessional concerns surrounding machine-generated decisions. Responsible and ethical integration of language models is crucial in veterinary medicine.
更多
查看译文
关键词
Artificial Intelligence,LLM,ChatGPT,Bard,Veterinary Medicine,Medical Education,Societal Impact
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要