Chrome Extension
WeChat Mini Program
Use on ChatGLM

Evaluating Large Language Models for Public Health Classification and Extraction Tasks

Joshua Harris, Timothy Laurence, Leo Loman, Fan Grayson, Toby Nonnenmacher, Harry Long, Loes WalsGriffith,Amy Douglas,Holly Fountain, Stelios Georgiou,Jo Hardstaff, Kathryn Hopkins, Y-Ling Chi, Galena Kuyumdzhieva,Lesley Larkin,Samuel Collins,Hamish Mohammed,Thomas Finnie, Luke Hounsome,Steven Riley

CoRR(2024)

Cited 0|Views15
No score
Abstract
Advances in Large Language Models (LLMs) have led to significant interest in their potential to support human experts across a range of domains, including public health. In this work we present automated evaluations of LLMs for public health tasks involving the classification and extraction of free text. We combine six externally annotated datasets with seven new internally annotated datasets to evaluate LLMs for processing text related to: health burden, epidemiological risk factors, and public health interventions. We initially evaluate five open-weight LLMs (7-70 billion parameters) across all tasks using zero-shot in-context learning. We find that Llama-3-70B-Instruct is the highest performing model, achieving the best results on 15/17 tasks (using micro-F1 scores). We see significant variation across tasks with all open-weight LLMs scoring below 60 Classification, while all LLMs achieve greater than 80 such as GI Illness Classification. For a subset of 12 tasks, we also evaluate GPT-4 and find comparable results to Llama-3-70B-Instruct, which scores equally or outperforms GPT-4 on 6 of the 12 tasks. Overall, based on these initial results we find promising signs that LLMs may be useful tools for public health experts to extract information from a wide variety of free text sources, and support public health surveillance, research, and interventions.
More
Translated text
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined