Successful Development of a Natural Language Processing Algorithm for Pancreatic Neoplasms and Associated Histologic Features

PANCREAS(2023)

Cited 0|Views30
No score
Abstract
ObjectivesNatural language processing (NLP) algorithms can interpret unstructured text for commonly used terms and phrases. Pancreatic pathologies are diverse and include benign and malignant entities with associated histologic features. Creating a pancreas NLP algorithm can aid in electronic health record coding as well as large database creation and curation.MethodsText-based pancreatic anatomic and cytopathologic reports for pancreatic cancer, pancreatic ductal adenocarcinoma, neuroendocrine tumor, intraductal papillary neoplasm, tumor dysplasia, and suspicious findings were collected. This dataset was split 80/20 for model training and development. A separate set was held out for testing purposes. We trained using convolutional neural network to predict each heading.ResultsOver 14,000 reports were obtained from the Mass General Brigham Healthcare System electronic record. Of these, 1252 reports were used for algorithm development. Final accuracy and F1 scores relative to the test set ranged from 95% and 98% for each queried pathology. To understand the dependence of our results to training set size, we also generated learning curves. Scoring metrics improved as more reports were submitted for training; however, some queries had high index performance.ConclusionsNatural language processing algorithms can be used for pancreatic pathologies. Increased training volume, nonoverlapping terminology, and conserved text structure improve NLP algorithm performance.
More
Translated text
Key words
machine learning,natural language processing,pancreatic cancer,pancreatic ductal adenocarcinoma,learning curves
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined