Chrome Extension
WeChat Mini Program
Use on ChatGLM

A Scalable Natural Language Processing for Inferring BT-RADS Categorization from Unstructured Brain Magnetic Resonance Reports

JOURNAL OF DIGITAL IMAGING(2020)

Cited 17|Views364
No score
Abstract
The aim of this study is to develop an automated classification method for Brain Tumor Reporting and Data System (BT-RADS) categories from unstructured and structured brain magnetic resonance imaging (MR) reports. This retrospective study included 1410 BT-RADS structured reports dated from January 2014 to December 2017 and a test set of 109 unstructured brain MR reports dated from January 2010 to December 2014. Text vector representations and semantic word embeddings were generated from individual report sections (i.e., “History,” “Findings,” etc.) using Tf-idf statistics and a fine-tuned word2vec model, respectively. Section-wise ensemble models were trained using gradient boosting (XGBoost), elastic net regularization, and random forests, and classification accuracy was evaluated on an independent test set of unstructured brain MR reports and a validation set of BT-RADS structured reports. Section-wise ensemble models using XGBoost and word2vec semantic word embeddings were more accurate than those using Tf-idf statistics when classifying unstructured reports, with an f1 score of 0.72. In contrast, models using traditional Tf-idf statistics outperformed the word2vec semantic approach for categorization from structured reports, with an f1 score of 0.98. Proposed natural language processing pipeline is capable of inferring BT-RADS report scores from unstructured reports after training on structured report data. Our study provides a detailed experimentation process and may provide guidance for the development of RADS-focused information extraction (IE) applications from structured and unstructured radiology reports.
More
Translated text
Key words
BT-RADS,Deep learning,NLP,Distributional semantics,Text mining
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined