Radiology Text Analysis System (RadText): Architecture and Evaluation.

Overview

abstract

Analyzing radiology reports is a time-consuming and error-prone task, which raises the need for an efficient automated radiology report analysis system to alleviate the workloads of radiologists and encourage precise diagnosis. In this work, we present RadText, a high-performance open-source Python radiology text analysis system. RadText offers an easy-to-use text analysis pipeline, including de-identification, section segmentation, sentence split and word tokenization, named entity recognition, parsing, and negation detection. Superior to existing widely used toolkits, RadText features a hybrid text processing schema, supports raw text processing and local processing, which enables higher accuracy, better usability and improved data privacy. RadText adopts BioC as the unified interface, and also standardizes the output into a structured representation that is compatible with Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM), which allows for a more systematic approach to observational research across multiple, disparate data sources. We evaluated RadText on the MIMIC-CXR dataset, with five new disease labels that we annotated for this work. RadText demonstrates highly accurate classification performances, with a 0.91 average precision, 0.94 average recall and 0.92 average F-1 score. We also annotated a test set for the five new disease labels to facilitate future research or applications. We have made our code, documentations, examples and the test set available at https://github.com/bionlplab/radtext.

authors

Wang, Song

Lin, Mingquan

Ding, Ying

Shih, George
Lu, Zhiyong
Peng, Yifan

publication date

September 8, 2022

published in

Proceedings. IEEE International Conference on Healthcare Informatics Journal

Identity

PubMed Central ID

PMC9484781

Scopus Document Identifier

85134315755

Digital Object Identifier (DOI)

10.1109/ichi54592.2022.00050

PubMed ID

36128510

Additional Document Info

has global citation frequency

11

volume

2022

VIVO Weill Cornell Medical College

Radiology Text Analysis System (RadText): Architecture and Evaluation. Academic Article

Overview

abstract

authors

publication date

published in

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

has global citation frequency

volume