.abstract img { width:300px !important; height:auto; display:block; text-align:center; margin-top:10px } .abstract { overflow-x:scroll } .abstract table { width:100%; display:block; border:hidden; border-collapse: collapse; margin-top:10px } .abstract td, th { border-top: 1px solid #ddd; padding: 4px 8px; } .abstract tbody tr:nth-child(even) td { background-color: #efefef; } .abstract a { overflow-wrap: break-word; word-wrap: break-word; }
A4409 - Validation of a Natural Language Processing Algorithm to Extract Nodule Characteristics from Dictated Radiology Transcripts
Author Block: M. K. Gould, A. L. Liu, C. Zheng, J. S. Lee, D. E. Altman, B. Huang, B. Creekmur; Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States.
Introduction We previously developed and validated a natural language processing (NLP) algorithm to scan the free text of dictated radiology reports to identify patients with incidental pulmonary nodules with high sensitivity and specificity. The same method was extended extract information about edge characteristics, attenuation and calcification of identified lung nodules. Method We used data from the Radiology Information System and our existing NLP algorithm to identify 9114 patients with a lung nodule on a dictated radiology transcript dated from November 2015 to February 2016 at Kaiser Permanente Southern California. We purposely sampled 150 subjects’ transcripts and further tested the NLP classification ability on lung nodule edge characteristics, attenuation and calcification. To provide a reference standard, an experienced pulmonologist reviewed the same transcripts. Results Seven transcripts (4.7%) were identified as false positives for the presence of a nodule after review by the pulmonologist. For the remaining 143 true positive lung nodule transcripts, the mean nodule size was 9.6 mm (±6.9 mm). The specifications of laterality were 89 right lung, 50 left, 3 both and 1 neither. There were 67 nodules located in the upper lobe, 13 middle, 2 lingula, 52 lower, 2 other and 7 un-specified. The edge characteristics were classified into five categories: not specified, smooth, lobulated, irregular and spiculated. The Kappa coefficient for agreement between the NLP and the pulmonologist for edge characteristics was 0.732 (95% CI, 0.608, 0.857). Nodule attenuation was classified into four categories: not specified, solid, part-solid and non-solid; the Kappa agreement for attenuation was 0.710 (0.600, 0.821). Lastly, we classified calcification into four categories: not specified, non-calcified, calcified-non-specific and calcified-benign; Kappa agreement was 0.606 (0.481, 0.731). Conclusions Excellent agreement between NLP and manual review of transcripts demonstrates that NLP is a useful research tool to extract information about lung nodule edge characteristics, attenuation and calcification, although agreement regarding the presence of calcification was less robust and could benefit from additional fine tuning of the NLP algorithm.