Chest X-rays(CXR), being highly sensitive, serve as a Screening tool in TB diagnosis. Though there are no classical features diagnostic of TB on CXR, there are a few patterns that can be used as supportive evidence. In Resource limited settings, developing Deep Learning algorithms for CXR based TB screening, could reduce diagnostic delay. Our algorithm screens for 8 abnormal patterns(TB tags)- Pleural effusion, blunted CP, Atelectasis, Fibrosis, Opacity, Nodules, Calcification and Cavity. It reports ‘No Abnormality Detected’ if none of these patterns are present on CXR.
An anonymized dataset of 423,218 CXRs with matched radiologist reports across (22 models, 9 manufacturers, 166 centres in India) was used to generate training data for the deep learning models. Natural Language Processing techniques were used to extract TB tags from these reports. Deep learning systems were trained to predict the probability of the presence/absence of each TB tag along with heat-maps that highlight abnormal regions in the CXR for each positive result.
We validated the screening algorithm on 3 datasets external to our training set- two public datasets maintained by NIH(from Montgomery and Shenzen) and a third from NIRT, India. The Area under the Receiver Operating Curve (AUC-ROC) for TB prediction was 0.91, 0.87 and 0.83 respectively.
Training on a diversified dataset enabled good performance on samples from completely different demographics. After further validation of it’s robustness against variation, the system can be deployed at scale to improve the current systems for TB screening significantly