0
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Interested in becoming a RADSCI published author?

      • Platinum Open Access with no APCs.
      • Fast peer review/Fast publication online after article acceptance.

      See further information on submitting a paper at https://radsci-journal.org/submit-a-paper/

      scite_
       
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Deep learning in interstitial lung disease: classification and prognostic insights

      Published
      review-article

            Abstract

            Interstitial lung disease (ILD) comprises diverse parenchymal lung disorders, and are an important cause of morbidity and mortality among lung diseases. Disagreement is frequently observed among radiologic reads, pathologic interpretations, and multidisciplinary discussion consensus. Therefore, establishing a definitive diagnosis of ILD by using current techniques and criteria poses a considerable challenge. High-resolution computed tomography (HRCT) plays a crucial role in characterizing imaging patterns and predicting ILD prognosis. However, the substantial overlap in radiographic findings hinders accurate diagnosis of ILD in HRCT, even by experienced radiologists. Recently, deep learning (DL), a strategy that can automatically learn important characteristic features and patterns within CT images, has shown great potential in classifying and predicting ILD prognosis. This review summarizes current DL applications in ILD classification and prognosis evaluation; discusses challenges in clinical implementation; and presents insights for advancing this field. In conclusion, advanced DL can enhance diagnostic accuracy and enable more personalized treatment, thus providing new perspectives for managing ILD in the future.

            Main article text

            1. INTRODUCTION

            Interstitial lung disease (ILD) comprises diverse conditions characterized by inflammation and fibrosis of the interstitium [1]. Because these disorders exhibit overlapping clinical, radiological, and pathological features, differential diagnosis is challenging, even for experienced physicians [2]. Recent data have indicated rising morbidity and mortality from ILD. Between 2000 and 2017, the mortality rate due to idiopathic pulmonary fibrosis (IPF) in the United States increased by 9.85% (from 18.81 per 100,000 to 20.66 per 100,000) [3]. Diagnosing ILD in early stages is critical for determining appropriate treatment strategies. In contrast, missed diagnosis can give rise to potentially life-threatening complications.

            In recent decades, high-resolution computed tomography (HRCT) has emerged as the primary modality used for diagnosing ILD, particularly fibrotic lung diseases. Trained radiologists rely on visual evaluation of medical images to detect, characterize, and diagnose diseases. Nevertheless, this assessment is somewhat subjective and differs depending on physicians’ education and experience. For instance, even experienced chest radiologists may achieve only basic agreement in detecting cellular tissue [4, 5], an essential component of usual interstitial pneumonia (UIP).

            Since the 1990s, image analysis using artificial intelligence (AI) has rapidly advanced, driven by the introduction of deep learning (DL) algorithms based on neural networks and increased computer processing power. DL techniques have been applied in lesion detection, segmentation, and classification of ILD in HRCTs. Instead of using qualitative reasoning, DL algorithms recognize intricate patterns in imaging data and can automatically offer quantitative evaluations. The prominent DL algorithms used for image analysis are particularly adept at HRCT image analysis in ILD. Increasing evidence indicates that CNNs exhibit performance comparable or even superior to that of experts in diagnosing and managing ILD. By integrating AI into clinical processes as a tool to assist physicians, more precise and reproducible radiological evaluations can be achieved. Consequently, to achieve objective and timely diagnosis of ILD, AI assistance is highly desirable.

            The objective of this review is to summarize recent advancements in DL applications for ILD classification and prognosis evaluation. Moreover, we explore barriers to translating these findings into clinical practice, and provide insights and recommendations.

            2. ARTIFICIAL INTELLIGENCE OVERVIEW

            AI has revolutionized medical image analysis and has shown promising performance in various computer vision tasks. Machine learning (ML) and DL are subcategories of AI that have become the state of the art in the field of image analysis. Unlike traditional ML methods, DL algorithms can automatically retrieve crucial information on characteristics without a need for manual definition by human experts. Such algorithms usually use multiple layers of processing that enhance feature extraction and characterization [6]. On the basis of neural networks and increased computer processing power, the performance of DL is continually improving and is currently considered comparable to or better than that of humans in image classification tasks (e.g., pneumonia recognition) [7]. Table 1 summarizes five frequently used DL algorithms.

            Table 1 |

            Summary of five frequently used DL algorithms.

            AlgorithmTypical ApplicationsAdvantagesDisadvantages
            Convolutional neural networksImage classification, object detection, image segmentation, face recognition
            • Localized processing and translation invariance improve model learning of image features.

            • Features like parameter sharing and robustness to spatial variations also improve model efficiency and generalization.

            • Training requires substantial time and sample sizes, because of its multilayer structure.

            • Generalization to unseen variations in data distribution is limited.

            Recurrent neural networksLanguage modeling, text generation, speech recognition
            • Historical information is considered for more effective processing of sequential data.

            • Parameters are shared across time steps.

            • This method is prone to vanishing gradients during training.

            • Slow computation can make training difficult.

            • Accessing information from long ago may be difficult.

            • Future inputs from current state cannot be considered.

            Long short-term memoryMachine translation, natural language processing, complex time series analysis
            • Long-range dependencies in sequential data, such as language and time series, can be captured.

            • Gradient issues are alleviated, thus enhancing temporal modeling.

            • This method is ideal for real-time applications, maintaining context over long sequences.

            • Training requires extensive labeled data, particularly for tasks with intricate temporal patterns.

            • The inability to use parallel computation increases computational load.

            • This method is not conducive to transfer learning.

            Generative adversarial networksImage generation, style transfer, data augmentation
            • This method can produce high quality, realistic images and is effective in image generation tasks.

            • Unsupervised learning methods offer advantages with large amounts of unlabeled data.

            • Diverse outputs are generated.

            • The training process is complex and unstable.

            • This method is prone to mode collapse phenomena.

            • Interpretability is unsatisfactory.

            Deep belief networksFeature extraction, image recognition, video recognition, motion capture.
            • This method is especially useful with limited training data.

            • This method is specifically robust in classification (size, position, color, and viewing angle rotation).

            • A large dataset is required to optimize performance.

            • Hardware demands are high.

            • Classifiers are needed to understand the output.

            The primary deep neural networks, CNNs, were proposed by Lecun et al. [8] in the 1990s, and are structured on the basis of a neural system. CNNs have shown remarkable capabilities in image analysis, surpassing all other image classification algorithms in the ImageNet Large Scale Visual Recognition Challenge [9]. The basic CNNs comprise three essential components: convolution layers, pooling layers, and fully connected networks. CNNs consist of extensive data and computational units (neurons) that communicate with each other through data transmission connections (axons). The AI algorithm is run multiple times on a training dataset, thereby adjusting the importance of the data connected to each axon to minimize errors in the algorithm’s outputs. After completion of the training process, the algorithm is tested on an independent dataset to gauge its performance [10]. Numerous AI systems using CNN algorithms have been developed to diagnose various diseases, including lung nodules [11, 12]. AI-aided interpretation has been found to achieve 6.4% greater detection accuracy than that of nine radiologists in pulmonary nodule detection on chest radiographs [13].

            Because ILD images usually have repeating patterns, CNNs have been developed to exploit repetitive patterns [14]. Therefore, CNNs are being specifically used for the analysis of HRCT images in ILD. CNNs divide the intricate task of interpreting ILD images into several simplified tasks, including measuring organs or lesions (segmentation), identifying abnormal regions throughout the image (detection), diagnosing detected lesions (classification), and predicting pathology or prognosis on the basis of images.

            3. CLASSIFICATION

            3.1 Classifications of ILD patterns

            DL is becoming the approach of choice for classifying ILDs patterns in radiological data [15]. Anthimopoulos et al. [16] first designed and assessed a CNN model specifically for differentiating between healthy tissues and typical ILD patterns, including ground glass opacity, micronodules, cementum, reticulum, fovea, and combinations of ground glass opacity and reticulum. This model exhibited a classification performance as high as 85.5%, thus demonstrating the potential of CNNs in ILD classification. The same research team subsequently introduced a CNN architecture that captures textural variations inherent to ILD patterns. In comparison to previous outcomes, this model, by leveraging transfer learning from several non-medical source databases, has achieved a 2% enhancement in performance [17]. Despite unsatisfactory results, the training approaches of the network are as important as the structural design. Transfer learning has been shown to effectively address data scarcity issues, and the ensemble and model compression used in this method are relatively intricate. One customized CNN architecture proposed by Huang et al. [18] has recently shown favorable benchmark performance in the classification of ILD patterns, exceeding that of most state-of-the-art models. Additionally, the researchers have further enhanced performance by using a novel two-stage transfer learning strategy that effectively transfers knowledge acquired from both the source and intermediate domains.

            Extensive research has attempted to increase the accuracy of algorithms in distinguishing ILD patterns, particularly similar patterns. For instance, Kim’s study has increased the CNN’s classification accuracy from 81.27% to 95.12% with the expansion of the convolutional layer [19]. Notably, the incidence of misclassification substantially decreases in instances of pathological ambiguity, such as differentiation between normal and emphysema cases. Therefore, more complicated DL algorithms should be implemented to improve diagnostic capabilities for ILD.

            Most previous studies have used patch-based image representations, which are generally effective for ILD classification [16, 19, 20]. Gao et al. [21] have presented a novel method that uses whole lung images as a holistic input to classify ILD patterns, and can capture visual details and spatial context that might be ignored in image patch-based characterization. Thet study used three attenuation ranges to detect abnormal lung patterns, thereby achieving enhanced visibility or visual separation among all six ILD disease categories.

            Radiologic assessment of ILD requires experience and expertise, and inter-observer variability is high, even among experienced radiologists. Recently, Chaoe et al. [22] have conducted a pioneering investigation applying content-based image retrieval (CBIR) to ILD diagnosis. In that study, the top three reference CT images with diagnostic significance were extracted from the database through comparison of the extent and distribution of disease patterns in different regions quantified by the DL algorithm. After implementation of CBIR, the results demonstrated enhanced diagnostic accuracy among radiologists, across varying levels of experience and inter-reader agreement. This method increased confidence in the final diagnosis of ILD through reliance on not only the radiologist’s perceptions and experience, but also support from AI algorithms. Additionally, rather than relying solely on radiological data as individual inputs, modern models incorporating clinical and laboratory information are being developed [23]. Mei et al. have built a model based on initial CT images and associated clinical data, thus yielding a more comprehensive algorithm for classifying five types of ILD accurately [24]. Five models were devised, and the joint CNN model achieved the highest proficiency in the classification of ILD subtypes. This model precisely predicted five ILD subtypes and demonstrated superior diagnostic performance to that of a senior thoracic radiologist and a senior pulmonologist in identifying UIP in the test set. Therefore, although clinical information and relevant CT scans are accessible, this DL system can aid clinicians in diagnosis and classification of patients with ILD.

            3.2 Classification of pulmonary fibrosis

            Precise diagnosis of IPF, a chronic and progressive ILD, is crucial to facilitate timely commencement of antifibrotic therapy and, when applicable, enrollment in clinical trials. A confident diagnosis of IPF may be made in the correct clinical context when the CT shows a pattern of definite or probable UIP [25, 26]. However, radiological evaluation of IPF remains challenging, primarily because of significant inter-observer variability, even among experienced radiologists [4, 27].

            DL algorithms can apply specialized expertise to particular issues. Walsh et al. [28] first conducted a case-cohort study to develop and evaluate a DL method for IPF classification based on criteria specified by two international idiopathic pulmonary fibrosis guideline statements. The algorithm (73.3%) performed more accurately than most chest radiologists (70.7%) in classifying cases according to the 2011 ATS/ERS/JRS/ALAT IPF guidelines. Moreover, on the basis of the 2018 Fleischner Society criteria for UIP, the algorithm was further retrained and achieved performance comparable to that of thoracic radiologists. Christe et al. [29] have introduced a machine learning-assisted computer-aided detection algorithm capable of classifying IPF with accuracy similar to that of radiologists, in accordance with the 2018 Fleischner Society criteria. Shaish et al. [30] first designed a DL model for the classification of ILD by using histopathology as a reference standard instead of relying on radiologists’ interpretation classifications. The researchers used virtual lung wedge resection as input to a CNN, and observed that this method achieved moderate accuracy in predicting histopathologic UIP pattern, with performance comparable to that of humans. Likewise, in a retrospective study in 198 patients with biopsy-confirmed ILD conducted by Bratt et al. [31], a DL model was used to enhance the noninvasive evaluation of atypically presenting IPF through predicting UIP histopathology from CT images. This DL model achieved superior diagnostic performance to that of visual assessment (AUC, 0.87 vs 0.80, P = 0.03) and exhibited higher reproducibility.

            Most recently, intensive studies have focused on increasing the potential of the DL model in differentiating between IPF and non-IPF on HRCT images. Yu et al. [32] have built a two-stage model integrating a multi-scale, domain knowledge-guided attention model to ensure explainability and a random forest model to increase accuracy in making the final decision. In another study, Refaee et al. [33] developed three models involving handcrafted radiomics, DL, and ensemble models for the classification of IPF and non-IPF on HRCTs. The ensemble models achieved better performance than the radiologists. Hence, the combination of DL and handcrafted radiomics models may be a promising approach for supporting radiologists in diagnosing IPF.

            The gold standard for diagnosing ILD is a dynamic and comprehensive approach involving multidisciplinary discussion (MDD). This approach emphasizes close collaboration among clinicians, radiologists, and pathologists to ensure accurate diagnosis. However, few DL studies have integrated CT images and clinical information to diagnose IPF. Furukawa et al. [34] first developed a multimodal AI for differentiating IPF from all ILDs. This algorithm used CT findings and clinical data to increase the accuracy of IPF diagnosis, because the MDD teams arrived at a diagnosis by integrating these data. The model showed higher diagnostic agreement in IPF diagnosis (κ = 0.67) than international MDD teams (κ = 0.53) and respiratory physicians (κ = 0.41). Future multi-center research is warranted to develop a more robust algorithm. This algorithm may serve as a promising tool for IPF diagnosis by furnishing reproducible, nearly instantaneous reports with human-level accuracy.

            4. PREDICTING ILD PROGNOSIS

            Given the unpredictability of progression and the short median survival (2–5 years), identifying patients with ILD who exhibit rapid disease progression is crucial. Nevertheless, predicting the future of patients poses a formidable challenge.

            Most current guidelines suggest that pulmonary function tests and chest HRCT are essential for ILD patient follow-up. Early studies proposed several multidimensional indexes for the initial stratification of patients with IPF according to possible prognosis. The Composite Physiologic Index has a straightforward structure composed primarily of spirometric volumes and the diffusing capacity of the lung for carbon monoxide [35]. Ley et al. [36] have introduced a gender, age, physiology model to predict mortality in people with IPF, which incorporates four common variables: sex, age, and two lung physiology variables (forced vital capacity and diffusing capacity of the lung for carbon monoxide). Over the past decade, visual scoring has been the most frequently used approach for evaluating either overall disease status or the extent of specific CT patterns. Nevertheless, the primary constraint of visual scoring lies in the considerable interobserver variability [37]. Jacob et al. have reported the first evidence that ML is superior to radiologists in predicting the mortality of patients with IPF [38]. Kim et al. [39] have used recent quantitative texture-based scores to access initial alterations in HRCT scans to predict IPF progression. They have used a threshold from visual confirmation and have found that structural alterations of 4% or more in paired HRCT images from patients with IPF can potentially predict the decline in lung function in 1–2 years.

            In the past several years, DL algorithms have been widely used as an important technique in evaluating ILD prognosis. For instance, Walsh et al. [40] have evaluated the prognostic precision of the DL algorithm Systematic Objective Fibrotic Imaging Analysis Algorithm (SOFIA). This algorithm has better prognostic predictive ability for individuals with progressive fibrotic lung disease than either assessments performed by expert radiologists or outcomes derived from guideline-based histologic patterns. The success of SOFIA in this context underscores the potential of DL algorithms to process complex medical data and extract valuable insights that can aid in clinical decision-making. However, although DL algorithms can provide valuable assistance, they should always be used in conjunction with medical expertise and human judgment, to ensure accurate and safe diagnoses and prognoses. In a recent study by Chassagnon et al. [41], lung atrophy was identified through the elastic alignment of CT images integrated with a DL classifier. This combined method was used to evaluate deterioration due to ILD in individuals with systemic sclerosis, and achieved an accuracy of 80% and 83% in depicting morphologic and functional worsening, respectively. This study fills gaps in previous longitudinal follow-up studies, which focused predominantly on the appraisal of ILD extent while disregarding the potential effect of lung shrinkage. Similarly, Si-Mohamed et al. [42] have discovered that the median annual lung volume loss on CT is more significant in individuals with rather than without IPF (155.7 mL vs 50.7 mL, P < 0.0001), and a relative annual CT volume loss higher than 9.4% is associated with a reduced mean survival time (2.0 years vs. 2.8 years) in IPF patients. Nam et al. [43] have applied commercial DL software to quantify the extent of pulmonary fibrosis in chest CT scans from patients with IPF. Additionally, assessment of the prognostic significance of the CT volumetric parameters has revealed that normal and fibrotic lung volume proportions can serve as independent predictors of overall survival after adjustment for clinical and physiological factors. Most recently, Mei et al. [24] have used two distinct time-series models, formulated by using retrospectively gathered clinical data and quantitative CT scans, to comprehensively analyze 3-year survival rates. The researchers incorporated medication use and additional therapeutic details into the patients’ clinical histories to further enhance the prediction of 3-year survival rate. The Transformer model was chosen to train on patient data within 1 year, 2 years, and 3 years. The model’s sensitivity increased from 54.55% at the end of the first year to 72.73% by the end of the third year. The model, through providing 3-year survival predictions, can dynamically furnish personalized insights regarding current and prospective patient treatment outcomes.

            5. DISCUSSION AND PERSPECTIVES

            ILD comprises a diverse spectrum of conditions that are major causes of morbidity and mortality. Tables 2 and 3 summarize the literature applying CNN techniques in ILD classification and prognosis evaluation on HRCT. Numerous studies applying DL to ILD have reported superior performance to conventional techniques or better performance than radiologists in diagnosing IPF or predicting ILD prognosis. However, several challenges persist in this field. First, developing DL models for ILD with high accuracy requires a substantial image sample size [46]. Normal lungs and various ILD patterns may exhibit similar appearance, and the same ILD pattern can present significant variations across different participants. The volume of training data plays a crucial role in enhancing the precision and reliability of DL algorithms. Moreover, the development of public databases could expand the use of CBIR, to enable applications providing ILD diagnostic assistance [22]. Nonetheless, acquiring a sufficient quantity of medical images to train these algorithms poses several challenges. Unlike the common images used in mainstream image analysis datasets (e.g., ImageNet, AlexNet, GoogLeNet, and VGGNet), securing high-quality radiological images is both problematic and costly, thus posing a substantial bottleneck in the field of medical image analysis.

            Table 2 |

            Applications of DL in classifying ILD patterns.

            ApplicationAuthorsYearData sourceModel and methodKey findings
            Classification of ILD patterns and subtypesAnthimopoulos et al. [16]2016HUG database + proprietaryA CNN consists of five convolutional layers with 2×2 kernels and LeakyReLU activations, followed by just one average pooling, with the size equal to that of the final feature maps and three dense layers.Pattern-sensitivity ranged from 69% (honeycombing) to 99% (consolidation).
            Christodoulidis et al.[17]2017HUG database + proprietaryMulti-source transfer learning is used.The method improved performance by an absolute 2% over the previous performance 0.8557 of the same CNN in [16].
            Kim et al. [19]2018ProprietaryA CNN with six learnable layers consists of four convolution layers and two fully connected layers.As the convolution layer increased, the classification accuracy of the CNN showed better performance, from 81.27% to 95.12%.
            Wang et al. [20]2018HUG databaseA multi-scale and rotation-invariant convolutional neural network is used.All tissue categories achieved >85% classification rates.
            Gao et al. [21]2018HUG databaseOne algorithm achieves the entire image as a holistic input.In the holistic image classification, the overall accuracy was 68.6%.
            Huang et al. [18]2020HUG databaseOne new CNN architecture with a novel two-stage transfer learning strategy is used.The performance was improved by the proposed two-stage transfer learning method.
            Choe et al. [22]2022ProprietaryThis CBIR system for chest CT images uses DL.Diagnostic accuracy improved in all readers after application of CBIR (before vs after CBIR, 46.1% vs 60.9%, respectively).
            Mei et al. [24]2023ProprietaryA CNN model is built via transfer learning by using pre-trained weights from a RadImageNet CNN Inception-ResNet-V2 (IRV2).The joint model outperformed a senior thoracic radiologist and a senior pulmonologist in diagnosing UIP.
            It also performed as well as all human readers in sensitivity in diagnosing CHP, sarcoidosis, NSIP, and other ILD.
            Classification of pulmonary fibrosisWalsh et al.[28]2018ProprietaryOne DL algorithm developed using TensorFlow on a 3XS DL G10 computer is used.The model achieved greater accuracy (73.3%) than most thoracic radiologists (70.7%).
            Christe et al. [29]2019ProprietaryThe INTACT system was designed by biomedical engineers and trained by chest radiologists and pulmonologists.Reader 1, reader 2, and INTACT achieved similar accuracy for classifying pulmonary fibrosis into the original four categories: 0.6, 0.54, and 0.56, respectively (P > 0.45).
            Shaish et al. [30]2021ProprietaryVirtual lung wedge resection in patients with ILD can be used as input to a CNN.The model achieved a sensitivity of 74% and specificity of 58% in the testing cohort.
            Yu et al. [44]2021ProprietaryThis time/memory-efficient IPF diagnosis model uses axial chest CT and DK.Incorporating DK in the training of DL models increased the overall accuracy from 0.89 to 0.91 for the baseline CNN model.
            Refaee et al. [33]2022ProprietaryAn HCR model, a DL model, and an ensemble of HCR and DL model are used.The ensemble (85.3%) models performed better than two radiologists and one pulmonologist (66.7%) on the external test set.
            Bratt et al. [31]2022ProprietaryThis DL model was trained on a heterogeneous data set of scans from multiple institutions.The model performance was superior to that of radiologists in predicting histopathologic diagnosis (AUC, 0.87 vs 0.80, respectively).
            Yu et al. [32]2023ProprietaryThis two-stage model combines explainability achieved by a DL approach, and accuracy achieved by a machine learning technique.When both high- and moderate-resolution attention were included, under certain hyperparameter settings, the model achieved the highest AUC among all experiments (AUC ± SD = 0.99 ± 0.01).

            Abbreviations: DL, deep learning; CNN, convolutional neural network; ILD, interstitial lung disease; IPF, idiopathic pulmonary fibrosis; UIP, unusual interstitial pneumonitis; CHP, chronic hypersensitivity pneumonitis; NSIP, nonspecific interstitial pneumonia; HUG, Hospitals of Geneva; CBIR, content-based image retrieval; HCR, handcrafted radiomics; DK, domain knowledge; AUC, area under curve; SD, standard deviation.

            Table 3 |

            Applications of DL in predicting ILD prognosis.

            AuthorsYearObjectiveNo. of patientsModelKey findings
            Jacob et al. [38]2017Prediction of IPF mortality283CALIPER computer algorithmCALIPER-derived parameters (pulmonary vessel volume and honeycombing) had greater prognostic accuracy than traditional visual CT scores.
            Walsh et al. [40]2022Prediction of progressive fibrotic lung disease504SOFIA, a deep CNN loosely based on the InceptionResNet-v2 architectureSOFIA achieved better outcome prediction than expert radiologist evaluation or guideline-based histologic patterns.
            Chassagnon et al. [41]2020Diagnosis of lung shrinkage and functional worsening of ILD in patients with systemic sclerosis212Combination of elastic registration of CT scans with a DL classifierThe DL classifier depicted morphologic and functional worsening with an accuracy of 80% and 83%, respectively.
            Si-Mohamed, et al. [42]2022Exploration of prognostic value of annual CT volume loss in IPF560Commercially available software, a U-net-based DL algorithmA relative annual CT volume loss above 9.4% was associated with a significantly diminished mean survival time at 2.0 years versus 2.8 years in IPF.
            Nam et al. [43]2022Prediction of IPF overall survival161Fully automatic, commercial DL softwareCT-Norm% and CT-Fib% were independent prognostic factors for overall survival in IPF.
            Handa et al. [45]2021Prediction of IPF prognosis465AI-based quantitative CT image analysis softwareBronchial volumes and normal lung volumes were independently associated with survival after adjustment for sex, age, and lung physiology stage of IPF.
            Mei et al. [24]2023Evaluation of 3-year survival rate234Transformer modelThe model became more sensitive when more follow up information was available, increasing in sensitivity from 54.55% to 72.73% at the end of year 1 and the end of year 3, respectively.

            Abbreviations: AI, artificial intelligence; DL, deep learning; CNN, convolutional neural network; ILD, interstitial lung disease; IPF, idiopathic pulmonary fibrosis; SOFIA, systematic objective fibrotic imaging analysis algorithm; CT-Norm%, normal lung volume proportion; CT-Fib%, fibrotic lung volume proportion.

            Although several large-scale radiological imaging databases are currently available, most are specific to various conditions. Examples include MR-Net, which focuses on magnetic resonance images of the knee; MURA, designed for musculoskeletal radiographs; and Chest X-ray, which specializes in chest radiographs [47]. HUG database, one of the few public ILD databases, contains only 128 CT studies [48]. Therefore, international efforts will be critical to construct large-scale, balanced databases specifically designed for ILD. These databases should include comprehensive information, encompassing images and clinical data, and should adhere to uniform imaging standards, such as HRCT, to ensure effective training of DL models for ILD.

            Several technological solutions can compensate for data scarcity to some extent. These methods include techniques such as transfer learning, data augmentation, and generative adversarial networks (GANs). For instance, transfer learning can leverage pre-existing knowledge from extensive datasets, thereby avoiding the need for large-scale data specific to the given task. Fine-tuning, a common technique for transfer learning, enables the nuanced adaptation of generalized models to specific requirements, and decreases the time required to develop and process new DL models. Notably, ensuring the similarity of the datasets is an essential prerequisite in considering fine-tuning methods [49]. Another promising technique expected to be increasingly used in the future creates synthetic medical images from GANs to potentially supply unlimited numbers of images created from one or more image databases [50]. Ensuring that the quality of generated data matches that of real data is critical.

            DL algorithms are becoming progressively more efficient and complex, and capable of executing more tasks. However, standardized methods for validation are lacking. DL algorithms are increasingly acknowledged to need to undergo testing on publicly available datasets before clinical deployment. A public ILD dataset would enable the validation of diverse research models, thereby facilitating identification of the most effective models, given that many DL models developed for ILD diagnosis and prognosis are trained on small nonpublic datasets. Another major concern for DL algorithms is their generalizability [51]. The performance of DL algorithms may be robust on the datasets on which they were initially trained and tested, but may show marked deterioration of performance on new data from other sources. Consequently, developing a large and diverse dataset could enhance the training process and improve the algorithms’ generalization ability, thereby making them more reliable and effective in real-world clinical applications. Medical experts should establish multifaceted evaluation criteria to assess the clinical utility of these algorithms, because accuracy does not necessarily indicate clinical efficacy. Finally, deploying DL models at scale requires consideration of costs and energy consumption. Lightweight models may address these challenges by reducing model parameters and complexity.

            In future work, ensemble learning may offer a promising approach for more accurate and efficient ILD management. This method combines multiple models to leverage their strengths and mitigate individual weaknesses, thus improving model detection accuracy and decreasing training time [52]. This method can also effectively handle imbalanced datasets, thereby increasing sensitivity to rare ILD patterns. Additionally, DL is occasionally considered a “black box” because of challenges associated with interpretability. Decoding the image features used by deep neural networks for prediction is critical for biomarker development in patients with established pulmonary fibrosis [40].

            In conclusion, ILD is a clinically significant and difficult-to-manage problem, and DL offers distinctive advantages in diagnosing and predicting ILD prognosis. Further research efforts should focus on developing a high-performance DL architecture that could be deployed on any computer station and be made available to non-academic centers. Prospective studies to validate the clinical relevance of these tools are warranted before their use in routine clinical practice.

            CONFLICT OF INTEREST

            The authors declare no conflicts of interest.

            References

            1. Raghu G, Weycker D, Edelsberg J, Bradford WZ, Oster G. Incidence and prevalence of idiopathic pulmonary fibrosis. Am J Respir Crit Care Med. 2006. Vol. 174:810–16. 1680963310.1164/rccm.200602-163OC

            2. Watadani T, Sakai F, Johkoh T, Noma S, Akira M, et al.. Interobserver variability in the CT assessment of honeycombing in the lungs. Radiology. 2013. Vol. 266:936–44. 2322090210.1148/radiol.12112516

            3. Dove EP, Olson AL, Glassberg MK. Trends in idiopathic pulmonary fibrosis–related mortality in the United States: 2000–2017. Am J Respir Crit Care Med. 2019. Vol. 200:929–31. 3122596510.1164/rccm.201905-0958LE

            4. Walsh SLF, Calandriello L, Sverzellati N, Wells AU, Hansell DM. Interobserver agreement for the ATS/ERS/JRS/ALAT criteria for a UIP pattern on CT. Thorax. 2016. Vol. 71:45–51. 2658552410.1136/thoraxjnl-2015-207252

            5. Wuyts LL, Camerlinck M, De Surgeloose D, Vermeiren L, Ceulemans D, et al.. Comparison between the ATS/ERS/JRS/ALAT criteria of 2011 and 2018 for usual interstitial pneumonia on HRCT: a cross-sectional study. Br J Radiol. 2021. Vol. 94:20201159. 3353923110.1259/bjr.20201159

            6. Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJWL. Artificial Intelligence in radiology. Nat Rev Cancer. 2018. Vol. 18:500–10. 2977717510.1038/s41568-018-0016-5

            7. Dodge S, Karam L. A study and comparison of human and deep learning recognition performance under visual distortionsProceedings of the 2017 26th International Conference on Computer Communication and Networks (ICCCN); New York: IEEE. 2017. https://www.webofscience.com/wos/alldb/full-record/WOS:000463806000105Last accessed on 27 Aug 2023 10.1109/ICCCN.2017.8038465

            8. Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998. Vol. 86:2278–324. 10.1109/5.726791

            9. Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017. Vol. 60:84–90. 10.1145/3065386

            10. Chan J, Auffermann WF. Artificial intelligence in the imaging of diffuse lung disease. Radiol Clin North Am. 2022. Vol. 60:1033–40. 3620247410.1016/j.rcl.2022.06.014

            11. Yoo H, Kim KH, Singh R, Digumarthy SR, Kalra MK. Validation of a deep learning algorithm for the detection of malignant pulmonary nodules in chest radiographs. JAMA Netw Open. 2020. Vol. 3:e2017135. 3297015710.1001/jamanetworkopen.2020.17135

            12. Nam JG, Park S, Hwang EJ, Lee JH, Jin K-N, et al.. Development and validation of deep learning-based automatic detection algorithm for malignant pulmonary nodules on chest radiographs. Radiology. 2019. Vol. 290:218–28. 3025193410.1148/radiol.2018180237

            13. Homayounieh F, Digumarthy S, Ebrahimian S, Rueckel J, Hoppe BF, et al.. An artificial intelligence-based chest X-ray model on human nodule detection accuracy from a multicenter study. JAMA Netw Open. 2021. Vol. 4:e2141096. 3496485110.1001/jamanetworkopen.2021.41096

            14. Soffer S, Ben-Cohen A, Shimon O, Amitai MM, Greenspan H, et al.. Convolutional neural networks for radiologic images: a radiologist’s guide. Radiology. 2019. Vol. 290:590–606. 3069415910.1148/radiol.2018180547

            15. Mazurowski MA, Buda M, Saha A, Bashir MR. Deep learning in radiology: an overview of the concepts and a survey of the state of the art with focus on MRI. J Magn Reson Imaging. 2019. Vol. 49:939–54. 3057517810.1002/jmri.26534

            16. Anthimopoulos M, Christodoulidis S, Ebner L, Christe A, Mougiakakou S. Lung pattern classification for interstitial lung diseases using a deep convolutional neural network. IEEE Trans Med Imaging. 2016. Vol. 35:1207–16. 2695502110.1109/TMI.2016.2535865

            17. Christodoulidis S, Anthimopoulos M, Ebner L, Christe A, Mougiakakou S. Multisource transfer learning with convolutional neural networks for lung pattern analysis. IEEE J Biomed Health Inform. 2017. Vol. 21:76–84. 2811404810.1109/JBHI.2016.2636929

            18. Huang S, Lee F, Miao R, Si Q, Lu C, et al.. A deep convolutional neural network architecture for interstitial lung disease pattern classification. Med Biol Eng Comput. 2020. Vol. 58:725–37. 3196540710.1007/s11517-019-02111-w

            19. Kim GB, Jung K-H, Lee Y, Kim HJ, Kim N, et al.. Comparison of shallow and deep learning methods on classifying the regional pattern of diffuse lung disease. J Digit Imaging. 2018. Vol. 31:415–24. 2904352810.1007/s10278-017-0028-9

            20. Wang Q, Zheng Y, Yang G, Jin W, Chen X, Yin Y. Multiscale rotation-invariant convolutional neural networks for lung texture classification. IEEE J Biomed Health Inform. 2018. Vol. 22:184–95. 10.1109/JBHI.2017.2685586

            21. Gao M, Bagci U, Lu L, Wu A, Buty M, et al.. Holistic classification of CT attenuation patterns for interstitial lung diseases via deep convolutional neural networks. Comput Methods Biomech Biomed Eng Imaging Vis. 2018. Vol. 6:1–6. 2962324810.1080/21681163.2015.1124249

            22. Choe J, Hwang HJ, Seo JB, Lee SM, Yun J, et al.. Content-based image retrieval by using deep learning for interstitial lung disease diagnosis with chest CT. Radiology. 2022. Vol. 302:187–97. 3463663410.1148/radiol.2021204164

            23. Koo CW, Williams JM, Liu G, Panda A, Patel PP, et al.. Quantitative CT and machine learning classification of fibrotic interstitial lung diseases. Eur Radiol. 2022. Vol. 32:8152–61. 3567886110.1007/s00330-022-08875-4

            24. Mei X, Liu Z, Singh A, Lange M, Boddu P, et al.. Interstitial lung disease diagnosis and prognosis using an AI system integrating longitudinal data. Nat Commun. 2023. Vol. 14:2272 3708095610.1038/s41467-023-37720-5

            25. Lynch DA, Sverzellati N, Travis WD, Brown KK, Colby TV, et al.. Diagnostic criteria for idiopathic pulmonary fibrosis: a fleischner society white paper. Lancet Respir Med. 2018. Vol. 6:138–53. 2915410610.1016/S2213-2600(17)30433-2

            26. Gruden JF. CT in idiopathic pulmonary fibrosis: diagnosis and beyond. Am J Roentgenol. 2016. Vol. 206:495–507. 2690100510.2214/AJR.15.15674

            27. Walsh SLF, Wells AU, Desai SR, Poletti V, Piciucchi S, et al.. Multicentre evaluation of multidisciplinary team meeting agreement on diagnosis in diffuse parenchymal lung disease: a case-cohort study. Lancet Respir Med. 2016. Vol. 4:557–65. 2718002110.1016/S2213-2600(16)30033-9

            28. Walsh SLF, Calandriello L, Silva M, Sverzellati N. Deep learning for classifying fibrotic lung disease on high-resolution computed tomography: a case-cohort study. Lancet Respir Med. 2018. Vol. 6:837–45. 3023204910.1016/S2213-2600(18)30286-8

            29. Christe A, Peters AA, Drakopoulos D, et al.. Computer-aided diagnosis of pulmonary fibrosis using deep learning and CT images. Invest Radiol. 2019. Vol. 54:627–32. 3148376410.1097/RLI.0000000000000574

            30. Shaish H, Ahmed FS, Lederer D, D’Souza B, Armenta P, et al.. Deep learning of computed tomography virtual wedge resection for prediction of histologic usual interstitial pneumonitis. Ann Am Thorac Soc. 2021. Vol. 18:51–9. 3285759410.1513/AnnalsATS.202001-068OC

            31. Bratt A, Williams JM, Liu G, Panda A, Patel PP, et al.. Predicting usual interstitial pneumonia histopathology from chest CT imaging with deep learning. Chest. 2022. Vol. 162:815–23. 3540511010.1016/j.chest.2021.03.044

            32. Yu W, Zhou H, Choi Y, Goldin JG, Teng P, et al.. Multi-scale, domain knowledge-guided attention plus random forest: a two-stage deep learning-based multi-scale guided attention models to diagnose idiopathic pulmonary fibrosis from computed tomography images. Med Phys. 2023. Vol. 50:894–905. 3625478910.1002/mp.16053

            33. Refaee T, Salahuddin Z, Frix A-N, Yan C, Wu G, et al.. Diagnosis of idiopathic pulmonary fibrosis in high-resolution computed tomography scans using a combination of handcrafted radiomics and deep learning. Front Med (Lausanne). 2022. Vol. 9:915243. 3581476110.3389/fmed.2022.915243

            34. Furukawa T, Oyama S, Yokota H, Kondoh Y, Kataoka K, et al.. A comprehensible machine learning tool to differentially diagnose idiopathic pulmonary fibrosis from other chronic interstitial lung diseases. Respirology. 2022. Vol. 27:739–46. 3569734510.1111/resp.14310

            35. Wells AU, Desai SR, Rubens MB, Goh NSL, Cramer D, et al.. Idiopathic pulmonary fibrosis – a composite physiologic index derived from disease extent observed by computed tomography. Am J Respir Crit Care Med. 2003. Vol. 167:962–69. 1266333810.1164/rccm.2111053

            36. Ley B, Ryerson CJ, Vittinghoff E, Ryu JH, Tomassetti S, et al.. A multidimensional index and staging system for idiopathic pulmonary fibrosis. Ann Intern Med. 2012. Vol. 156:684–91. 2258600710.7326/0003-4819-156-10-201205150-00004

            37. Sverzellati N, Brillet P-Y. When deep blue first defeated Kasparov: is a machine stronger than a radiologist at predicting prognosis in idiopathic pulmonary fibrosis? Eur Respir J. 2017. Vol. 49:1602144. 2812285710.1183/13993003.02144-2016

            38. Jacob J, Bartholmai BJ, Rajagopalan S, Kokosi M, Nair A, et al.. Mortality prediction in idiopathic pulmonary fibrosis: evaluation of computer-based CT analysis with conventional severity measures. Eur Respir J. 2017. Vol. 49:1601011. 2781106810.1183/13993003.01011-2016

            39. Kim GHJ, Weigt SS, Belperio JA, Brown MS, Shi Y, et al.. Prediction of idiopathic pulmonary fibrosis progression using early quantitative changes on CT imaging for a short term of clinical 18-24-month follow-ups. Eur Radiol. 2020. Vol. 30:726–34. 3145197310.1007/s00330-019-06402-6

            40. Walsh SLF, Mackintosh JA, Calandriello L, Silva M, Sverzellati N, et al.. Deep learning-based outcome prediction in progressive fibrotic lung disease using high-resolution computed tomography. Am J Respir Crit Care Med. 2022. Vol. 206:883–91. 3569634110.1164/rccm.202112-2684OC

            41. Chassagnon G, Vakalopoulou M, Regent A, Sahasrabudhe M, Marini R, et al.. Elastic registration-driven deep learning for longitudinal assessment of systemic sclerosis interstitial lung disease at CT. Radiology. 2021. Vol. 298:189–98. 3307899910.1148/radiol.2020200319

            42. Si-Mohamed SA, Nasser M, Colevray M, Nempont O, Lartaud PJ, et al.. Automatic quantitative computed tomography measurement of longitudinal lung volume loss in interstitial lung diseases. Eur Radiol. 2022. Vol. 32:4292–303. 3502973010.1007/s00330-021-08482-9

            43. Nam JG, Choi Y, Lee S-M, Yoon SH, Goo JM, et al.. Prognostic value of deep learning-based fibrosis quantification on chest CT in idiopathic pulmonary fibrosis. Eur Radiol. 2023. Vol. 33:3144–155. 3692856810.1007/s00330-023-09534-y

            44. Yu W, Zhou H, Goldin JG, Wong WK, Kim GHJ. End-to-end domain knowledge-assisted automatic diagnosis of idiopathic pulmonary fibrosis (IPF) using computed tomography (CT). Med Phys. 2021. Vol. 48:2458–67. 3354764510.1002/mp.14754

            45. Handa T, Tanizawa K, Oguma T, Uozumi R, Watanabe K, et al.. Novel artificial intelligence-based technology for chest computed tomography analysis of idiopathic pulmonary fibrosis. Ann Am Thorac Soc. 2022. Vol. 19:399–406. 3441088610.1513/AnnalsATS.202101-044OC

            46. Trusculescu AA, Manolescu D, Tudorache E, Oancea C. Deep learning in interstitial lung disease-how long until daily practice. Eur Radiol. 2020. Vol. 30:6285–92. 3253772810.1007/s00330-020-06986-4

            47. Moawad AW, Fuentes DT, ElBanan MG, Shalaby AS, Guccione J, et al.. Artificial intelligence in diagnostic radiology: where do we stand, challenges, and opportunities. J Comput Assist Tomogr. 2022. Vol. 46:78–90. 3502752010.1097/RCT.0000000000001247

            48. Depeursinge A, Vargas A, Platon A, Geissbuhler A, Poletti P-A, et al.. Building a reference multimedia database for interstitial lung diseases. Comput Med Imaging Graph. 2012. Vol. 36:227–38. 2180354810.1016/j.compmedimag.2011.07.003

            49. Wang G, Li W, Zuluaga MA, Prat R, Patel PA, et al.. Interactive medical image segmentation using deep learning with image-specific fine tuning. IEEE Trans Med Imaging. 2018. Vol. 37:1562–73. 2996940710.1109/TMI.2018.2791721

            50. Saboury B, Morris M, Siegel E. Future directions in artificial intelligence. Radiol Clin North Am. 2021. Vol. 59:1085–95. 3468987610.1016/j.rcl.2021.07.008

            51. Lee G, Fujita H. Deep learning in medical image analysis: challenges and applications. Cham: Springer International Publishing. 2020. p. 1–181. 10.1007/978-3-030-33128-3

            52. Sun L, Mo Z, Yan F, Xia L, Shan F, et al.. Adaptive feature selection guided deep forest for COVID-19 classification with Chest CT. IEEE J Biomed Health Inform. 2020. Vol. 24:2798–805. 3284584910.1109/JBHI.2020.3019505

            Author and article information

            Journal
            radsci
            Radiology Science
            Compuscript (Ireland )
            2811-5635
            03 July 2024
            : 3
            : 1
            : 41-49
            Affiliations
            [a ]Department of Radiology, The Second Xiangya Hospital, Central South University, Changsha 410011, China
            [b ]Department of Radiology, The Second Affiliated Hospital of Xinjiang Medical University, Urumqi 830011, China
            [c ]School of Computer Science and Engineering, Central South University, Changsha 410083, China
            [d ]Paul C. Lauterbur Research Center for Biomedical Imaging, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
            [e ]Clinical Research Center for Medical Imaging in Hunan Province, Changsha 410011, China
            Author notes
            *Correspondence: wei.zhao@ 123456csu.edu.cn (W. Zhao); junliu123@ 123456csu.edu.cn (J. Liu)

            1WZ and JL was responsible for the review design and provided critical revision of the manuscript for important intellectual content.

            Article
            10.15212/RADSCI-2023-0011
            034277e6-ab77-48ff-aaf5-8e41813871c9
            Copyright © 2024 The Authors.

            This work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 International.

            History
            : 28 August 2023
            : 04 March 2024
            : 31 May 2024
            Page count
            Tables: 3, References: 52, Pages: 9
            Funding
            Funded by: National Natural Science Foundation of China
            Award ID: 82102157
            Funded by: Hunan Provincial Natural Science Foundation for Excellent Young Scholars
            Award ID: 2022JJ20089
            Funded by: Clinical Research Center for Medical Imaging in Hunan Province
            Award ID: 2020SK4001
            Funded by: Science and Technology Innovation Program of Hunan Province
            Award ID: 2021RC4016
            Funded by: National Natural Science Foundation of China
            Award ID: 61971451
            Funded by: National Natural Science Foundation of China
            Award ID: U22A20303
            Funded by: Innovative Province Special Construction Foundation of Hunan Province
            Award ID: 2019SK2131
            Funded by: Science and Technology Innovation Program of Hunan Province
            Award ID: 2021RC4016
            Funded by: Clinical Research Center for Medical Imaging in Hunan Province in China
            Award ID: 2020SK4001
            The study was supported by the National Natural Science Foundation of China (82102157), Hunan Provincial Natural Science Foundation for Excellent Young Scholars (2022JJ20089), Clinical Research Center for Medical Imaging in Hunan Province (2020SK4001), and Science and Technology Innovation Program of Hunan Province (2021RC4016). This work was supported by the National Natural Science Foundation of China (61971451, U22A20303), Innovative Province Special Construction Foundation of Hunan Province 2019SK2131, Science and Technology Innovation Program of Hunan Province 2021RC4016, and Clinical Research Center for Medical Imaging in Hunan Province in China 2020SK4001.
            Categories
            Review

            Medicine,Radiology & Imaging
            Interstitial lung disease,Deep learning,Convolution neural network,Idiopathic pulmonary fibrosis

            Comments

            Comment on this article