Sharifi Y, Danay Ashgzari M, Naseri Z, Amiri Tehranizadeh. TIRADS-Based Artificial Intelligence Systems for Ultrasound Imaging of Thyroid Nodules: A Systematic Review

Sharifi Y; Danay Ashgzari M; Naseri Z; Amiri Tehranizadeh A

Research Article

Austin J Radiol. 2025; 12(2): 1254.

TIRADS-Based Artificial Intelligence Systems for Ultrasound Imaging of Thyroid Nodules: A Systematic Review

Sharifi Y¹*, Danay Ashgzari M², Naseri Z³ and Amiri Tehranizadeh A³

¹Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran

²Department of Computer, Faculty of Engineering, Islamic Azad University of Mashhad, Mashhad, Iran

³Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran

*Corresponding author: Sharifi Y, Department of Medical Informatics, Faculty of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran Tel: 00989153523360; Email: sharifiy@mums.ac.ir

Received: March 15, 2025 Accepted: April 02, 2025 Published: April 04, 2025

Abstract

Objective: The Thyroid Imaging Reporting and Data Systems (TI-RADS) is a standard terminology that classifies thyroid nodules according to their potential risk of cancer to reduce unnecessary biopsies, minimize variations in interpreting thyroid nodule images, and improve diagnostic accuracy. This study aims to comprehensively review articles that utilize AI techniques to develop decision support systems for analyzing ultrasound images of thyroid nodules, following different TIRADS guidelines.

Materials and Methods: We followed a five-step process, this included identifying the key research questions, outlining the literature search strategies, establishing criteria for including and excluding studies, assessing the quality of the studies, and extracting the relevant data. We created a comprehensive search string to gather all relevant English-language studies up to January 2024 from the PubMed, Scopus, and Web of Science databases, and we also followed the PRISMA diagram.

Results: In this review, forty-four papers were included, and the most important properties of these papers, including dataset characteristics, AI technical specifications, results and outcome metrics, metrics, limitations, and contributions, were extracted.

Conclusion: We evaluated the technical characteristics and various aspects used in the development of artificial intelligence CAD systems based on various TI-RADS. This review demonstrates that AI advancements, especially deep learning methods, have significantly enhanced CAD systems for evaluating thyroid nodules. However, comprehensive datasets, multimodal images, and standard evaluation metrics are needed to further enhance machine learning models. Our study aims to provide researchers and physicians with a summary of the current advancements in this field to guide future investigations.

Keyword: Thyroid nodules; Artificial Intelligence; TIRADS; Ultrasonography; Computer-assisted diagnosis

Introduction

The thyroid gland, a small yet crucial endocrine organ situated in the anterior aspect of the neck, plays a significant role in the regulation of metabolism and various bodily functions [1]. Thyroid nodules are frequently encountered in clinical practice, with the majority of cases being benign. However, accurately differentiating between benign and malignant nodules to guide appropriate management strategies is paramount. The evaluation of thyroid nodules often involves a combination of clinical assessment, imaging studies such as ultrasound, and fine needle aspiration biopsy for cytological examination [2].

Fine needle aspiration (FNA) is an invasive procedure used to evaluate thyroid nodules for the presence of cancerous cells. However, it is common practice for many nodules to undergo a biopsy to identify a small percentage of cases that may be malignant. It is important to consider the potential burden that FNA procedures can place on healthcare systems, as they can result in significant costs and create stress and anxiety for patients. Therefore, it is crucial for healthcare providers to carefully evaluate the necessity of such procedures and consider alternative approaches when possible [3,4]. Thyroid ultrasound imaging plays a crucial role in the identification of thyroid nodules because of its accessibility, noninvasive nature, and cost effectiveness. This procedure allows clinicians to visualize the thyroid gland and any abnormalities present within it [5]. Furthermore, it is a safe and convenient diagnostic tool that can be easily performed in outpatient settings, making it a valuable resource for monitoring thyroid health and guiding treatment decisions [5].

The Thyroid Imaging, Reporting, and Data System (TI-RADS) was established to provide a standardized framework for categorizing thyroid nodules according to their specific characteristics associated with risk. This system aims to mitigate issues surrounding the variability and low reproducibility that often arise in the detection and interpretation of nodule features among different physicians [6]. By implementing TI-RADS, healthcare providers can ensure a more consistent and reliable approach to evaluating thyroid nodules, ultimately leading to more accurate diagnoses and treatment decisions for patients [7]. There are several variations of TIRADS, each with its own specific criteria and scoring system. These variations, such as the American College of Radiology (ACR) TIRADS [8], the Korean Society of Thyroid Radiology (KTIRADS) [9], ACE [10], ATA [11], Kwak-TIRADS [12], and the European Thyroid Imaging and Reporting System (EU-TIRADS) [13], aim to standardize the interpretation and management of thyroid nodules. AI-based approaches, such as machine learning and deep learning algorithms, have demonstrated significant potential in enhancing the accuracy and efficacy of thyroid nodule evaluation. These advancements not only help reduce variability among observers but also contribute to improving diagnostic outcomes by identifying patterns and trends that may not be easily identifiable by clinicians alone [14-18].

With advancements in medical technology, computer-aided detection (CAD) systems have been developed to assist radiologists in analyzing ultrasound images of thyroid nodules. These CAD systems can help in the early detection of suspicious nodules, leading to timely intervention and improved patient outcomes. By combining the expertise of radiologists with the efficiency and accuracy of CAD systems, healthcare professionals, by minimizing the subjective nature of traditional diagnostic methods, can provide more precise and reliable diagnoses and treatment plans for patients with thyroid nodules [18-20].

The development of AI-driven TIRADS models, which combine computerized analysis of ultrasound images with established risk stratification systems, represents a progressive step in the field of thyroid imaging [14,21,22].

The classification of thyroid nodules via various TIRADS systems has been the subject of several studies, highlighting the importance of evaluating these systems in depth. The primary objective of this study is to explore the utilization of artificial intelligence CAD systems in the ultrasound image classification of thyroid nodules via various TIRADS systems. It is crucial to consider factors such as dataset characteristics, technical specifications of the network, evaluation metrics, results, advantages, obstacles, and limitations.

By analyzing the literature, this research aims to offer a comprehensive understanding of the role of AI techniques in the development of TIRADS-based decision support systems for this purpose to highlight the challenges and prospects that lie ahead in the integration of these groundbreaking technologies into clinical settings. However, to the best of our knowledge, no systematic review has explicitly focused on this field. The insights gained from this study could serve as a valuable resource for researchers and developers looking to create more effective systems with improved efficiency. Ultimately, the implementation of these systems could help reduce unnecessary thyroid nodule biopsies, address issues of over care, enhance the reproducibility and reliability of ultrasound diagnostics, and provide educational support for less experienced physicians.

To carry out these tasks, the following research questions are proposed to direct this systematic literature review:

- What is the best artificial intelligence technique for implementing a thyroid nodule classification system based on TI-RADS?

• What is the size of the appropriate dataset for the successful implementation of these systems?

• What are the most common neural network architectures used in these systems?

• What is the most common TI-RADS used in these systems?

• What are the most common evaluation metrics in these systems?

The remainder of this paper is structured as follows: Section 2 outlines the methodology of the systematic review. Section 3 details the findings of various uses of AI systems based on TIRADS on ultrasound images of thyroid nodules. Finally, a discussion will be presented, and conclusions and future works will be drawn.

Materials and Methods

This systematic review involves five main steps: literature search, study selection, study quality assessment, data extraction, and analysis. Further details of each step are presented in the subsequent subsections. Importantly, the protocol for this systematic review was registered in the PROSPERO database in August 2024 [23]. (Registration number: CRD42024551311).

Literature Search

This study conducted a systematic review to retrieve all relevant English language articles up to January 2024 via the PubMed, Scopus, and Web of Sciences databases. The search terms included "Thyroid Imaging Reporting and Data System", "Artificial Intelligence", "ultrasonography" and their related terms (Table 1). In addition, the Medical Subject Headings (MeSH) vocabulary and synonym keywords were utilized.

Study Selection

This study adheres to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [24]. To select relevant articles, we defined the inclusion and exclusion criteria.

The inclusion criteria were as follows:

1- Articles that have implemented an AI system based on TIRADS.

2- Articles on ultrasound imaging of the thyroid.

The exclusion criteria were as follows:

1. Nonoriginal articles such as review articles, comments, and editorials.

2. Conference abstracts and unpublished articles.

3. Articles that do not use TIRADS.

4. Articles that do not include ultrasound images.

5. Articles that evaluate existing AI systems on the basis of TIRADS.

Study Quality Assessment

The included studies were subjected to a quality assessment process to evaluate the credibility and strength of the articles. We used a modified quality assessment with 13 questions and three options ('‘Yes’’= 1, '‘Partly’’= 0.5, and '‘No’’= 0), as suggested by Sharifi et al. [18] (Table 1S).

Two independent researchers with backgrounds in systematic review, machine learning, deep learning, and medical informatics evaluated the quality of the included studies and resolved any discrepancies in their findings by consulting a third researcher to reach a unanimous conclusion.

Data Extraction

Two reviewers independently evaluated and extracted data from the included articles, using a predesigned table in Microsoft Excel to ensure accuracy. A pilot test was subsequently conducted on twenty random studies to confirm the reliability of their data extraction. The calculated kappa statistic [25] indicated strong agreement in data interpretation (kappa statistic = 0.85). The following major aspects of the included studies were extracted: paper information, patient information, dataset characteristics, technical specifications, results, outcome metrics, limitations, and contributions.

Data Analysis

In this section, the major aspects of the articles that implemented a TIRADS-based AI system are analyzed.

Results

Literature Search and Study Selection

The identification of potentially related articles to TIRADS-based artificial intelligence systems on US images of thyroid nodules in this systematic review adheres to the PRISMA flow diagram and guidelines [24]. Figure 1 displays the PRISMA diagram for this study, which comprises four primary phases. The initial phase involved identifying relevant English language articles via the PubMed, Scopus, and Web of Sciences databases until January 2024, on the basis of the search strategy outlined in Section 2-1. Initially, 618 papers were found, and after removing duplicates, 521 papers remained. In the next stage, after screening the titles and abstracts, 443 unsuitable articles were removed, leaving 88 articles for further consideration. In the third phase, we evaluated the suitability of the articles by reading the full texts and applying the inclusion and exclusion criteria outlined in Section 2-2. As a result, 44 articles were eliminated from the study. In the fourth phase, 44 articles were chosen for additional qualitative analysis.

Literature Sources

The analysis involved reviewing 44 selected articles to investigate TIRADS-based artificial intelligence research on ultrasound images of thyroid nodules.

These articles were published from 2017--2024. They were categorized as follows: Q1 (66%), Q2 (27%), and Q3 (7%). Summary details regarding these articles can be found in Table 2S of the Supplementary data.

Study Quality Assessment

To evaluate the quality of the selected articles, two independent researchers responded to 13 quality answers [18] for articles that implemented a TIRADS-based AI system as previously stated. If there were any discrepancies in their evaluations, they sought advice from a third researcher. The final scores were subsequently calculated by adding the scores of these answers for each article that could receive a score ranging from zero to 13.

Furthermore, the articles are divided into three groups on the basis of their scores, namely, "low-score," "mid-score," and "highscore", by splitting the score range into three equally sized intervals: [0 - 4.33), [4.33 - 8.66), and [8.66 - 13], respectively.

The details and results of the quality questionnaire are shown in Table 3S in the supplementary data. According to the computed scores, the articles are distributed as follows: 2% low-score, 17% midscore, and 81% high-score categories.

Data Analysis

In this review, forty-four papers used TIRADS in the implementation of an AI system for analyzing ultrasound images of thyroid nodules.

Figure 1S in the supplementary data illustrates the number of articles by year until January 2024, and the most important properties of these papers, including dataset characteristics, AI technical specifications, results and outcome metrics, limitations, and contributions, are presented in Table 2, Table 3, and Table 4.

Figure 1: PRISMA flow diagram of this study.

    
    
    Figure 1: PRISMA flow diagram of this study.

Discussion

The primary objective of this study was to conduct a systematic review of articles related to TIRADS-based CAD systems for analyzing thyroid ultrasound images.

Among the initial 618 publications, 44 articles published up to January 2024 were selected for this study. As depicted in Figure 1S, articles in this field have been published from 2017 to 2024, with the highest number of publications in 2023 (n=14, 30.4%). All the articles that have utilized TIRADS guidelines to implement AI systems for analyzing thyroid ultrasound images have focused on classification tasks. The expected growth in this field is due to CAD systems being designed to detect suspicious nodules and differentiate between benign and malignant nodules in thyroid ultrasound images.

Comparison of Outcome Metrics

The use of various evaluation metrics in these studies makes it difficult to assess and compare the performance of the CAD systems being presented. In these studies, as depicted in Figure 2S in supplementary data, the most popular metrics are accuracy (n=35, 21%), sensitivity (n=34, 20%), specificity (n=34, 20%), area under the curve (AUC) (n=23, 13%), PPV (n=18, 10%), NPV (n=18, 4011%), and F score (n=8, 5%).

Figure 2: Dataset type and number of TIRADS-based AI systems forthyroid US images.

    
    
    Figure 2: Dataset type and number of TIRADS-based AI systems for thyroid US images.

Dataset Comparison

The performance of research articles has been validated via various datasets of different sizes and types, including local and public datasets.

Table 4S and Figure 2 present statistical information about the size and type of datasets included in the studies. The used dataset consisted of a minimum of 134 images from a local source and a maximum of 31888 images, which included both a local dataset and a public dataset. In Figure 2, it is clear that only a small fraction (n:3, 4.4%) of the studies use public datasets, making it difficult to compare their methods.

The public ultrasound thyroid datasets used in these papers are the Thyroid Digital Image Database (TDID), provided by Pedraza et al. [74], and an open-source dataset from the scientific community [75]. Among the studies that used local datasets, (n=31, 68.9%) utilized one dataset (one center), and (n=9, 20%) utilized two to four datasets (multiple centers).

Image Preprocessing and Augmentation

Image preprocessing is a crucial step in medical image analysis. It sets the foundation for accurate image interpretation and insight extraction. This phase often includes detailed operations such as cropping and resizing the region of interest (ROI), which are essential for focusing the analysis on the most relevant aspects of the image [29,32,36-38,44].

Fundamental to image preprocessing are processes such as binarization, which effectively separates objects from their backgrounds, and normalization, which is crucial for ensuring that intensity values remain consistent across a dataset, facilitating more reliable comparisons and evaluations [30,31,53].

In addition to these core techniques, various image filtering methodologies, such as median filters [34] and bilateral filtering [65], are used to reduce noise and enhance important features in images, thus improving the clarity and usefulness of visual data. Specialized preprocessing techniques, such as removing patient identification details and any misleading markers (artifacts) from nodules [45,64,66,68], improve image clarity, enabling more accurate analysis and diagnosis. This is often complemented by additional image enhancement techniques and advanced denoising strategies. All of these techniques aim to improve the overall quality of the images. Such comprehensive preprocessing efforts are critical, as they significantly increase the reliability and accuracy of image-based assessments across a multitude of applications. This informs decisionmaking processes and enhances the efficacy of subsequent analyses [68].

In addition to the initial processing steps, detection of the region of interest (ROI) is conducted to identify and isolate the relevant areas within the images. The images are subsequently resized to standardized dimensions to ensure uniformity. A manual cropping process is employed to format the images into a square shape, facilitating consistency across the dataset and enhancing the effectiveness of subsequent analysis [44,45]. The detection of regions of interest (ROIs) enhances the accuracy of diagnostic assessments by prioritizing specific areas within an image that warrant further examination. Many studies have employed manual methodologies [27,30,33,36,37,44,45,52-54,58,59,61]. These manual techniques, while traditional, often require considerable time and are subject to human error. This has prompted a shift toward more automated approaches. In contrast, a few research endeavors have embraced automated methods for detecting regions of interest (ROIs) [31,48,66]. With the continuous advancement of technology, there is a growing opportunity to improve manual techniques by integrating automated solutions. This approach has the potential to increase the efficiency and consistency of nodule detection.

Among modern techniques, deep learning models, such as RetinaNet [32] and Faster R-CNN [42], which utilize cutting-edge frameworks, have gained significant attention because of their ability to enhance detection capabilities. In terms of segmentation, various methodologies have emerged, including StableSeg [28], U-Net++ [47], U-Net [49], and deep learning-based segmentation approaches such as SkaNet [67] and EfficientNet B6 [68].

Moreover, various tools and applications for detecting regions of interest (ROIs) and facilitating their extraction and analysis are discussed in the literature. One notable tool is ePADlite, which is a semiautomated segmentation tool integrated within the Electronic Physician Annotation Device [35]. Manual ROI tools such as ITKSNAP [38,50], MATLAB [51], and ImageJ software (version 1.48, National Institutes of Health, USA) serve as alternative options for researchers. Additionally, LabelMe software [65] is recognized for its ability to facilitate precise annotation and segmentation tasks within this dynamic field of study.

Medical image augmentation plays a vital role in overcoming the challenges associated with limited medical image datasets. Artificially expanding the volume and diversity of training data through augmentation techniques such as rotation, flipping, zooming, mirroring, shifting, scaling and adjusting brightness or contrast [27,30,32,33,54,60] or adding Gaussian noise [30,33,45] can significantly enhance the performance and robustness of machine learning models.

Transfer learning has become a highly effective strategy for addressing the challenges of insufficient medical imaging data and improving generalizability across various applications. Several studies have shown that using pretrained models significantly enhances the performance of machine learning frameworks, providing a strong starting point for tasks where data scarcity is a concern. Many researchers in deep learning have specifically utilized various architectures pretrained on ImageNet datasets, demonstrating the adaptability of these models to medical imaging tasks [27-33,35- 37,44,54,55,60,65,66,68]. This not only increases the prediction accuracy but also speeds up the training process, ultimately leading to better clinical outcomes.

Compared with traditional machine learning methods, contemporary deep learning methods tend to use image augmentation, nodule detection, and segmentation techniques. The ratio of these approaches used is illustrated in Figure 3.

Download PDF

Citation: Sharifi Y, Danay Ashgzari M, Naseri Z, Amiri Tehranizadeh. TIRADS-Based Artificial Intelligence Systems for Ultrasound Imaging of Thyroid Nodules: A Systematic Review. Austin J Radiol. 2025; 12(2): 1254.

Home

Journal Scope

Editorial Board

Instruction for Authors

Submit Your Article