Peer-reviewed veterinary case report
Beyond genomics: artificial intelligence-powered diagnostics for indeterminate thyroid nodules-a systematic review and meta-analysis.
- Year:
- 2025
- Authors:
- Jassal K et al.
- Affiliation:
- Monash University Endocrine Surgery Unit · Australia
Abstract
<h4>Introduction</h4>In recent years, artificial intelligence (AI) tools have become widely studied for thyroid ultrasonography (USG) classification. The real-world applicability of these developed tools as pre-operative diagnostic aids is limited due to model overfitting, clinician trust, and a lack of gold standard surgical histology as ground truth class label. The ongoing dilemma within clinical thyroidology is surgical decision making for indeterminate thyroid nodules (ITN). Genomic sequencing classifiers (GSC) have been utilised for this purpose; however, costs and availability preclude universal adoption creating an inequity gap. We conducted this review to analyse the current evidence of AI in ITN diagnosis without the use of GSC.<h4>Methods</h4>English language articles evaluating the diagnostic accuracy of AI for ITNs were identified. A systematic search of PubMed, Google Scholar, and Scopus from inception to 18 February 2025 was performed using comprehensive search strategies incorporating MeSH headings and keywords relating to AI, indeterminate thyroid nodules, and pre-operative diagnosis. This systematic review and meta-analysis was conducted in accordance with methods recommended by the Cochrane Collaboration (PROSPERO ID CRD42023438011).<h4>Results</h4>The search strategy yielded 134 records after the removal of duplicates. A total of 20 models were presented in the seven studies included, five of which were radiological driven, one utilised natural language processing, and one focused on cytology. The pooled meta-analysis incorporated 16 area under the curve (AUC) results derived from 15 models across three studies yielding a combined estimate of 0.82 (95% CI: 0.81-0.84) indicating moderate-to-good classification performance across machine learning (ML) and deep learning (DL) architectures. However, substantial heterogeneity was observed, particularly among DL models (I² = 99.7%, pooled AUC = 0.85, 95% CI: 0.85-0.86). Minimal heterogeneity was observed among ML models (I² = 0.7%), with a pooled AUC of 0.75 (95% CI: 0.70-0.81). Meta-regression analysis performed suggests potential publication bias or systematic differences in model architectures, dataset composition, and validation methodologies.<h4>Conclusion</h4>This review demonstrated the burgeoning potential of AI to be of clinical value in surgical decision making for ITNs; however, study-developed models were unsuitable for clinical implementation based on performance alone at their current states or lacked robust independent external validation. There is substantial capacity for further development in this field.<h4>Systematic review registration</h4>https://www.crd.york.ac.uk/PROSPERO/, identifier CRD42023438011.
Find similar cases for your pet
PetCaseFinder finds other peer-reviewed reports of pets with the same symptoms, plus a plain-English summary of what was tried across them.
Search related cases →Original publication: https://europepmc.org/article/MED/40391010