The detection and resection of adenomas and sessile serrated lesions are key recommendations in colorectal cancer (CRC) prevention guidelines.1,2 Colonoscopy remains the most effective method for this purpose.3 The adenoma detection rate (ADR) is widely recognized as the main quality indicator for colonoscopy, and it should be ≥ 35% in individuals aged ≥ 45 years.4 However, there is a significant rate of missed adenomas, mainly due to failures in lesion recognition and inadequate exposure of mucosal folds, particularly as a result of suboptimal withdrawal technique.5
Artificial intelligence (AI) is a technology that mimics human cognitive functions based on artificial neural networks, with the potential to improve colonoscopy performance by enabling less experienced endoscopists to achieve results comparable to experts, while also mitigating performance decline in fatigued endoscopists. Current AI applications extend to the evaluation of bowel preparation, assessment of withdrawal technique, lesion size estimation, and monitoring of ulcerative colitis, with evidence mainly supporting its use in adenoma detection and lesion characterization.
Bowel preparation quality is a well-established determinant of lesion detection. AI has demonstrated an accuracy of 93.3%, superior to the performance of endoscopists regardless of their level of expertise.6
AI has also proven valuable in assessing withdrawal technique by quantifying mucosal exposure during examination of colonic folds. Improvements have been most pronounced among endoscopists with low ADRs, suggesting a role for AI in supporting endoscopists with limited training.7
Lesion size estimation is clinically relevant, as any adenoma ≥ 10 mm in size is considered an advanced adenoma, requiring shorter surveillance intervals. AI has outperformed methods such as visual estimation and open biopsy forceps in determining lesion size.8
Takenaka et al.9 reported AI accuracy of 90.1%, sensitivity of 93.3%, and specificity of 87.8% for evaluating endoscopic remission in ulcerative colitis, with a kappa coefficient of 0.798 between AI and endoscopic score. In predicting histologic remission, AI results were 92.9%, 92.4%, and 93.5%, respectively, with a kappa coefficient of 0.859 between AI and biopsy.
Among AI applications with the strongest evidence are computer-aided detection (CADe) and characterization (CADx). Lou et al.10 analyzed 33 studies including 27,404 patients and demonstrated a 24.2% increase in ADR and a 39.0% increase in adenomas per colonoscopy in AI-assisted groups, along with a 50.5% reduction in adenoma miss rate (AMR). Repici et al.11 observed a high ADR in controls (40.4%), yet an even higher ADR (54.8%) with CADe. In our own unpublished series of 711 patients, with all procedures performed by a high-ADR endoscopist, ADR was 50.8% in the AI group and 45.9% in the control group (p=0.20).
Makar et al.12 analyzing 23,861 participants from 28 randomized trials, found a 20% increase in ADR and a 55% reduction in AMR with CADe compared with unassisted colonoscopy, with similar findings even in expert-only subgroup analyses (p<0.001). Maida et al.13 confirmed fewer missed adenomas (p<0.001) and sessile serrated lesions (p=0.007) with AI.
Comparative analyses between AI, single-observer, and dual-observer approaches indicate that both AI and dual observers achieve higher ADRs than single observers (p<0.001), with no difference between AI and dual observers.14 These findings highlight the potential of AI to serve as a “second observer,” reducing AMR. In full-day procedures, ADR has been shown to decline significantly in afternoon procedures without AI (relative risk 1.18).15 However, the downward trend in ADR throughout the day in non-AI groups (p=0.015) has not been observed in AI- assisted groups (p=0.65),16 indicating that AI may improve the performance of fatigued endoscopists.
Accurate lesion characterization is critical in the choice of treatment strategy once a lesion has been detected. AI can distinguish neoplastic from non-neoplastic lesions and assess potential submucosal invasion. Yoshida et al.17 showed superior accuracy for AI compared with trainees (87.8% vs. 79.0%, p=0.04), while experts maintained higher accuracy (92.0%), with sensitivity of 93.3% and specificity of 90.9%. In our preliminary series of 110 lesions, expert evaluation achieved 93.6% accuracy, 92.5% sensitivity, 96.7% specificity, 98.7% positive predictive value (PPV), and 82.9% negative predictive value (NPV), while AI achieved 81.8%, 76.3%, 96.7%, 98.5%, and 60.4%, respectively, with significant differences favoring the expert in accuracy, sensitivity, and NPV (p<0.01). Agreement between expert and AI was substantial (k=0.75).18 In another our unpublished study of 782 lesions, updated AI systems achieved 92.1% accuracy, 94.7% sensitivity, 81.8% specificity, 95.3% PPV, and 79.8% NPV.
Estimation of CRC invasion depth is central to treatment planning, since superficial submucosal invasion (T1a < 1000 µm) is considered a curative criterion after resection. Luo et al.19 reported AI accuracy of 91.1%, sensitivity of 91.2%, specificity of 91.0%, PPV of 87.6%, and NPV of 93.7%, with results comparable to endoscopists experienced in image-enhanced endoscopy (92.6%, 88.4%, 95.5%, 93.2%, and 92.2%, respectively) and superior to endoscopic ultrasound by experts (79.3%, 79.8%, 79.0%, 67.0%, and 88.0%, respectively).
Despite its promise, AI remains operator-dependent. Alterations in colonoscope positioning in response to different interpretations of AI-predicted diagnoses can modify CADx interpretations, highlighting the importance of endoscopist judgment. It is important to note that less experienced endoscopists may passively rely on AI predictions in their diagnostic and therapeutic decisions, which could result in misjudgments, and that overreliance on AI may foster a new generation of endoscopists increasingly dependent on technology, potentially diminishing vigilance and the drive to refine lesion recognition and characterization skills. Ultimately, responsibility for diagnostic accuracy lies with the endoscopist; therefore, only well-trained professionals are adequately prepared to accept or reject AI-generated interpretations.














