Scalable Clinical Oversight of Large Language Models Via Uncertainty Triangulation

NOT YET RECRUITING

NCT07414966

NA

Sponsor: China National Center for Cardiovascular Diseases

Summary

This prospective, multi-reader, randomized crossover trial evaluates SCOUT (Scalable Clinical Oversight via Uncertainty Triangulation), a model-agnostic meta-verification framework that selectively defers unreliable large language model (LLM) predictions to clinicians by triangulating three orthogonal uncertainty signals: model heterogeneity, stochastic inconsistency, and reasoning critique. The trial assesses whether SCOUT-assisted review can reduce physician review time compared with standard manual review of AI-generated diagnoses while maintaining non-inferior diagnostic accuracy in coronary heart disease (CHD) subtyping.

Official title: Prospective Evaluation of a Model-Agnostic Meta-Verification Framework (SCOUT) for Scalable Clinical Oversight of Large Language Model Outputs in Coronary Heart Disease Diagnosis: A Multi-Reader, Randomized, Crossover Trial

Key Details

Gender

All

Age Range

18 Years - Any

Study Type

INTERVENTIONAL

Enrollment

7

Start Date

2026-02-19

Completion Date

2026-02-28

Last Updated

2026-02-17

Healthy Volunteers

No

Conditions

Coronary Heart Disease (CHD)

Interventions

DIAGNOSTIC_TEST

SCOUT-Assisted Review Workflow

SCOUT-Assisted Review (Intervention Arm): Physicians review 56 cases processed through the SCOUT framework. For cases classified as low-uncertainty (D(x)=0), the AI prediction is auto-accepted without physician review. For high-uncertainty cases (D(x)=1), the physician reviews the case with access to the main model's chain-of-thought reasoning and the meta-verification audit results. The main model is DeepSeek-V3.1 with chain-of-thought prompting.

DIAGNOSTIC_TEST

Standard Manual Review Workflow

Physicians perform a full manual review of 54 cases using raw medical records with access to the AI model's predictions and reasoning, but without SCOUT uncertainty stratification or selective deferral.

Clinical Research Directory