Tundra Space

Tundra Space

Clinical Research Directory

Browse clinical research sites, groups, and studies.

Back to Studies
NOT YET RECRUITING
NCT07449429

A Privacy-Preserving OCR-LLM System for Coronary Syndrome Subtyping From Admission HPI: Multicenter Validation in China and the US

Sponsor: China National Center for Cardiovascular Diseases

View on ClinicalTrials.gov

Summary

This study develops and validates a privacy-preserving OCR-LLM pipeline that converts admission history of present illness (HPI) records into structured coronary syndrome subtypes (STEMI, NSTEMI, unstable angina, and chronic coronary syndrome). The system first extracts text from de-identified HPI images using locally deployed OCR, then applies large language models with a fixed diagnostic prompt to generate subtype classification and evidence. Performance is evaluated in an internal validation cohort and multiple external datasets covering heterogeneous EHR templates, emergency department cases, and an English dataset from MIMIC-IV. A clinician usability study assesses changes in diagnostic accuracy and time with and without tool assistance.

Official title: Development and Multicenter Validation of a Privacy-Preserving OCR-LLM Pipeline for Four-Subtype Coronary Syndrome Classification Using Admission HPI Across Heterogeneous EHR Systems

Key Details

Gender

All

Age Range

18 Years - Any

Study Type

OBSERVATIONAL

Enrollment

10

Start Date

2026-02-28

Completion Date

2026-03-08

Last Updated

2026-03-04

Healthy Volunteers

Not specified

Interventions

DEVICE

OCR-Prompt-LLM Information Extraction and Classification Workflow (OCR-Prompt-LLM)

An automated clinical data management workflow integrating Optical Character Recognition (OCR), optimized prompt engineering, and large language models (LLMs). The system processes unstructured inpatient/ED records (primarily admission history of present illness and related narrative text) to extract prespecified key clinical indicators (e.g., left ventricular ejection fraction, coronary syndrome subtype, medications) and to classify cases into prespecified coronary artery disease categories (e.g., unstable angina, STEMI, NSTEMI, chronic coronary syndrome). The workflow outputs structured fields and a classification result with supporting evidence excerpts.

DEVICE

Manual Clinical Data Review

Standard manual process in which experienced clinicians review patient medical records and extract the same prespecified clinical indicators and coronary artery disease categories using routine clinical judgment and documentation review. This manual abstraction serves as the human benchmark for comparing diagnostic accuracy, completeness, and operational efficiency against the automated OCR-Prompt-LLM workflow.