AI Scribes Enhance Feedback Quality in Medical Education
Publication Title: Ambient AI Scribes to Create Educational Feedback Notes for Medical Students: A Randomized Trial.
Summary
- Question
- This study investigated the use of ambient artificial intelligence (AI) scribes to generate written feedback notes during a medical interviewing workshop for first-year medical students. The researchers aimed to evaluate whether AI-assisted workflows improve the quality of feedback narratives compared to traditional human-only documentation and to assess the usability and cognitive load associated with the technology.
- Why it Matters
Providing high-quality written feedback is crucial for medical students to reflect on their performance and improve their clinical skills. However, educators often face challenges translating verbal observations into detailed written feedback due to time constraints and administrative burdens. This study explores how AI scribes, which capture and summarize verbal exchanges into structured notes, could alleviate these challenges and enhance feedback quality in medical education. The findings have potential implications for improving educational practices and reducing instructor administrative workload.
- Methods
Thirteen instructors were randomly assigned to either a control group (human-only feedback) or an intervention group (AI-assisted feedback). The intervention group used AI scribes to transcribe verbal feedback during student-instructor encounters and summarize it into feedback notes with the help of a large language model. Instructors edited the AI-generated summaries before submitting them. Feedback quality was evaluated using the Evaluation of Feedback Captured Tool (EFeCT), and usability and task load were measured using the System Usability Scale and NASA Task Load Index.
- Key Findings
AI-assisted feedback narratives scored higher on the EFeCT tool, with median scores of 3.00 (out of 5.00) for both human-edited AI narratives and unedited AI summaries, compared to 2.00 for human-only narratives. AI-generated feedback was longer and more detailed, but occasional inaccuracies (6.8% mischaracterization and 1.7% hallucination rates) were corrected during instructor editing. The study found no significant differences in task load or usability between AI-assisted and human-only workflows, with both rated as cognitively demanding and marginally usable.
- Implications
- The use of AI scribes has the potential to improve the quality of written feedback in medical education without increasing instructor effort. By capturing and summarizing verbal feedback, AI-assisted workflows can help bridge the gap between verbal and written feedback, supporting student reflection and learning. However, human oversight remains essential to ensure accuracy and reliability, especially in sensitive educational contexts.
- Next Steps
- The researchers recommend further studies to refine AI-assisted workflows, integrate the technology into streamlined educational systems, and evaluate its impact in other settings, such as bedside teaching and clinical precepting. Additionally, engaging students and educators to assess the utility and actionable value of AI-generated feedback could guide future developments.
- Funding Information
- This research was not supported by any external funding. Yale University also provided funding and support for this research.
Full Citation
Authors
Jaideep S. Talwalkar, MD
First AuthorProfessor of Internal Medicine (General Internal Medicine)
Donald Wright, MD, MHS
Last AuthorInstructor of Emergency Medicine