A Framework for Artificial Intelligence in Medical Education: Could I, Would I, Should I?

Rick Peter Fritz Wolthusen; Mountasir El-tohami; Reuben Abraham Hendler; Paul Allen Riordan; Joseph Brian Stoklosa

doi:10.4300/JGME-D-25-00009.1

Medical Education and Artificial Intelligence

Artificial intelligence (AI) is increasingly influencing medical education, with rapid developments in tools that promise to enhance how physicians teach, learn, and assess competency. While the broader literature on AI in medical education is growing, most studies focus on specific AI functions, like rapidly training novices in bronchoscopy,¹ and not on generative AI (GAI)—text-based models like ChatGPT.

The Association of American Medical Colleges has offered ChatGPT-generated, then human-edited principles for responsible use of AI in medical education.² Initial work in undergraduate medical education has led to proposed AI competencies.³ However, graduate medical education (GME) lags in pragmatic frameworks and strategies for their application. Among the existing GME-relevant literature, much of the work is theoretical and descriptive, only outlining capabilities or raising concerns. A 2024 review in Frontiers in Medicine, for example, provides summaries of the use of GAI in GME settings and discusses opportunities, such as prompt engineering, and risks, such as automation bias, with minimal discussion of how GME educators can practically implement GAI tools in real-world settings.⁴ If practical tips are provided, they are general in nature without actionable steps to guide day-to-day teaching.⁵

This Perspectives article offers a practical, theory-informed, decision-making tool—the “Could I, Would I, Should I?” framework—to help health professions education leaders and educators fill this gap by engaging proactively with GAI tools like ChatGPT.

Could I Use It?

GAI has rapidly attracted many users because it is powerful, accessible, and easy to use. Through natural language processing, GAI has learned to communicate in human languages and can readily translate between these languages, assisting educators in engaging more diverse learners. Unfortunately, as GAI works by predicting a likely response based on probabilistic reasoning about data it has previously encountered, the content it produces can be vague, generic, and inaccurate. Therefore, at least for the time being, GAI may be most useful in the beginning stages of brainstorming and middle stages of refining educational work, rather than producing a final product. For example, GAI can initially serve as a standardized “patient” for practicing communication skills by offering generic, typical responses. In later editing stages, GAI can rapidly analyze a practice transcript to provide coaching and feedback. GAI is designed for dialogue, so iterative back-and-forth exchanges tend to elicit more satisfying results than a single request. If unsure how to write an effective prompt, you can simply ask the GAI to guide you (and, for reasons that remain unclear, you may get better results if you ask nicely). For effective prompting, we often utilize the user-friendly ICIO framework: Instruction, Context, Input Data, and Output Structure. In the Figure, we provide details about the framework and display an example.

Figure Best Practices for Providing Instructions
Citation: Journal of Graduate Medical Education 17, 4; 10.4300/JGME-D-25-00009.1

Would I Use It?

One of the challenges facing learners in medicine is how to organize and integrate facts and ideas into frameworks that form the building blocks of expertise in medicine. Cognitive psychologists have studied this learning process for many years, and 5 core learning strategies have been identified as critical for mastery of a topic.⁶ These strategies are (1) spaced retrieval, (2) elaboration, (3) reflection, (4) interleaving, and (5) generation, which form the mnemonic RE-RIG (see Table). Each of these strategies is a highly active process that requires the learner to engage with the material. Additionally, to advance from expertise to mastery of a clinical topic, clinicians require repeated exposure to multiple patient cases with feedback on one’s performance, so that the pathophysiological knowledge becomes embedded within the mind as “illness scripts.”⁷ These scripts are effortlessly retrieved, allowing for highly accurate diagnoses within minutes of working with patients. Unfortunately, GAI can remove much of the mental effort of learning by taking a learner’s input and producing a pleasing summary of complex ideas for memorization. This bypasses the struggle required by these learning strategies and may lead to a regression to the mean, both elevating struggling learners by providing them with average answers to clinical problems and lowering top performers by limiting the creative insights produced by mastery of a topic. To counter this trend, as educators, we will need to teach our learners how to use GAI in a manner that allows these cognitive strategies to flourish so that expertise can grow. Educators may benefit the most from GAI if they understand it as a “copilot” that helps provide feedback so that they can challenge and refine their ideas.⁸ In terms of “Would I use it?” we believe that the use of AI in medical education is most effective when combined with evidence-based effective learning strategies.

Should I Use It?

Numerous ethical concerns have been associated with the development, implementation, and use of AI. Well-established bioethical principles like non-malfeasance, beneficence, justice, and fidelity⁹ provide a familiar structure, conceptual clarity, and proven applicability across similar contexts for organizing key ethical concerns.

Do No Harm—Non-Malfeasance

Because GAI is designed more to sound correct than to be correct, AI can convincingly spread misinformation. Educators should avoid instructing learners to “find the answer” using GAI and should instead teach learners to critically appraise AI-generated content.

Beneficence

AI has the potential to improve GME’s efficiency to the detriment of its quality, by providing an alluring but inferior substitute for thoughtful teaching and active learning. In each use case, educators should ask themselves: Will AI take this lesson further, or shortcut to a less desirable destination?

Justice

GAI tends to “learn” the biases patterned in its (opaque) training data inputs and then perpetuate those biases in its outputs. Educators should affirmatively monitor GAI outputs for bias and encourage learners to consider how bias could impact GAI’s reliability and social impact.

Fidelity

AI systems risk compromising our patients’ health information confidentiality when AI companies control data fed into their systems. Educators should warn trainees against prompting unsecured AI systems with protected information.

In summary, clinical educators should use a bioethics framework to thoughtfully consider any use of GAI and its potential to help and harm trainees, patients, and society to optimize benefits while mitigating risk.

Conclusion

AI is here to stay and will substantially impact medical education. While tools like ChatGPT can help medical educators generate large amounts of structured content quickly if prompted effectively, their predictive nature can also bypass essential learning processes and raise ethical concerns, making the role of educators more critical than ever. We propose the “Could I, Would I, Should I?” framework and associated strategies to help GME leaders make thoughtful, context-specific decisions about when and how to use GAI. To move to deliberate integration, we recommend that GME curricula support AI literacy. These curricula should include structured prompting strategies such as ICIO; practical use that reinforces critical thinking, reflection, and deep learning; and ethical considerations. Educators should link emerging GAI competencies to established GME core competencies and offer adaptable, program-specific frameworks. Additionally, we encourage GME stakeholders to consult the American Association of Directors of Psychiatric Residency Training AI in Psychiatric Education Taskforce Report,¹⁰ which outlines potential AI applications to support informed, context-driven decisions. By deliberately shaping AI’s use, GME programs can ensure that GAI strengthens—rather than undermines—the mission to train skillful, ethical, and reflective physicians.

[1] 1.
Cold KM , XieS, NielsenAO, ClementsenPF, KongeL. Artificial intelligence improves novices’ bronchoscopy performance: a randomized controlled trial in a simulated setting. Chest. 2024;165(
2
):405-413. doi:10.1016/j.chest.2023.08.015

OpenURL
PubMed
Google Scholar
Crossref

[2] OpenURL

[3] PubMed

[4] Google Scholar

[5] Crossref

[6] 2.
Association of American Medical Colleges. Principles for the use of artificial intelligence in medical education. Accessed May 20, 2025. https://www.aamc.org/about-us/mission-areas/medical-education/principles-ai-use

OpenURL
PubMed
Google Scholar
Crossref

[7] OpenURL

[8] PubMed

[9] Google Scholar

[10] Crossref

[11] 3.
Lee YM , KimS, LeeYH, et al. Defining medical AI competencies for medical school graduates: outcomes of a Delphi survey and medical student/educator questionnaire of South Korean medical schools. Acad Med. 2024;99(
5
):524-533. doi:10.1097/ACM.0000000000005618

OpenURL
PubMed
Google Scholar
Crossref

[12] OpenURL

[13] PubMed

[14] Google Scholar

[15] Crossref

[16] 4.
Janumpally R , NanuaS, NgoA, YouensK. Generative artificial intelligence in graduate medical education. Front Med (Lausanne). 2025;11:1525604. doi:10.3389/fmed.2024.1525604

OpenURL
PubMed
Google Scholar
Crossref

[17] OpenURL

[18] PubMed

[19] Google Scholar

[20] Crossref

[21] 5.
Buckley PJ . Practical tips for enhancing academic skills with generative artificial intelligence tools. Acad Psychiatry. 2025;49(
1
):40-43. doi:10.1007/s40596-024-02055-w

OpenURL
PubMed
Google Scholar
Crossref

[22] OpenURL

[23] PubMed

[24] Google Scholar

[25] Crossref

[26] 6.
Brown PC , RoedigerHLIII, McDanielMA. Make It Stick: The Science of Successful Learning.
The Belknap Press of Harvard University Press
; 2014.

OpenURL
PubMed
Google Scholar
Crossref

[27] OpenURL

[28] PubMed

[29] Google Scholar

[30] Crossref

[31] 7.
Norman GR , GriersonLE, SherbinoJ, HamstraSJ, SchmidtHG, MamedeS.
Chapter 19: expertise in medicine and surgery
. In:EricssonKA, HoffmanRR, KozbeltA, WilliamsAM, eds.The Cambridge Handbook of Expertise and Expert Performance.
Cambridge University Press
; 2018:331-355.

OpenURL
PubMed
Google Scholar
Crossref

[32] OpenURL

[33] PubMed

[34] Google Scholar

[35] Crossref

[36] 8.
Mollick E . Co-Intelligence: Living and Working With AI.
Penguin Publishing Group
; 2018.

OpenURL
PubMed
Google Scholar
Crossref

[37] OpenURL

[38] PubMed

[39] Google Scholar

[40] Crossref

[41] 9.
Beauchamp TL , ChildressJF. Principles of Biomedical Ethics.
8th ed
.
Oxford University Press
; 2019.

OpenURL
PubMed
Google Scholar
Crossref

[42] OpenURL

[43] PubMed

[44] Google Scholar

[45] Crossref

[46] 10.
American Association of Directors of Psychiatric Residency Training. Artificial intelligence in psychiatric education: a report from the AADPRT AI Task Force. Accessed May 20, 2025. https://www.aadprt.org/application/files/1717/4343/2312/AADPRT_AI_Task_Force_Report_F_small.pdf

OpenURL
PubMed
Google Scholar
Crossref

[47] OpenURL

[48] PubMed

[49] Google Scholar

[50] Crossref

Article Contents

A Framework for Artificial Intelligence in Medical Education: Could I, Would I, Should I?

Medical Education and Artificial Intelligence

Could I Use It?

Would I Use It?

Should I Use It?

Do No Harm—Non-Malfeasance

Beneficence

Justice

Fidelity

Conclusion

Rouge on the Lips of Silence

Evaluating Methodology for Increasing Diversity in US Residency Training Programs: A Scoping Review

Scoping Review of Simulation-Based Training for Social Determinants of Health Within Residency Programs

Career Outcomes Among Graduates of 2 Urban Health Primary Care Training Programs

Trends in MedEdPORTAL Faculty Development Resources for Clinician Educators

The Effect of Paging Reminders on Fellowship Conference Attendance: A Multi-Program Randomized Crossover Study

A Values Affirmation Intervention to Improve Female Residents' Surgical Performance

Improving Residents' Safe Opioid Prescribing for Chronic Pain Using an Objective Structured Clinical Examination

Integrating a Resident-Driven Longitudinal Quality Improvement Curriculum Within an Ambulatory Block Schedule

Skills for Interviewing Adolescent Patients: Sustainability of Structured Feedback in Undergraduate Education on Performance in Residency

Get Email Alerts