Program Evaluation for Graduate Medical Education: Practical Approaches From the Reimagining Residency Evaluation Community of Practice

Judee Richardson; Lalena M. Yarris; Patricia A. Carney; Erin Goss; Melissa I. Zelaya; Helen K. Morgan; Fei Chen; Julie A. Schumacher; Paul O’Rourke; Colleen Gillespie; Britta M. Thompson; Mary Ellen J. Goldhamer

doi:10.4300/JGME-D-24-00492.1

Introduction

Program evaluation is a crucial component of graduate medical education (GME), as it systematically assesses educational interventions through standardized methods, feeding data back to programs for evidence-based improvements to advance training in high-quality, patient-centered care. Without this rigorous analytic approach, we cannot know how, when, and why interventions affect learning, care quality, and patient health outcomes. Program evaluation requires specialized skills and training¹ that educators and administrators may not possess, and for which institutional resources may be lacking. This gap hinders evidence-driven change in GME.

Educational innovations require iterative evaluation cycles for continuous quality improvement. Perspectives of those affected should also be considered based on evidence provided, to increase the applied utility.² Outcomes that are products of a deliberately iterative improvement process are an important engine for transforming the medical education system.³ Program evaluation provides evidence for an intervention’s feasibility, applicability, and effectiveness for learning at the micro (individual learner), meso (institution) and macro (system) levels, including whether the innovation contributes to the medical profession’s social contract. This can be determined by connecting patient outcomes–focused research to medical training as evidence for competence, accountability, and the public good.⁴

In this Perspective, American Medical Association (AMA) Reimagining Residency Initiative (RR) program evaluators provide practical considerations for GME educators as they plan evaluation in the complex educational environment of residency and fellowship. We relied on program evaluation resources⁵ applied in GME settings and built unique approaches to support our innovations. The information provided aims to help other educators evaluate their programs.

Reimagining Residency Program Evaluation Community of Practice

In June 2019, the AMA launched RR with a vision to “transform residency training to best address the workplace needs of our current and future health care system.”⁶ The initiative funded 11 teams to develop, pilot, and evaluate innovations to improve workforce training in the AMA’s GME priority areas: competency-based medical education, improving the transition from undergraduate medical education to GME, enhancing the clinical learning environment, advancing health systems science competencies (eg, systems thinking, health care delivery, health system improvement), and developing new models of culturally competent care to address social drivers of health and health equity (eg, access to nutritious food, quality housing, clean water, transportation). Each team identified and included an evaluation expert who assessed progress relative to stated goals and outcomes.

To foster collaboration and deliberate learning across these 11 teams, the AMA’s research and evaluation leadership scheduled monthly meetings to promote the sharing of expertise, provide feedback on research and evaluation approaches, work through challenges, carve a space for professional development (leading medical education experts joined some meetings), promote scholarship, and determine the extent to which results realized the desired change(s). In medical education, only some institutions have this type of expertise on staff, underscoring the need to build community. The AMA organized sessions to provide value and develop expertise further. This collective commitment to learning has been previously described as a community of practice (CoP).⁷ Over time, the group focused on advanced thinking and practices to benefit the field of medical education and patient-centered care. The group also gained perspective from discussions of program evaluation methods across institutions while managing tailored approaches for specific projects. Each of the 11 teams benefited greatly from this collective platform supporting unique designs and growth. Fostering this type of peer community for deliberate learning and advancement is highly recommended for multi-institutional efforts.

Program evaluation requires rigor but also adaptability in cycles of data collection, harmonization or calibration, analysis, and incorporation of change. In the online supplementary data, we include the title of each grant, website address for a description of the project and progress, exemplar evaluation approaches, rationale, and resources for interested GME educators. We recommend program directors create a similar evaluation culture that strategically weaves assessments across the program, rather than creating assessments later and risking misalignment with implementation.

Key Considerations for Program Evaluation in GME Innovations

As the RR CoP developed and revised evaluation plans for innovations, key questions shaped approaches. These considerations, along with traditional program evaluation standards, such as accuracy, utility, integrity, and feasibility⁸ are some of the first steps taken in designing a program evaluation effort. We include questions here to assist other GME educators. When answers are not immediately evident, educators should develop their evaluation approach further until sufficient information is available.

Types of Change

Innovations can be improvement-oriented with little expectation that organizational cultures will shift. This type of innovation focuses on incremental change⁹ within the current education system and framework. Innovations can also be transformative, shifting individual beliefs and frameworks leading to cultural change in education systems. A local innovation may include both types of change in different facets of the program. It is important to identify the change desired, theory behind it, and what is needed for change to occur. To determine the type of change, questions include: Why and how will change occur in a particular context (theory of change)? What are the observable behaviors of residents and faculty today, and what are the behaviors expected after the innovation? How different are these behaviors based on measures and metrics? What policies and resources are in place to facilitate learning the innovation employed versus staying with the status quo?

Systems Perspective

Understanding a system’s beliefs and behaviors will determine how GME program leaders design and evaluate effective innovations, frame outcomes, and promote change. It is also important to consider other influential systems (eg, health care) within which residents practice, as well as accreditation and professional organizations’ influence on graduate education and clinical practice. National policies also influence the environment within which the innovation occurs. To achieve an effective systems perspective, questions include: What practices and processes of your educational system are working and not working? What are the shared mental models of your organizational culture? What individual mental models exist that could be barriers or facilitators to your innovation? What other systems influence the prospect of change, and how do they do this?

Program Evaluation Designs

Program evaluation should aim to assess both the adoption of the initiative (process) as well as its effectiveness at creating and sustaining change (outcome measures). To provide framing for a tailored program evaluation approach, questions include: What is the explicit theoretical framework (eg, developmental, behavioral, social constructionist) guiding the innovation and its alignment with all program components? Has a culture of evaluation been created so all invested parties are committed to what will be assessed and changed as a result? Is there a process in place for rapid cycles of evaluation³ aimed at desired change with results fed back into the system in an iterative fashion? Are there processes in place for collaboration where all have an opportunity for meaningful involvement?

Social Return on Investment

Social return on investment (SROI) is a conceptual approach for calculating the social, health, environmental, and economic value of program interventions for invested parties.¹⁰ This calculation is useful when communicating the return realized from education innovations in terms more familiar to those interested in financial and cost-benefit outcomes. Steps include: (1) establishing scope and identifying key program beneficiaries¹¹; (2) mapping outcomes to beneficiaries using identified theory of change or outcomes framework; (3) identifying evidence for outcomes and assign value; (4) establishing what would have occurred without the intervention and what happened due to the intervention; (5) calculating SROI with partners by assigning financial values for nonfinancial changes; and (6) completing cost-benefit calculation, including sensitivity analysis.

Summary Considerations

The diverse evaluation methods used by the AMA RR Evaluation CoP (online supplementary data) reflect the importance of carefully selecting evaluation methods that are specific to the goals and objectives of the innovation.¹² Employing a variety of methods can ensure that real-life intricacies, unexpected changes, and local contexts are considered in real-time. By using evidence-informed flexibility in program evaluation, it is possible to adapt to changing environments and circumstances, which increases the potential for others to use effective innovations in different contexts.

The CoP environment provided a longitudinal opportunity to build relationships and expertise focused on the dynamic nature of GME program implementation. This powerful community experience rises above while also enhancing the technical methods learned and shared in the supplemental materials. We strongly recommend program directors, GME educators, and evaluators look for ways to create, facilitate, maintain a collaborative, iterative, continuous learning process when designing their own residency or fellowship programs.

[1] 1.
Simpson D , MarcdanteK, SouzaKH, AndersonA, HolmboeE. Job roles of the 2025 medical educator. J Grad Med Educ. 2018;10
(3)
:243-246. doi:10.4300/JGME-D-18-00253.1

OpenURL
PubMed
Google Scholar
Crossref

[2] OpenURL

[3] PubMed

[4] Google Scholar

[5] Crossref

[6] 2.
Patton MQ . Essentials of Utilization-Focused Evaluation.
SAGE Publications
; 2012.

OpenURL
PubMed
Google Scholar
Crossref

[7] OpenURL

[8] PubMed

[9] Google Scholar

[10] Crossref

[11] 3.
Hargreaves MB . Rapid evaluation approaches for complex initiatives. Mathematica Policy Research. Published 2014. Accessed March 13, 2025. https://www.mathematica.org/download-media?MediaItemId= %7B75A14D68-EED7-4672-ADB2-10C90EC8 804D%7D

OpenURL
PubMed
Google Scholar
Crossref

[12] OpenURL

[13] PubMed

[14] Google Scholar

[15] Crossref

[16] 4.
Richardson J , SantenSA, MejicanoGC, et al. Learner assessment and program evaluation: supporting precision education. Acad Med. 2024;99
(4 Suppl 1)
: 64-70. doi:10.1097/ACM.0000000000005599

OpenURL
PubMed
Google Scholar
Crossref

[17] OpenURL

[18] PubMed

[19] Google Scholar

[20] Crossref

[21] 5.
Varpio L , MeyerHA. Lesson from the Qualitative Rip Out Series: let go of expectations for universally applicable “gold standards” for qualitative research. J Grad Med Educ. 2017;9
(2)
:154-156. doi:10.4300/JGME-D-17-00014.1

OpenURL
PubMed
Google Scholar
Crossref

[22] OpenURL

[23] PubMed

[24] Google Scholar

[25] Crossref

[26] 6.
American Medical Association. AMA Reimagining Residency Initiative. Accessed June 14, 2024. https://www.ama-assn.org/education/changemeded-initiative/ama-reimagining-residency-initiative

OpenURL
PubMed
Google Scholar
Crossref

[27] OpenURL

[28] PubMed

[29] Google Scholar

[30] Crossref

[31] 7.
Wenger-Trayner E , Wenger-TraynerB. An introduction to communities of practice: a brief overview of the concept and its uses. Published 2015. Accessed June 14, 2024. https://www.wenger-trayner.com/introduction-to-communities-of-practice

OpenURL
PubMed
Google Scholar
Crossref

[32] OpenURL

[33] PubMed

[34] Google Scholar

[35] Crossref

[36] 8.
Balmer DF , RiddleJM, SimpsonD. Program evaluation: getting started and standards. J Grad Med Educ. 2020; 12
(3)
:345-346. doi:10.4300/JGME-D-20-00265.1

OpenURL
PubMed
Google Scholar
Crossref

[37] OpenURL

[38] PubMed

[39] Google Scholar

[40] Crossref

[41] 9.
Deming WE . Out of the Crisis, 1986.
Massachusetts Institute of Technology Center for Advanced Engineering Study
; 1991.

OpenURL
PubMed
Google Scholar
Crossref

[42] OpenURL

[43] PubMed

[44] Google Scholar

[45] Crossref

[46] 10.
United Nations Development Programme. Social Return on Investment (SROI) methodology and sensitivity analysis of the case studies. Accessed June 14, 2024. https://www.undp.org/sites/g/files/zskgke326/files/migration/eurasia/Appendix-SROI-methodology_ENG.pdf

OpenURL
PubMed
Google Scholar
Crossref

[47] OpenURL

[48] PubMed

[49] Google Scholar

[50] Crossref

[51] 11.
Simpson D , RiddleJM, HamelDL, BalmerDF. Blueprinting program evaluation evidence through the lens of key stakeholders. J Grad Med Educ. 2020;12
(5)
: 629-630. doi:10.4300/JGME-D-20-01015.1

OpenURL
PubMed
Google Scholar
Crossref

[52] OpenURL

[53] PubMed

[54] Google Scholar

[55] Crossref

[56] 12.
Hamza DM , RegehrG. Eco-normalization: evaluating the longevity of an innovation in context. Acad Med. 2021;96
(suppl 11)
:48-53. doi:10.1097/ACM.0000000000004318

OpenURL
PubMed
Google Scholar
Crossref

[57] OpenURL

[58] PubMed

[59] Google Scholar

[60] Crossref

Article Contents

Program Evaluation for Graduate Medical Education: Practical Approaches From the Reimagining Residency Evaluation Community of Practice

Introduction

Reimagining Residency Program Evaluation Community of Practice

Key Considerations for Program Evaluation in GME Innovations

Types of Change

Systems Perspective

Program Evaluation Designs

Social Return on Investment

Summary Considerations

Rouge on the Lips of Silence

Evaluating Methodology for Increasing Diversity in US Residency Training Programs: A Scoping Review

Scoping Review of Simulation-Based Training for Social Determinants of Health Within Residency Programs

Career Outcomes Among Graduates of 2 Urban Health Primary Care Training Programs

Trends in MedEdPORTAL Faculty Development Resources for Clinician Educators

The Effect of Paging Reminders on Fellowship Conference Attendance: A Multi-Program Randomized Crossover Study

A Values Affirmation Intervention to Improve Female Residents' Surgical Performance

Improving Residents' Safe Opioid Prescribing for Chronic Pain Using an Objective Structured Clinical Examination

Integrating a Resident-Driven Longitudinal Quality Improvement Curriculum Within an Ambulatory Block Schedule

Skills for Interviewing Adolescent Patients: Sustainability of Structured Feedback in Undergraduate Education on Performance in Residency

Get Email Alerts