Use of the Quality Improvement (QI) Knowledge Application Tool in Assessing Pediatric Resident QI Education

MD,
PhD, and
MD, MPH
Online Publication Date: 01 Jun 2014
Page Range: 284 – 291
DOI: 10.4300/JGME-D-13-00221.1
Save
Download PDF

Abstract

Background

Assessing the effectiveness of quality improvement curricula is important to improving this area of resident education.

Objective

To assess the ability of the Quality Improvement Knowledge Application Tool (QIKAT) to differentiate between residents who were provided instruction in QI and those who were not, when scored by individuals not involved in designing the QIKAT, its scoring rubric, or QI curriculum instruction.

Methods

The QIKAT and a 9-item self-assessment of QI proficiency were administered to an intervention and a control group. The intervention was a longitudinal curriculum consisting of 8 hours of didactic QI training and 6 workshops providing just-in-time training for resident QI projects. Two uninvolved faculty scored the QIKAT.

Results

A total of 33 residents in the intervention group and 27 in the control group completed the baseline and postcurriculum QIKAT and self-assessment. QIKAT mean intervention group scores were significantly higher than mean control group scores postcurriculum (P < .001). Absolute QIKAT differences were small (of 15 points, intervention group improved from a mean score of 12.8 to 13.2). Interrater agreement as measured by kappa test was low (0.09). Baseline self-assessment showed no differences, and after instruction, the intervention group felt more proficient in QI knowledge than controls in 4 of 9 domains tested.

Conclusions

The QIKAT detected a statistically significant improvement postintervention, but the absolute differences were small. Self-reported gain in QI knowledge and proficiency agreed with the results of the QIKAT. However, QIKAT limitations include poor interrater agreement and a scoring rubric that lacks specificity. Programs considering using QIKAT to assess curricula should understand these limitations.

Editor's Note: The online version of this article contains Quality Improvement Knowledge Application scenarios adapted to pediatrics and a self-assessment tool to measure quality improvement proficiency.

Introduction

Training in quality improvement (QI) and patient safety is a critical skill for physicians,1 yet the optimal approach for educating residents in QI and patient safety has not been fully identified.2 Wide variation in instructional methods exists among programs and across medical disciplines.2 In pediatric training specifically, a recent survey of program directors found a high degree of variation in curricular elements and minimal formal evaluation of trainee QI knowledge, skills, and attitudes.3

Best practices for resident QI education include combining didactic and experiential learning into a longitudinal curriculum with leadership of local experts successful in QI within their clinical practice.47 However, few instruments exist that objectively assess gains in QI skills, knowledge, and behavior postinstruction.810 A tool with evidence of preliminary validity was published during this study but has not been evaluated further.11

The Quality Improvement Knowledge Application Tool (QIKAT) has been used to evaluate QI knowledge acquisition after curricular instruction.1215 The QIKAT assesses differences in resident QI knowledge after curricular instruction, with acceptable to good interrater agreement when scored by the creators of the tool,13 QI instructors,14 and creators of the scoring rubric.15 Although the QIKAT has evidence of validity in content only, residency programs may be using the QIKAT to assess QI curricular effectiveness.15 Important to this use, it is not known how the QIKAT performs when scored by individuals not involved in design of the tool, curricular instruction, and scoring rubric.

The objective of our study was to assess the ability of the QIKAT to differentiate between residents who were provided with instruction in QI and those who were not, when scored by individuals not involved in the QI curriculum delivery, the development of the QIKAT, or the scoring rubric.

Methods

Our study used a pre-post quasi-experimental test design. The intervention was a longitudinal QI curriculum delivered to second-year pediatrics residents in the 2010–2011 academic year. Third-year pediatrics residents not exposed to the longitudinal QI curriculum formed the concurrent control group. Baseline measurements included the QIKAT and a QI self-assessment questionnaire of QI proficiency. After intervention group curricular instruction, both groups completed 3 QIKAT scenarios and the self-assessment.

Assessment Tools

The QIKAT presents the learner with clinical scenarios that fail in at least 1 Institute of Medicine dimension of health care performance.16 We modified the clinical content of the QIKAT scenarios to include disease states familiar to pediatrics residents (provided as online supplemental material). After reading the scenario, the learner responds to 3 free-text questions asking about an aim for improvement, measures for assessment, and a change proposal that could be tested.

The 6 pediatric QIKAT scenarios underwent beta testing for content clarity by third-year pediatrics residents not otherwise involved. The structures of the 3 baseline and 3 postintervention QIKAT scenarios were similar, but their clinical context differed to avoid direct recall of responses. We measured resident self-perception of QI skills with a QI knowledge self-assessment tool17 (provided as online supplemental material), which has been used in other studies to demonstrate self-perception of QI knowledge gained after curricular instruction.13,15 Residents rated their proficiency in 9 QI skills using a Likert scale from 1 (not proficient) to 5 (very proficient). The QIKAT scenarios adapted to pediatrics and the self-assessment tool are available as online supplemental material.

QI Curriculum Description

We designed our QI curriculum using methodology suggested by experts.4,18 It incorporated guiding principles for teaching others to lead change by combining didactic and experiential learning with leadership from clinicians demonstrating continuous improvement in their own work.6 The intervention group received the QI curriculum longitudinally through the 2010–2011 academic year. Faculty trained in QI methodology and health services research taught 8 noon conference didactic sessions. Six 3-hour workshops provided just-in-time training and focused on QI skills matched to the needs of residents' projects.

QI projects were designed and carried out by residents working in teams and mentored by faculty. Projects included (1) reducing time to lumbar puncture in febrile infants in the emergency department; (2) enhancing parental understanding of discharge instructions by standardizing discharge information; (3) increasing primary care follow-up and controller medication use in patients discharged after asthma exacerbation; (4) increasing the availability of working mobile computer workstations during intensive care unit rounds; (5) increasing referral of primary care patients with obesity to a multidisciplinary team; and (6) incorporation of postpartum depression screening at 2-, 4-, and 6-month well-child visits. Residents in the intervention group were required to propose a QI project, present interim data, conduct at least 2 plan-do-study-act cycles (many undertook several), and present final data and outcomes.

Scoring of QIKAT Responses

Two raters scored responses to the open-ended questions of the QIKAT scenarios. Both of the raters had graduated fellowships in health care delivery research and held faculty positions in QI in their respective institutions. We used a scoring rubric, described previously,15 which awards a maximum of 5 points per scenario, 3 scenarios per test, for a maximum of 15 points. Maximum points per subsection were 2 points for the aim, 1 point for measures, 1 point for change proposal, and 1 point for answers that were related to one another. Raters were blinded to the study objectives and had no role in the QI curriculum. After scoring a random sample of 20 QIKAT responses, raters met to discuss scoring decisions. Differences of more than 1 point per 15-point QIKAT were resolved; thereafter, the raters scored responses independently.

The Institutional Review Board of Boston Children's Hospital approved this study and granted a waiver of informed consent.

Analyses

All analyses were performed using Stata version 12.1 software (Stata Corp LP, College Station, TX). Analyses consisted of 2 parts: the first part focused on the influence of the QI curriculum over time and differences between the comparison and intervention groups, and the second focused on the scores of the individual raters and their agreement. The significance of mean score comparisons between intervention and control groups was assessed using the 2-sample Wilcoxon rank sum (Mann-Whitney U) test due to the nonnormal distribution of scores. Wilcoxon matched pairs signed rank test was used to test mean score differences between baseline and postcurriculum measurements. We also used linear regression analyses to assess the unique influence of having completed the QI curriculum on postcurriculum scores by controlling for baseline scores for each of the QIKAT subsections. Predictor variables were cubed to normalize the regression residuals to meet regression analysis assumptions. Scoring differences between the 2 raters were evaluated through Wilcoxon rank sum tests and correlations (tetrachoric or Spearman rank correlation coefficients). Interrater agreement was measured through Cohen's kappa test. For all analyses, a P value of < .05 was considered statistically significant.

Results

The intervention group included 36 residents and the control group 27 residents. Three residents in the intervention group did not complete the postcurriculum self-assessment or QIKAT and were excluded from analysis. None of the residents in the intervention or control group reported having received prior formal training in QI methods.

Overall—QIKAT

Data for QIKAT scores by intervention and control groups are shown in table 1. Mean intervention group scores were significantly higher than mean control group scores postcurriculum (P < .001). Baseline mean QIKAT scores for the intervention and control groups were similar (P  =  .06). Within the intervention group, postcurriculum mean QIKAT scores were not significantly higher than baseline QIKAT scores (P  =  .44). However, multivariate regression analysis indicated a significant increase in postcurriculum scores for the intervention group compared to the control group, accounting for baseline scores (P  =  .007; detailed results not shown).

TABLE 1 Comparison of Baseline and Postcurriculum QIKAT Scores for Intervention and Control Groups
TABLE 1

Subsections—QIKAT

For all subsections, distributions of baseline and postcurriculum scores were left-skewed. There were no statistically significant baseline differences between control and intervention groups (table 1). Postcurriculum, scores for 3 of the 4 subsections (aim, change proposal, and relatedness) were significantly higher for the intervention group (P  =  .009; P  =  .04; P  =  .002). Changes in mean scores between baseline and postcurriculum assessments within each group and subsections were not significant in bivariate analyses. Multivariate regression analysis confirmed this finding for the aim, measures, and change proposal subsections but showed a significant increase in scores for the relatedness subsection for the intervention group (P < .001; detailed results not shown).

QIKAT Scores by Rater

Analyses of QIKAT scores by rater are shown in table 2. Both of the raters identified a postcurriculum difference between intervention and control groups in overall scores and the relatedness subsection. Neither rater identified a difference in measures subsection scores between control and intervention groups postcurriculum. Rater 2 identified postcurriculum differences between the groups in the aim and change proposal subsections, and rater 1 identified a score difference at baseline between the intervention and control groups and for the change proposal and relatedness subsections.

TABLE 2 Comparison by Rater of Baseline and Postcurriculum QIKAT Scores for Intervention and Control Group
TABLE 2

table 3 shows interrater agreement between raters 1 and 2. As raters were blinded to when each scenario was administered, the analyses combine the ratings of all 6 scenarios. Although the overall scores and scores in the measures subsection did not differ significantly between raters, the scores for all other subsections did. Correlation coefficients for overall QIKAT and the subsections between the 2 raters were fairly high and statistically significant, but the value of kappa was generally low.

TABLE 3 Overall and Subsection QIKAT Scores by Ratera
TABLE 3

Self-Assessment Results

No baseline differences between the intervention and control groups were identified using the self-assessment tool (table 4). Postcurriculum, the intervention group's self-assessment ratings were generally higher than the ratings of the control group, reaching significance in 4 of 9 QI skills. Within the intervention group, self-assessment scores significantly increased after QI instruction for all 9 QI skills. The control group self-assessment scores significantly increased only for ability to “identify a quality problem related to patient care.”

TABLE 4 Results From Self-Assessment of QI Proficiency
TABLE 4

Discussion

Our evaluation of the QIKAT in a quasi-experimental setting with a concurrent control group found that the QIKAT successfully distinguished intervention from control group after instruction in QI. However, we identified challenges to the performance of the QIKAT. Baseline scores were high, and absolute score differences for the intervention group were small, with a mean increase in QIKAT score of 0.34 of 15 total points (12.8 to 13.2). Also, interrater agreement as measured by kappa test was poor (0.09). Combined baseline and postcurriculum QIKAT scores from raters 1 and 2 were significantly different in 3 of 4 QIKAT subsections.

Our results highlight important limitations in the QIKAT as an assessment tool. The QIKAT has limited evidence of validity as a tool to assess QI knowledge gained after QI instruction. Similar to previous studies, we found the QIKAT distinguished intervention from control groups after QI instruction. Major differences in our study were lower absolute differences in QIKAT scores and lower kappa results. In previous studies, the QIKAT was used either to describe the curriculum13,14 or to assess curricular effectiveness15 and did not include baseline and postcurriculum control groups.1315 The absence of a concurrent control group may not permit causal inferences of the ability of the QIKAT to assess knowledge gained from curricular instruction.19 table 5 compares the design of this study to previous uses of the QIKAT. Finally, the QIKAT does not test knowledge of other elements for successful system improvement such as skills for functioning in multidisciplinary teams and project prioritization and use of tools for outcome measurement.5,6,12 The scoring rubric used in this study appeared to lack specificity. Raters were asked to make qualitative judgments (ie, good, excellent). Few points were available, and most scores had a binary value. We believe this led to “grade inflation.” Improvements in the QIKAT could include more difficult questions, testing other QI knowledge areas, and an enhanced scoring rubric. Many of these proposed improvements to the QIKAT are features of another QI knowledge assessment tool, the Systems Quality Improvement Training and Assessment Tool, published after completion of this study.11

TABLE 5 Comparison of Studies Using the QIKAT
TABLE 5

Limitations of our study include the poor interrater agreement as measured by kappa test. However, although absolute agreement between raters may be poor, correlation coefficients between raters were statistically significant. QIKAT responses were scored using only 1 scoring rubric, and we do not know how results would have been different if another rubric had been used.

Conclusion

The QIKAT distinguishes intervention from control group after curricular instruction in QI. In this study, overall scores were high, and interrater agreement was poor. The QIKAT in its current form lacks specificity and is not generalizable due to limitations in scoring. We caution educators about these limitations when considering use of the QIKAT in curricular assessment.

Copyright: 2014
word
word

Author Notes

Eric W. Glissmeyer, MD, is Fellow, Pediatric Emergency Medicine, University of Utah, and Fellow, Intermountain Healthcare Institute for Health Care Delivery Research; Sonja I. Ziniel, PhD, is Senior Survey Methodologist, Program for Patient Safety and Quality, Faculty Member, Division of Adolescent and Young Adult Medicine, Department of Medicine, Boston Children's Hospital, and Instructor in Pediatrics, Department of Pediatrics, Harvard Medical School; and James Moses, MD, MPH, is Pediatric Director of Quality and Patient Safety, Department of Pediatrics, Boston Medical Center, Associate Program Director, Boston Combined Residency Program in Pediatrics, and Assistant Professor, Department of Pediatrics, Boston University School of Medicine.

Corresponding author: Eric W. Glissmeyer, MD, University of Utah Department of Pediatrics, Division of Pediatric Emergency Medicine, 295 Chipeta Way, Salt Lake City, UT 84108, eric.glissmeyer@hsc.utah.edu

Funding: Resident quality improvement efforts were supported by the Fred Lovejoy Research and Education Fund of the Boston Combined Residency Program and a grant from the Program for Patient Safety and Quality, Children's Hospital Boston.

Conflict of interest: The authors declare they have no competing interests.

The authors would like to thank the residents and faculty of the Boston Combined Residency Program in Pediatrics; and Bob Vinci, MD, and Ted Sectish, MD, for their support of the quality improvement curriculum and resident projects.

Received: 13 Jun 2013
Accepted: 12 Jan 2014
  • Download PDF