Sometimes Means Some of the Time: Residents' Overlapping Responses to Vague Quantifiers on the ACGME-I Resident Survey

MA,
MSc, MA, MS,
MSocSc,
MBBS, MRCP (UK), MMed, FAMS, MHPEd,
MBBS, MMed(Psy), MHPEd, FAMS, Grad Dip(Psy), and
MD
Online Publication Date: 01 Dec 2017
Page Range: 735 – 740
DOI: 10.4300/JGME-D-17-00187.1
Save
Download PDF

ABSTRACT

Background 

Vague quantifiers used in the Accreditation Council for Graduate Medical Education–International (ACGME-I) resident survey are open to interpretation, raising concerns about the validity of survey scores. Residency programs may be unduly cited if survey responses are affected by differing judgments of vague quantifiers.

Objective 

Through investigating frequency judgment overlap, we assessed the validity of vague quantifiers by quantifying variation in residents' frequency judgment of the following response options: never, rarely, sometimes, very often, and extremely often.

Methods 

We conducted a cross-sectional survey of residents in 2 ACGME-I accredited institutions in Singapore. Participants assigned a frequency judgment to response options in 8 questions in the ACGME-I Resident Survey. Overlap in frequency judgment was computed using the minimum and maximum frequency judgment for each response option. This was ascertained to have occurred when the maximum frequency of the preceding category exceeded the minimum frequency of the downstream categories. The percentage of participants whose frequency judgment overlapped was computed.

Results 

Of 652 residents, 289 (44%) responded; after exclusions of incomplete and careless responses, 119 responses (18%) were included in the study. Frequency judgment overlap was more frequent for vague quantifiers that are adjacent, ranging from 11% to 50% for questions in faculty, evaluation, and resources domains. The percentage of frequency judgment overlap was greatest for duty hour questions, with an overlap between 21% and 47% for adjacent categories.

Conclusions 

Residents demonstrated wide variation in frequency judgment of vague quantifiers, especially on the duty hour questions in the ACGME-I resident survey.

Introduction

The Accreditation Council for Graduate Medical Education–International (ACGME-I) Resident Survey is an important monitoring tool for evaluating residency programs and making accreditation-related decisions. An annual survey that gathers perceptions of clinical education and the learning environment has been conducted in the United States since 2004.1 Since 2010, the survey has been administered in English in Singapore and 4 other countries that have adopted the ACGME-I accreditation framework.2,3 Studies looking at the reliability and validity of scores on the resident survey have yielded mixed results.47

For questions related to frequency and occurrences in the survey, residents respond by selecting 1 of the 5 following options: never, rarely, sometimes, very often, and extremely often. These response options have been termed vague quantifiers, as they denote quantification but lack concrete numerical quantities. Vague quantifiers have been found to be subjected to a wide range of frequency judgments with considerable overlaps, especially in those that are semantically adjacent.4,5 Internal medicine program directors reported that resident survey terms are “vague/ambiguous/misinterpreted by residents,”5(p3) and indicated that the response option of sometimes can be problematic.5

The resident survey is used by ACGME-I as a screening tool to assess compliance; the importance of this screening is increasing with Singapore residency programs' move toward a new accreditation system with annual data screening and less frequent site visits. Concerned about the effect that residents' varied frequency judgment may have on survey results, we aimed to quantify the variation in residents' frequency judgment for vague quantifier response options.

Methods

Study Setting and Data Collection

We conducted an anonymous, cross-sectional survey with residents enrolled full time in ACGME-I accredited residency programs as of March 1, 2014, at 2 sponsoring institutions in Singapore. All residents are proficient in English, and English is the lingua franca and medium of instruction within educational institutions in Singapore. E-mail invitations with an anonymous electronic link to the survey platform Qualtrics (Qualtrics LLC, Provo, UT) were sent to residents via their respective program coordinators between March and May 2014. When residents clicked on the link, they were directed to a participant information sheet explaining study aims and details, with a link to the survey. Consent to participate was implied if they proceeded with the survey.

Eight questions from the domains of educational content, faculty, duty hours, and resources were taken from the ACGME-I Resident Survey (table 1). Participants were instructed that never refers to 0% of the time, and to provide their frequency judgment of rarely, sometimes, very often, and extremely often by moving a slider between numerical values of 0 to 100 on the survey interface.

table 1 Comparison of Frequency Judgment Across Vague Quantifiers

            
              table 1

The National University of Singapore Institutional Review Board reviewed this study and determined it to be exempt.

Data Analysis

Previous studies sought to understand validity through expert validation,810 interviews, and focus group discussions.4,5,7 By getting participants to provide numerical frequency judgments corresponding to the various vague quantifiers, this study presents another way of looking at the issue of validity through eliciting comprehension of the vague quantifiers. This is akin to a cognitive interview without additional probing.1012

Data were first screened for logical consistency of the responses. Based on the phrasing of the study questions and option labels, it necessitates that the frequency judgment of rarely to be the smallest and the frequency judgment of extremely often to be the greatest. Participants who, for example, gave frequency judgments of rarely that were greater than sometimes, very often, and extremely often would be providing logically inconsistent responses. Logically inconsistent responses could be a result of inattentive or insufficient effort, and can be exacerbated by the anonymous nature of web-based surveys.13

In this study, we assumed response patterns13 that did not show an incremental increase in the frequency judgment of adjacent vague quantifiers (consistent with the measurement of event occurrence from the lowest to the highest intensity) to be careless responses, and removed them from further analyses. Next, we tabulated minimum and maximum frequency for each of the vague quantifier response options (table 1). Frequency judgment overlap was ascertained to have occurred when the maximum frequency of the preceding vague quantifier exceeded the minimum frequency of the downstream vague quantifiers. For example, frequency judgment overlap occurs when the maximum frequency for sometimes is 70 but the minimum frequency for very often is 50. The overlap values between 50 and 70 could either mean sometimes or very often. The percentage of participants whose frequency judgment falls within these overlapped regions would constitute the percentage of overlap between 2 vague quantifiers. To calculate percentage of overlap, the number of participants in the overlapped regions for 2 vague quantifiers was divided by the total number of participants rating both quantifiers.

Intraclass correlation was calculated to determine whether participants were consistent in their frequency judgment of vague quantifiers across all questions. For instance, if they equated a frequency judgment of 15 as rarely across questions. Descriptive statistics and the figure illustrating frequency judgment overlap without outliers were also included to understand how outliers affect frequency judgment overlap. Outliers were identified as observations that were furthest away from the mean, and they were replaced by sample mean.

figure. Illustration of Frequency Judgment Overlap Across Vague Quantifiersfigure. Illustration of Frequency Judgment Overlap Across Vague Quantifiersfigure. Illustration of Frequency Judgment Overlap Across Vague Quantifiers
figure Illustration of Frequency Judgment Overlap Across Vague Quantifiers

Citation: Journal of Graduate Medical Education 9, 6; 10.4300/JGME-D-17-00187.1

Data were analyzed using RStudio 0.98.162 (RStudio, Boston, MA).

Results

A total of 289 of 652 eligible participants (44%) responded to the study, and 186 (64%) completed all study questions. We excluded 67 (23%) participants due to careless responses. We included 119 surveys (18%) in the final analysis.

Of these 119, 66 (55%) were from medical residency programs, 28 (24%) were from surgical residency programs, and 13 (11%) were from all other residency programs. Twelve (10%) did not indicate their residency program. Sixty participants (50%) were in postgraduate year 1 (PGY-1) to PGY-3; 49 (41%) were in PGY-4 to PGY-6; and 10 (8%) were in PGY-7 and above. The final sample is comparable to the total population (652 residents) and the participants (289 residents) in terms of representation from the 2 sponsoring institutions as well as types of residency programs. In addition, the PGY breakdown was similar between the sample that responded to the survey and the final sample.

Table 1 summarizes the frequency judgment of the vague quantifiers for the 8 survey questions. There was a steady increase in the mean of frequency judgment from rarely to extremely often. The standard deviation tended to be smallest for rarely and larger for the other vague quantifiers. When compared across domains, the standard deviation for very often and extremely often was greater in the domains of resources and duty hours.

In general, frequency judgment overlap occurs at a higher percentage for vague quantifiers that are adjacent, with overlap between 38% and 82% for questions in the faculty, educational content, and resources domains (figure and provided as online supplemental material). Percentage of frequency judgment overlap was also higher for duty hours questions, with overlap between 58% and 95% for adjacent vague quantifiers. In contrast, the percentage of frequency judgment overlap was considerably lower for nonadjacent vague quantifiers. For instance, the overlap was between 1% and 49% for rarely and very often, between 1% and 16% for rarely and extremely often, and between 5% and 29% for sometimes and extremely often for questions in the faculty, educational content, and resources domains. Similarly, the percentage of frequency judgment overlap for nonadjacent vague quantifiers was also higher for duty hours questions—between 18% and 45% for rarely and very often, between 3% and 18% for rarely and extremely often, and between 53% and 56% for sometimes and extremely often.

After removing the outliers, frequency judgment overlaps between adjacent vague quantifiers, although reduced, were still substantial, with overlaps ranging from 21% to 72% for questions in the faculty, educational content, and resources domains (descriptive statistics and illustration are provided as online supplemental material). Good intraclass correlation within participants was found (table 2).

table 2 Intraclass Correlation of Participants' Frequency Judgment

          
            table 2

Discussion

Our results suggest that there was considerable frequency judgment overlap of adjacent vague quantifiers in the ACGME-I resident survey, attributable to participants perceiving adjacent vague quantifiers to be similar in meaning. Participants were consistent in their frequency judgment of the vague quantifiers as evident from the good intraclass correlation coefficient. Participants who gave a frequency judgment of 15 to rarely for question 1 were likely to give a similar frequency judgment for rarely in the other questions.

Disconcertingly, standard deviations and frequency judgment overlaps were greater for questions about duty hours. It is unclear why this is the case. One possibility could be the confusing and difficult phrasing of questions in the duty hour domain. Questions in this domain require participants to recall various frequencies and undertake various calculations in their heads, in contrast with questions from other domains where they are asked to recall instances. For instance, 1 question requires residents to think of instances when they break the duty hours rule, which presupposes that they are guilty of breaking the rule. Residents then have to recall these instances over a 4-week period. The effort it takes to process this question is likely to be a burden on the residents' working memory, with subsequent greater variability in their recall.14 This increase in cognitive effort, coupled with a wide range of interpretations for vague quantifiers, may have implications on interpreting the results of the questions in the ACGME-I Resident Survey, in particular with regard to duty hours violations. In our study, we found that the percentage of frequency judgment overlap was greater for rarely and sometimes (39.9% to 46.6%), rarely and very often (15.6% to 41.6%), and rarely and extremely often (1.7% to 15.9%) for questions in the duty hours domain. Residency programs would be flagged for noncompliance if a substantial number of residents answered sometimes, very often, or extremely often to these questions. The high percentage of frequency judgment overlap may result in the reported incidence of duty hours violation to be higher than the actual duty hours violation, which could lead to residency programs being unduly flagged for noncompliance.

Our findings are similar to those of other studies on frequency judgment. While the other studies set out to understand the average frequency for each vague quantifier, we went a step further to understand the percentage of frequency judgment overlap, which allowed us to quantify variation in residents' frequency judgment for vague quantifier response options in the ACGME-I Resident Survey.

Our study has limitations, including the low response rate and small final sample due to nonresponse and exclusion of careless responses, which reduces the generalizability and validity of the results. While the final sample is comparable to the population, we cannot rule out systematic differences in those who were included in the final sample and those who were not included. The substantial proportion of careless responses (23% of respondents) may suggest a larger problem with study fidelity. If careless responses were included in the analysis, it would increase the percentage of frequency judgment overlap as the majority of the careless respondents gave frequency judgment that were greatest for rarely and smallest for extremely often.

A larger follow-up study is needed to ascertain whether the phrasing of survey questions in the ACGME-I Resident Survey or the vague quantifiers themselves lead to variation in frequency judgment. Future studies could replace vague quantifiers with response options that are more specific, for example, less than once a week for rarely. This way, a reference period and actual numerical benchmark of event occurrence could be established.15

Conclusion

In this study, residents were asked to give their frequency judgment of the vague quantifiers response options used in the ACGME-I Resident Survey. Considerable variation in residents' interpretation of frequency judgment was found, which could affect the validity of survey results.

References

  • 1
    Holt KD,
    Miller RS.
    The ACGME Resident Survey aggregate reports: an analysis and assessment of overall program compliance. J Grad Med Educ. 2009;1(
    2
    ):327333.
  • 2
    Huggan PJ,
    Samarasekara DD,
    Archuleta S,
    et al.
    The successful, rapid transition to a new model of graduate medical education in Singapore. Acad Med. 2012;87(
    9
    ):12681273.
  • 3
    ACGME International. Where we are. http://www.acgme-i.org/about-us/where-we-are. Accessed September 20, 2017.
  • 4
    Sticca RP,
    MacGregor JM,
    Szlabick RE.
    Is the Accreditation Council for Graduate Medical Education (ACGME) Resident/Fellow Survey a valid tool to assess general surgery residency programs compliance with work hours regulations? J Surg Educ. 2010;67(
    6
    ):406411.
  • 5
    Adams M,
    Willett LL,
    Wahi-Gururaj S,
    et al.
    Usefulness of the ACGME Resident Survey: a view from internal medicine program directors. Am J Med. 2014;127(
    4
    ):351355.
  • 6
    Holt KD,
    Miller RS,
    Philibert I,
    et al.
    Residents' perspectives on the learning environment: data from the Accreditation Council for Graduate Medical Education Resident Survey. Acad Med. 2010;85(
    3
    ):512518.
  • 7
    Ibrahim H,
    Lindeman B,
    Matarelli SA,
    et al.
    International residency program evaluation: assessing the reliability and initial validity of the ACGME-I Resident Survey in Abu Dhabi, United Arab Emirates. J Grad Med Educ. 2014;6(
    3
    ):517520.
  • 8
    Schwarz N.
    What respondents learn from questionnaires: the survey interview and the logic of conversation. Int Stat Rev. 1995;63:153168.
  • 9
    Artino AR Jr,
    La Rochelle JS,
    Dezee KJ,
    et al.
    Developing questionnaires for educational research: AMEE Guide No. 87. Med Teach. 2014;36(
    6
    ):463474.
  • 10
    Bocklisch F,
    Bocklisch SF,
    Krems JF.
    Sometimes, often, and always: exploring the vague meanings of frequency expressions. Behav Res Methods. 2012;44(
    1
    ):144157.
  • 11
    Curran PG.
    Methods for the detection of carelessly invalid responses in survey data. J Exp Soc Psychol. 2016;66:419.
  • 12
    Sudman S.
    Mail surveys of reluctant professionals. Eval Rev. 1985;9(
    3
    ):349360.
  • 13
    Johnson JA.
    Ascertaining the validity of individual protocols from web-based personality inventories. J Res Personal. 2005;39(
    1
    ):103129.
  • 14
    Tourangeau R,
    Rips LJ,
    Rasinski K.
    The Psychology of Survey Response.
    Cambridge, UK
    :
    Cambridge University Press;
    2000.
  • 15
    Lietz P.
    Research into questionnaire design. Intl J Market Res. 2010;52(
    2
    ):249272.
Copyright: Accreditation Council for Graduate Medical Education 2017 2017
word
word
word
<bold>
  <sc>figure</sc>
</bold>
figure

Illustration of Frequency Judgment Overlap Across Vague Quantifiers


Author Notes

Funding: The authors report no external funding source for this study.

Conflict of interest: Dr Archuleta is a member of the International Review Committee, Accreditation Council for Graduate Medical Education–International.

Results of this survey were presented as a poster at the Association for Medical Education in Europe Conference, Glasgow, Scotland, United Kingdom, September 5–9, 2015.

The authors would like to thank Loo May Eng and Kenneth Lim for their invaluable support for their reading of the manuscript.

Corresponding author: Yvonne Yock, MA, National Healthcare Group, Tan Tock Seng Hospital, 7 Jalan Tan Tock Seng, Singapore 308440, 6594374909, fax 94374909, yvonne.yock@gmail.com

Editor's Note: The online version of this article contains a table of postgraduate year breakdown between the sample that responded to the survey and the final sample, as well as descriptive statistics and illustration of frequency judgment overlaps across vague quantifiers.

Received: 11 Mar 2017
Accepted: 07 Aug 2017
  • Download PDF