I magine the sequence of events: You have designed a 6-station simulation objective structured clinical examination (OSCE) to assess the resuscitation skills of your trainees. At each station trainees are assessed on professionalism, communication, leadership, and technical skills relevant to the scenario. These individual scores are averaged to create a singular score for each station. Cognizant that gender of the trainee may play a role in how raters assess trainees, you wish to examine your OSCE for reliability and sources of variance, to ensure that gender effects are not driving competency decisions. How can you gather validity evidence to support



