Instructor, Emergency Medicine
OBJECTIVE: With the rising number of female physicians, there will be more children than ever born in residency and the current system is inadequate to handle this increase in new resident parents. Residency is stressful and rigorous in isolation, let alone when pregnant or with a new child. Policies that ease these stressful transitions are generally either insufficient or do not exist. Therefore, we created a comprehensive Return to Work Policy for resident parents and piloted its implementation. Our policy aims to: 1) establish a clear, shared understanding of the regulatory and training requirements as they pertain to parental leave, 2) facilitate a smooth transition for new parents returning back to work, and 3) summarize the local and institutional resources available for both males and females during residency training.METHOD: In Fall 2017, a task force was convened to draft a Return to Work Policy for New Resident Parents. The task force included 9 key stakeholders (i.e., residents, faculty, and administration) at our institution and was made up of 3 Graduate Medical Education (GME) Program Directors, a Vice Chair of Education, a Designated Institutional Official (DIO), a Chief Resident, and 3 members of our academic department's Faculty Affairs Committee. The task force was selected because of individual expertise in gender equity issues, mentorship of resident parents, GME, and departmental administration.RESULTS: After development, the policy was piloted from November 2017 to June 2018. Our pilot implementation period included 7 new resident parents. All of these residents received schedules that met the return to work scheduling terms of our Return to Work Policy including no overnight shifts, no sick call, no more than 3 shifts in a row. Of equal importance, throughout our pilot, the emergency department schedules at all of our clinical sites remained fully staffed and our sick call pool was unaffected.CONCLUSION: Our Return to Work Policy for New Resident Parents provides a comprehensive guide to training requirements and family leave policies, an overview of available resources, and a scheduling framework that makes for a smooth transition back to clinical duties. This article is protected by copyright. All rights reserved.
View details for PubMedID 30636353
Assessment and evaluation of trainees' clinical performance measures is needed to ensure safe, high-quality patient care. These measures also aid in the development of reflective, high-performing clinicians and hold graduate medical education (GME) accountable to the public. While clinical performance measures hold great potential, challenges of defining, extracting, and measuring clinical performance in this way hinder their use for educational and quality improvement purposes. This article provides a way forward by identifying and articulating how clinical performance measures can be used to enhance GME by linking educational objectives with relevant clinical outcomes. The authors explore four key challenges: defining as well as measuring clinical performance measures, using electronic health record and clinical registry data to capture clinical performance, and bridging silos of medical education and health care quality improvement. The authors also propose solutions to showcase the value of clinical performance measures and conclude with a research and implementation agenda. Developing a common taxonomy of uniform specialty-specific clinical performance measures, linking these measures to large-scale GME databases, and applying both quantitative and qualitative methods to create a rich understanding of how GME affects quality of care and patient outcomes is important, the authors argue. The focus of this article is primarily GME, yet similar challenges and solutions will be applicable to other areas of medical and health professions education as well.
View details for DOI 10.1097/ACM.0000000000002620
View details for PubMedID 30720528
There exists an assumption that improving medical education will improve patient care. While seemingly logical, this premise has rarely been investigated. In this Invited Commentary, the authors propose the use of big data to test this assumption. The authors present a few example research studies linking education and patient care outcomes and argue that using big data may more easily facilitate the process needed to investigate this assumption. The authors also propose that collaboration is needed to link educational and health care data. They then introduce a grassroots initiative, inclusive of universities in one Canadian province and national licensing organizations that are working together to collect, organize, link, and analyze big data to study the relationship between pedagogical approaches to medical training and patient care outcomes. While the authors acknowledge the possible challenges and issues associated with harnessing big data, they believe that the benefits supersede these. There is a need for medical education research to go beyond the outcomes of training to study practice and clinical outcomes as well. Without a coordinated effort to harness big data, policy makers, regulators, medical educators, and researchers are left with sometimes costly guesses and assumptions about what works and what does not. As the social, time, and financial investments in medical education continue to increase, it is imperative to understand the relationship between education and health outcomes.
View details for DOI 10.1097/ACM.0000000000002217
View details for Web of Science ID 000435369500022
View details for PubMedID 29538109
Our ability to assess independent trainee performance is a key element of competency-based medical education (CBME). In workplace-based clinical settings, however, the performance of a trainee can be deeply entangled with others on the team. This presents a fundamental challenge, given the need to assess and entrust trainees based on the evolution of their independent clinical performance. The purpose of this study, therefore, was to understand what faculty members and senior postgraduate trainees believe constitutes independent performance in a variety of clinical specialty contexts.Following constructivist grounded theory, and using both purposive and theoretical sampling, we conducted individual interviews with 11 clinical teaching faculty members and 10 senior trainees (postgraduate year 4/5) across 12 postgraduate specialties. Constant comparative inductive analysis was conducted. Return of findings was also carried out using one-to-one sessions with key informants and public presentations.Although some independent performances were described, participants spoke mostly about the exceptions to and disclaimers about these, elaborating their sense of the interdependence of trainee performances. Our analysis of these interdependence patterns identified multiple configurations of coupling, with the dominant being coupling of trainee and supervisor performance. We consider how the concept of coupling could advance workplace-based assessment efforts by supporting models that account for the collective dimensions of clinical performance.These findings call into question the assumption of independent performance, and offer an important step toward measuring coupled performance. An understanding of coupling can help both to better distinguish independent and interdependent performances, and to consider revising workplace-based assessment approaches for CBME.
View details for DOI 10.1111/medu.13588
View details for PubMedID 29676054
Construct: We investigated the quality of emergency medicine (EM) blogs as educational resources.Online medical education resources such as blogs are increasingly used by EM trainees and clinicians. However, quality evaluations of these resources using gestalt are unreliable. We investigated the reliability of two previously derived quality evaluation instruments for blogs.Sixty English-language EM websites that published clinically oriented blog posts between January 1 and February 24, 2016, were identified. A random number generator selected 10 websites, and the 2 most recent clinically oriented blog posts from each site were evaluated using gestalt, the Academic Life in Emergency Medicine (ALiEM) Approved Instructional Resources (AIR) score, and the Medical Education Translational Resources: Impact and Quality (METRIQ-8) score, by a sample of medical students, EM residents, and EM attendings. Each rater evaluated all 20 blog posts with gestalt and 15 of the 20 blog posts with the ALiEM AIR and METRIQ-8 scores. Pearson's correlations were calculated between the average scores for each metric. Single-measure intraclass correlation coefficients (ICCs) evaluated the reliability of each instrument.Our study included 121 medical students, 88 EM residents, and 100 EM attendings who completed ratings. The average gestalt rating of each blog post correlated strongly with the average scores for ALiEM AIR (r = .94) and METRIQ-8 (r = .91). Single-measure ICCs were fair for gestalt (0.37, IQR 0.25-0.56), ALiEM AIR (0.41, IQR 0.29-0.60) and METRIQ-8 (0.40, IQR 0.28-0.59).The average scores of each blog post correlated strongly with gestalt ratings. However, neither ALiEM AIR nor METRIQ-8 showed higher reliability than gestalt. Improved reliability may be possible through rater training and instrument refinement.
View details for DOI 10.1080/10401334.2017.1414609
View details for Web of Science ID 000435016500007
View details for PubMedID 29381099
With the implementation of competency-based medical education (CBME) in emergency medicine, residency programs will amass substantial amounts of qualitative and quantitative data about trainees' performances. This increased volume of data will challenge traditional processes for assessing trainees and remediating training deficiencies. At the intersection of trainee performance data and statistical modeling lies the field of medical learning analytics. At a local training program level, learning analytics has the potential to assist program directors and competency committees with interpreting assessment data to inform decision making. On a broader level, learning analytics can be used to explore system questions and identify problems that may impact our educational programs. Scholars outside of health professions education have been exploring the use of learning analytics for years and their theories and applications have the potential to inform our implementation of CBME. The purpose of this review is to characterize the methodologies of learning analytics and explore their potential to guide new forms of assessment within medical education.
View details for DOI 10.1002/aet2.10087
View details for PubMedID 30051086
View details for PubMedCentralID PMC6001721
View details for DOI 10.1017/cem.2018.336
The shift toward broader, programmatic assessment has revolutionized the approaches that many take in assessing medical competence. To understand the association between quantitative and qualitative evaluations, the authors explored the relationships that exist among assessors' checklist scores, task ratings, global ratings, and written comments.The authors collected and analyzed, using regression analyses, data from the McMaster Modular Assessment Program. The data were from emergency medicine residents in their first or second year of postgraduate training from 2012 through 2014. Additionally, using content analysis, the authors analyzed narrative comments corresponding to the "done" and "done, but needs attention" checklist score options.The regression analyses revealed that the task ratings, provided by faculty assessors, are associated with the use of the "done, but needs attention" checklist score option. Analyses also identified that the "done, but needs attention" option is associated with a narrative comment that is balanced, providing both strengths and areas for improvement. Analysis of qualitative comments revealed differences in the type of comments provided to higher- and lower-performing residents.This study highlights some of the relationships that exist among checklist scores, rating scales, and written comments. The findings highlight that task ratings are associated with checklist options while global ratings are not. Furthermore, analysis of written comments supports the notion of a "hidden code" used to communicate assessors' evaluation of medical competence, especially when communicating areas for improvement or concern. This study has implications for how individuals should interpret information obtained from qualitative assessments.
View details for DOI 10.1097/ACM.0000000000001743
View details for Web of Science ID 000419151600038
View details for PubMedID 28562452
Open educational resources such as blogs are increasingly used for medical education. Gestalt is generally the evaluation method used for these resources; however, little information has been published on it. We aim to evaluate the reliability of gestalt in the assessment of emergency medicine blogs.We identified 60 English-language emergency medicine Web sites that posted clinically oriented blogs between January 1, 2016, and February 24, 2016. Ten Web sites were selected with a random-number generator. Medical students, emergency medicine residents, and emergency medicine attending physicians evaluated the 2 most recent clinical blog posts from each site for quality, using a 7-point Likert scale. The mean gestalt scores of each blog post were compared between groups with Pearson's correlations. Single and average measure intraclass correlation coefficients were calculated within groups. A generalizability study evaluated variance within gestalt and a decision study calculated the number of raters required to reliably (>0.8) estimate quality.One hundred twenty-one medical students, 88 residents, and 100 attending physicians (93.6% of enrolled participants) evaluated all 20 blog posts. Single-measure intraclass correlation coefficients within groups were fair to poor (0.36 to 0.40). Average-measure intraclass correlation coefficients were more reliable (0.811 to 0.840). Mean gestalt ratings by attending physicians correlated strongly with those by medical students (r=0.92) and residents (r=0.99). The generalizability coefficient was 0.91 for the complete data set. The decision study found that 42 gestalt ratings were required to reliably evaluate quality (>0.8).The mean gestalt quality ratings of blog posts between medical students, residents, and attending physicians correlate strongly, but individual ratings are unreliable. With sufficient raters, mean gestalt ratings provide a community standard for assessment.
View details for DOI 10.1016/j.annemergmed.2016.12.025
View details for Web of Science ID 000410255300022
View details for PubMedID 28262317
OSCEs are commonly conducted in multiple cycles (different circuits, times, and locations), yet the potential for students' allocation to different OSCE cycles is rarely considered as a source of variance-perhaps in part because conventional psychometrics provide limited insight.We used Many Facet Rasch Modeling (MFRM) to estimate the influence of "examiner cohorts" (the combined influence of the examiners in the cycle to which each student was allocated) on students' scores within a fully nested multi-cycle OSCE.Observed average scores for examiners cycles varied by 8.6%, but model-adjusted estimates showed a smaller range of 4.4%. Most students' scores were only slightly altered by the model; the greatest score increase was 5.3%, and greatest score decrease was -3.6%, with 2 students passing who would have failed.Despite using 16 examiners per cycle, examiner variability did not completely counter-balance, resulting in an influence of OSCE cycles on students' scores. Assumptions were required for the MFRM analysis; innovative procedures to overcome these limitations and strengthen OSCEs are discussed.OSCE cycle allocation has the potential to exert a small but unfair influence on students' OSCE scores; these little-considered influences should challenge our assumptions and design of OSCEs.
View details for DOI 10.1080/0142159X.2017.1248916
View details for Web of Science ID 000393885800015
View details for PubMedID 27897083
Simulation stands to serve an important role in modern competency-based programs of assessment in postgraduate medical education. Our objective was to compare the performance of individual emergency medicine (EM) residents in a simulation-based resuscitation objective structured clinical examination (OSCE) using the Queen's Simulation Assessment Tool (QSAT), with portfolio assessment of clinical encounters using a modified in-training evaluation report (ITER) to understand in greater detail the inferences that may be drawn from a simulation-based OSCE assessment.A prospective observational study was employed to explore the use of a multicenter simulation-based OSCE for evaluation of resuscitation competence. EM residents from five Canadian academic sites participated in the OSCE. Video-recorded performances were scored by blinded raters using the scenario-specific QSATs with domain-specific anchored scores (primary assessment, diagnostic actions, therapeutic actions, communication) and a global assessment score (GAS). Residents' portfolios were evaluated using a modified ITER subdivided by CanMEDS roles (medical expert, communicator, collaborator, leader, health advocate, scholar, and professional) and a GAS. Correlational and regression analyses were performed comparing components of each of the assessment methods.Portfolio review and ITER scoring was performed for 79 residents participating in the simulation-based OSCE. There was a significant positive correlation between total OSCE and ITER scores (r = 0.341). The strongest correlations were found between ITER medical expert score and each of the OSCE GAS (r = 0.420), communication (r = 0.443), and therapeutic action (r = 0.484) domains. ITER medical expert was a significant predictor of OSCE total (p = 0.002). OSCE therapeutic action was a significant predictor of ITER total (p = 0.02).Simulation-based resuscitation OSCEs and portfolio assessment captured by ITERs appear to measure differing aspects of competence, with weak to moderate correlation between those measures of conceptually similar constructs. In a program of competency-based assessment of EM residents, a simulation-based OSCE using the QSAT shows promise as a tool for assessing medical expert and communicator roles.
View details for DOI 10.1002/aet2.10055
View details for PubMedID 30051047
View details for PubMedCentralID PMC6001706
Raters represent a significant source of unexplained, and often undesired, variance in performance-based assessments. To better understand rater variance, this study investigated how various raters, observing the same performance, perceived relationships amongst different noncognitive attributes measured in performance assessments.Medical admissions data from a Multiple Mini-Interview (MMI) used at one Canadian medical school were collected and subsequently analyzed using the Many Facet Rasch Model (MFRM) and hierarchical clustering. This particular MMI consisted of eight stations. At each station a faculty member and an upper-year medical student rated applicants on various noncognitive attributes including communication, critical thinking, effectiveness, empathy, integrity, maturity, professionalism, and resolution.The Rasch analyses revealed differences between faculty and student raters across the eight different MMI stations. These analyses also identified that, at times, raters were unable to distinguish between the various noncognitive attributes. Hierarchical clustering highlighted differences in how faculty and student raters observed the various noncognitive attributes. Differences in how individual raters associated the various attributes within a station were also observed.The MFRM and hierarchical clustering helped to explain some of the variability associated with raters in a way that other measurement models are unable to capture. These findings highlight that differences in ratings may result from raters possessing different interpretations of an observed performance. This study has implications for developing more purposeful rater selection and rater profiling in performance-based assessments.
View details for DOI 10.1097/ACM.0000000000000902
View details for Web of Science ID 000375840200008
View details for PubMedID 26505102
Examiner effects and content specificity are two well known sources of construct irrelevant variance that present great challenges in performance-based assessments. National medical organizations that are responsible for large-scale performance based assessments experience an additional challenge as they are responsible for administering qualification examinations to physician candidates at several locations and institutions. This study explores the impact of site location as a source of score variation in a large-scale national assessment used to measure the readiness of internationally educated physician candidates for residency programs. Data from the Medical Council of Canada's National Assessment Collaboration were analyzed using Hierarchical Linear Modeling and Rasch Analyses. Consistent with previous research, problematic variance due to examiner effects and content specificity was found. Additionally, site location was also identified as a potential source of construct irrelevant variance in examination scores.
View details for DOI 10.1007/s10459-014-9547-z
View details for Web of Science ID 000357644900002
View details for PubMedID 25164266
Pressure within school can be a critical component in understanding how the school experience influences young people's intellectual development, physical and mental health and future educational decisions.Data from five survey rounds (1993/1994, 1997/1998, 2001/2002, 2005/2006 and 2009/2010) were used to examine time-, age- and gender-related trends in the amounts of reported school pressure among 11-, 13- and 15-year-olds, in five different regions (North America, Great Britain, Eastern Europe, Nordic and Germanic countries).Across the regions the reported perceptions of school pressure did not change between 1994 and 2010, despite a temporary increase in 2002 and 2006. With the exception of children at 11 years of age, girls reported higher levels of school pressure than boys (Cohen's d from 0.12 to 0.58) and school pressure was higher in older age groups. These findings were consistent across countries. Regionally, children in North America reported the highest levels of school pressure, and students in the Germanic countries the lowest.Factors associated with child development and differences in societal expectations and structures, along with the possible, albeit, differential impact of the Programme for International Student Assessment (PISA), may partially explain the differences and trends found in school pressure. School pressure increases alongside the onset of adolescence and the shift from elementary school to the higher demanding expectations of secondary education. Time-related increases in school pressure occurred in the years following the release of the PISA results, and were larger in those regions in which results were less positive.
View details for DOI 10.1093/eurpub/ckv027
View details for Web of Science ID 000362971500013
View details for PubMedID 25805788
Patient safety (PS) receives limited attention in health professional curricula. We developed and pilot tested four Objective Structured Clinical Examination (OSCE) stations intended to reflect socio-cultural dimensions in the Canadian Patient Safety Institute's Safety Competency Framework.18 third year undergraduate medical and nursing students at a Canadian University.OSCE cases were developed by faculty with clinical and PS expertise with assistance from expert facilitators from the Medical Council of Canada. Stations reflect domains in the Safety Competency Framework (ie, managing safety risks, culture of safety, communication). Stations were assessed by two clinical faculty members. Inter-rater reliability was examined using weighted ? values. Additional aspects of reliability and OSCE performance are reported.Assessors exhibited excellent agreement (weighted ? scores ranged from 0.74 to 0.82 for the four OSCE stations). Learners' scores varied across the four stations. Nursing students scored significantly lower (p<0.05) than medical students on three stations (nursing student mean scores=1.9, 1.9 and 2.7; medical student mean scores=2.8, 2.9 and 3.5 for stations 1, 2 and 3, respectively where 1=borderline unsatisfactory, 2=borderline satisfactory and 3=competence demonstrated). 7/18 students (39%) scored below 'borderline satisfactory' on one or more stations.Results show (1) four OSCE stations evaluating socio-cultural dimensions of PS achieved variation in scores and (2) performance on this OSCE can be evaluated with high reliability, suggesting a single assessor per station would be sufficient. Differences between nursing and medical student performance are interesting; however, it is unclear what factors explain these differences.
View details for DOI 10.1136/bmjqs-2014-003277
View details for Web of Science ID 000349721000005
View details for PubMedID 25398630
View details for PubMedCentralID PMC4345888
View details for DOI 10.1007/s10459-014- 9547-z
The multiple mini-interview (MMI) has become an increasingly popular admissions method for selecting prospective students into professional programs (e.g., medical school). The MMI uses a series of short, labour intensive simulation stations and scenario interviews to more effectively assess applicants' non-cognitive qualities such as empathy, critical thinking, integrity, and communication. MMI data from 455 medical school applicants were analyzed using: (1) Generalizability Theory to estimate the generalizability of the MMI and identify sources of error; and (2) the Many-Facet Rasch Model, to identify misfitting examinees, items and raters. Consistent with previous research, our results support the reliability of MMI process. However, it appears that the non-cognitive qualities are not being measured as unique constructs across stations.
View details for DOI 10.1007/s10459-013-9463-7
View details for Web of Science ID 000331630200007
View details for PubMedID 23709188
This survey is part of a multi-year research study on informal and formal mental health support in northern Canada involving the use of qualitative and quantitative data collection and analysis methods in an effort to better understand mental health in a northern context.The main objective of the 3-year study was to document the situation of formal and informal helpers in providing mental health support in isolated northern communities in northern British Columbia, northern Alberta, Yukon, Northwest Territories and Nunavut. The intent of developing a survey was to include more participants in the research and access those working in small communities who would be concerned regarding confidentiality and anonymity due to their high profile within smaller populations.Based on the in-depth interviews from the qualitative phase of the project, the research team developed a survey that reflected the main themes found in the initial qualitative analysis. The on-line survey consisted of 26 questions, looking at basic demographic information and presenting lists of possible challenges, supports and client mental health issues for participants to prioritise.Thirty-two participants identified various challenges, supports and client issues relevant to their mental health support work. A vast majority of the respondents felt prepared for northern practice and had some level of formal education. Supports for longevity included team collaboration, knowledgeable supervisors, managers, leaders and more opportunities for formal education, specific training and continuity of care to support clients.For northern-based research in small communities, the development of a survey allowed more participants to join the larger study in a way that protected their identity and confidentiality. The results from the survey emphasise the need for team collaboration, interdisciplinary practice and working with community strengths as a way to sustain mental health support workers in the North.
View details for DOI 10.3402/ijch.v72i0.20962
View details for Web of Science ID 000325721900021
View details for PubMedID 23984276
View details for PubMedCentralID PMC3753122