Application of “Earl's Assessment as, Assessment for, and Assessment of Learning Model” with Orthopaedic Assessment Clinical Competence
In order to study the efficacy of assessment methods, a theoretical framework of Earl's model of assessment was introduced. (1) Introduce the predictive learning assessment model (PLAM) as an application of Earl's model of learning; (2) test Earl's model of learning through the use of the Standardized Orthopedic Assessment Tool (SOAT); and (3) establish construct validity of the SOAT. Quasi-experimental. Three Canadian universities A convenience sample of 57 third-year undergraduate athletic therapy students from three universities were randomly assigned into three experimental groups. Treatment group 1 gave the instructor access to the SOAT, but the instructor could not explicitly share it. Treatment group 2 gave both the instructor and students access to the SOAT throughout the semester to use formatively. Group three was the comparison. All students were tested using the SOAT at the end of the semester using expert raters. An analysis of variance (ANOVA) (P < .05) was used to determine whether there was a difference between groups in their final examination grades. The ANOVA demonstrated a significant difference between groups (F2,56 = 28.6, P < .01). The effect size, calculated using η2, was 0.51. Post hoc analysis revealed a significant difference between treatment group 2 and the other treatment group and comparison group. Small sample size and the quasi-experimental design prevent definitive conclusions, but the SOAT was able to discriminate between various groups, supporting our construct validity objective. The SOAT was introduced as a predictive tool that may assist orthopaedic assessment skill development. The treatment group exposed to the SOAT demonstrated that formative assessment of students using the SOAT was an effective means of teaching relative to no exposure or where only the instructor was exposed to the SOAT.Context
Objective
Design
Setting
Patients or Other Participants
Intervention(s)
Main Outcome Measure(s)
Results
Conclusions
INTRODUCTION
The concept of using assessment to facilitate learning is not new, but Dr. Lorna Earl was the first to differentiate between assessment of, assessment for, and assessment as learning.1,2 Earl2 expanded on earlier conceptions of the “assessment for learning” model first introduced by Martinez and Lipson,3 followed by Black and Wiliam.4 Earl modified the “assessment for learning” into both “assessment for learning” and “assessment as learning.”2 According to Earl, the instructor and student interactions are central to assessment for learning.2 The instructor gathers information from the assessment for diagnostic purposes. The “diagnosis” provides insight to the instructor, with the student's current understanding of the learning objective(s) preparing them to personalize teaching based on the gap between what the student knows and what he or she needs to know.2 Assessment as learning shifts the responsibility away from the instructor to put more of the onus of learning onto the student. The student takes responsibility and participates actively in the assessment process through critical thinking and metacognition.2 Both assessment for learning and assessment as learning are formative learning models whereby the assessment is used to facilitate learning.
In Earl's model, assessment of learning is primarily summative in nature, typically represented by final examinations in academic programs or licensing examinations.2,5 Earl does state a balance must be struck among all components of assessment but prioritized assessment as learning as the pinnacle of assessment.2 Assessment as learning is an attempt to make students more autonomous, which coincides with tenets of those who advocate for constructivism.5
Summative and Formative Assessment of Clinical Competence in Health Care Professions
The focus in medical education and allied health care professions has traditionally been summative in nature due to the concerns around clinical competence and patient safety.6 The use of valid and reliable assessment tools for summative purposes is critical to ensure the public is protected.7,8 The objective structured clinical examination (OSCE) has become ubiquitous in allied health care professions as the primary vehicle to measure clinical competence.9–12 In brief, an OSCE is a practical, performance-based examination that breaks apart components of clinical competence into various stations where students interact with standardized patients (SPs) who are taught to act as real patients. There can be stations that do not involve patients but merely ask the student to evaluate tests, make judgments about a condition, or perform written tests about clinical cases. Student performance is graded with standardized checklists and/or global rating scales by expert evaluators, clinicians, and/or the SPs. The primary purpose of OSCEs is to provide a summative assessment of clinical competence. However, some researchers have suggested using the OSCE or its derivatives as formative learning tools.13–18
The Standardized Orthopedic Assessment Tool (SOAT) is an evaluation tool that has been developed for both summative and formative assessment of orthopaedic assessment clinical competence and is used in OSCE-type examinations. There are slight variations of the testing procedure using the SOAT compared with OSCEs, but both are practical, performance-based examinations.19,20 The SOAT has undergone initial content validation and reliability/internal consistency testing (α = .82 for the shoulder and .83 for the knee).20–22 It has also undergone interrater reliability evaluation and demonstrated strong reliability (intraclass correlation coefficient [ICC] = 0.75 for the shoulder and 0.82 for the knee scenarios).20 Our study is another in a series of studies to establish the validity and reliability of the SOAT, because it takes a number of studies to establish the overall validity of an assessment tool.23,24
Purpose
The purpose of our paper is to introduce a theoretical framework, the predictive learning assessment model (PLAM) (Figure). The PLAM is hypothesized to act as a predictive validity model of student performance in summative examinations using the SOAT. The theoretical framework that underpins the PLAM is Earl's model of learning: assessment as, for, and of learning.2 Specifically, the thesis in the predictive model is that explicit and purposeful exposure of students to the SOAT in a learning environment (ie, the assessment tool) will lead to greater clinical competence of orthopaedic assessment skills as measured by a summative examination using the same tool (the SOAT). Furthermore, implicit exposure of students to the SOAT (as determined by the explicit exposure of the SOAT to the instructor only) will result in equally poor results on the final exam as those of students and instructors who did not have any exposure to the SOAT. By parceling out the levels of exposure to the SOAT, essentially there is application of Earl's model: assessment as, for, and of learning.2 Embedded in the testing of this predictive model is construct validation of the SOAT. If the SOAT is able to discriminate among various quasi-experimental groups, it is thought to possess construct validity.23–25



Citation: Athletic Training Education Journal 8, 4; 10.4085/0804109
METHODS
Participants
Four types of participants were needed for our study: educational institutions, instructors, students, and examiners. There were two types of examiners: raters and SPs. Selection rationale for these participants is outlined separately. We selected a convenience sample of three educational institutions due to the similarity in undergraduate athletic therapy program curricular design: Concordia University, Mount Royal University, and the University of Winnipeg. Three instructors of an introductory orthopaedic assessment class in athletic therapy curricula were solicited to participate in the study. The instructors for the orthopaedic assessment classes solicited students after the final grades for the introductory class had been finalized. All three instructors had been teaching the course for at least 3 years at their home institution. All instructors were Certified Athletic Therapists in Canada. Two instructors had a PhD and one had a Master's of Science academic credential.
There were 57 third-year students who volunteered for our study across the three athletic therapy programs: Concordia University (n = 24); Mount Royal University (n = 24); University of Winnipeg (n = 9). The Human Research Ethics Board at all three institutions approved our study.
There were two types of examiner participants in this study: SPs and raters. The primary investigator (M.L. from Mount Royal University) acted as the SP for all examinations (n = 57) to ensure there was consistency in the acting for each scenario across multiple institutions. Raters were solicited through an e-mail distribution and call for volunteers 3 weeks before testing. Raters were required to have practiced athletic therapy for at least 5 years and have had past experience testing students at the university undergraduate level and at the national examination level. The final raters who participated in our study were chosen based on availability for the testing dates once all the baseline requirements were met. There were five raters from Mount Royal University, five raters from University of Winnipeg, and two raters from Concordia University.
Procedure/Intervention
All institutions offered an introductory orthopaedic assessment class with a structured lab component in the Fall 2006 semester, followed by a clinical internship course in the Winter 2007 semester. All institutions were using the same textbook for their introductory course26 and all programs followed the same basic competencies outlined by the professional governing body.27 Testing took place for all students at their home university in the Spring 2007 semester. Each educational institution was randomly assigned into one of three levels of exposure to the SOAT (Table 1).

All instructors agreed to participate knowing they may have to change their curriculum delivery based on being assigned into one of three groups outlined herein. Those instructors who were part of groups 2 and 3 (Table 1) were oriented to the SOAT so they were familiar with the content, its functionality, and its use in the final examination of students in the Spring 2007. The instructor for group 3 was permitted to copy and distribute the SOAT to students at will throughout the course of the Fall 2006 semester. The group 3 instructor was permitted to use the SOAT in a final, summative examination at the end of the class in December 2006. The instructor for group 2 was not permitted to copy or distribute the SOAT in any way but was permitted to apply the principles embedded in it throughout the course of the semester. The group 2 instructor was not permitted to use the SOAT in the final, summative examination. The instructors were asked not to solicit students into the study until the testing phase of the study in the Spring 2007 semester to ensure there was no bias or coercion of the students participating in the study.
A training session at each institution took place on the evening before the first day of testing based on methods described in greater detail in a previous publication.21 Briefly, a 3-hour training session was used where raters were exposed to the SOAT, its rating scales, the process students were expected to follow, the scenario, and correct responses and/or performance they should expect to grade.21
Outcome Measures and Statistical Analyses
Demographic and psychographic data were collected for all students including sex, age, self-reported grade point average, and the total estimated number of courses completed to date in their undergraduate degree. All SOAT test scores for both knee and shoulder scenarios from all three groups were combined to calculate a Cronbach α reliability coefficient as a measure of internal consistency. We completed a one-way analysis of variance (ANOVA) to determine whether a difference existed between the comparison group and the two quasi-experimental groups. A post hoc analysis was conducted once the ANOVA was complete. The statistical analysis was calculated using SPSS 17.0 (SPSS Inc, Chicago, IL).
RESULTS
Demographic and psychographic data were collected to provide a profile of the students in each group and to help frame the ANOVA results accordingly. Those demographic and psychographic data are listed in Table 2. Cronbach α reliability coefficient for the shoulder scenario was 0.90 and 0.93 for the knee scenario for all three groups combined.

Descriptive statistics of student scores using the SOAT in a practical, performance-based examination for the comparison group and two treatment groups are listed in Table 3. We conducted an ANOVA to explore the impact of exposure to the SOAT (ie, explicitly or implicitly) on learning orthopaedic assessment skills. There was a statistically significant difference in SOAT scores for the three groups (F2,56 = 28.6, P < .01). The effect size, calculated using η2, was 0.51. Post hoc comparisons using the Tukey honestly significant difference test indicated that the mean score for quasi-experimental group 1 (mean [M] = 77.36, SD = 7.9, SEM = 3.8, 95% confidence interval [CI] = 68.56, 86.16) was significantly different from treatment group 2 (M = 55.85, SD = 12.37, SEM = 2.5, 95% CI = 50.65, 61.05) and the comparison group (M = 57.61, SD = 11.44, SEM = 1.6, 95% CI = 54.31, 60.91). There was no difference between treatment group 2 and group 1 (comparison group).

DISCUSSION
Examination or testing of students has two potential purposes: summative and formative assessment. Summative assessment is used in programs to determine whether a student has achieved a standard that will permit him or her to go to the next level of learning or stage in career. Additionally, licensing bodies in medicine and allied health care use a form of summative assessment to ensure the practitioner is safe to practice in the public domain.7,8 Formative assessment consists of evaluation of student performance in a classroom or informal setting where the instructor or fellow students provide immediate feedback that does not contribute a student's final grade in a course. The students in our study who were exposed to the SOAT or the final grading process performed better than students who were not exposed or whose instructors were exposed but did not explicitly share the SOAT with their students. Students who were exposed to the SOAT were able to use it to help guide them in learning the orthopaedic assessment clinical competency in a formative way throughout the Fall 2006 and Winter 2007 semesters. Students and instructors who were exposed to the SOAT were not tracked in terms of how they used the tool over the course of our study, but we theorized the exposure had a positive impact based on how well those students performance in the final examination. Moreover, explicit exposure to the SOAT, and the formative learning associated with it, provided evidence for the assessment as learning model proposed by Earl.2
The implicitly exposed treatment group where the instructor was exposed to the SOAT but could not explicitly share it with the students performed equally well (or poorly) as the comparison group who had no exposure to the SOAT at all. These two groups reflect similar situations to what can be found in traditional teaching-learning environments. In these cases, an instructor may not share their expectations or learning objectives explicitly even though they may know them. This is an example of what Sambell called the “hidden curriculum.”28 In contrast, the instructors who had access to the SOAT were able to explicitly share it through formative evaluation, through discussion with students (ie, assessment for learning), and providing students with the opportunity to use it however they saw fit (ie, assessment as learning). Ultimately, this group demonstrated higher levels of orthopaedic assessment clinical competence as measured by the SOAT.
Earl's model is a static description of three types of assessment (assessment as, for, and of learning). The PLAM (Figure) is a practical and dynamic application of Earl's model.2 The PLAM demonstrates that assessment as learning is more important and should be prioritized compared with the assessment for and of learning of orthopaedic assessment clinical competence.
The SOAT is hypothesized to act as a vehicle or a “script” that facilitates a reduction in the cognitive load, thus permitting students to learn more material more easily.29–31 Furthermore, it is thought the SOAT may help transition novice athletic therapists (or students) further along the expertise continuum of orthopaedic assessment clinical competency.29,30 The SOAT has been designed to allow students to use a more detailed, structured linear approach, and thus a deductive reasoning approach. However, the SOAT is also a flexible tool that would permit shortcuts or a more inductive reasoning approach to be taken if the examinee is able to gather information more organically like an expert does in their practice.29,30 The flexible design of the SOAT is thought to create a more valid and authentic testing environment.
Finally, the SOAT was able to discriminate among three different groups in our study. The ability of a tool to discriminate among various treatment groups lends support to its construct validation.23,24 Collectively, a multitude of studies19,21,22 when combined with the current study results demonstrate a series of validity and reliability studies to establish the overall validity of the SOAT to measure orthopaedic injury evaluation clinical competence. Future research should focus on application to a wider audience and thus will continue to build the SOAT's psychometric soundness.
Limitations
Educational research is very challenging for a number of reasons, and the sampling process contributes to that challenge.32 In our study, a convenience sample was chosen and then each of the three groups were randomly assigned. These groups were not equal, as was demonstrated in the demographic and psychographic data in Table 2. Every participating institution had a cohort of 30 students who demonstrated some interest in participating in the study when they were first introduced to the concept in the Winter 2007. However, the final cohort for each group was smaller than anticipated. The potential volunteers did not provide reasons for their lack of participation. Thus, the small sample size could have influenced the results.
Some statistical analysis may be helpful in contextualizing the sample size and, thus, the generalizability of the results. The effect size (η2) was 0.51, which is considered a moderate (or medium) effect size.33,34 The SEM is a statistical measure that takes the sample size into account in its interpretation. Furthermore, a t distribution of 95% CI that is used for sample sizes fewer than 30 was reported to assist the reader with interpretation and generalizability of the results.23 Therefore, interpretation of the results may limit generalization to the group tested in our study only.35 Regardless, conclusions about the tool and application of the theoretical model should be interpreted accordingly, with more research in the future to confirm our study's findings.
Conclusion
There is a trend in medical and allied health care education toward objective assessment of student performance in both a formative and summative format.31 The assessment of student learning should have a theoretical underpinning. Our study applied an existing model of learning (Earl's model) using a valid and reliable, practical, performance-based assessment tool. There is some evidence to support the implication that the SOAT or any practical, performance-based assessment tool can be used as a vehicle to assist in student learning, thus validating Earl's model of learning. Specifically, the PLAM is presented as a model to demonstrate the importance of the various components of Earl's model in learning and how each component is required to ensure complete learning as measured by the SOAT.

The predictive learning assessment model.
Contributor Notes
Dr Lafave is currently a Professor and Athletic Therapy Program Coordinator in Physical Education and Recreation Studies at Mount Royal University.