Do teachers think differently from non-teachers when scoring performance tasks?

Research on human scoring is growing in importance due to the inclusion of more performance tasks on next-generation student assessments. Research on rater, or scorer, cognition is one category of investigation that is generating more interest. This area of research is concerned with helping us better understand what raters are thinking and how those thoughts influence their scoring of students’ performance on assessment tasks. The methodology used most frequently to study rater cognition is the so-called “think-aloud” technique, which asks raters to verbalize their thoughts as they are scoring. Previous research studies on rater cognition using this technique revealed cognitive differences among raters and suggested that there is a meaningful relationship between raters’ thought processes and their scoring performance, as measured by interrater agreement and agreement with expert-assigned scores.

Teaching experience is often seen as one of the required or preferred characteristics in the selection of scorers for large-scale scoring projects. This is understandable as some performance tasks, such as extended problems in mathematics and science, require in-depth content knowledge and complex problem-solving skills. Subject-area teachers, who have both, are ideal scorers for those assessments. Some of the research literature shows that, in performance assessments that do not need extensive content knowledge, such as writing an essay, people with a college level education can score effectively after receiving rigorous, standardized training. However, there also is research that claims scorers’ professional training and work experiences influence their scoring behaviors such as the use of scoring rubrics. My colleagues and I decided to do additional research to clarify how teachers’ professional judgments affected their scoring. This line of research grew out of a broader inquiry into the critical area of teachers’ professional judgement and how it affects their effectiveness in the classroom.

Little research exists that links teaching experience to scoring quality. Although teachers are more familiar with the task of assigning grades to student work than non-teachers, the claim that they do a better job in this task under conditions beyond classroom grading is unsubstantiated. Therefore, my colleagues and I will be comparing the thought processes and cognitive strategies of teachers with those of non-teachers while they are scoring, and associate the differences, if any, with measures of scoring accuracy.  

If we were to discover that teachers think differently and score more accurately, we could use that information to modify how we train scorers. We would then try to train non-teachers to use the processes and strategies that teachers follow when they are making judgments about student work. In that way, the scoring of performance tasks could be substantially improved, without limiting the pool of potential scorers to current or former teachers.


Hua Wei

Hua Wei is a research scientist with Pearson. Dr. Wei’s primary research interests include multi-dimensional modeling, Bayesian methods, and model fit analyses.