I’ll just come right out and say it: I am fundamentally opposed to automated essay scoring. For all the reasons I’ve mentioned throughout this blog, and throughout this course, I think that AES runs counter to every part of my pedagogical stance. We’ve discussed assessment as technology, and certainly AES is a technology, and I understand its merits. Sure, there are situations that do not involve classroom assessment. There are times when large-scale, high-stakes tests call for quick, efficient, reliable scoring of massive numbers of essays. But my question is, should there be?
AES does not treat composition as a process, but instead as a product. Machines code essays for criteria that have nothing to do with ideas or expression, but instead focus on whatever quantifiable aspects of writing there are. Here, we say to our students, jump through this hoop and do x, y, and z, and you’ll pass the test. Assessments like this demonstrate a complete break between taught and tested curriculum, between pedagogy and assessment practices. I see the merit of AES as efficient, quick, and technologically motivated, but I ask—again, mirroring assertions by Wardle and Roozen, Broad, Yancey, Reilly and Atkins, and others—isn’t reliability here at risk of overturning validity? AES measures what it’s supposed to measure, sure. But, is what it’s measuring really what we want to measure? Our pedagogy and the current moment in composition theory would indicate that, no, AES does not measure process, diversity, exploration, or reflection. Instead, it makes our students into algorithms, into nodes on a network, into mindless drones churning out five-paragraph essays that fit the formula but hold no pedagogical value whatsoever. There are ways to conduct large-scale assessments, but when we let the machines do the scoring, how far are we from letting the machines do the writing as well?