Yancey, Kathleen Blake. “Looking back as we look forward: Historicizing Writing Assessment” (19 pages) AW
Yancey articulates a history of writing assessment (from 1950-2000) framed around three waves. Historicizing writing assessment in this manner allows us to see how or way developments have been made in assessment and what we might expect in subsequent waves.
The use of “waves” as a guiding metaphor is not used arbitrarily: waves allow us to note how each wave overlaps, “one wave feeding into another but without completely displacing the waves that came before” (131). The waves are characterized by trends “that constitute a general forward movement, at least chronologically, but a movement that is composed of both kinds of waves, those that move forward and those that don’t…it allows us to mark past non-discrete patterns whose outlines and effects become clearer over time and upon reflection” (131).
The three waves of writing assessment, as outlined by Yancey are described as the following: “During the first wave (1950-1970), writing assessment took the form of objective tests; during the second wave (1970-1986), it took the form of the holistically scored essay; and during the current wave, the third (1986-present), it has taken the form of portfolio assessment and of programmatic assessment” (131).
Expertise: the emerging discipline of Writing Assessment
The movement of these waves can be understood in certain frames. At the onset, she notes that there is often a tension in expertise: who knows what about writing or assessment? And who controls how students are assessed? In the first wave, in particular, testing—and its expertise—was located outside of the classroom (usually for purposes of placement into a course or track). Testing specialists made decisions about how to move students in and out of which classrooms; teachers decided what content students should be taught. However, as teachers soon realized, classrooms were often defined by technology of testing: the ways in which testing influenced the teaching and learning in the classroom. In the two subsequent waves, the roles of testing specialist and educators merge and overlap, “with administrators and then faculty taking on institutional and epistemological responsibilities for testing previously claimed by testing experts” (133). This merging and overlapping—which resulted in a new kind of expertise—is key in the development of a new discipline: Writing assessment.
Yancey attributes the movement from indirect measures of assessment to direct with the involvement of teachers in the assessment of student writing. Namely, “(1) teachers saw the difference between what they taught in their classrooms—writing—and what was evaluated—selection of homonyms and sentence completion exercises; (2) they thought that difference mattered; and (3) they continued to address this disjunction rhetorically, as though the testing enterprise could be altered” (134).
Reliability and Validity
The first wave was marked by a central focus on reliability—and an ancillary concern for efficiency. “What it all comes down to is twofold: (prohibitive) cost, of course, but also ‘the inevitable margin of test error,’ a margin that is a given in a testing context, but that can be minimized” through reliability measurements, namely (136). In this way, the first wave was dominated by a central question: “which measure can do the best and fairest job of prediction with the least among of work and the lowest cost?” Yancey response, “the answer: the reliable test” (136).
The second wave is dominated with concerns of validity: does a multiple choice test accurately assess students’ writing ability? Simple answer is no; instead, holistic writing assessment emerges as a means of assessing students’ writing ability through writing. However, as Yancey notes, such movement from multiple-choice tests to writing tests still relied heavily on frames of reliability to retain credibility: “they had to assure that essay tests would perform the same task as the objective tests” (137). As Yancey writes, the second wave is characterized by a working both within and against the psychometric paradigm. And, like previously stated, such testing became closer to how teachers practice classroom assessment practices.
The third wave is characterized by the portfolio: “multiple writing samples on different occasions and in various rhetorical modes” (138). Such means of assessment is employed by programs, and such assessment offers a construct of writing that comes closest to accepted ideas about how writing works. For example, raters are teachers who aren’t calibrated or trained to agree, but rather negotiate meaning; what Elbow & Belanoff write is a process of communal assessment where communal standards are articulated during the assessment process.
Writing Assessment as Social Act
“As a social act, writing assessment exerts enormous influence both explicitly and implicitly, often in ways we, both faculty and students, do not fully appreciate. Certainly, writing assessment has been used historically to exclude entire groups of people” (143-4). In particular, in the same way that writing assessments construct an idea of writing, it also constructs identities: “what we are about, in a phrase, is formation of the self: and writing assessment, because it wields so much power, plays a crucial role in what self, or selves, will be permitted—in our classrooms; in our tests; ultimately, in our culture” (144).
In the first wave, the “tested self” was defined in multiple-choice tests: “passive, forced-choice response to external expert’s understanding of language conventions” (145). Such a constructed self is easy to see; however, Yancey notes that the self constructed in holistically scored tests is also fraught with baggage: the conventions and substance of the writing is determined for students. “Experts constrain what is possible, by creating the prompt, designing the scoring guide used to evaluate the text, training the readers who do the scoring” (145). Portfolio, then, allows multiple voices through diverse texts.
I want to say an aside here: it seems that the third wave—when written at the time—was more prospectus than reflection of actual practices at the time. Certainly, portfolio emerges in both classroom practices and writing programs, but the large-scale assessments that characterize the first two waves have not widely adopted such assessment, still hasn’t. In this way, the merging between classroom practitioners (educators) and testing specialists (including policy makers) seems to be exaggerated in the third wave. In fact, I’d argue that the third wave is continuing to emerge, but has been slowly moving due to the bifurcation of practitioners and policy makers.