Writing Is a Journey: Thoughts on Writing, College, and the SAT

A writer’s writer often ignored is James Baldwin, who examines his drive to write in the context of race:

INTERVIEWER

If you felt that it was a white man’s world, what made you think that there was any point in writing? And why is writing a white man’s world?

BALDWIN

Because they own the business. Well, in retrospect, what it came down to was that I would not allow myself to be defined by other people, white or black. It was beneath me to blame anybody for what happened to me. What happened to me was my responsibility. I didn’t want any pity. “Leave me alone, I’ll figure it out.” I was very wounded and I was very dangerous because you become what you hate. It’s what happened to my father and I didn’t want it to happen to me. His hatred was suppressed and turned against himself. He couldn’t let it out—he could only let it out in the house with rage, and I found it happening to myself as well. And after my best friend jumped off the bridge, I knew that I was next. So—Paris. With forty dollars and a one-way ticket. (The Paris Review interview)

Prompted by the announcement from the College Board that the SAT would be revamped in 2016, including dropping the writing section added in 2005, The New York Times has included a Room for Debate on Can Writing Be Assessed?

So, unlike the moment when the SAT added writing (one that heralded only doom for the field of composition), I want to take this moment to examine writing and the teaching of writing because dropping writing from the SAT may prove to be a positive watershed moment for both.

First, let me offer a few points of context.

I am 53 and have been teaching for 31 years, most of that life and career dedicated to writing and teaching writing. I read and write every day—much of that reading and writing is serious in that it is connected to my professional work. But I also read and write extensively for pleasure, including my life as a poet.

Two facts about my writing life: (1) I write because I must, not because I choose to, and (2) I am always learning to write because writing is a journey, not something one can acquire fully or finish.

As well, I strongly embrace the foundational belief that writing is an essential aspect of human liberty, autonomy, agency, and dignity; this is part of the grounding of my work as a critical educator. Living and learning must necessarily include reading, re-reading, writing, and re-writing the world (see Paulo Freire, bell hooks, and Maxine Greene, just to mention a few).

Writing is also integral to academics, in terms of learning and scholarship. Writing is part of the learning process, but it is also a primary vehicle for scholarly expression.

Next, considering the importance of writing in human agency and education, any effort to standardized the assessment of writing or to use writing assessments as gatekeepers for any child’s access to further education are essentially corrupt and corrupting.

Adding writing to the SAT in 2005, then, was one of several powerful contexts that have seriously crippled the teaching of writing in formal education; those forces include also:

All three of the above fail the fundamental value in writing because they distract from the process and act of writing as well as misread writing a a fixed skill that can be attained at some designated point along the formal education continuum.

As the Faculty Director of First Year Seminars at my university, I focus primarily on how we address the teaching of writing in those seminars (and throughout the curriculum). That role has highlighted for me a lesson I also learned while teaching high school English for 18 years: Many teachers, including English teachers, do not see themselves as writing teachers and often expect that students should come to their courses already proficient writers.

Essentially, then, using a writing assessment of some sort to identify students as college-ready as writers perpetuates the idea that we can and should have students demonstrating some fixed writing outcomes before we allow them access to higher education; this presumes in some ways that college will not be a place where people can and should learn to write.

In much the same way that the accountability paradigm is misguided in fixating on outcomes over conditions, seeing writing as a measurable skill useful for gatekeeping college entrance shifts our focus away from what experiences students need so that their continual learning to write in college can be better supported.

Yes, student outcomes matter, and samples of student writing in the right contexts may provide some powerful evidence of what students know as writers and what students need as writers. But something in the addition of writing to the 2005 SAT must not be forgotten: One-draft, timed, and prompted writing scored by rubrics, and even by computers, works against the important goals of writing [1].

Just as grading should be shunned for feedback when teaching writing (see my chapter here), the question is not if writing can be assessed, but how do we insure that all students have access to the common experiences necessary at all point along the formal education experience?

What, then, are those common experiences—and once we implement those, how do we document those experiences in order to support both students having equitable access to higher education and to the continual learning to write that must be central throughout higher education?

Some thoughts on common experiences:

  • Rich and multi-genre/media reading experiences that include choice and assigned reading. Students need to develop genre awareness and discipline-specific awareness as readers.
  • Rich and multi-genre/media writing experiences that include the following: choice and assigned writing, peer and teacher feedback and conferences, workshop experiences drafting short and extended multi-draft compositions, and discipline-specific writing experiences.
  • Analysis of and experiences with a wide range of citation and documentation style sheets for integrating primary and secondary sources in original writing.
  • Continual consideration of expectations for writing both in academic/school settings and real world settings—challenging school-based norms such as thesis sentences and template essay formats.

While this isn’t meant to be exhaustive, the point is that instead of seeking ways in which we can assess well test-based writing or continuing to explore tests and metrics that correlate strongly with actual writing proficiencies, we must commit ourselves to all students having the sorts of common experiences with writing necessary to grow as writers—both for their own agency and their academic pursuits.

Finally, if we can commit to these conditions of learning instead of outcomes, we should then find ways to gather artifacts of these common experiences to use instead of metrics as we guide students through—and not gatekeep them from—formal education.

INTERVIEWER

Did what you wanted to write about come easily to you from the start?

BALDWIN

I had to be released from a terrible shyness—an illusion that I could hide anything from anybody. (The Paris Review interview)

[1] See The New Writing Assessments: Where Are They Leading Us? (Newkirk)From Failing to Killing Writing: Computer-Based Grading, and More on Failing Writing, and Students.

NOTE: For a historical perspective on teaching writing see selected works by Lou LaBrant.

If Fewer or No Tests, Then What?

When I responded to Students Should Be Tested More, Not Less by Jessica Lahey and the related study by Henry L. Roediger III and Jeffrey D. Karpicke in the blog post Students Should Be Tested Less, Then Not at All, resulting comments and Tweets suggest that the topic of moving toward fewer and even no tests needs further discussion and clarification.

One aspect of debating the role of tests in education revolves around the term “test.” For the general public, Lahey’s headline, I am certain, triggers a relatively basic view of tests—students answering questions created by a teacher or a standardized testing company. For the general public, distinguishing between teacher-made tests and high-stakes standardized tests or between summative and formative assessments will likely not change that basic perception.

And thus, Lahey’s headline is certain to cause more problems than good in the public debate about accountability, education reform, teacher effectiveness, and student achievement.

Many have noted the headline problem, but quickly argue that Lahey’s article, and Roediger and Karpicke’s research make a valuable case for formative assessment, adding that the study also raises concerns about high-stakes standardized testing and seeks to encourage more in-class formative assessments.

As I noted in my initial post, however, Roediger and Karpicke’s study is flawed—in their narrow defining of learning as retention and recall as well as their idealizing of testing (they raise concerns, but argue the positives outweigh those negatives).

Here, then, I want to clarify that calling for fewer and then no tests is not hyperbole on my part and not some idealized goal unfit for the real world of public school. As a co-editor with Joe Bower and building off the work of Alfie Kohn, I have detailed how to de-grade and de-test the writing classroom—practices I began as a public high school English teacher for 18 years and then expanded as a writing teacher in first-year seminars.

In terms of magnitude, yes, high-stakes standardized tests are by far the most corrosive types of tests impacting negatively teaching and learning. Standardized tests remain significantly biased by race, class, and gender, and their high-stakes status encourages the worst characterizations of schools, teachers, and students while also draining valuable resources and time from teaching and learning.

Despite the tradition of using standardized tests, U.S. education should end all high-stakes standardized testing—with a reasonable compromise being the use of randomized samplings of NAEP periodically to monitor large trends in measurable student outcomes (recognizing the limitations of measurable outcomes).

While ending standardized testing, or even lessening its frequency and impact, would be a huge move forward, continuing in-class testing would remain a misguided practice. Let me offer a few reasons and then an alternative.

Even the best in-class and teacher-made tests are reductive and only partial representations of learning because testing by its nature is artificial.

For example, consider testing any courses or student activities outside the so-called core curriculum, such as visual art, music, or athletics.

A course in painting that seeks students who can create their own original paintings does not begin with paint-by-numbers, and art teachers would never rely on traditional in-class tests of any kind to represent a student’s ability as a visual artist.

High school football teams, as well, line up each Friday night and the high school players actually play football; they don’t sit in desks and take tests to decide the best team (see Childress for an elaboration on this idea).

In other words, education has conceded the least accurate process, testing, to the core courses that we deem essential, while allowing in the so-called non-essential courses and activities the most authentic demonstrations of learning and teaching practices.

If tests are inadequate for determining a student’s ability in chorus, art, or soccer (where we allow and require students and players to perform the real task), I suggest that they are also inadequate for English, math, science, and social studies.

Now, before offering a brief consideration of what should replace testing, let me also explain that testing fails because it occupies time better spent doing real activities and receiving authentic feedback from teachers. This is the same issue with isolated grammar instruction as it fails the teaching of writing.

Isolated grammar instruction does not transfer to student writing and the time spent on that futile grammar instruction would have been better spent asking students to write. Such is the case with testing—as it wastes time better spent doing whole and authentic activities.

A transition to whole and authentic activities by students in class must begin by reconsidering the place of content acquisition and retention. Most commitments to testing see content as fixed and assume that memorization of that content must come before application, evaluation, or synthesis.

This is the distorted traditional view of Bloom’s taxonomy applied both to instruction and assessment in U.S. education—a view that reduces Bloom’s work on assessment to a linear and sequential model of teaching and learning.

To embrace students engaging in whole and authentic activities instead of tests, the acquisition of knowledge must be re-imagined as the result of that engagement, not a prerequisite to that engagement.

We own and know facts, knowledge, and details because and once we have used those facts in whole and authentic ways. Again, consider how we have learned to paint a work of art, play an instrument, or participate in an athletic event. All of these require some basics, some practice, some artificial preparation, but the real learning comes from the doing, the feedback while performing as a novice, and then the re-doing, and re-doing.

About 60 years ago, Lou LaBrant (1953) lamented:

It ought to be unnecessary to say that writing is learned by writing; unfortunately there is need. Again and again teachers or schools are accused of failing to teach students to write decent English, and again and again investigations show that students have been taught about punctuation, the function of a paragraph, parts of speech, selection of “vivid” words, spelling – that students have done everything but the writing of many complete papers. Again and again college freshmen report that never in either high school or grammar school have they been asked to select a topic for writing, and write their own ideas about that subject. Some have been given topics for writing; others have been asked to summarize what someone else has said; numbers have been given work on revising sentences, filling in blanks, punctuating sentences, and analyzing what others have written….Knowing facts about language does not necessarily result in ability to use it. (p. 417)

And this is essentially my argument about testing.

If we want students to be better at taking tests, then more testing will certainly accomplish that goal (again, that is basically what Roediger and Karpicke show).

But if we redefine learning and frame our teaching goals toward whole and authentic behaviors by students, we must recognize that students learn by doing those whole and authentic things.

Instead of tests, then, and grades, students need extended blocks of time in school to perform in whole and authentic ways (ways that occur in the real world outside of school; ways that occur in art class, chorus, and band, and on athletic fields and courts) along with having teachers observing and offering rich and detailed feedback that contributes to those students trying those performances again and again.

Not tests, whether we call them formative or summative, of the artificial kind, but whole and authentic performances and rich feedback leading to more and more performances.

Again, if you seek examples of what should replace the inordinate amount of time spent testing in schools, visit an art class, chorus, an athletic event—or consider that a central aspect of science courses are labs.

Commitments to testing are commitments to the static classroom where teachers are active, students are passive, and content is central. These commitments are asking very little of students.

I am calling for de-testing and de-grading the classroom in order to increase student activity, engagement, and thus learning in ways that are whole and authentic.

As Childress concludes in his argument that football is better than high school:

What I am saying is that we have a model for learning difficult skills — a model that appears in sports, in theater, in student clubs, in music, in hobbies — and it’s a model that works, that transmits both skills and joy from adult to teenager and from one teenager to another.

For Further Reading

More on Failing Writing, and Students

Education Done To, For, or With Students?

Teacher Quality, Wiggins and Hattie: More Doing the Wrong Things the Right Ways

Students Should Be Tested Less, Then Not at All

Students Should Be Tested More, Not Less by Jessica Lahey is not a compelling case to test students more, but another example of journalism failing to represent accurately a relatively limited study related to education.

Several aspects of the article reveal that the title and apparent claim of the need for more testing are misleading:

Henry L. Roediger III, a cognitive psychologist at Washington University, studies how the brain stores, and later retrieves, memories. He compared the test results of students who used common study methods—such as re-reading material, highlighting, reviewing and writing notes, outlining material and attending study groups—with the results from students who were repeatedly tested on the same material. When he compared the results, Roediger found, “Taking a test on material can have a greater positive effect on future retention of that material than spending an equivalent amount of time restudying the material.” Remarkably, this remains true “even when performance on the test is far from perfect and no feedback is given on missed information.”

And to be fair, this is the actual abstract of the study discussed above:

A powerful way of improving one’s memory for material is to be tested on that material. Tests enhance later retention more than additional study of the material, even when tests are given without feedback [emphasis added]. This surprising phenomenon is called the testing effect, and although it has been studied by cognitive psychologists sporadically over the years, today there is a renewed effort to learn why testing is effective and to apply testing in educational settings. In this article, we selectively review laboratory studies that reveal the power of testing in improving retention [emphasis added] and then turn to studies that demonstrate the basic effects in educational settings. We also consider the related concepts of dynamic testing and formative assessment as other means of using tests to improve learning. Finally, we consider some negative consequences of testing that may occur in certain circumstances, though these negative effects are often small and do not cancel out the large positive effects of testing. Frequent testing in the classroom may boost educational achievement at all levels of education.

Not to trivialize the study, but in short, the research associates “learning” with retention (memorization), and assumes a relatively direct correlation between test scores and the narrow view of learning as retention. In other words, if you want to raise summative test scores of retention, a series of smaller (and formative) tests are more effective in raising those scores than compared study strategies.

The problem with this “well, duh” study is that it remains trapped within the testing paradigm, even though the authors do concede (and then marginalize) problems with high-stakes testing and also briefly endorse the power of formative assessment: “the general procedure of using the results of classroom assessments as feedback for teachers to guide future instruction and also for students to guide their future studying” (p. 201).

This study, however, is not a compelling argument* as the title states for more testing.

In fact, it is an ideal opportunity to argue that we must move beyond retention, recall, and memorization as foundational to what counts as learning. We must also begin to reject that traditional testing formats (including selected-response formats in the classroom as well as standardized testing such as the SAT) are credible goals or evidence of learning.

Students should be tested less, and then not at all. Students should be offered opportunities to practice and perform whole and authentic activities (such as playing an instrument, creating a work of art, composing an essay, designing a budget for a project) during class time instead of preparing for and taking a battery of narrow assessments. Additionally, students need ample teacher feedback, and not grades, as part of drafting and revision processes surrounding those activities.

Retention and enhanced memory come from authentic engagement with real behaviors that students want to perform; memorization need not precede authentic displays of understanding, and must not be a primary proxy for learning. Ultimately, memorization is not deep learning, and testing limits, and never enhances deep learning. Test scores also misrepresent student learning, teacher impact, and school quality.

Lahey’s article and the research on testing do offer valuable concerns about high-stakes associated with testing, and lends credibility to formative assessment, but both in the end remain trapped within the failed testing paradigm that needs to be lessened and then rejected entirely.

* Broadly, the authors ignore entirely issues related to who decides what should be learned; in other words, critical educators tend to explore education not bound by the traditional testing paradigm within which this study resides like a bug trapped in amber. The narrow and static view of knowledge and learning is as problematic as the idealized view of testing that the study fails to challenge.

Faith-Based Education Reform: Common Core as Standards-and-Testing Redux

Let’s start with irony:

Compelling research suggests that the public in the U.S. is unique in its commitment to belief, often at the expense of evidence—leading me to identify the U.S. as a belief culture.

Additionally, while I remain convinced that the U.S. is a belief culture, I also argue that, below, the political cartoon posted at Truthout captures another important dynamic: Many committed to their own beliefs both do not recognize that they are committed to belief and belittle others for being committed to their beliefs:

By Clay Bennett, Washington Post Writers Group | Political Cartoon

By Clay Bennett, Washington Post Writers Group | Political Cartoon

And this brings me to advocacy for Common Core standards, with one additional point: Along with embracing belief over evidence, the public (along with political leadership) in the U.S. tends to lack historical context.

Placed in the century-plus commitment to pursuing new and supposedly higher standards for public schools, then, Common Core advocacy falls into only two possible characterizations:

  1. Common Core is a response to the historical failure of all the many standards movements that have come before, and thus, the success of CC depends on CC being somehow a different and better implementation of an accountability/standards/testing paradigm.
  2. CC advocacy is yet another example of finding oneself in a hole and persisting with digging despite evidence to the contrary. In other words, CC may well be yet another commitment to a reform paradigm that isn’t appropriate regardless of how it is implemented, as John Thompson details in his review of The Allure of Order:

Jal Mehta’s masterpiece, The Allure of Order, answers the question, “Why have American [school] reformers repeatedly invested such high hopes in these instruments of control despite their track record of mixed results?” He starts with the review of how the bloom fell off the NCLB rose, explaining why its results in the toughest schools have been “miserable.” In the highest poverty schools the predictable result has been “rampant teaching to the test” which has robbed children of the opportunity to be taught in an engaging manner.

Mehta explains that this “outcome might have been surprising if it were the first time policymakers tried to use standards, tests, and accountability to remake schooling from above.” The contemporary test-driven reform movement is the third time that reformers have used the “alluring but ultimately failing brew” of top down accountability to “rationalize” schools and, again, they failed [emphasis added].

These two claims are themselves evidence-based (and it will be interesting to watch as others respond, as they have to my previous work on CC, by either ignoring evidence or garbling evidence to support what proves to be faith-based commitments to CC), and thus should provide a foundation upon which to continue the debate about CC.

CC advocacy and criticism are often based on false narratives and baseless claims (see Anthony Cody for one example of this problem and Ken Libby‘s [@kenmlibby] cataloguing on Twitter #corespiracy)—again reinforcing the pervasive and corrosive consequences of faith-based, but not evidence-based debates.

Instead, we should start with an evidence-based recognition about standards-driven education reform.

For example, the existence and/or quality of standards are not positively correlated with NAEP or international benchmark test data—leading Mathis (2012) to conclude about CC: “As the absence or presence of rigorous or national standards says nothing about equity, educational quality, or the provision of adequate educational services, there is no reason to expect CCSS or any other standards initiative to be an effective educational reform by itself [emphasis in original]” (p. 2 of 5).

Therefore, CC advocacy has some principles within which it should continue if that advocacy is to be credible and thus effective:

  • Claims that CC advocacy is separate (and can be separated) from high-stakes testing must show evidence of when standards have been implemented without high-stakes tests (and how that was effective) or evidence of some state implementing CC without high-stakes tests connected. Otherwise, this is a faith-based claim.
  • Claims that accountability built on standards and high-stakes testing is an effective education reform strategy must show evidence of how that has worked in the previous state-based accountability era and then explain why those examples of success must now be replaced by the new CC set of standards. Otherwise, this is a faith-based claim.
  • CC advocacy has been endorsed as a logical next step built on the call in NCLB for scientifically based education reform; thus, CC advocates must either comply with the two points above or concede that the CC era is a break from evidence-based reform.

I am no advocate for remaining only within rational, evidence-based, and quantifiable norms for decision making, by the way, but I am convinced we must make clear distinctions between evidence and belief—and I am equally convinced that many education reformers enjoy a flawed freedom to call for evidence from their detractors while practicing faith-based reform themselves.

It is the hypocrisy that bothers me, the hypocrisy of power:

scientist evidence – Married to the Sea

Let’s acknowledge that teachers currently work under the demand of measurable evidence of their impact on students while CC advocates impose faith-based policies such as CC, new generation high-stakes testing, merit pay, charter schools, value-added methods of teacher evaluation, and a growing list of commitments to education reform at least challenged if not refuted by evidence.

CC advocates now bear the burden of either offering the evidence identified above or admitting they are practicing faith-based education reform.

REVIEW: De-Testing and De-Grading Schools, Bower and Thomas

Reviewed by J. Spencer Clark, Utah State University, which concludes:

The purpose of this book was to offer a map of the high-stakes accountability and standardization landscape, and more importantly to provide ways to navigate this landscape in positive ways. Bower and Thomas are successful in this regard and have provided a powerful critique that equally identifies powerful alternatives to high-stakes accountability. Overall, this is a fresh look at how to meld the theories behind de-grading and de-testing schools with actual classroom practice. This book could be a useful tool for instructors of pre-service methods and assessment courses, and possibly educational foundations courses at all levels, as it provides both an analysis of key aspects of a failing system of accountability and possible alternatives to it.

Bower, J., & Thomas, P. L. (2013). De-testing and de-grading schools: Authentic alternatives to accountability and standardization. New York, NY: Peter Lang USA.

De-Testing and De-Grading Schools

GUEST POST: Continu—what? Sara Newell

Continu—what?

Sara Newell

How do you derive meaning from a number? Should a parent or student respond differently to a 97% than to a 99%? What about a 75%? How do you know what number to assign to a student created product you’ve never seen before? As a 5th grade teacher at the Charles Townes Center for highly gifted students in grades 3-8, I felt these questions were a constant thorn in my side.

My students qualified for invitation to the center in Greenville, South Carolina based on scores in the top percentile on nationally normed tests. The current, numerical grading system has always presented quite a challenge to me as a public school classroom teacher—how do I push my students to strive for excellence without encouraging the crippling effects of perfectionism? In giving students and parents a true measure of learning, personal achievement, and goal-setting, the numerical grading system always seemed to me ineffective at best. Since I began teaching gifted students, I have been in a constant struggle to find a more effective way to provide accurate feedback about their current performance while motivating them to continue to give their best effort on whatever challenges are presented next.

The assessment issues faced in our school were exaggerated versions of the problems caused by the numerical grading system in schools across the country. The nature of our students simply intensifies the problem. For example, the vast majority of my students can ace a grade-level multiple-choice test before I even engage in the first lesson. Should they just receive “A’s”? Is that what they earned? And, if I – instead – increased the depth and complexity of my instruction to provide the appropriate intellectual challenge and a student then only mastered 92% of that material—is it “fair” to assign a less than stellar grade?

This issue becomes even more important as students begin to move into high-school level courses. How does a 92 affect their GPA when they are enrolled in high-school and honors courses beginning in 7th grade? Should they be scored less than their peers who attend mainstream schools? Teachers of gifted (and all) students face these types of problems again and again as they are asked to differentiate to meet the needs of diverse learners. How does a teacher maintain some sort of equity and still challenge students appropriately? Some schools have attempted to rectify this disparity by offering higher grade points for honors or AP classes. This does not remedy the problem—it simply magnifies the spectrum of an inaccurate ruler and introduces an additional disadvantage for college applicants whose schools do not offer this option.  The issue of quality feedback and appropriate challenge remains.

For a while, I thought the solution to the problem was that I needed to design better rubrics. If I could just break assignments down into more concrete sections, the students would see what they needed to do and would be able to demonstrate mastery in a way that provided equal access to all while challenging students appropriately. (And I could still put a number on it and feel good about it.) Unfortunately, there were still roadblocks.

In a subject-integrated inquiry-based classroom, how do you quantify “delightful,” “sophisticated,” “clever” and all of the other descriptors that address the work of students who clearly went above and beyond the scope of the assignment? The scale model in gingerbread of the Metropolitan Museum of Art or the original musical composition in response to a Langston Hughes poem received a 100% that was “worth” exactly the same amount as the student who ploddingly met the minimum requirement for each element. So, the rubric was a start, but it still lacked the depth I was seeking to truly communicate effectively with my students (not-to-mention their parents) about the quality of their work. Truly authentic assessment with feedback that can guide students into becoming independent learners still seemed out of reach.

Then, our principal brought back the idea of using continua from a school visit in Seattle. These reading, writing and math continua are based on the work of Bonnie Campbell Hill and provide a system to analyze student skill and progress over many years. The lists are simple and concise. They do not include every possible state standard but instead provide an overview of the crucial skills students need to be successful.

I jumped on these tools and piloted using them with my students almost immediately. My students completed self-evaluations, rating themselves at the “beginning,” “developing,” “proficient,” or “independent” levels described on the continua. I then added my own assessment of their skills. We used these in our student-led conferences, and I could see the beginnings of evidence-based discussions in their conversations. Students were using their writing portfolios and math assessments to provide concrete support for their evaluations. This represented a terrific shift in the way students and parents thought and talked about student work.

Instead of parent comments like, “What did you miss?” or “Great job!” I was hearing, “How did you decide you were proficient in reading fluency instead of independent?”  One parent asked his son, “I didn’t know you should be reading different genres. What are you reading right now? Is that the kind of book you always read?” These conversations were so much richer than the previous years’ event which basically consisted of students proudly showing their work while their parents made appreciative mumbles and nodded their heads.  I was excited by the beginnings of the give and take that marks a truly thoughtful discussion, but something was still missing. There was still not a way to communicate the truly exceptional or the gifted student who was playing it safe.

After musing on this initial success and talking repeatedly with a middle school colleague struggling with many of the same frustrations, we decided that we needed to create an additional continuum. The difference between the “minimum doer” and the outstanding student in our school was based not only on the ability to demonstrate skill mastery, but on the willingness to strive to apply critical and creative thinking skills. With this in mind, I pulled together a number of resources and began to hammer out a draft of a critical and creative thinking skills continuum. (I still haven’t hammered out a shorter name, though.) Dr. Richard Paul’s mini-guides on critical and creative thinking, Torrance’s work on creativity and Van Tassel-Baska’s writing on application of these skills in the classroom were all of great benefit to me as I worked. My hope was that this document would bridge the gap between the seemingly arbitrary nature of a number grade and the lightning strike of truly outstanding work. I ended up with a scale more rooted in psychology and child development than pedagogy and standards. This was initially surprising, but it became more satisfying as I realized that perhaps with this tool we might finally get to the roots of why one student was clearly outperforming another and more importantly—what to do about it.

The purpose of this creative and critical thinking skills continuum is to provide specific feedback for students and parents about the students’ current progress as well as to communicate in a straightforward way the next steps in their educational growth. Numeric grades are loaded with judgment, both objective and subjective, as well as academic stigma. Students feel that a 100% means that you are perfect while a 67% means that you are a loser. I’ve even had students tell me that even numbers are better than odd numbers (a 99% means that I am a point away from perfect—the most frustrating thing—but a 98% means that I’m solidly in the high “A” category). The focus on the number rather than what the number represents is a bizarre, yet true manifestation of the problem with attempting to quantify something as variable as knowledge and learning. Students become so focused on the number and what it “means” that they completely lose sight of the true purpose of assessment– reflection and growth. A continuum has no numbers—hence, no judgment. There is no “right” or “wrong” way to evaluate oneself with this method.

On first executing the continua in my classroom, I did not ask my students to provide evidence to support their evaluations. (That came later…) It was absolutely fascinating to see students read through and begin their self-evaluations on the critical and creative thinking continuum. I only allowed one hour of the class period for students to complete their analysis of this one-page document. However, most of my students took much more time than that. The room was silent. My students were incredibly focused on their reading and analysis. As 5th graders are still fairly ego-centric at this concrete operational stage (thanks Piaget), they seemed to feel that an assessment all about them was well worth their time. The questions students asked about concepts like “intellectual humility” and perseverance got to the core of what I had been trying to teach for years. Why is it important to continue to try to find a solution to a difficult problem? What does it mean to demonstrate originality? How do I know if I am taking an intellectual risk? These were the questions that I wanted my students to ask—and this was finally a document that set the stage to ask them.

Another revelation occurred when I reviewed these documents individually. I began to get a much more relevant picture of how each child saw him or herself. It was striking to compare the self-assessments with the list of test-score data that my principal had just sent out. (Yes, we are still in a public school. And yes, we still have to do things like set learning goals based on the number of points students “should” improve on certain tests.) Those standardized test scores have been relatively meaningless to me in the past. However, coupled with the information from the continuum self-assessments, a fascinating phenomenon was revealed. By and large, students in the top performing test score group had consistently given themselves the lowest evaluations on the continuum while the students with the lowest (comparative) test scores had marked themselves as having mastered all or almost all of the critical and creative thinking skills. The Dunning Kruger effect in action! We had a tremendous class discussion about this effect—in which less competent people in a field tend to overestimate their abilities. We analyzed how it applied to their attitudes and approaches to learning.  I began to see a shift in several students’ attitudes and performance following this one illuminating discussion.

This initial work was very inspiring. I was surprised and pleased at the effort my students put into their evaluations. The vocabulary from the continuum was popping up in our discussions again and again. Instead of “I don’t get it,” I was hearing comments like, “I need to clarify this—do you mean…?” The students were beginning to look at learning through this alternate lens. I continued to have students review the continuum and reflect on their progress as we completed units of instruction. They documented their growth and reflected on their struggles.

We also used the continuum to decide on areas of focus for the next units. I previewed with the students what I felt were the “big ideas” for learning while they made choices about skills they thought it important to develop. The quality of our communication continued to improve. Our goals were aligned—I was attempting to provide opportunities for them to improve in areas that THEY had identified as needing work. This method gave them a sense of control over their own learning.

Other teachers in my school are currently working to apply the math, reading, writing, and thinking skills continua in their classrooms. In the middle school, students are expected to provide support for their analysis as they complete their initial evaluations. In the lower grades, teachers use the continua to shift the focus from what students can’t do to what students COULD do. These continua are shared between teachers vertically to provide a long-range picture of the student’s development over time. This is something that a numeric grade based on grade-level standards fails to communicate.

At first some teachers struggled with how to make the continua relevant to their students, and all teachers recognized that the thought and effort needed to accurately utilize the continua required more time than typical “grading.” However, the value of the knowledge gained far outweighs the extra effort the analysis requires.

The next breakthrough came when I began to use the critical and creative thinking continuum in one-on-one parent conferences. For years, my conferences followed a fairly typical script. First, I would go over the previous year’s test scores. Then, I would discuss grades. The parent(s) and I would discuss any issues or “concerns,” and then I would try to end on some kind of positive note. For the parents of my highest achievers though, this was not a helpful meeting. While I’m sure they enjoyed hearing me list all of the delightful adjectives that described their child, I’m not sure that they felt that they were getting a clear picture of what their child could do to continue to grow.

The use of the continua has changed our discussions. My conferences conducted this fall focused on which elements their child was clearly demonstrating as well as areas their child could continue to develop. I was able to explain the Dunning Kruger Effect to parents who thought their child was practically perfect but who in reality was barely making an effort. I described to the parents of the perfectionists what intellectual risk-taking was and how their child could begin to do it. The conversations were so much richer than in the past, and parents did not feel that I was judging their parenting, or their children.

Instead, the focus was on attributes and evidence.  Parents were surprised and fascinated when reading their child’s reflections. The conferences now were a detailed conversation about the whole child and how he or she interacted with the world. Even more importantly, parents were now able to support our classroom objectives with greater accuracy.  One parent commented, “We were delighted to discuss and learn about the Creative Continuum. The Continuum is visual and the skill sets are clearly presented…Our meeting was one of the most informative conferences I have attended.”

Shifting our focus from a numerical grading system to a continuum-based evaluation has started to address many of the assessment issues I was facing. My students have stopped asking, “Is this for a grade?” as though that alone determines the value of an assignment. I continue to work to provide more opportunities for students to develop those critical and creative thinking skills. Knowing that I am going to be asking them to evaluate their growth—I am very conscious of the need to design learning experiences that require students to demonstrate those skills.

Most importantly, the students themselves feel a sense of ownership over their learning, and now they are making the effort to ask accurate, insightful questions about what they can do and what they still need to learn to do. Removing the focus from the number grade and putting it back on the evaluation of skills and attributes improves the quality of instruction, performance and communication. The end result is a focus on authentic student learning and success.

References

Davis, G.A., & Rimm, S.B. (2009). Education of the gifted and talented (6th ed.). Boston, MA: Allyn and Bacon.

Dunning, D., Johnson, K., Ehrlinger, J., & Kruger, J. (2003). Why people fail to recognize their own incompetence (PDF). Current Directions in Psychological Science, 12(3), 83–87.

Elder, L., & Paul, R. (2005). The miniature guide to critical thinking concepts & tools. (4th ed.). Dillon Beach, CA: The Foundation for Critical Thinking.

Hill, B. C. (2008). Retrieved from http://www.bonniecampbellhill.com/support.php

Van Tassel-Baska, J., & Stambaugh, T. (2006). Comprehensive curriculum for gifted learners. (3rd ed.). New York, NY: Pearson.

For Further Reading

Rubrics

Kohn, A. (2006). The trouble with rubrics. English Journal, 95(4), 12-15.

Wilson, M. (2007). Why I won’t be using rubrics to respond to students’ writing. English Journal, 96(4), 62-66.

Wilson, M. (2006). Rethinking rubrics in writing assessment. Portsmouth, NH: Heinemann.

Self-Assessment

Liberating Grades/Liberatory Assessment, sj Miller

What Are Tests Really Measuring?: When Achievement Isn’t Achievement

High-stakes standardized testing must be the most resilient phenomenon ever to exist on the planet. Joining high-stakes standardized testing in that (dis)honor would be the persistent but misleading claim that test scores are primarily achievement (and a growing future candidate for this honor is the claim that test scores by students, labeled “achievement,” are also credible metrics for “teacher quality”).

Let’s start with a couple statistical breakdowns of what test scores constitute:

But in the big picture, roughly 60 percent of achievement outcomes is explained by student and family background characteristics (most are unobserved, but likely pertain to income/poverty). Observable and unobservable schooling factors explain roughly 20 percent, most of this (10-15 percent) being teacher effects. The rest of the variation (about 20 percent) is unexplained (error). In other words, though precise estimates vary, the preponderance of evidence shows that achievement differences between students are overwhelmingly attributable to factors outside of schools and classrooms (see Hanushek et al. 1998Rockoff 2003Goldhaber et al. 1999Rowan et al. 2002Nye et al. 2004).

Just 14 per cent of variation in individuals’ performance is accounted for by school quality. Most variation is explained by other factors, underlining the need to look at the range of children’s experiences, inside and outside school, when seeking to raise achievement.

Next, consider this from the UK:

Differences in children’s exam results at secondary school owe more to genetics than teachers, schools or the family environment, according to a study published yesterday.

The research drew on the exam scores of more than 11,000 16-year-olds who sat GCSEs at the end of their secondary school education. In the compulsory core subjects of English, maths and science, genetics accounted for on average 58% of the differences in scores that children achieved.

While the genetics claim is potentially dangerous, and certainly controversial, the article offers some important clarifications:

The findings do not mean that children’s performance at school is determined by their genes, or that schools and the child’s environment have no influence. The overall effect of a child’s environment – including their home and school life – accounted for 36% of the variation seen in students’ exam scores across all subjects, the study found….

Writing in the journal, the authors point out that genetics emerges as such a strong influence on exam scores because the schooling system aims to give all children the same education. The more school and other factors are made equal, the more genetic differences come to the fore in children’s performance. The same situation would happen if everyone had a healthy diet: differences in bodyweight would be more down to genetic variation, instead of being dominated by lifestyle.

Plomin said one message from the study was that differences in children’s performance were not merely down to effort. “Some children find it easier to learn than others do, and I think it’s appetite as much as aptitude,” he said. “There is a motivation, maybe because you like to do what you are good at.”

Genetics, he said, caused people to create, select and modify their environment, and so nature drives nurture, which in turn reinforces nature. A child with a gift for maths seeks friends who like maths. A child who learns to read easily might join a book club, and work through books on the shelves at home.

Additional points drawn from this research present some strong cautions about continued reliance on not only standardized tests, but also uniform national standards:

“Education is still focused on a one-size-fits-all approach and if genetics tells us anything it’s that children are different in how easily they learn and what they like to learn. Forcing them into this one academic approach is going to make some children confront failure a lot and it doesn’t seem a wise approach. It ought to be more personalised,” he said.

“These things are as heritable as anything in behaviour, and yet when you look in education or in educational textbooks for teachers there is nothing on genetics. It cannot be right that there’s this complete disconnect between what we know and what we do.”

Finally, consider this research on the disconnect between test scores and student abilities:

To evaluate school quality, states require students to take standardized tests; in many cases, passing those tests is necessary to receive a high-school diploma. These high-stakes tests have also been shown to predict students’ future educational attainment and adult employment and income.

Such tests are designed to measure the knowledge and skills that students have acquired in school — what psychologists call “crystallized intelligence.” However, schools whose students have the highest gains on test scores do not produce similar gains in “fluid intelligence” — the ability to analyze abstract problems and think logically — according to a new study from MIT neuroscientists working with education researchers at Harvard University and Brown University.

In a study of nearly 1,400 eighth-graders in the Boston public school system, the researchers found that some schools have successfully raised their students’ scores on the Massachusetts Comprehensive Assessment System (MCAS). However, those schools had almost no effect on students’ performance on tests of fluid intelligence skills, such as working memory capacity, speed of information processing, and ability to solve abstract problems….

Instead, the researchers found that educational practices designed to raise knowledge and boost test scores do not improve fluid intelligence. “It doesn’t seem like you get these skills for free in the way that you might hope, just by doing a lot of studying and being a good student,” says Gabrieli, who is also a member of MIT’s McGovern Institute for Brain Research.

So should we be shocked when students passing high-stakes reading tests in Texas admit they cannot read?:

A female classmate of Tony’s says she can’t get through the stories she reads in school unless someone explains them to her. She’s passed all her state tests, too. How? She says she uses classroom-taught “strategies” on her English reading test and that if she underlines and highlights enough and narrows down her options, she has a better chance of guessing right by playing the odds. She failed her math state test because of the word problems, so she employed her English strategies there on the retry attempt and passed.

Or that the most recent analysis of the teaching of writing in middle and high schools has found that best practice in writing hasn’t occurred because of accountability and high-stakes testing?:

Overall, in comparison to the 1979–80 study, students in our study were writing more in all subjects, but that writing tended to be short and often did not provide students with opportunities to use composing as a way to think through the issues, to show the depth or breadth of their knowledge, or to make new connections or raise new issues…. The responses make it clear that relatively little writing was required even in English…. [W]riting on average mattered less than multiple-choice or short-answer questions in assessing performance in English…. Some teachers and administrators, in fact, were quite explicit about aligning their own testing with the high-stakes exams their students would face. (Applebee & Langer, 2013, pp. 15-17)

Our educational world has been turned over wholesale to testing, despite ample evidence that test scores are many things (markers of privilege, markers of genetic predispositions, markers of teaching-to-the-test), among the least of which are student achievement and teacher quality.

If we don’t have the political will to de-test our schools, the evidence is clear that the stakes associated with testing must be greatly lessened and that the amount of time spent teaching to the tests and administering the tests must also me reduced dramatically.

More on Failing Writing, and Students

Throughout the 1980s and 1990s, I taught English in the rural South Carolina high school I attended as a student. Many of those years, I taught Advanced Placement courses as part of my load (I taught all levels of English and usually sophomores and seniors) and was department chair.

Over the years, I worked hard to create an English department that served our students well. We made bold moves to provide all students in each grade the same literature textbooks (not different texts for different levels, as was the tradition, thus labeling students publicly) and to stop issuing to students grammar texts and vocabulary books (teachers retained classroom sets to use as they chose).

And a significant part of our English classes was the teaching of writing—having students write often and to produce multiple-draft essays. I stressed the need to end isolated grammar instruction (worksheets and textbook exercises) and urged that grammar, mechanics, and usage be addressed directly in the writing process.

Even though the principal was supportive and a former English teacher, at one faculty meeting while the administrators were discussing recent standardized test scores for the school (yes, this test-mania was in full force during the 80s and 90s in SC), the principal prefaced his comments about the English test scores with, “Keep in mind that the English scores may not reflect what we are doing here since we don’t teach grammar.”*

In a nut shell, that sort of mischaracterization and misunderstanding about best practice is at the foundation of my previous post exploring Joan Brunetta’s writing about how standards- and test-based schooling had failed her.

A few comments on the post and a follow up discussion in the comments with Robert Pondiscio—as well as a subsequent post by Pondiscio at Bridging Differences—have prompted me to continue to address not only how we still fail the teaching of writing but also how that failure is a subset of the larger failure of students by traditional approaches to teaching that are teacher-centered and committed to core knowledge.

Revisiting “The Good Student Trap” in the Accountability Era

Adele Scheele has coined the term “the good student trap,” which perfectly captures how schools create a template for what counts as being a good student and then how that template for success fails students once they attend college and step into the real world beyond school. My one caveat to Scheele’s ideas is that especially during the accountability era—a ramping up of traditional practices and norms for education—this trap affects all students, not just the good ones.

And the trap goes something like this, according to Scheele:

Most of us learned as early as junior high that we would pass, even excel if we did the work assigned to us by our teachers. We learned to ask whether the test covered all of chapter five or only a part of it, whether the assigned paper should be ten pages long or thirty, whether “extra credit” was two book reports on two books by the same author or two books written in the same period. Remember?

We were learning the Formula.

• Find out what’s expected.
• Do it.
• Wait for a response.

And it worked. We always made the grade. Here’s what that process means: You took tests and wrote papers, got passing grades, and then were automatically promoted from one year to the next. That is not only in elementary, junior, and senior high school, but even in undergraduate and graduate school. You never had to compete for promotions, write résumés, or rehearse yourself or even know anyone for this promotion. It happened automatically. And we got used to it….

What we were really learning is System Dependency! If you did your work, you’d be taken care of. We experienced it over and over; it’s now written in our mind’s eye. But nothing like this happens outside of school. Still, we remain the same passive good students that we were at ten or fourteen or twenty or even at forty-four. The truth is, once learned, system dependency stays with most of us throughout our careers, hurting us badly. We keep reinforcing the same teacher-student dichotomy until it is ingrained. Then we transfer it to the employers and organizations for whom we’ll work.

This model of traditional schooling includes a teacher who makes almost all the decisions and students who are rewarded for being compliant—and that compliance is identified as “achievement.”

In English classes, a subset of this process is reflected in how we teach, and fail, writing. As I noted in my earlier post, Hillocks and others have noted that traditional commitments to the five-paragraph essay (and cousin template-models of essays) and a return to isolated grammar exercises have resulted from the rise of high-stakes testing of writing. As well, the accountability era has included the central place of rubrics driving what students write, how teachers respond to student writing, and how students revise their essays.

So what is wrong with five-paragraph essays, grammar exercises, and rubrics?

Let’s focus on rubrics to examine why all of these are ways in which we fail writing and students. Alfie Kohn explains:

Mindy Nathan, a Michigan teacher and former school board member told me that she began “resisting the rubric temptation” the day “one particularly uninterested student raised his hand and asked if I was going to give the class a rubric for this assignment.”  She realized that her students, presumably grown accustomed to rubrics in other classrooms, now seemed “unable to function [emphasis added] unless every required item is spelled out for them in a grid and assigned a point value.  Worse than that,” she added, “they do not have confidence in their thinking or writing skills and seem unwilling to really take risks.”

Rubric-based writing and assessment, then, reflect the exact problem I highlighted earlier, one noted by Applebee and Langer: teachers know more today than ever about how to teach writing, but commitments to accountability and testing prevent that awareness from being applied in class; as Kohn explains:

What all this means is that improving the design of rubrics, or inventing our own, won’t solve the problem because the problem is inherent to the very idea of rubrics and the goals they serve.   This is a theme sounded by Maja Wilson in her extraordinary new book, Rethinking Rubrics in Writing Assessment. In boiling “a messy process down to 4-6 rows of nice, neat, organized little boxes,” she argues, assessment is “stripped of the complexity that breathes life into good writing.”  High scores on a list of criteria for excellence in essay writing do not mean that the essay is any good because quality is more than the sum of its rubricized parts.  To think about quality, Wilson argues, “we need to look to the piece of writing itself to suggest its own evaluative criteria” – a truly radical and provocative suggestion.

Wilson also makes the devastating observation that a relatively recent “shift in writing pedagogy has not translated into a shift in writing assessment.”  Teachers are given much more sophisticated and progressive guidance nowadays about how to teach writing but are still told to pigeonhole the results, to quantify what can’t really be quantified.  Thus, the dilemma:  Either our instruction and our assessment remain “out of synch” or the instruction gets worse in order that students’ writing can be easily judged with the help of rubrics.

Once fulfilling the expectations of the rubric becomes the primary if not exclusive goal for the student, we have the SAT writing section and the unintended consequences, as Newkirk explains (English Journal, November 2005) about students writing to prompts and rubrics for high-stakes testing:

George Hillocks Jr. has shown that another persistent problem with these types of prompts concerns evidence—the writer must instantly develop instances or examples to be used for support. In a sample of the released papers from the Texas state assessment, some of this evidence looks, well, manufactured….When I first read this essay, I imagined some free spirit, some rebel, flaunting the ethics of composition and inventing evidence to the point of parody. But when I shared this letter with a teacher from Texas, she assured me that students were coached to invent evidence if they were stuck [emphasis added]. In my most cynical moment, I hadn’t expected that cause. And what is to stop these coached students from doing the same on the SAT writing prompt? Who would know?

As but one example above, “the good student trap” is replicated day after day in the ways in which students are prompted to write and then how teachers respond to and grade that writing. The failure lies in who makes almost all of the decisions, the teacher, and who is rewarded for being mostly compliant, students.

While core knowledge advocates and proponents of rubric-driven assessment tend to misrepresent critical and progressive educators who seek authentic learning experiences for students with charges of “not teaching X” or “So what shall we teach?” (with the implication that core knowledge educators want demanding content but critical and progressive educators don’t), the real question we must confront is not what content we teach and students learn, but who decides and why.

If we return to rubrics, well designed rubrics do everything for students (see Education Done To, For, or With Students? for a full discussion of this failure), everything writers need to do in both college and the real world beyond school.

Rubric-driven writing is asking less of students than authentic writing in a writing workshop.

Traditional core knowledge classrooms are also deciding for students what knowledge matters, and again, asking less of students than challenging students to identify what knowledge matters in order to critique that knowledge as valuable (or not) for each student as well as the larger society. The tension of this debate is about mere knowledge acquisition versus confronting the norms of knowledge in the pursuit of individual autonomy and social justice—making students aware of the power implications of knowledge so that they live their lives with purpose and dignity instead of having life happen to them.

My call is not for ignoring the teaching of grammar, but for confronting the norms of conventional language so that students gain power over language instead of language having power over them. Why do we feel compelled not to end a sentence with a preposition? Where did that claim come from and who benefits from such a convention?

Why does academic writing tend to erase the writer from the writing (“No ‘I’!”) and who benefits from that convention?

You see, critical approaches to teaching go beyond the mere acquisition of knowledge that some authority has deemed worthy (what Freire labels the “banking concept” of teaching). Yes, knowledge matters, but not in the fixed ways core knowledge advocates claim and pursue. Critical approaches to knowledge honor the dignity of human autonomy in children, something that many adults seem at least leery if not fearful of allowing in their classrooms.

Core knowledge, rubrics, templates, prescriptions, and prompts are all tools of control, ways to trap students in the pursuit of compliance. They aren’t challenging (or “rigorous” as advocates like to say), and they aren’t learning.

As Scheele explains:

System dependency is not the only damaging thing we learned in the context of school: We learned our place….

Yet most of us were falsely lulled into a false self labeled “good” by fulfilling the expected curriculum. The alternative was being “bad” by feeling alienated and losing interest or dropping out….

So what’s the problem? The problem is the danger. The danger lies in thinking about life as a test that we’ll pass or fail, one or the other, tested and branded by an Authority. So, we slide into feeling afraid we’ll fail even before we do-if we do. Mostly we don’t even fail; we’re just mortally afraid that we’re going to. We get used to labeling ourselves failures even when we’re not failing. If we don’t do as well as we wish, we don’t get a second chance to improve ourselves, or raise our grades. If we do perform well, we think that we got away with something this time. But wait until next time, we think; then they’ll find out what frauds we are. We let this fear ruin our lives. And it does. When we’re afraid, we lose our curiosity and originality, our spirit and our talent-our life.

Beyond Rigor, Templates, and Compliance

In my position at a small and selective liberal arts university, I now teach mostly good students in my writing-intensive first year seminars. Students are asked to read and discuss Style, a descriptive look at grammar, mechanics, and usage that raises students’ awareness and skepticism about conventional uses of language, but rejects seeing conventions as fixed rules. (We ask why teachers in high school tend to teach students that fragments are incorrect when many published works contain fragments, leading to a discussion of purposeful language use.)

Throughout the course, students are asked to plan and then write four original essays that must be drafted several times with peer and my feedback. The focus, topic, and type of essay must be chosen by the student. To help them in those choices, we discuss what they have been required to do in high school for essays, we explore what different fields expect in college writing, and we read and analyze real-world essays in order to establish the context for the choices, and consequences of those choices, that writers make—specifically when those writers are students.

I offer this here in case you think somehow I am advocating “fluffy thinking” or a “do-your-own-thing philosophy” of teaching, as some have charged. And I invite you to ask my students which they prefer, which is easier—the template, prompt-based writing of high school that created their good student trap or my class. [HINT: Students recognize that five-paragraph essays and rubrics are easier, and they often directly ask me to just tell them what to write and how. As Mindy Nathan noted above, good students are “unable to function [emphasis added] unless every required item is spelled out for them in a grid and assigned a point value.”]

My students reinforce for me every class session that we have failed the teaching of writing and those students by doing everything for them in school. They are nearly intellectually paralyzed with fear about the consequences of their own decisions.

When challenged and supported to be agents of their own learning, their own coming to understand the world, and their own decisions about what knowledge matters and why, however, they are more than capable of the tasks.

And with them in mind, I must ask, who benefits from compliant, fearful students as intellectual zombies, always doing as they are told?

—–

* Although he phrased his comment poorly, my principal was, in fact, making a valid point that a multiple-choice English (grammar) test was unlikely to fairly represent what our students had learned about composing original essays. He intended to make a swipe at the quality of the test, although he did so gracelessly.

The New York Times in an Era of Kool-Aid Journalism

With Advertisements for the Common Core, the Editorial Board at The New York Times has offered its special brand of Kool-Aid journalism to the careless claim that 2013 NAEP data somehow prove education reform is a success:

The country is engaged in a fierce debate about two educational reforms that bear directly on the future of its schoolchildren: first, teacher evaluation systems that are taking hold just about everywhere, and, second, the Common Core learning standards that have been adopted by all but a few states and are supposed to move the schools toward a more challenging, writing-intensive curriculum.

Both reforms — or at least the principles behind them — got a welcome boost from reading and math scores released recently by the federal government. …

Two examples are the District of Columbia and Tennessee, among the first to install more ambitious standards and teacher evaluations. Tennessee jumped from 46th in the country in fourth-grade math two years ago to 37th, and from 41st in the nation to 34th in eighth-grade reading. The District of Columbia, though still performing below the national average, has also shown progress. The scores of its students improved significantly in both math and English.

Moreover, according to Education Secretary Arne Duncan, the eight states that managed to get the Common Core standards in place in time for the latest National Assessment of Educational Progress exams this year showed improvement from 2009 scores in either reading or math.

Kool-Aid journalism occurs when journalists relinquish their work as researchers and reporters to political appointees—in this case the Editorial Board of the NYT decides to turn Secretary Duncan’s baseless claims into statements of fact that support an editorial position. The Board concludes:

But the progress seen elsewhere — like Tennessee and the District of Columbia — shows that improvement is possible if the states strengthen their resolve and apply solutions that have been shown to work.

However, if the Editorial Board at the NYT had made even a basic effort at confirming Duncan’s claims, the Board could have discovered that NAEP data are complicated and cannot prove in any way that recent reforms are a success.

As I have detailed, and despite my not having any training as a journalist or as an investigative reporter, the Editorial Board could have benefitted from the following clarifications about NAEP that I found easily—all of which discredit Duncan’s claims and the Board’s position:

When I point out that raw changes in state proficiency rates or NAEP scores are not valid evidence that a policy or set of policies is “working,” I often get the following response: “Oh Matt, we can’t have a randomized trial or peer-reviewed article for everything. We have to make decisions and conclusions based on imperfect information sometimes.”

This statement is obviously true. In this case, however, it’s also a straw man. There’s a huge middle ground between the highest-quality research and the kind of speculation that often drives our education debate. I’m not saying we always need experiments or highly complex analyses to guide policy decisions (though, in general, these are always preferred and sometimes required). The point, rather, is that we shouldn’t draw conclusions based on evidence that doesn’t support those conclusions.

This shows that the places with the greatest gains were D.C., Tennessee, and Indiana, three places that have embraced the corporate reform strategy of testing, closing down schools, and opening charters.  If this was the only data we had access to, it would seem to prove that “the ends justify the means” when it comes to education reform….

There are many other things to analyze, and I’m looking forward to reading how others analyze the data.  For example, it is curious that Louisiana had ‘gains’ that were smaller than the national average despite that state having, certainly, the most aggressive reforms occurring.  For ‘reformers’ who are so obsessed with test scores and test score gains, this is certainly something that shouldn’t be ignored.  Also, Washington and Hawaii were pretty high up on the ‘growth’ numbers even though Washington does not have charter schools and Hawaii has been very slow to adopt Race To The Top reforms so their ‘gains’ can’t be attributed to those.

I’m still pretty confident that in the long run education reform based primarily on putting pressure on teachers and shutting down schools for failing to live up to the PR of charter schools will not be good for kids or for the country, in general.  I hope politicians won’t accept the first ‘gains’ chart without putting it into context with the rest of the data.

  • Latest NAEP Results, by G.F. Brandenburg exposes that DC gains pre-date the reforms championed by Duncan and the NYT:

First of all, the increases in some of the scores in DC (my home town) are a continuation of a trend that has been going on since about 2000. As a result of those increases, DC’s fourth grade math students, while still dead last in the nation, have nearly caught up with MISSISSIPPI, the lowest-scoring state in the US.

You will have to strain your imagination to see any huge differences between the trends pre-Rhee and post-Rhee. (She was installed after testing was over in 2007.)…

So, the Educational DEforms instituted by Rhee, Henderson, and their corporate masters have not produced the promised miracles.

Yesterday gave us the release of the 2013 NAEP results, which of course brings with it a bunch of ridiculous attempts to cast those results as supporting the reform-du-jour. Most specifically yesterday, the big media buzz was around the gains from 2011 to 2013 which were argued to show that Tennessee and Washington DC are huge outliers – modern miracles – and that because these two settings have placed significant emphasis on teacher evaluation policy – that current trends in teacher evaluation policy are working – that tougher evaluations are the answer to improving student outcomes – not money… not class size… none of that other stuff.

I won’t even get into all of the different things that might be picked up in a supposed swing of test scores at the state level over a 2 year period. Whether 2 year swings are substantive and important or not can certainly be debated (not really), but whether policy implementation can yield a shift in state average test scores in a two  year period is perhaps even more suspect….

Is Tennessee’s 2-year growth an anomaly? we’ll have to wait at least another two years to figure that out. Was it caused by teacher evaluation policies? That’s really unlikely, given that those states that are equally and even further above their expectations have approached teacher evaluation in very mixed ways and other states that had taken the reformy lead on teacher policies – Louisiana and Colorado – fall well below expectations.

As it stands, the position taken by the NYT Editorial Board lacks even the barest qualities of credibility, but it does expose the utter failure of Kool-Aid journalism.