Devaluing Teachers in the Age of Value-Added

We teach the children of the middle class, the wealthy and the poor,” explains Anthony Cody, continuing:

We teach the damaged and disabled, the whole and the gifted. We teach the immigrants and the dispossessed natives, the transients and even the incarcerated.

In years past we formed unions and professional organizations to get fair pay, so women would get the same pay as men. We got due process so we could not be fired at an administrator’s whim. We got pensions so we could retire after many years of service.

But career teachers are not convenient or necessary any more. We cost too much. We expect our hard-won expertise to be recognized with respect and autonomy. We talk back at staff meetings, and object when we are told we must follow mindless scripts, and prepare for tests that have little value to our students.

During the 1980s and 1990s, U.S. public schools and the students they serve felt the weight of standards- and test-based accountability—a bureaucratic process that has wasted huge amounts of tax-payers’ money and incalculable time and energy assigning labels, rankings, and blame. The Reagan-era launching of accountability has lulled the U.S. into a sort of complacency that rests on maintaining a gaze on schools, students, and test data so that no one must look at the true source of educational failure: poverty and social inequity, including the lingering corrosive influences of racism, classism, and sexism.

The George W. Bush and Barack Obama eras—resting on intensified commitments to accountability such as No Child Left Behind (NCLB) and Race to the Top (RTTT)—have continued that misguided gaze and battering, but during the past decade-plus, teachers have been added to the agenda.

As Cody notes above, however, simultaneously political leaders, the media, and the public claim that teachers are the most valuable part of any student’s learning (a factually untrue claim), but that high-poverty and minority students can be taught by those without any degree or experience in education (Teach for America) and that career teachers no longer deserve their profession—no tenure, no professional wages, no autonomy, no voice in what or how they teach.

And while the media and political leaders maintain these contradictory narratives and support these contradictory policies, value-added methods (VAM) of evaluating and compensating U.S. public teachers are being adopted, again simultaneously, as the research base repeatedly reveals that VAM is yet another flawed use of high-stake accountability and testing.

When Raj Chetty, John N. Friedman, and Jonah E. Rockoff released (and re-released) reports claiming that teacher quality equates to significant earning power for students, the media and political leaders tripped over themselves to cite (and cite) those reports.

What do we know about the Chetty, et al., assertions?

From 2012:

[T]hose using the results of this paper to argue forcefully for specific policies are drawing unsupported conclusions from otherwise very important empirical findings. (Di Carlo)

These are interesting findings. It’s a really cool academic study. It’s a freakin’ amazing data set! But these findings cannot be immediately translated into what the headlines have suggested – that immediate use of value-added metrics to reshape the teacher workforce can lift the economy, and increase wages across the board! The headlines and media spin have been dreadfully overstated and deceptive. Other headlines and editorial commentary has been simply ignorant and irresponsible. (No Mr. Moran, this one study did not, does not, cannot negate  the vast array of concerns that have been raised about using value-added estimates as blunt, heavily weighted instruments in personnel policy in school systems.) (Baker)

And now, a thorough review concludes:

Can the quality of teachers be measured the way that a person’s weight or height is measured? Some economists have tried, but the “value-added” they have attempted to measure has proven elusive. The results have not been consistent over tests or over time. Nevertheless, a two-part report by Raj Chetty and his colleagues claims that higher value-added scores for teachers lead to greater economic success for their students later in life. This review of the methods of Chetty et al. focuses on their most important result: that teacher value-added affects income in adulthood. Five key problems with the research emerge. First, their own results show that the calculation of teacher value-added is unreliable. Second, their own research also generated a result that contradicts their main claim—but the report pushed that inconvenient result aside. Third, the trumpeted result is based on an erroneous calculation. Fourth, the report incorrectly assumes that the (miscalculated) result holds across students’ lifetimes despite the authors’ own research indicating otherwise. Fifth, the report cites studies as support for the authors’ methodology, even though they don’t provide that support. Despite widespread references to this study in policy circles, the shortcomings and shaky extrapolations make this report misleading and unreliable for determining educational policy.

Similar to the findings in Edward H. Haertel’s analysis of VAM, Reliability and validity of inferences about teachers based on student test scores (ETS, 2013), the American Statistical Association has issued ASA Statement on Using Value-Added Models for Educational Assessment, emphasizing:

Research on VAMs has been fairly consistent that aspects of educational effectiveness that are measurable and within teacher control represent a small part of the total variation in student test scores or growth; most estimates in the literature attribute between 1% and 14% of the total variability to teachers. This is not saying that teachers have little effect on students, but that variation among teachers accounts for a small part of the variation in scores. The majority of the variation in test scores is attributable to factors outside of the teacher’s control such as student and family background, poverty, curriculum, and unmeasured influences.

The VAM scores themselves have large standard errors, even when calculated using several years of data. These large standard errors make rankings unstable, even under the best scenarios for modeling. Combining VAMs across multiple years decreases the standard error of VAM scores. Multiple years of data, however, do not help problems caused when a model systematically undervalues teachers who work in specific contexts or with specific types of students, since that systematic undervaluation would be present in every year of data.

Among DiCarlo, Baker, Haertel and the ASA, several key patterns emerge regarding VAM: (1) VAM remains an experimental statistical model, (2) VAM is unstable and significantly impacted by factors beyond a teacher’s control and beyond the scope of that statistical model to control, and (3) implementing VAM in high-stakes policies exaggerates the flaws of VAM.

The rhetoric about valuing teachers rings hollow more and more as teaching continues to be dismantled and teachers continue to be devalued by misguided commitments to VAM and other efforts to reduce teaching to a service industry.

VAM as reform policy, like NCLB, is sham-science being used to serve a corporate need for cheap and interchangeable labor. VAM, ironically, proves that evidence does not matter in education policy.

Like all workers in the U.S., we simply do not value teachers.

Political leaders, the media, and the public call for more tests for schools, teachers, and students, but they continue to fail themselves to acknowledge the mounting evidence against test-based accountability.

And thus, we don’t need numbers to prove what Cody states directly: “But career teachers are not convenient or necessary any more.”

VAMboozled by Empty-Suit Leadership in SC

Rep. Andy Patrick, R-Hilton Head Island (SC), has made two flawed claims recently, one about leadership and another about teacher evaluation (“S.C. lawmaker proposes teacher evaluation plan,” Charleston Post and Courier, December 10, 2013).*

First, and briefly, Patrick asserts that SC needs leadership for superintendent of education, discounting the importance of experience or expertise. As I will address below, Patrick’s lack of experience and expertise is, ironically, evidence that leadership is not enough. In fact, leadership begins with experience and expertise; it doesn’t replace those essential qualities.

Next, and more importantly, Patrick’s and current Superintendent Mick Zais’s pursuit of test-based teacher evaluation reform is deeply flawed and discredited by research on value added methods (VAM) of evaluating teachers.

Endorsing VAM-heavy teacher evaluation joins grade retention, charter schools, and Common Core as a series of policy decisions in SC that are countered by the research base—resulting in a tremendous waste of time and funding that should be better spent for our students and our state.

For example, Edward H. Haertel’s Reliability and validity of inferences about teachers based on student test scores (ETS, 2013) now offers yet another analysis that details how VAM fails, again, as a credible policy initiative. Haertel’s analysis offers the following:

  • First, Haertel addresses the popular and misguided perception that teacher quality is a primary influence on measurable student outcomes. As many researchers have detailed, teachers account for about 10% of student test scores. While teacher quality matters, access to experienced and certified teachers as well as addressing out-of-school factors dwarf narrow measurements of teacher quality.
  • Next, Haertel confronts the myth of the top quintile teachers, outlining three reasons that arguments about those so-called “top” teachers’ impact are exaggerated.
  • Haertel also acknowledges the inherent problems with test scores and what VAM advocates claim they measure—specifically that standardized tests create a “bias against those teachers working with the lowest- performing or the highest performing classes” (p. 8).
  • The next two sections detail the logic behind VAM as well as the statistical assumptions in which VAM is grounded, laying the basis for Haertel’s main assertion about using VAM in high-stakes teacher evaluations.
  • The main section of the report reaches a powerful conclusion that matches the current body of research on VAM:

These 5 conditions would be tough to meet, but regardless of the challenge, if teacher value-added scores cannot be shown to be valid for a given purpose, then they should not be used for that purpose.

So, in conclusion, VAM may have a modest place in teacher evaluation systems, but only as an adjunct to other information, used in a context where teachers and principals have genuine autonomy in their decisions about using and interpreting teacher effectiveness estimates in local contexts. (p. 25)

  • In the last brief section, Haertel outlines a short call for teacher evaluations grounded in three evidence-based “common features”:

First, they attend to what teachers actually do — someone with training looks directly at classroom practice or at records of classroom practice such as teaching portfolios. Second, they are grounded in the substantial research literature, refined over decades of research, that specifies effective teaching practices….Third, because sound teacher evaluation systems examine what teachers actually do in the light of best practices, they provide constructive feedback to enable improvement. (p. 26)

Haertel’s concession that VAM has a “modest” place in teacher evaluation is no ringing endorsement, but it certainly refutes the primary—and expensive—role that VAM is playing in proposals to reform teacher evaluation in SC and across the U.S.

Would SC benefit from focusing on teacher quality—as well as insuring all children have equitable access to experienced and certified teachers? Absolutely.

But current calls by leaders with no experience or expertise in education are failing that possibility by rushing to implement policy that is contradicted by a growing body of research discounting the value of VAM as a key element of teacher evaluation.

SC students, teachers, and schools cannot afford doubling-down on a failed test-based education culture, and certainly, SC cannot afford more leadership without expertise, which is what Representative Patrick is offering.

* Submitted to and unpublished in, so far, Charleston Post and Courier.

Please see VAMboozled web site for research refuting VAM.

VAM Fails Test, Again: The Bizarro World of Education Reform

The great state of South Carolina (and for full effect, you should hear that with “great” and “state” rhyming, sort of, with “pet” because that is how the good ol’ boy patriarchy says it around here) continues down a path all too familiar across the U.S.: adopt any and all education reform policies that other states are rushing to implement, even (and maybe especially) when research fails to support the practices.

I have catalogued the inexcusable political and public support in SC for retaining third graders based on high-stakes testing scores—a policy directly linked to Read, Florida.

And despite equally ample evidence to the contrary about basing teacher evaluations on value added methods (VAM), also a corrosive policy in Florida, Charleston, SC is moving forward with BRIDGE, characterized by Peter Smyth as A BRIDGE to I Have No Clue Where.

Public policy implementing grade retention, VAM, and lingering commitments to merit pay—just to name a few—continues to thrive in SC and across the U.S., seemingly as a bold-faced snub of the idealistic (and increasingly Orwellian) call in No Child Left Behind that education policy must be “scientifically based.”

Education Reform in Bizarro World

In the DC Universe, Superman has often encountered Bizarro World, Htrae. Education reform is no less bizarre with the political and public mania for policies that have been and continue to be refuted by large bodies of research.

For example, Edward H. Haertel’s Reliability and validity of inferences about teachers based on student test scores (ETS, 2013) now offers yet another analysis that details how VAM fails, again, as a credible policy initiative—with a few caveats*.

Briefly, the analysis by Haertel offers the following:

  • First, Haertel addresses the popular and misguided perception that teacher quality is a primary influence on measurable student outcomes. As many researchers have detailed, teachers account for about 10% of student test scores, as shown in this graphic (see p. 5):

graphic teach influence

  • Next, Haertel confronts the myth of the top quintile teachers (pp. 6-7*), outlining three reasons that arguments about those so-called “top” teachers’ impact are exaggerated.
  • Haertel also acknowledges the inherent problems with test scores and what VAM advocates claim they measure—specifically that standardized tests create a “bias against those teachers working with the lowest- performing or the highest performing classes” (p. 8).
  • The next two sections detail the logic behind VAM as well as the statistical assumptions in which VAM is grounded (pp. 9-13), laying the basis for Haertel’s main assertion about using VAM in high-stakes teacher evaluations.
  • The main section of the report, An Interpretive argument for value-added model (VAM)
    teacher effectiveness estimates (pp. 14-25), reaches a powerful conclusion that matches the current body of research on VAM:

These 5 conditions would be tough to meet, but regardless of the challenge, if teacher value-added scores cannot be shown to be valid for a given purpose, then they should not be used for that purpose.

So, in conclusion, VAMs may have a modest place in teacher evaluation systems, but only as an adjunct to other information, used in a context where teachers and principals have genuine autonomy in their decisions about using and interpreting teacher effectiveness estimates in local contexts. (p. 25)

  • In the last brief section, Haertel outlines a short call for teacher evaluations grounded in three evidence-based “common features”:

First, they attend to what teachers actually do — someone with training looks directly at classroom practice or at records of classroom practice such as teaching portfolios. Second, they are grounded in the substantial research literature, refined over decades of research, that specifies effective teaching practices….Third, because sound teacher evaluation systems examine what teachers actually do in the light of best practices, they provide constructive feedback to enable improvement. (p. 26)

Haertel’s concession that VAM has a “modest” place in teacher evaluation is no ringing endorsement, but it certainly refutes the primary—and expensive—role that VAM is playing in the rush to reform teacher evaluation in SC and across the U.S.

In the irony of ironies that can occur only in the Bizzaro World of education reform, each time VAM is tested, it fails, and each time it fails, more states line up to implement it.

* Haertel offers a more than generous analysis of the Chetty, Friedman, and Rockoff (2011) claim that teacher impact can be extrapolated into adult earning for students. I urge readers to examine Bruce Baker‘s and Matthew Di Carlo‘s more nuanced and cautious analyses of those claims.

When the Shoe Is on the Other Foot: Lessons for Teachers in Misguided Accountability

If we imagined a pictorial representation of the evolution of education accountability, similar to the standard image we associate with human evolution—

—then we’d have to confront that the accountability era begun in the early 1980s focused first on students, requiring them to pass exit exams (regardless of their having taken and passed all of the required courses for graduation) in order to receive their diplomas.

Next, schools were the target of accountability with the advent and distribution of school report cards.

By the end of the first and beginning of the second decades of the twenty-first century, teachers have found their place at the accountability table, with some suggesting that teachers are now being fed their just desserts. Merit pay linked to student test scores and the more recent flurry of implementing value-added methods (VAM) of teacher evaluation and retention in many ways bring teachers into decades-long predicaments faced by students and schools: the misguided and unfair weight of standardized testing used in dysfunctional and invalid ways.

When I posted about how absurd teacher accountability has become, I expected most on my Twitter feed to recognize the situation in New York as unfair and a harsh warning of the mounting weight of failed accountability:

A Bronx performing arts school’s dance instructor will be judged on students’ English exam scores. Physical education teachers at a transfer school in Brooklyn are going to teach Olympic history lessons to prepare students for the history tests that will help determine their ratings. And teachers in Queens are putting the fate of their evaluations into a final exam that they don’t teach, but yields high pass rates.

The scenarios are not unusual — across [New York City] this year, thousands of teachers will be rated in large part based on test scores of subjects and students that they do not teach.

Rather, the scenarios are examples of how schools have tried to comply with a new teacher evaluation system that must factor student performance into final ratings. They also represent how the original purpose of the evaluations, to differentiate teachers’ effectiveness, has been squeezed by restrictive state laws, limited resources, and a tight timeline for implementation.

“It’s insane to me that 40 percent of my evaluation is going to be based on someone else’s work,” said Jason Zanitsch, a high school drama teacher who will share the same “student growth” score with colleagues in his school this year.

However, the first response I received raised a much different point:

@plthomasEdD if teachers don’t like this then way assign all the group work which is just as bad for the kids? hmmm….

My first response was to note that holding teachers accountable for the work of other teachers and the test scores of students they do not even teach is not truly analogous to having students do group work, and then be graded for that group work.

As @Tim_10_ber and I exchanged tweets, I came to recognize that I was arguing from my idealized position on how best to implement group work (group work must require collaboration—or it is simply students sitting close to each other doing individual work—and any grades assigned to group work must be articulated to reflect participation) and @Tim_10_ber was confronting a position with which I agree—that group work is often implemented and graded carelessly and thus unfairly to students.

It is from that recognition, then, that I want to make an argument about the only potential positive outcome related to the unjustifiable use of merit pay and VAM in teacher evaluation, pay, and retention: teachers need to learn how to teach better now that the shoe is on the other foot. Some ironic lessons teachers should learn from invalid teacher accountability include the following:

  • Testing and grades often do far more educational harm than good; the time has come to consider de-testing and de-grading our teaching. Teacher feedback, student self-assessment, student-created rubrics, and re-imagined assessment situations (such as group assessments) and formats are all better alternatives to tests and grades, if our goal is equitable and effective learning opportunities for students.
  • The central flaw with teacher accountability being linked to student test scores and the standards movement is that teachers have experienced declining autonomy in both their content and pedagogy as well as the high-stakes tests themselves. Accountability without autonomy is tyranny. This lesson translates into how often student learning is reduced to mere compliance. Students being held accountable also must have their autonomy honored; thus, students deserve far more choice in their learning than they have been traditionally allowed.
  • As noted by @Tim_10_ber, teachers must be far more vigilant about designing, assigning, and assessing group work, with a keen eye on autonomy, engagement, and causation/correlation (what are fair associations between each student and the outcomes of the group).

The accountability era has nearly destroyed public education. Little about accountability based on standards and high-stakes testing can be embraced or endorsed.

But oppressive and even capricious mandates tend to be leveled at the least among us first; once those policies trickle up to those in power—in other words, when the shoe is on the other foot—living with inequity, unfair accountability, and unworkable conditions can open our eyes to our own flaws as teachers.

As we continue to fight for our professional autonomy and dignity, taking moral stands of non-cooperation, let’s be sure to bring that fight to our classrooms and honor the autonomy and dignity of all our students as a model for those in power who have yet to see the flaws of their ways through the distorting lens of privilege they wear.

In the words of Henry David Thoreau in “Civil Disobedience”:

If I devote myself to other pursuits and contemplations, I must first see, at least, that I do not pursue them sitting upon another man’s shoulders. I must get off him first, that he may pursue his contemplations too….

If the injustice is part of the necessary friction of the machine of government, let it go, let it go; perchance it will wear smooth — certainly the machine will wear out. If the injustice has a spring, or a pulley, or a rope, or a crank, exclusively for itself, then perhaps you may consider whether the remedy will not be worse than the evil; but if it is of such a nature that it requires you to be the agent of injustice to another, then, I say, break the law. Let your life be a counter friction to stop the machine. What I have to do is to see, at any rate, that I do not lend myself to the wrong which I condemn [emphasis added].

What We Know Now (and How It Doesn’t Matter)

Randy Olson’s Flock of Dodos (2006) explores the evolution and Intelligent Design (ID) debate that represents the newest attack on teaching evolution in U.S. public schools. The documentary is engaging, enlightening, and nearly too fair considering Olson admits upfront that he stands with scientists who support evolution as credible science and reject ID as something outside the realm of science.

Olson’s film, however, offers a powerful message that rises above the evolution debate. Particularly in the scenes depicting scientists discussing (during a poker game) why evolution remains a target of political and public interests, the documentary shows that evidence-based expertise often fails against clear and compelling messages (such as “teach the controversy”)—even when those clear and compelling messages are inaccurate.

In other words, ID advocacy has often won in the courts of political and public opinion despite having no credibility within the discipline it claims to inform—evolutionary biology.

With that sobering reality in mind, please identify what XYZ represents in the following statement about “What We Know Now”:

Is there a bottom line to all of this? If there is one, it would appear to be this: Despite media coverage, which has been exceedingly selective and misrepresentative, and despite the anecdotal meanderings of politicians, community members, educators, board members, parents, and students, XYZ have not been effective in achieving the outcomes they were assumed to aid….

This analysis is addressing school uniform policies, conducted by sociologist David L. Brunsma who examined evidence on school uniform effectiveness (did school uniform policies achieve stated goals of those policies) “from a variety of data gathered during eight years of rigorous research into this issue.”

This comprehensive analysis of research from Brunsma replicates the message in Flock of Dodos—political, public, and media messaging continues to trump evidence in the education reform debate. Making that reality more troubling is that a central element of No Child Left Behind was a call to usher in an era of scientifically based education research. As Sasha Zucker notes in a 2004 policy report for Pearson, “A significant aspect of the No Child Left Behind Act of 2001 (NCLB) is the use of the phrase ‘scientifically based research’ well over 100 times throughout the text of the law.”

Brunsma’s conclusion about school uniform policies, I regret to note, is not an outlier in education reform but a typical representation of education reform policy. Let’s consider what we know now about the major education reform agendas currently impacting out schools:

Well into the second decade of the twenty-first century, then, education reform continues a failed tradition of honoring messaging over evidence. Neither the claims made about educational failures, nor the solutions for education reform policy today are supported by large bodies of compelling research.

As the fate of NCLB continues to be debated, the evidence shows not only that NCLB has failed its stated goals, but also that politicians, the media, and the public have failed to embrace the one element of the legislation that held the most promise—scientifically based research—suggesting that dodos may in fact not be extinct.

* Santelices, M. V., & Wilson, M. (2010, Spring). Unfair treatment? The case of Freedle, the SAT, and the standardization approach to differential item functioning. Harvard Educational Review, 80(1), 106-133.; Spelke, E. S. (2005, December). Sex differences in intrinsic aptitude for mathematics and science? American Psychologist, 60(9), 950-958; See page 4 for 2012 SAT data: http://media.collegeboard.com/digitalServices/pdf/research/TotalGroup-2012.pdf

A Call for Non-Cooperation: So that Teachers Are Not Foreigners in Their Own Profession

Gandhi’s views on enhancing the vernaculars…so that Indians are “not foreigners in their own land” are directly tied to his opinions on developing communities (for “the poorest of the poor” ) and making community service an integral part of any education. (Ramanathan, 2006, pp. 235-236)

Standing in the middle of the road offers some statistical advantage to avoiding being run over since you aren’t in the prescribed lanes of traffic, but standing in the middle of the road can never assure the safety that refusing to walk into the road to begin with does.

Writing about a call for a moratorium on implementing and testing Common Core State Standards (CCSS) from union leadership, Anthony Cody ends his blog post with three questions:

What do you think? Should we join Randi Weingarten in pushing for one year’s delay in the harsh consequences attached to Common Core assessments? Will this year put the project on sound footing?

These questions about CCSS have been joined by two other calls for compromise and civility—Matthew Di Carlo challenging charges that value-added methods (VAM) of teacher evaluation are “junk science” and Jennifer Jennings penning an apology to Secretary of Education Arne Duncan for protests at his 2013 talk at American Educational Research Association (AERA). [1]

Weingarten, Di Carlo [2], and Jennings share a call for standing in the middle of the road, a quest for ways to compromise, and these all appear reasonable positions. Ultimately, however, moratoriums, compromise, and civility are all concessions to the current education reform movement and the policies at the center of those reforms, specifically CCSS and VAM.

Teachers as Foreigners in Their Own Profession

Briefly, I want to identify how arguments about a CCSS moratorium, implementing VAM properly and cautiously, and the need for civility are concessions that render teachers foreigners in their own profession.

As long as the debate about CCSS and VAM remain how best to implement them, the essential questions remain unasked, and the agenda behind both are assured success. While I want to address the civility argument next, let me note here that calls for CCSS and VAM are inherently civil and derogatory, exposing the myopic concern for the civility of those rejecting Duncan’s discourse and policies.

The implied and stated messages of calls for CCSS and more high-stakes testing include the following: (1) Teachers do not know what to teach, or how, and (2) teachers are unlikely to perform at the needed levels of effort in their profession unless they are held accountable by external and bureaucratic means.

The implied and stated messages of calls for VAM and merit pay include the following: (1) The most urgent problem at the core of educational outcomes is teacher quality, and (2) teachers are unlikely to perform at the needed levels of effort in their profession unless they are held accountable by external and bureaucratic means.

Calls for CCSS and VAM also share another implied and stated message: Failed educational outcomes are the result of in-school deficiencies; in effect, out-of-school factors are irrelevant in the pursuit of education reform.

These messages are factually false and, despite the civility of the language, irrevocably offensive.

Standing in the middle of the road of bureaucratic, accountability-based school reform, then, may decrease the likelihood of being run over, but it concedes the road itself to those who have built it, to those who govern the laws of transportation.

To answer Cody’s second and third questions, then, No. And now to his first.

Civility: Standing in the Middle of the Road of Accountability

The call for civility exposes a foundational problem with the current education reform debate because, for all practical purposes, there is no debate.

Civility, CCSS, and VAM may all have some appeal in theory, but all of them fall apart in reality, in their implementation.

Civility is the last recourse of the powerful, those who can afford to appear civil because they hold all the power.

Through the lens of history, we must recognize that CCSS will become “what is testing is what is taught,” as all standards movements have shown.

VAM also sits in a long history of the corrosive consequences of stack ranking, merit pay, and competition.

And this brings us back to standing in the middle of someone else’s road.

Education reform and policy have been historically and are currently under the control of political and corporate leadership who are not educators—many of whom did not even attend public schools, many of whom send their own children to schools unlike the environments they promote and implement.

The locus of power in education is catastrophically inverted; thus, we do not need more or different mechanisms for accountability-based education reform, but we do need a new era of non-cooperation.

The goal of non-cooperation must include seeking ways in which to shift the priorities of the locus of power:

  • First, the central locus of power in education is the student, situated in her/his home and community.
  • Next in importance is the locus of power afforded the teacher in her/his unique classrooms.
  • These must then merge for a locus of power generated within the community of the school.
  • Finally, the locus of power in this school-based community must radiate outward.

A Call for Non-Cooperation

Non-cooperation, as found in the philosophy and actions of Gandhi, represents another inversion—away from in-school only education reform and toward, as Ramanathan explains, “communal and educational change”:

As is evident, the take on “education” presented here is not the usual one—of teaching and learning in formal contexts of classrooms and institutions—but one that is intended to move us toward becoming collectively open to realizing that very valuable “education” often goes on outside the constraints of classrooms: in ashrams, in madrassas, in extracurricular programs, by local, politically minded youth, all drawing on local vernacular ways of healing rifts. Indeed, “education” in both these institutions is civic and community education that seems to assume Gandhian ideals of “Non-Cooperation” (and nonformal education) and that is aimed at primarily effecting changes in the community, sometimes before addressing issues relevant to formal education. (p. 230)

Non-cooperation, then, moves beyond a call for teacher autonomy; instead, non-cooperation is the act of the autonomy by “people directly involved” (Ramanathan, p. 231):

Not only do they have Gandhi’s larger philosophy of Non-Cooperation against political hegemonies  [emphasis added] at their core…, but they also opened up for me a way of understanding both how Gandhianism is situated and how particular dimensions of the identities of participants (Kanno, 2003; Menard-Warwick, 2005; Norton, 2000; Pavlenko & Blackledge, 2004) get laminated. I was able to see how Gandhianism is first collaboratively interpreted in workshops, then applied and translated on the ground in most local of contexts, and then recast and reinterpreted by individuals and groups as they regroup. (Ramanathan, p. 232)

Non-cooperation is a new paradigm that begins with those most directly impacted by the institution (here, education)—parents, students, teachers. In other words, the people most directly impacted ask the foundational questions: Do we need formal education? And if so, what does that include and how should that be implemented?

This is not about seeking compromise at someone else’s table, not about standing still in the middle of someone else’s road.

The purposes of universal public education, then, is refocused in the ways that address the needs of the least among us, as Gandhi envisioned:

[Nonformal education] … will check the progressive decay of our villages and lay the foundation for a juster social order in which there is no unnatural division between the “haves” and the “have nots” and everybody is assured a living wage and the rights to freedom.…It will provide a healthy and a moral basis of relationship between the city and village and will go a long way towards eradicating some of the worst evils of the present social insecurity and poisoned relationship between the classes. (Harijan, 9-10-37, cited in Prasad, 1924…). (qtd. in Ramanathan, p. 236)

Bureaucratic accountability-based reform is ill equipped to address inequity, mismatched with goals of social justice since the paradigm is authoritarian, the locus of power exclusively with the “haves.”

Non-cooperation seeks instead, as Ramanathan explains:

[an orientation] toward viewing education in broader, community-oriented terms to draw out “the best in children,” to build a “healthy and moral” base for both “the city and the village,” to be entirely secular in its orientation (with “no room … for sectional religious training,” and to eventually transform the “homes of the pupils”[)]. (p. 237)

As well, this call for non-cooperation reframes the civility debate, as Gandhi recognized: “We must welcome them to our political platforms [emphasis added] as honoured guests. We must meet them on neutral platforms as comrades” (qtd. in Ramanathan, p. 237). Civility then follows the re-imagining of the locus of power: “Non-Cooperation…emerges as a deeply historicized awareness committed to doing the opposite of repressive, silencing ills. The quiet way in which both projects bridge perceived gulfs are reminiscent of Gandhi’s insistence on responding to tyranny by searching for nonviolent, quiet alternatives that tap the moral instincts of humans” (Ramanathan, p. 238).

Currently, since calls for CCSS, VAM, and civility all work as “repressive,” “silencing,” and “tyranny,” non-cooperation is the only alternative remaining.

The results must be “interpreting all education as ‘civic education’ and on attending to the most basic of human needs—food, clothing, shelter—before addressing any issues related to formal learning”  (Ramanathan, pp. 241-242) as direct action refusing to compromise on in-school only education reform that drives arguments for how best to implement CCSS and VAM:

This close attention to “educating oneself,” of figuring out and questioning one’s own default assumptions, has echoes of Gandhi’s Non-Cooperation, and finds interesting articulation in the idea that we each need to “not cooperate” with our default views but attempt to step outside them by “educating ourselves” by learning from others. (Ramanathan, pp. 244)

In the West, specifically in the United States, we are deeply entrenched in our “default views,” most of which are tinted by commitments to competition, authoritarian structures, and the sanctity of the individual. This call, however, is a call to recognize the importance of community and social justice in our national pursuit of democracy.

Arundhati Roy confronts the tensions at the core of why compromise, moratoriums, and civility fail the narrow education debate as well as the broader democracy:

Fascism is about the slow, steady infiltration of all the instruments of state power. It’s about the slow erosion of civil liberties, about unspectacular, day-to-day injustices.…It means keeping an eagle eye on public institutions and demanding accountability. It means putting your ear to the ground and listening to the whispering of the truly powerless. It means giving a forum to the myriad voices from the hundreds of resistance movements across the country that are speaking about real issues….It means fighting displacement and dispossession and the relentless, every violence of abject poverty. (Roy, 2002; qtd. Ramanathan, pp. 246)

Now is the time for non-cooperation, not moratoriums, not compromise, and not civility on other people’s terms.

Now is the time for non-cooperation so that teachers are not foreigners in their own profession and students are not foreigners in their own classrooms.

[1] See also Jeff Bryant.

[2] Of the three calls for moderation, I do not place Di Carlo’s position as essentially equal to those by Weingarten and Jennings. Di Carlo’s nuanced and detailed discussion of VAM contributes a credible position that I find compelling to a point (such as Di Carlo conceding: “Now, I personally am not opposed to using these estimates in evaluations and other personnel policies”); however, Weingarten and Jennings present far more problems and suffer from a much greater degree of lacking credibility.

Conservative Leadership Poor Stewardship of Public Funds

In South Carolina and across the U.S., conservative leadership of education reform has failed to fulfill a foundational commitment to traditional values, good stewardship of public funds. [1]

The evidence of that failed stewardship is best exposed in commitments to three education reform policies: Adopting and implementing Common Core State Standards (CCSS), designing and implementing new tests based on CCSS, and proposing and field-testing revised teacher evaluations based on value-added models (VAM).

SC committed a tremendous amount of time and public funding to the accountability movement thirty years ago as one of the first states to implement state standards and high-stakes testing. After three decades of accountability, SC, like every other state in the union, has declared education still lacking and thus once again proposes a new round of education reform primarily focusing on, yet again, accountability, standards, and high-stakes testing.

Several aspects of committing to CCSS, new high-stakes tests, and teacher evaluation reform that are almost absent from the political and public debate are needs and cost/benefit analyses of these policies.

More of the Same Failed Policies?

If thirty years of accountability has failed, why is more of the same the next course of reform? If thirty years of accountability has failed, shouldn’t SC and other states first clearly establish what the problems and goals of education are before committing to any policies aimed at solving those problems or meeting those goals?

Neither of these questions have been adequately addressed, yet conservative political leadership is racing to commit a tremendous amount of public funding and public workers’ time to CCSS, an increase in high-stakes testing never experienced by any school system, and teacher evaluations proposals based on discredited test-based metrics.

Just as private corporations have reaped the rewards of tax dollars in SC during the multiple revisions of our accountability system, moving through at least three versions of tests and a maze of reformed state standards, the only guaranteed outcomes of commitments to CCSS, new tests, and reformed teacher evaluations are profits for textbook companies, test designers, and private consultants—all of whom have already begun cashing in on branding materials with CCSS and the yet-to-be designed high-stakes tests that will eventually be implemented twice a year in every class taught in the state.

SC as a state and as an education system is burdened by one undeniable major problem, inequity of opportunities in society and in schools spurred by poverty.

Numerous studies in recent years have shown that schools across the U.S. tend to reflect and perpetuate inequity; thus, children born into impoverished homes and communities are disproportionately attending schools struggling against and mirroring the consequences of poverty.

Commitments in SC to CCSS, new high-stakes tests, and reforming teacher evaluations based in large part on those new tests are at their core poor stewardship of public funding in a state that has many more pressing issues needing the support of state government.

A further problem with conservative leadership endorsing these education reforms is that much of the motivation for CCSS, new test, and reforming teacher evaluations comes from funding mandates by the federal government.

Misguided education reform is not only a blow to conservative economics but also a snub to traditional trust in local government over federal control.

Recently, as well, a special issue on VAM from Education Policy Analysis Archives (EPAA) includes two analyses that should give policy makers in SC and all states key financial reasons to pause if not halt commitments to education reform based on student test scores—the potential for legal action from a variety of stakeholders in education.

Baker, Oluwole, and Green explain: “Overly prescriptive, rigid teacher evaluation mandates, in our view, are likely to open the floodgates to new litigation over teacher due process rights. This is likely despite the fact that much of the policy impetus behind these new evaluation systems is the reduction of legal hassles involved in terminating ineffective teachers.”

Further, Pullin warns: “For public policymakers, there are strong reasons to suggest that high-stakes implementation of VAM is, at best, premature and, as a result, the potential for successful legal challenge to its use is high. The use of VAM as a policy tool for meaningful education improvement has considerable limitations, whether or not some judges might consider it legally defensible.”

Do schools across SC need education reform? Yes, just as social policy in the state needs to address poverty as a key mechanism for supporting those schools once they are reformed.

But in a state driven by traditional values and conservative political leadership, current commitments to CCSS, new high-stakes tests, and reforming teacher evaluations are neither educationally sound nor conservative.

[1] Expanded version of Op-Ed published in The State (Columbia, SC), March 8, 2013: “Conservatives poor stewards of education funds”

Daily Kos: Misreading Teacher Evaluation and Retention

Daily Kos: Misreading Teacher Evaluation and Retention

The League of Women Voters of South Carolina has released “How to Evaluate and Retain Effective Teachers” (2011-2013), but this report misreads the evidence on teacher evaluation and thus distracts high-poverty states from needed educational reform. [1]

A review of the report shows it does not establishing a clear problem with teacher quality in SC and misrepresents the current body of research on teacher evaluation, particularly value added methods (VAM) of evaluation.

As a high-poverty and racially diverse state, SC is similar to many other states facing educational hurdles, but those hurdles have less to do with identifying and ranking teacher quality and more to do with the inequitable distribution of teachers. Children of color, children in poverty, English language learners, and special needs students are taught disproportionately by inexperienced and un-/under-certified teachers. SC and other high-poverty states would do well to address teacher assignment and teaching conditions before experimenting with new teacher evaluation systems.

Ultimately, this report misreads and misrepresents the current understanding of how to evaluate and determine teacher quality—specifically through test-based methods.

continue reading at Daily Kos

NFL again a Harbinger for Failed Education Reform?

During the impending NFL strike in 2011—the act of a union—I drew a comparison between how the public in the U.S. responds to unionization in different contexts:

“I am speaking about the possible NFL strike that hangs over this coming Super Bowl weekend: a struggle between billionaires and millionaires, which, indirectly, shines an important light on the rise of teacher and teacher union-bashing in the US. Adam Bessie, in Truthout, identifies how the myth of the bad teacher has evolved.”

Once again, the NFL is facing a situation that I believe and even hope is another harbinger of how education reform can be halted: A suit filed by the family of Junior Seau:

“The family said the league not only ‘propagated the false myth that collisions of all kinds, including brutal and ferocious collisions, many of which lead to short-term and long-term neurological damage to players, are an acceptable, desired and natural consequence of the game,’ but also that ‘the N.F.L. failed to disseminate to then-current and former N.F.L. players health information it possessed’ about the risks associated with brain trauma.”

This law suit has prompted a considerable amount of debate concerning whether or not the NFL as we currently know it could be dramatically reconfigured under the pressure of more law suits. In other words, the inherent but often ignored or concealed dangers of football are now being exposed by legal action, in much the same way as the tobacco industry was unmasked and thus the entire culture of smoking has radically changed in the last couple decades.

With the release of the Education Policy Analysis Archives (EPAA) Special Issue on “Value-Added Model (VAM) Research for Educational Policy,” a similar question should now be raised about the future of implementing high-stakes accountability policies that focus on teacher evaluation and retention through VAM-style metrics.

“High-Stakes Implementation of VAM,…Premature”

Two articles in the special issue from EPAA examines the validity and reliability of VAM-based teacher evaluation in high-stakes settings and then places these policies in the context of legal ramifications faced by districts and states for those policies.

“The Legal Consequences of Mandating High Stakes Decisions Based on Low Quality Information: Teacher Evaluation in the Race-to-the-Top Era” (Baker, Oluwole, & Green, 2013) identifies the current trend: “Spurred by the Race-to-the-Top program championed by the Obama administration and a changing political climate in favor of holding teachers accountable for the performance of their students, many states revamped their tenure laws and passed additional legislation designed to tie student performance to teacher evaluations” (p. 3). Because of the political and public momentum behind reforming teacher evaluation, Baker, Oluwole, and Green seek “to bring some urgency to the need to re-examine the current legislative models that put teachers at great risk of unfair evaluation, removal of tenure, and ultimately wrongful dismissal” (p. 5).

While Baker, Oluwole, and Green offer a detailed and evidence-based examination of the VAM-based and student growth model approaches to high-stakes teacher accountability, they ultimately place the weaknesses of reform policies in the context of potential challenges from teachers who believe they have been wrongfully evaluated or dismissed:

“In this section, we address the various legal challenges that might be brought by teachers dismissed under the rigid statutory structures outlined previously in this article. We also address how arguments on behalf of teachers might be framed differently in a context where value-added measures are used versus one where student growth percentiles are used. Where value-added measures are used, we suspect that teachers will have to show that while those measures were intended to attribute student achievement to their effectiveness, the measures failed to do so in a number of ways. That is, where value-added measures are used to assign effectiveness ratings, we suspect that the validity and reliability, as well as understandability of those measures would need to be deliberated at trial. However, where student growth percentiles are used, we would argue that the measures on their face are simply not designed for attributing responsibility to the teacher, and thus making such a leap would necessarily constitute a wrongful judgment. That is, one would not necessarily even have to vet the SGP measures for reliability or validity via any statistical analysis, because on their face they are invalid for this purpose.”

The analysis ultimately discredits both the use of narrow metrics to determine teacher quality and the high-stakes policies being implemented using those metrics, concluding with the ironic consequences of these policies: “Overly prescriptive, rigid teacher evaluation mandates, in our view, are likely to open the floodgates to new litigation over teacher due process rights. This is likely despite the fact that much of the policy impetus behind these new evaluation systems is the reduction of legal hassles involved in terminating ineffective teachers” (pp. 18-19).

In “Legal Issues in the Use of Student Test Scores and Value-added Models (VAM) to Determine Educational Quality” (Pullin, 2013), the rapid increase of VAM-based accountability is further examined in the context of “a wide array of potential legal issues [that] could arise from the implementation of these programs” (p. 2).

Pullin notes the motivation for reforming teacher evaluation:

“VAM initiatives are consistent with a highly publicized press from the business community and many politicians to make government services more like private business, data-driven to measure productivity and accountability (Kupermintz, 2003). VAM approaches are in part a response to concerns that the current system of selecting and compensating teachers based their education and credentials is insufficient for insuring teacher quality (Corcoran, 2011; Gordon, Kane & Staiger, 2006; Hanushek & Rivkin, 2012; Harris, 2011). There have been increasing expressions of concern that teacher evaluation practices are not robust and do not improve practice (Kennedy, 2010). In the contemporary public policy context, much of the support for the use of student test scores for educator evaluation comes from a concern that the current system for evaluation is ineffective and that the current legal protections for teachers are too cumbersome for schools seeking to terminate teachers (Harris, 2009, 2011).”

While a business model for addressing quality control of a work force may seem efficient, Pullin highlights that legal ramifications are likely with these new models.

Pullin’s analysis offers a detailed and useful examination of previous court cases involving the use of test scores to evaluate educators, including recent cases involving VAM, concluding that the picture is not clear on how the courts may rule in the future, but that a pattern exists of “heavy judicial deference to state and local education policymakers and the allure of using test scores to make decisions about education quality” (p. 5).

Further, Pullin notes “there are differences of perspective among social scientists about VAM and the defensibility of using it to make high-stakes decisions about educators,” further complicating the concerns of legal action (p. 9).

While raising many other complications, Pullin also notes that students and parents may enter legal battles using VAM metrics “to substantiate their own legal claims that schools are not meeting their obligations to provide education” (p. 14).

Pullin concludes with a sobering look at teacher quality reform built on VAM and implemented in high-stakes environments:

“In the broad contemporary public policy context for education reform, the desire for accountability and transparency in government, coupled with heavily financed criticisms of public school teachers and their unions, may mean that VAM initiatives will prevail. The concerns of education researchers about VAM, coupled with legal obligations for the validity and reliability of education and evaluation programs should require judges and education policymakers to take a closer look for future decision-making. At the same time, the social science research community should be generating substantial new and persuasive evidence about VAM and the validity and reliability of all of its potential uses. For public policymakers, there are strong reasons to suggest that high-stakes implementation of VAM is, at best, premature and, as a result, the potential for successful legal challenge to its use is high. The use of VAM as a policy tool for meaningful education improvement has considerable limitations, whether or not some judges might consider it legally defensible.” (p. 17)

Like the NFL, federal and state governments may soon be compelled to reform the reform movement under the threat of legal action from a variety of stakeholders since the science of teacher evaluation remains far behind the curve of implementation, particularly when teacher evaluation is high-stakes and based on VAM and other metrics linked to student test scores.

The special issue from EPAA is yet another call for political leadership to pause if not end wide-scale teacher evaluation and retention models that pose legal, statistical, and funding challenges that those leaders appear unwilling to acknowledge or address.

VAM: A Primer

Education reform has existed at some policy and public levels since at least the 1890s in the U.S. The current reform movement grounded in state-based accountability began in the early 1980s with a Nation at Risk, and then was nationalized in 2001 with No Child Left Behind.

In the first decades of the recent accountability era, standards and high-stakes testing were implemented and periodically revised at the state level with the primary focus being on student and school accountability. The current cycle, however, has seen an increase in policies and practices aimed at teacher accountability and increasing teacher quality—despite a solid research base showing that teacher quality constitutes only 10-15% of measurable student outcomes (test data).

The focus on increasing teacher quality and accountability has included both experiments and policies with value added methods (VAM) of determining teacher quality in order to label, rank, sort, and retain or dismiss teachers. VAM claims to isolate teacher quality through pre- and post-testing methods that seek to identify teacher quality and isolate that from the other factors reflected in test scores.

Before policy-makers and stakeholders in education commit to reforming teacher evaluation and retention, foundational questions must be addressed, and then the current facts about VAM must be acknowledged.

First, the foundational questions:

(1) What evidence exists identifying teacher quality as a primary or significant problem facing a school or district? Where, then, does teacher quality rank as a priority for a school, district, or state in terms of cost effectiveness in committing funding to the reform?

(2) Are all elements of implementing VAM at any percentage to the revised teacher evaluation process valid for determining teacher quality? In other words, what measures are taken to account for using student test scores (designed to reflect student learning, and not designed to reflect teacher quality) as data points for teacher quality?

(3) Have the teaching and learning environments for students’ home and schools been addressed to insure equitable teaching and learning environments in which determining teacher quality becomes valid?

Next, what is the current knowledge base about VAM*?:

(1) Including VAM at any percentage in reformed teacher evaluation models is currently in the experimental phase. Data and the validity of including VAM are being tested, but almost no researchers currently claim VAM (at any percentage but certainly at high percentages such as 40-50%) to be ready for widespread implementation.

(2) VAM models for labeling teacher quality are highly unstable. Teacher rankings tend to shift with new populations of students.

(3) Researchers agree that VAM is unlikely ever to be completely stable; thus, it is possible that VAM will never be a practical or fair element in teacher evaluation, particularly at the individual teacher level or in any single year.

(4) The statistical and practical requirements to isolate teacher quality in student test scores pose tremendous costs in time and funding that also may prove to be not cost effective in the context of most school systems’ priorities. In order to implement a fair and equitable teacher evaluation system that includes VAM at any percentage, states must create and implement pre- and post-tests to all students in all teachers’ courses, creating a new and costly commitment to education funding.

(5) Decades of research on high-stakes testing have shown many negative unintended consequences to accountability measures focusing on students and schools; including VAM in high-stakes accountability policies focusing on teachers is likely to have similar unintended negative consequences such as discouraging high-quality teachers from working with high-needs populations of students.

Thus, policy-makes and stakeholders are strongly cautioned to consider education reform priorities, the experimental nature of VAM, and the current knowledge base on VAM before committing tax dollars to either field testing or implementing new teacher evaluation policies built in any way on VAM.

* See a recent review of teacher evaluation reform for links and citations to numerous research studies on VAM.