Catherine Joynson and Ottoline Leyser’s The culture of scientific research identifies the motivation of scientists, which:
provide[s] additional insights into how they view research, and the majority of the survey respondents clearly chose a career in science in order to find out more about the world around them. When respondents were asked to rank phrases to describe what they believe motivates them in their work, the top three were:
- Improving my knowledge and understanding
- Making scientific discoveries for the benefit of society
- Satisfying my curiosity
And then, they confront the impact of competition:
High levels of competition in scientific research emerged as a strong theme running through all the project activities. Applying for funding is thought to be very competitive by the majority of the survey respondents (94 per cent), as is applying for jobs and promotions (77 per cent). Around nine in ten think making discoveries and gaining peer recognition is quite or very competitive.
High levels of competition for jobs and funding in scientific research are believed by survey respondents both to bring out the best in people and to create incentives for poor quality research practices, less collaboration, and headline chasing [emphasis added]. For example, behaviours such as rushing to finish and publish research, employing less rigorous research methods and increased corner-cutting in research were raised by 29 per cent of survey respondents who commented on the effects of competition on scientists.
Immediately this analysis reminded me of the increasing political calls for weeding out “bad” teachers and concurrently rewarding good teachers, notably from the right but also from the left, including support for value-added methods (VAM) for teacher evaluation and retention as well as merit pay.
While simplistic calls for rewarding good teachers are politically popular, they fail to confront the inherent negative consequences and to acknowledge the research base on what exactly motivates teachers.
Teaching and learning are highly sensitive to the same problems noted above about science: VAM and merit pay create competitive cultures in schools, discouraging collaboration and incentivizing teachers to view their students as tools of success (and thus, creating winners and loser when we claim a goal of everyone winning).
Research shows that merit pay for teachers is harmful:
Some researchers have warned, however, that merit pay may change the relationships between teachers and students: poor students may pose threats to the teacher’s rating and rewards [emphasis added] (Johnson 1986). Another concern is that merit pay plans may encourage teachers to adjust their teaching down to the program goals, setting their sights no higher than the standards (Coltham 1972).
Odden and Kelley reviewed recent research and experience and concluded that individual merit and incentive pay programs do not work and, in fact, are often detrimental (1997). A number of studies have suggested that merit pay plans often divide faculties, set teachers against their administrators, are plagued by inadequate evaluation methods, and may be inappropriate for organizations such as schools that require cooperative, collaborative work [emphasis added] (Lawler 1983).
Evidence on VAM reveals similar warnings:
High-stakes uses of teacher VAM scores could easily have additional negative consequences for children’s education. These include increased pressure to teach to the test, more competition and less cooperation among the teachers within a school, and resentment or avoidance of students who do not score well. In the most successful schools, teachers work together effectively (Atteberry & Bryk, 2010). If teachers are placed in competition with one another for bonuses or even future employment, their collaborative arrangements for the benefit of individual students as well as the supportive peer and mentoring relationships that help beginning teachers learn to teach better may suffer. (p. 24)
Frase identified two sets of factors that affect teachers’ ability to perform effectively: work context factors (the teaching environment, and work content factors (teaching)….
Work context factors are those that meet baseline needs. They include working conditions such as class size, discipline conditions, and availability of teaching materials; the quality of the principal’s supervision; and basic psychological needs such as money, status, and security.
In general, context factors clear the road of the debris that block effective teaching. In adequate supply, these factors prevent dissatisfaction. Even the most intrinsically motivated teacher will become discouraged if the salary doesn’t pay the mortgage….
Work content factors are intrinsic to the work itself. They include opportunities for professional development, recognition, challenging and varied work, increased responsibility, achievement, empowerment, and authority. Some researchers argue that teachers who do not feel supported in these states are less motivated to do their best work in the classroom (NCES 1997).
Data from the National Center for Education Statistics (1997) confirm that staff recognition, parental support, teacher participation in school decision making, influence over school policy, and control in the classroom are the factors most strongly associated with teacher satisfaction [emphasis added]. Other research concurs that most teachers need to have a sense of accomplishment in these sectors if they are to persevere and excel in the difficult work of teaching.
VAM, merit pay, accountability built on standards and high-stakes testing, “no excuses” ideologies, zero tolerance policies—these remain essential elements of education reform although they are likely to creates the worst possible contexts and cultures necessary for teaching and learning.
Top-down and technocratic approaches to school policy, which de-professionalize teaching and teachers, are creating harmful cultures in public schools, proving further that the partisan political control of education remains tone-deaf to evidence and educators.
Teachers, like scientists above, are already quite likely to have chosen the profession in order to serve others. VAM and merit pay destroy those initial reasons for teaching.
Political commitments to harmful policies suggest the real problem in education is the motivation of those political leaders, not teachers.
Before teaching The Crucible in my American literature courses during my two decades as a high school English teacher in rural Upstate South Carolina, I played the students R.E.M.s “Exhuming McCarthy,” which “makes an explicit parallel between the red-baiting of Joe McCarthy‘s time and the strengthening of the sense of American exceptionalism during the Reagan era, especially the Iran-Contra affair” (Wikipedia).
The song includes an audio from the McCarthy hearings, including this soundbite of Joseph Welch confronting Joe McCarthy: “Let us not assassinate this lad further, Senator….You’ve done enough. Have you no sense of decency, sir, at long last? Have you left no sense of decency?”
Part of The Crucible unit asked students to examine how societies continue to repeat the basic flaws of abusing power and oppressing powerless groups of people. Despite the lessons of the Witch Trials and the Red Scare/McCarthy Era (with the Japanese Internment in between), Americans seem hell-bent on doubling down on policies and practices that are authoritarian, hypocritical, and simply mean—especially if those policies can be implemented by people with power onto the powerless.
Current education reform needs a McCarthy hearing, and we need to confront those driving those reforms with “You’ve done enough. Have you no sense of decency, sir, at long last? Have you left no sense of decency?”
For example, consider the following:
- South Carolina plans to join Florida in retaining 3rd graders based on test scores—insuring that marginalized students (children in poverty, children of color, English language learners, special needs students) will be punished.
- Tennessee seeks to link aid to impoverished families to their children’s achievement.
- Merit pay for teachers across the U.S. will resurrect child labor.
- And “no excuses” charter schools continue to spread despite their harsh policies and paternalism targeted at “other people’s children.”
History is replete with evidence that the ends do not justify the means.
While there remains great political and public support for grade retention, for example, a huge body of evidence shows that retention negatively impacts students retained, taxpayers, and peers not retained—all for mixed results of short-term test scores.
The only justification for grade retention is giving the appearance of being tough (raising a key question about how tough any adult is for lording him/herself over a child).
Americans’ puritanical roots are some of our worst qualities, and especially where children and other marginalized groups are concerned, Americans need to regain our sense of decency.
We would be well advised to begin with how we reform our schools.
What are the problems?
What is the evidence the problems exist?
What is the quality of that evidence?
Who are the stakeholders in the problems and solutions?
What are the perspectives of those stakeholders?
What are the perspectives of the stakeholders with experience and expertise in the problems and solutions?
Who stands to gain personally, professionally, and financially from the problems and solutions?
In the pursuit of any sort of reform, the right questions are essential—as is credible evidence—before solutions can be identified as valid, useful, and potentially effective. The great failure of democracy is that it appears those elected to power have neither the ability to ask the right questions nor the propensity to seek credible solutions. Those leaders are, however, eager to claim problems and support solutions that benefit them.
“In a bold experiment in performance pay, complaints from patients at New York City’s public hospitals and other measures of their care — like how long before they are discharged and how they fare afterward — will be reflected in doctors’ paychecks under a plan being negotiated by the physicians and their hospitals,” announces the lede to “New York City Ties Doctors’ Income to Quality of Care.”
“Bold” apparently means “making decisions based on ideology and not a shred of evidence.”
The article makes no case that doctor pay currently poses any sort of genuine problem—just that doctor pay is “traditional.” Further, the article does acknowledge two important facts:
“Still, doctors are hesitant, saying they could be penalized for conditions they cannot control, including how clean the hospital floors are, the attentiveness of nurses and the availability of beds.
“And it is unclear whether performance incentives work in the medical world; studies of similar programs in other countries indicate that doctors learn to manipulate the system.”
For those of us struggling against a similar baseless current of teacher evaluation and pay reform, these details are all too familiar: (1) Concerns about accountability being linked to conditions over which a worker has no control (or autonomy), and (2) A complete disregard for the mountain of evidence that merit pay of all kinds proves to be ineffective and triggers for many negative unintended consequences:
“‘The consequences in a complex system like a hospital for giving an incentive for one little piece of behavior are virtually impossible to foresee,’ said Dr. David U. Himmelstein, professor of public health at the City University of New York and a visiting professor at Harvard Medical School, who has reviewed the literature on performance incentives. ‘There are ways of gaming it without even outright lying that distort the meaning of the measure.’ …
“Dr. Himmelstein also said doctors could try to avoid the sickest and poorest patients, who tend to have the worst outcomes and be the least satisfied. But physicians within the public hospital system have little ability to choose their patients, Mr. Aviles said. He added that he did not expect the doctors to act so cynically because, ‘in the main, physicians are here because they are attracted to that very mission of serving everybody equally.'”
The medical profession is poised to experience the complete failure of democracy that has been the fate of educators for at least three decades now. Democracy has spawned a legion of people with power but no expertise, and the result is a template for reform that ignores clearly identifying problems, fails to gather credible evidence, bypasses a wealth of experience and expertise, and imposes the mechanisms of inequity that brought those in power to that power.
As a result, buried late in this article on doctor pay reform is a cautionary tale:
“But Dr. Himmelstein said there were still hazards in the city’s plan. He said that when primary-care doctors in England were offered bonuses based on quality measures, they met virtually all of them in the first year, suggesting either that quality improved or — the more likely explanation, in his view — ‘they learned very quickly to teach to the test.'”
Educators, sound familiar?
The League of Women Voters of South Carolina has released a report entitled “How to Evaluate and Retain Effective Teachers” (2011-2013) with the identified purpose “to examine the growing movement toward ‘results based’ evaluation nationally and in South Carolina.”
Before examining the substance of the report, several problems with the larger context of teacher evaluation and retention as they intersect with important challenges facing SC need to be identified:
• The report fails to clarify and provide evidence that teacher quality and retention are primary problems facing SC. Without a clear and evidence-based problem, solutions are rendered less credible. However, SC, like most of the US, has a teacher assignment problem that has been clearly documented: students of color, students from poverty, English language learners (ELL), and special needs students are disproportionately assigned to un-/under-certified and inexperienced teachers (Peskey & Haycock, 2006). As well, the implied problem of the report marginalizes the greatest obstacles facing SC schools, poverty and the concentration of poverty (notably the Corridor of Shame along I-95).
• The report compiles and bases claims on a selection of references that are not representative of the body of research on teacher quality and value-added methods (VAM) or performance-based systems of identifying teacher quality. As detailed below, the claims and research included in this report misrepresent the reports themselves as well as the current knowledge-base on teacher quality and retention.
In that context, the report fails the larger education reform needs facing SC as well as the stated purpose of the study.
Do Teachers Matter?
The opening claim of the report asserts “the most important school-based factor is an effective teacher,” and then cites Hanushek, among others. While the report is careful to note teacher quality is an important in-school factor, it repeatedly overstates teacher quality’s impact with terms such as “overwhelmingly” and fails to clarify that in-school factors are dwarfed by out-of-school factors. Matthew Di Carlo offers a balanced picture of the proportional impact of teacher quality, including an accurate interpretation of many of the same references (such as Hanushek):
“But in the big picture, roughly 60 percent of achievement outcomes is explained by student and family background characteristics (most are unobserved, but likely pertain to income/poverty). Observable and unobservable schooling factors explain roughly 20 percent, most of this (10-15 percent) being teacher effects. The rest of the variation (about 20 percent) is unexplained (error). In other words, though precise estimates vary, the preponderance of evidence shows that achievement differences between students are overwhelmingly attributable to factors outside of schools and classrooms (see Hanushek et al. 1998; Rockoff 2003; Goldhaber et al. 1999; Rowan et al. 2002; Nye et al. 2004).” 
Along with misrepresenting the impact of teacher quality on measurable student outcomes, the report lends credibility to a misrepresented and flawed study by Chetty, Friendam and Rockoff (2011):
“[T]hose using the results of this paper to argue forcefully for specific policies are drawing unsupported conclusions from otherwise very important empirical findings.” (Di Carlo)
“These are interesting findings. It’s a really cool academic study. It’s a freakin’ amazing data set! But these findings cannot be immediately translated into what the headlines have suggested – that immediate use of value-added metrics to reshape the teacher workforce can lift the economy, and increase wages across the board! The headlines and media spin have been dreadfully overstated and deceptive. Other headlines and editorial commentary has been simply ignorant and irresponsible. (No Mr. Moran, this one study did not, does not, cannot negate the vast array of concerns that have been raised about using value-added estimates as blunt, heavily weighted instruments in personnel policy in school systems.)” (Baker)
The teacher quality impact is misrepresented in this report and perpetuates popular and agenda-driven research myths such as the need for consecutive years of high-quality teachers; the claim is inaccurate and should not drive policy:
“This is important, because the ‘X consecutive teachers’ argument only carries concrete policy implications if we can accurately identify the ‘top’ teachers. In reality, though, the ability to do so is still extremely limited [emphasis added].
“So, in the context of policy debates, the argument proves almost nothing. All it really does – in a rather overblown, misleading fashion – is illustrate that teacher quality is important and should be improved, not that policies like merit pay, higher salaries, or charter schools will improve it.
“This represents a fundamental problem that I have discussed before: The conflation of the important finding that teachers matter – that they vary in their effectiveness – with the assumption that teacher effects can be measured accurately at the level of the individual teacher (see here for a quick analogy explaining this dichotomy)….
“But the ‘X consecutive teachers’ argument doesn’t help us evaluate whether this or anything else is a good idea. Using it in this fashion is both misleading and counterproductive. It makes huge promises that cannot be fulfilled, while also serving as justification for policies that it cannot justify. Teacher quality is a target, not an arrow.” (Di Carlo)
Has Traditional Teacher Evaluation Failed?
One of the implied and cited reasons for addressing teacher quality and retention rests on wide-spread criticism of traditional teacher evaluation policies and practices. This report lends a great deal of credibility to those criticisms while relying on The Widget Effect from The New Teacher Project (TNTP).  However, a review of this report calls into question, again, the credibility of the study’s claims as well as using it as a basis for policy decisions:
“Overall, the report portrays current practices in teacher evaluation as a broken system perpetuated by a culture that refuses to recognize and deal with incompetence and that fails to reward excellence. However, omissions in the report’s description of its methodology (e.g., sampling strategy, survey response rates) and its sample lead to questions about the generalizability of the report’s findings.”
“I just want to make one quick (and, in many respects, semantic) point about the manner in which TNTP identifies high-performing teachers, as I think it illustrates larger issues. In my view, the term ‘irreplaceable’ doesn’t apply, and I think it would have been a better analysis without it….
“Based on single-year estimates in math and reading, a full 43 percent of the NYC teachers classified as ‘irreplaceable’ in 2009 were not classified as such in 2010. (In fairness, the year-to-year stability may be a bit higher using the other district-specific definitions.)
“Such instability and misclassification are inevitable no matter how the term is defined and how much data are available – it’s all a matter of degree – but, in general, one must be cautious when interpreting single-year estimates (see here, here and here for related analyses).
“Perhaps more importantly, if you look at how they actually sorted teachers into categories, the label irreplaceable,’ at least as I interpret it, seems inappropriate no matter how much data are available.”
Performance-Based Teacher Evaluations: Is VAM Credible?
A significant portion of the report makes claims about value-added methods (VAM) of teacher evaluations in the context of performance-based approaches to identifying teacher quality. Nationally, VAM and other performance-based policies are being implemented quickly, but with little regard to the current understanding of the effectiveness and limitations of those policies; this report fails to represent the current state of research on VAM accurately and depends on research and think tank advocacy (National Council on Teacher Quality [NCTQ]) that distorts the importance of teacher quality and the effectiveness of identifying teacher quality based on measurable student outcomes.
The report remains supportive of performance-based policy recommendations, but does identify cautions about test-based teacher evaluations while also encouraging teacher evaluations include multiple measures and include teachers in the creation of a new evaluation system.
Two failures, however, of this report’s endorsement of VAM and/or performance-based teacher evaluations systems include couching that endorsement in the distorted claims about teacher quality’s impact on measurable student outcomes and depending on reports and claims made by NCTQ and the Bill and Melinda Gates Foundation. 
What, then, are the current patterns from the research on VAM and performance-based models and how should those patterns shape policy? 
• VAM and test-based evaluations for teachers remain both misleading about teacher quality and misrepresented by research, the media, and political leadership. Numerous researchers have detailed that teachers identified as high-quality or weak one year are identified differently in subsequent years: Numerous factors beyond the control of teachers remain reflected in test scores more powerfully than the individual impact of any specified teacher. The debate over teacher quality and measuring that quality, then, is highly distorted, as Di Carlo explains: “Whether or not we use these measures in teacher evaluations is an important decision, but the attention it gets seems way overblown.” This report makes that mistake.
• VAM and performance-based teacher evaluations in high-stakes settings distort teaching and learning by narrowing the focus of both teaching and learning to teaching to the test and test scores. VAM and test-based data are likely valuable for big picture patterns and in-school or in-district decision making regarding teacher assignment, but VAM and performance-based evaluations of individual teachers remain inaccurate and inappropriate for evaluation, pay, or retention.
Particularly in a state such as SC where poverty and state budget concerns burden the state and the public school system, VAM and performance-based systems that rely on extensive retooling of standards, testing, and teacher evaluation systems are simply not cost effective (Bausell, 2013): “VAM is not reliable or valid, and VAM-based polices are not cost-effective for the purpose of raising student achievement and increasing earnings by terminating large numbers of low-performing teachers” (Yeh, 2014).
And further, rejecting VAM and using significant percentages of student test scores to evaluate and retain teachers is not rejecting teacher accountability, but confronting the misuse of data. Ewing (2011) clarifies that VAM is flawed math and thus invalid as a tool in teacher evaluation:
“Of course we should hold teachers accountable, but this does not mean we have to pretend that mathematical models can do something they cannot. Of course we should rid our schools of incompetent teachers, but value-added models are an exceedingly blunt tool for this purpose. In any case, we ought to expect more from our teachers than what value-added attempts to measure.”
If SC choses to reform teacher evaluation—which remains a project far less urgent than other problems being ignored—the state would be guided better by Gabriel and Allington (2012), who have analyzed and challenged the Gates Foundations MET Project, which has prompted misguided and hasty implementation of VAM-style teacher evaluation reform:
“Although we don’t question the utility of using evidence of student learning to inform teacher development, we suggest that a better question would not assume that value-added scores are the only existing knowledge about effectiveness in teaching. Rather, a good question would build on existing research and investigate how to increase the amount and intensity of effective instruction.”
Gabriel and Allington (2012) recommend five questions to guide teacher evaluation reform, instead of VAM or other student-outcome-based initiatives:
“Do evaluation tools inspire responsive teaching or defensive conformity?…
“Do evaluation tools reflect our goals for public education?…
“Do evaluation tools encourage teachers to use text in meaningful ways?…
“Do evaluation tools spark meaningful conversations with teachers?…
“Do evaluation tools promote valuable education experiences?”
Again, these guidelines are evidence-based alternatives to discredited and experimental commitments to the misrepresented evidence in the report from LWV SC, but SC remains overburdened by issues related to equity and opportunity that outweigh the need to reform teacher evaluation at this time.
How Should SC Proceed with Teacher Quality and Retention?
On balance, this report misrepresents teacher quality and overstates the need and ability to identify high-quality teachers using VAM and other performance-based policies. The flaws in this report grow from an over-reliance on misguided and misrepresented research and advocacy while ignoring the rich and detailed evidence from the full body of research on teacher quality. Finally, the report concludes by discrediting SC’s current teacher evaluation system (ADEPT) in the context of the inaccurate and distorted claims in the report.
Ultimately, the report encourages SC to spend valuable time and resources on policies that are dwarfed by more pressing needs facing the state and its public schools—a failure of state leadership replicated in the perpetual retooling of state education standards (Common Core State Standards adoption) and high-stakes testing based on revised standards. In short, SC has a number of social and educational challenges that need addressing before the state experiments with revising teacher evaluation and retention policies, including the following:
• Identify how better to allocate state resources to address childhood and family poverty, childhood food security, children and family access to high-quality health care, and stable, well paying work for families.
• Replace current education policies based on accountability, standards, and testing with policies that address equity and opportunity for all students.
• Address immediately the greatest teacher quality issue facing SC’s public schools—inequitable distribution of teacher quality among students in greatest need (high-poverty children, children of color, ELL, and special needs students).
• Address immediately the conditions of teaching and learning in the state’s schools, including issues of student/teacher ratios, building conditions and material availability, administrative and community support of teachers, equitable school funding and teacher salaries, teacher job security and academic freedom in a right-to-work state, and school safety.
Any policy changes that further entrench the culture of testing in SC as a mechanism for evaluating students, teachers, and schools perpetuate the burden of inequity in the state and schools.
SC does not need new standards, new tests, or a new teacher evaluation system. All of these practices have been implemented in different versions with high-stakes attached for the past thirty years—with the current result being the same identified failures with public schools that were the basis of these policies.
SC, like much of the US, needs to come to terms with identifying problems first before seeking solutions. The problems are ones of equity and opportunity, and no current teacher evaluation plan is facing those realities, including this report.
 Also see Di Carlo, 2011, on TNTP.
 For evaluations of NCTQ and Gates see the following: (a) NCTQ: Benner, 2012; Merrill & Ingersoll, 2010; Baker, 2010, January 29; Baker, 2010, October 8; Baker, 2010, December 4, and (b) Gates: Baker, 2010, December 13; Baker, 2011, March 2; Libby, 2012; [Gates MET Project] Gabriel & Allington, 2012; Rubinstein, 2013, January 13; Rubinstein, 2013, January 9; Baker, 2013, January 9; Glass, 2013, January 14; Rothstein & Mathis, 2013
 For comprehensive examinations of research on VAM, see Di Carlo, 2012, November 13*; Di Carlo, 2012, April 12 (reliability); Di Carlo, 2012, April 18 (validity); Baker, 2012, May 28 (misrepresentations of VAM); Baker, et al., 2010, August 27; Ewing, 2011, May. For recent concerns about legal action and VAM-based teacher evaluation, see Baker, Oluwole, & Green, 2013; Pullin, 2013
* Di Carlo remains a proponent of including VAM in teacher evaluations.
Berliner, D. C. (2014). Effects of inequality and poverty vs. teachers and schooling on America’s youth. Teachers College Record, 116(1). Retrieved from http://www.tcrecord.org/content.asp?contentid=16889
Berliner, D. C. (2009). Poverty and potential: Out-of-school factors and school success. Boulder, CO and Tempe, AZ: Education and the Public Interest Center & Education Policy Research Unit. Retrieved from http://nepc.colorado.edu/publication/poverty-and-potential
Gabriel, R., & Allington, R. (2012, November). The MET Project: The wrong 45 Million dollar question. Educational Leadership, 70(3), 44-49. Retrieved from http://220.127.116.11/Documents/RandD/Teacher%20Evaluation/MET%20Project%20-%20Wrong%2045%20Millon%20Question%20-%20Gabriel.pdf
Hirsch, D. (2007, September). Experiences of poverty and educational disadvantage. York, North Yorkshire, UK: Joseph Rowntree Foundation. Retrieved from http://www.jrf.org.uk/knowledge/findings/socialpolicy/2123.asp
Peske, H. G., & Haycock, K. (2006, June). Teaching inequality: How poor and minority students are shortchanged on teacher quality. Washington DC: The Education Trust, Inc. Retrieved from http://www.edtrust.org/sites/edtrust.org/files/publications/files/TQReportJune2006.pdf
Rothstein, R. (2010, October 14). How to fix our schools. Issue Brief 286. Washington DC: Economic Policy Institute. Retrieved from http://www.epi.org/publications/entry/ib286
Thomas, P. L. (2011, December). Teacher quality and accountability: A failed debate. Daily Censored. Retrieved from http://www.dailycensored.com/teacher-quality-and-accountability-a-failed-debate/
Traub, J. (2000, January 16). What no school can do. New York Times Magazine. Retrieved from http://local.provplan.org/pp170/materials/what%20no%20school%20can%20do.htm