Review [UPDATED]: “How to Evaluate and Retain Effective Teachers” (League of Women Voters of SC)

The League of Women Voters of South Carolina has released a report entitled “How to Evaluate and Retain Effective Teachers” (2011-2013) with the identified purpose “to examine the growing movement toward ‘results based’ evaluation nationally and in South Carolina.”

Before examining the substance of the report, several problems with the larger context of teacher evaluation and retention as they intersect with important challenges facing SC need to be identified:

• The report fails to clarify and provide evidence that teacher quality and retention are primary problems facing SC. Without a clear and evidence-based problem, solutions are rendered less credible. However, SC, like most of the US, has a teacher assignment problem that has been clearly documented: students of color, students from poverty, English language learners (ELL), and special needs students are disproportionately assigned to un-/under-certified and inexperienced teachers (Peskey & Haycock, 2006). As well, the implied problem of the report marginalizes the greatest obstacles facing SC schools, poverty and the concentration of poverty (notably the Corridor of Shame along I-95).

• The report compiles and bases claims on a selection of references that are not representative of the body of research on teacher quality and value-added methods (VAM) or performance-based systems of identifying teacher quality. As detailed below, the claims and research included in this report misrepresent the reports themselves as well as the current knowledge-base on teacher quality and retention.

In that context, the report fails the larger education reform needs facing SC as well as the stated purpose of the study.

Do Teachers Matter?

The opening claim of the report asserts “the most important school-based factor is an effective teacher,” and then cites Hanushek, among others. While the report is careful to note teacher quality is an important in-school factor, it repeatedly overstates teacher quality’s impact with terms such as “overwhelmingly” and fails to clarify that in-school factors are dwarfed by out-of-school factors. Matthew Di Carlo offers a balanced picture of the proportional impact of teacher quality, including an accurate interpretation of many of the same references (such as Hanushek):

“But in the big picture, roughly 60 percent of achievement outcomes is explained by student and family background characteristics (most are unobserved, but likely pertain to income/poverty). Observable and unobservable schooling factors explain roughly 20 percent, most of this (10-15 percent) being teacher effects. The rest of the variation (about 20 percent) is unexplained (error). In other words, though precise estimates vary, the preponderance of evidence shows that achievement differences between students are overwhelmingly attributable to factors outside of schools and classrooms (see Hanushek et al. 1998Rockoff 2003Goldhaber et al. 1999Rowan et al. 2002Nye et al. 2004).” [1]

Along with misrepresenting the impact of teacher quality on measurable student outcomes, the report lends credibility to a misrepresented and flawed study by Chetty, Friendam and Rockoff  (2011):

“[T]hose using the results of this paper to argue forcefully for specific policies are drawing unsupported conclusions from otherwise very important empirical findings.” (Di Carlo)

“These are interesting findings. It’s a really cool academic study. It’s a freakin’ amazing data set! But these findings cannot be immediately translated into what the headlines have suggested – that immediate use of value-added metrics to reshape the teacher workforce can lift the economy, and increase wages across the board! The headlines and media spin have been dreadfully overstated and deceptive. Other headlines and editorial commentary has been simply ignorant and irresponsible. (No Mr. Moran, this one study did not, does not, cannot negate  the vast array of concerns that have been raised about using value-added estimates as blunt, heavily weighted instruments in personnel policy in school systems.)” (Baker)

The teacher quality impact is misrepresented in this report and perpetuates popular and agenda-driven research myths such as the need for consecutive years of high-quality teachers; the claim is inaccurate and should not drive policy:

“This is important, because the ‘X consecutive teachers’ argument only carries concrete policy implications if we can accurately identify the ‘top’ teachers. In reality, though, the ability to do so is still extremely limited [emphasis added].

“So, in the context of policy debates, the argument proves almost nothing. All it really does – in a rather overblown, misleading fashion – is illustrate that teacher quality is important and should be improved, not that policies like merit pay, higher salaries, or charter schools will improve it.

“This represents a fundamental problem that I have discussed before: The conflation of the important finding that teachers matter – that they vary in their effectiveness – with the assumption that teacher effects can be measured accurately at the level of the individual teacher (see here for a quick analogy explaining this dichotomy)….

“But the ‘X consecutive teachers’ argument doesn’t help us evaluate whether this or anything else is a good idea. Using it in this fashion is both misleading and counterproductive. It makes huge promises that cannot be fulfilled, while also serving as justification for policies that it cannot justify. Teacher quality is a target, not an arrow.” (Di Carlo)

Has Traditional Teacher Evaluation Failed?

One of the implied and cited reasons for addressing teacher quality and retention rests on wide-spread criticism of traditional teacher evaluation policies and practices. This report lends a great deal of credibility to those criticisms while relying on The Widget Effect from The New Teacher Project (TNTP). [2] However, a review of this report calls into question, again, the credibility of the study’s claims as well as using it as a basis for policy decisions:

“Overall, the report portrays current practices in teacher evaluation as a broken system perpetuated by a culture that refuses to recognize and deal with incompetence and that fails to reward excellence. However, omissions in the report’s description of its methodology (e.g., sampling strategy, survey response rates) and its sample lead to questions about the generalizability of the report’s findings.”

Di Carlo has also documented that TNTP continues to misrepresent and overstate the impact of teacher quality and the effectiveness of identifying high-quality teachers:

“I just want to make one quick (and, in many respects, semantic) point about the manner in which TNTP identifies high-performing teachers, as I think it illustrates larger issues. In my view, the term ‘irreplaceable’ doesn’t apply, and I think it would have been a better analysis without it….

“Based on single-year estimates in math and reading, a full 43 percent of the NYC teachers classified as ‘irreplaceable’ in 2009 were not classified as such in 2010. (In fairness, the year-to-year stability may be a bit higher using the other district-specific definitions.)

“Such instability and misclassification are inevitable no matter how the term is defined and how much data are available – it’s all a matter of degree – but, in general, one must be cautious when interpreting single-year estimates (see herehere and here for related analyses).

“Perhaps more importantly, if you look at how they actually sorted teachers into categories, the label irreplaceable,’ at least as I interpret it, seems inappropriate no matter how much data are available.”

Performance-Based Teacher Evaluations: Is VAM Credible?

A significant portion of the report makes claims about value-added methods (VAM) of teacher evaluations in the context of performance-based approaches to identifying teacher quality. Nationally, VAM and other performance-based policies are being implemented quickly, but with little regard to the current understanding of the effectiveness and limitations of those policies; this report fails to represent the current state of research on VAM accurately and depends on research and think tank advocacy (National Council on Teacher Quality [NCTQ]) that distorts the importance of teacher quality and the effectiveness of identifying teacher quality based on measurable student outcomes.

The report remains supportive of performance-based policy recommendations, but does identify cautions about test-based teacher evaluations while also encouraging teacher evaluations include multiple measures and include teachers in the creation of a new evaluation system.

Two failures, however, of this report’s endorsement of VAM and/or performance-based teacher evaluations systems include couching that endorsement in the distorted claims about teacher quality’s impact on measurable student outcomes and depending on reports and claims made by NCTQ and the Bill and Melinda Gates Foundation. [3]

What, then, are the current patterns from the research on VAM and performance-based models and how should those patterns shape policy? [4]

• VAM and test-based evaluations for teachers remain both misleading about teacher quality and misrepresented by research, the media, and political leadership. Numerous researchers have detailed that teachers identified as high-quality or weak one year are identified differently in subsequent years: Numerous factors beyond the control of teachers remain reflected in test scores more powerfully than the individual impact of any specified teacher. The debate over teacher quality and measuring that quality, then, is highly distorted, as Di Carlo explains: “Whether or not we use these measures in teacher evaluations is an important decision, but the attention it gets seems way overblown.” This report makes that mistake.

• VAM and performance-based teacher evaluations in high-stakes settings distort teaching and learning by narrowing the focus of both teaching and learning to teaching to the test and test scores. VAM and test-based data  are likely valuable for big picture patterns and in-school or in-district decision making regarding teacher assignment, but VAM and performance-based evaluations of individual teachers remain inaccurate and inappropriate for evaluation, pay, or retention.

Particularly in a state such as SC where poverty and state budget concerns burden the state and the public school system, VAM and performance-based systems that rely on extensive retooling of standards, testing, and teacher evaluation systems are simply not cost effective (Bausell, 2013): “VAM is not reliable or valid, and VAM-based polices are not cost-effective for the purpose of raising student achievement and increasing earnings by terminating large numbers of low-performing teachers” (Yeh, 2014).

And further, rejecting VAM and using significant percentages of student test scores to evaluate and retain teachers is not rejecting teacher accountability, but confronting the misuse of data. Ewing (2011) clarifies that VAM is flawed math and thus invalid as a tool in teacher evaluation:

“Of course we should hold teachers accountable, but this does not mean we have to pretend that mathematical models can do something they cannot. Of course we should rid our schools of incompetent teachers, but value-added models are an exceedingly blunt tool for this purpose. In any case, we ought to expect more from our teachers than what value-added attempts to measure.”

If SC choses to reform teacher evaluation—which remains a project far less urgent than other problems being ignored—the state would be guided better by Gabriel and Allington (2012), who have analyzed and challenged the Gates Foundations MET Project, which has prompted misguided and hasty implementation of VAM-style teacher evaluation reform:

“Although we don’t question the utility of using evidence of student learning to inform teacher development, we suggest that a better question would not assume that value-added scores are the only existing knowledge about effectiveness in teaching. Rather, a good question would build on existing research and investigate how to increase the amount and intensity of effective instruction.”

Gabriel and Allington (2012) recommend five questions to guide teacher evaluation reform, instead of VAM or other student-outcome-based initiatives:

“Do evaluation tools inspire responsive teaching or defensive conformity?…

“Do evaluation tools reflect our goals for public education?…

“Do evaluation tools encourage teachers to use text in meaningful ways?…

“Do evaluation tools spark meaningful conversations with teachers?…

“Do evaluation tools promote valuable education experiences?”

Again, these guidelines are evidence-based alternatives to discredited and experimental commitments to the misrepresented evidence in the report from LWV SC, but SC remains overburdened by issues related to equity and opportunity that outweigh the need to reform teacher evaluation at this time.

How Should SC Proceed with Teacher Quality and Retention?

On balance, this report misrepresents teacher quality and overstates the need and ability to identify high-quality teachers using VAM and other performance-based policies. The flaws in this report grow from an over-reliance on misguided and misrepresented research and advocacy while ignoring the rich and detailed evidence from the full body of research on teacher quality. Finally, the report concludes by discrediting SC’s current teacher evaluation system (ADEPT) in the context of the inaccurate and distorted claims in the report.

Ultimately, the report encourages SC to spend valuable time and resources on policies that are dwarfed by more pressing needs facing the state and its public schools—a failure of state leadership replicated in the perpetual retooling of state education standards (Common Core State Standards adoption) and high-stakes testing based on revised standards. In short, SC has a number of social and educational challenges that need addressing before the state experiments with revising teacher evaluation and retention policies, including the following:

• Identify how better to allocate state resources to address childhood and family poverty, childhood food security, children and family access to high-quality health care, and stable, well paying work for families.

• Replace current education policies based on accountability, standards, and testing with policies that address equity and opportunity for all students.

• Address immediately the greatest teacher quality issue facing SC’s public schools—inequitable distribution of teacher quality among students in greatest need (high-poverty children, children of color, ELL, and special needs students).

• Address immediately the conditions of teaching and learning in the state’s schools, including issues of student/teacher ratios, building conditions and material availability, administrative and community support of teachers, equitable school funding and teacher salaries, teacher job security and academic freedom in a right-to-work state, and school safety.

Any policy changes that further entrench the culture of testing in SC as a mechanism for evaluating students, teachers, and schools perpetuate the burden of inequity in the state and schools.

SC does not need new standards, new tests, or a new teacher evaluation system. All of these practices have been implemented in different versions with high-stakes attached for the past thirty years—with the current result being the same identified failures with public schools that were the basis of these policies.

SC, like much of the US, needs to come to terms with identifying problems first before seeking solutions. The problems are ones of equity and opportunity, and no current teacher evaluation plan is facing those realities, including this report.

[1] For additional examinations of out-of-school factors compared to teacher quality and in-school factors see Berliner, 2009, 2014; Hirsch, 2007; Rothstein, 2010; Traub, 2000.

[2] Also see Di Carlo, 2011, on TNTP.

[3] For evaluations of NCTQ and Gates see the following: (a) NCTQ: Benner, 2012; Merrill & Ingersoll, 2010; Baker, 2010, January 29; Baker, 2010, October 8; Baker, 2010, December 4, and (b) Gates: Baker, 2010, December 13; Baker, 2011, March 2; Libby, 2012; [Gates MET Project] Gabriel & Allington, 2012; Rubinstein, 2013, January 13; Rubinstein, 2013, January 9; Baker, 2013, January 9; Glass, 2013, January 14; Rothstein & Mathis, 2013

[4] For comprehensive examinations of research on VAM, see Di Carlo, 2012, November 13*; Di Carlo, 2012, April 12 (reliability); Di Carlo, 2012, April 18 (validity); Baker, 2012, May 28 (misrepresentations of VAM); Baker, et al., 2010, August 27; Ewing, 2011, May. For recent concerns about legal action and VAM-based teacher evaluation, see Baker, Oluwole, & Green, 2013; Pullin, 2013

* Di Carlo remains a proponent of including VAM in teacher evaluations.

References

Berliner, D. C. (2014). Effects of inequality and poverty vs. teachers and schooling on America’s youth. Teachers College Record, 116(1). Retrieved from http://www.tcrecord.org/content.asp?contentid=16889

Berliner, D. C. (2009). Poverty and potential: Out-of-school factors and school success. Boulder, CO and Tempe, AZ: Education and the Public Interest Center & Education Policy Research Unit. Retrieved from http://nepc.colorado.edu/publication/poverty-and-potential

Gabriel, R., & Allington, R. (2012, November). The MET Project: The wrong 45 Million dollar question. Educational Leadership, 70(3), 44-49. Retrieved from http://216.78.200.159/Documents/RandD/Teacher%20Evaluation/MET%20Project%20-%20Wrong%2045%20Millon%20Question%20-%20Gabriel.pdf

Hirsch, D. (2007, September). Experiences of poverty and educational disadvantage. York, North Yorkshire, UK: Joseph Rowntree Foundation. Retrieved from http://www.jrf.org.uk/knowledge/findings/socialpolicy/2123.asp

Peske, H. G., & Haycock, K. (2006, June). Teaching inequality: How poor and minority students are shortchanged on teacher quality. Washington DC: The Education Trust, Inc. Retrieved from http://www.edtrust.org/sites/edtrust.org/files/publications/files/TQReportJune2006.pdf

Rothstein, R. (2010, October 14). How to fix our schools. Issue Brief 286. Washington DC: Economic Policy Institute. Retrieved from http://www.epi.org/publications/entry/ib286

Thomas, P. L. (2011, December). Teacher quality and accountability: A failed debate. Daily Censored. Retrieved from http://www.dailycensored.com/teacher-quality-and-accountability-a-failed-debate/

Traub, J. (2000, January 16). What no school can do. New York Times Magazine. Retrieved from http://local.provplan.org/pp170/materials/what%20no%20school%20can%20do.htm

28 thoughts on “Review [UPDATED]: “How to Evaluate and Retain Effective Teachers” (League of Women Voters of SC)”

  1. Thank you for an excellent review of a very biased report. Unfortunately, the report that has been posted does not represent the views of all of the members of the committee that began the work for the SC League of Women Voters.

Leave a comment