It was bad enough when No Child Left Behind (NCLB) went into effect and we had to start evaluating schools based on standardized tests that put the entire focus of public education on math and language arts to the detriment of science, social studies, and the arts—when teachers for the sake of their jobs and their schools’ survival had to put all other teaching considerations aside and focus on the tests.
Back in 2002 I was invited to speak to a group of Ukrainian English teachers in a State Department International Visitor Program. They had asked to hear about developments in the U.S public school system. I began to explain NCLB’s policy of Adequate Yearly Progress (AYP) in which student performance had to improve every year on mandated tests until all schools were performing at 95 or 96% in reading and mathematics. Suddenly, my listeners protested. Some even banged the table. One member summed up their concern: “Don’t let them do that. The Soviets did that to our educational system and they destroyed it. “ The essence of their argument was that unrealistic demands had led to (very understandable) corruption.
While the Obama administration has shown considerable interest in revising elements of No Child Left Behind, it seems to have put a new emphasis on evaluating teacher effectiveness based on student test results. Frequently, as much as 50% of a teacher’s effectiveness and compensation would be based on student test results.
This concerns many of us. Even the most powerful statistical models have difficulties capturing the complexity of what occurs in the classroom. Beyond such measurement problems, there are situational problems that make it difficult to evaluate the impact of individual teachers. Some include students who come and go during the year, “summer learning loss” (which seems to affect lower socioeconomic level students more than others); what the students managed to learn (or failed to learn) from a previous year’s teacher; which students are placed in a particular classroom; the effect of other teachers students have in the same academic year; the contribution that parents and other family members make (or don’t make) at home, and the effects of inclusion classrooms, team taught classrooms, and outside tutoring.
The stakes in this debate have shot up recently. On August 15th, the Los Angeles Times provided access to a database of teacher effectiveness related to student test results using a Value Added Modeling (VAM) approach. When asked about the ethics of publishing this data, Arne Ducan, Obama’s Secretary of Education replied, “what’s there to hide?”
We should not hide Problems with the Use of Student Test Scores to Evaluate Teachers, (2010) a briefing paper that the Economic Policy Institute (EPI) has just released. While the briefing paper acknowledges that the VAM approach can lead to stronger analyses than single point analyses of student test results, the paper also reports on a number of studies critical of the VAM approach, including several studies with surprising results. In the study of one VAM assessment, less than one third of the teachers who rated in the top 20% remained in the top 20% the following year, and another third of that top 20% moved to the bottom 40% the following year. Another study found that students’ fifth grade teachers were among the surest predictors of students’ fourth grade test scores.
The EPI’s critique focused in particular on VAM, as it is one of the most sophisticated models in use. However, the paper’s warning about using student test results to rate teachers for raises and firing is more general. It lists five ways that effective (or ineffective) teachers can be misidentified due to statistical limitations and lists several practical limitations (such as inappropriate tests) of relating student scores to teacher effectiveness, and the unintended consequences of this kind of teacher rating system (which include disincentives to work with the neediest students and decreased teacher collaboration).
The paper makes recommendations that we teachers are very familiar with—that increasing teacher effectiveness can be done by using systematic observation protocols, videos, and teacher interviews, and that items such as lesson plans, assignments, and samples of student work can be used to identify effective and ineffective teachers and provide support for those in need. This kind of teacher collaboration, the essence of professional development, can make ineffective teachers effective and make the effective ones even stronger. With such support, even those who just can’t make it can leave knowing they weren’t disrespected.
We need to stand up and stand together on this issue. If no one makes the counter argument against basing teacher effectiveness decisions on student test results, it will become practice. I know these arguments are complicated and it can be hard to translate what you feel into talking points, but Problems with the Use of Student Test Scores to Evaluate Teachers provides the tools to help you do that.
Contact your congress people. Send a message to the Department of Education. If you need to find the contact information, go to TESOL’s Advocacy Action Center. And please, talk this issue up among colleagues. It’s the start of a brand new year and a good time to act.
Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R., . . . Shepard, L. A. (2010). Problems with the use of student test scores to evaluate teachers (Briefing Paper No. 278). Retrieved from Economic Policy Institute Website: http://epi.3cdn.net/724cd9a1eb91c40ff0_hwm6iij90.pdf