Shuffling through the pressing crowd at the convention center, I felt very much out of place, but at the same time a bit privileged. I, along with another teacher from the district, had been invited to attend a regional meeting for administrators for the unveiling of some incredible evaluation tool that would change everything when it came to assessment.
That was nearly a decade ago, but I still remember sitting at the table and listening to the presenter protest our misled understanding of success. He said it didn’t matter if 100% of our students passed the State test or not; that meant nothing and was no cause for pride. What really mattered was if we showed value added to each student. That was the mark of true success, and his program, using a complex statistical formula, was just what we needed. Not only would it deliver reports on every learning objective imaginable, it would also predict the level at which each student should be performing year to year if teachers were doing their job.
As you might guess, the man’s presentation was a huge success, and most schools in the region signed on to his program touting this new tool called Value Added Measure (VAM). To everyone responsible for gathering data, this was a Godsend. But for everyone responsible for producing the added value to each student, this was a nightmare.
On the surface, this kind of evaluation doesn’t seem too much to ask of teachers, and I suppose that would be true if the “complex statistical formula” measuring and predicting student performance met the two criteria of testing: validity and reliability.
Validity is the degree to which a test or program measures what it is supposed to measure. In other words, if it is measuring a student’s progress based on in-school instruction, then the tool must have some way of removing possible out-of-school influences that could affect performance, such as parents getting a divorce, a move to a new home, a break-up with a boyfriend, illness, financial difficulties, etc. The problem, of course, is that there is no way to remove any of these variables from students’ lives. They don’t leave them at the door when they step into the school.
The second requirement for standardized measurement is reliability: the extent to which the instrument being used produces the same results on repeated trials. In other words, test results, when administered under like conditions, must produce like results time and time again. Only when reliability is established can one know that the test is measuring true progress in learning and not just providing a snapshot of performance on a single day.
Companies that produce these VAM programs know that their product must deliver validity and reliability. One such company, TAP, even addresses the requirements in their explanation of VAM on their website:
Value-added analysis is a statistical technique that uses student achievement data over time to measure the learning gains students make. This methodology offers a way to estimate the impact schools and teachers have on student learning isolated from other contributing factors such as family characteristics and socioeconomic background. In other words, value-added analysis provides a way to measure the effect a school or teacher has on student academic performance over the course of a school year or another period of time.
TAP attempts to address validity and reliability here, but reality leaves VAM short on both requirements. Even if the programmers can allow for students’ socio-economic-status and their family demographics, there is no way they can remove all the variables that come attached to different students. Reliability, which is dependent upon the stability of validity, cannot be established if the variables are always changing, as variables always do with children.
In short, the programmers of VAM tools have developed wares much like traveling medicine men used to cook up snake oil. Because everyone is so desperate to be cured, they are far too ready to buy the “solution”. The bottom line is that education has no short cuts. It is messy. It is tumultuous, one day never like another, one student never experiencing the same day twice, teachers juggling strategies to meet each child’s needs. These program designers think they have created something complex, but their formulas do not even come close to the complexity of the human beings in every classroom every single day. There is no measurement for such complexity. Period.
Another great post that simultaneously informs and turns my stomach. That school districts can purchase yet another battery of tests, blindly following the promises of a salesman – a salesman! – who has probably never spent a day teaching real students – makes me sick. This is likely the magic “Solution” that will be used until the next slick Pied Piper (with a few more bells and whistles) comes down the pike. As you stated so succinctly, there is no single measure that can declare or predict student “success.” But it seems teachers are the only ones who know this.
Thank you, thank you, thank you!! Teachers know that we are being sold snake oil as instant cures for things that require long-term investment in children, families, teachers, schools, communities… I appreciate your use of educational knowns in testing–validity and reliability–because it moves the discussion from what teachers know intuitively to what has been proven in research about how children learn, moving from subjective to objective experience. We, teachers, need to stand up to the traveling medicine shows, use the knowledge that we have acquired in classrooms with our students and in classrooms on the way to becoming the professionals that we are, to counter the madness that seeks only to steal education dollars from American public education and gut the ability of children (and their families, teachers, schools, communities) to become fully engaged participants in American society.
Love your line: “We, teachers, need to stand up to the traveling medicine shows…” Thanks for the reply!
God bless you for being a teacher. I believe teachers have one of the toughest jobs on the planet for many reasons and things like this just complicate their jobs more. I think you need the heart of a saint to even want to do this job.
Lisa, thank you for saying so clearly what I–and others–have been thinking. As you and a commenter above suggest, we’d do well to follow the money. I’ve been teaching for fifteen years. When I started, “test” was a bad word. The pendulum has swung, and the fact that these “reforms” come and go is, I guess, the good and bad news.
Thank you, Steve. Always love hearing from a fellow teacher!
Actually, there are statistical models that can control for the many variables but your point is well taken in terms of validity. There’s another blogger who addresses VAM through a different lens – that of a wicked smart mathematician (or statistician). Her focus is on how the model can be gamed. Here’s a link her post about VAM: http://mathbabe.org/2013/02/06/bad-model-high-stakes-gaming/
In Connecticut we’re ramping up a much more comprehensive evaluation system but the validity issue is still prevalent.
Randy, thanks for the reply and the link. Everyone, the link is worth the read. Even more reason to be concerned about VAM.
I love it whenever people come together and share opinions.
Great site, keep it up!