Tuesday, September 29, 2009

Rhee's blindness

Courtesy of the Washington Post, Steve Pearlstein interviews Michelle Rhee on her failed gambit to get teachers to give up tenure protections in exchange for a chance to make more money.   It shines an interesting light on how Rhee thinks.  She completely fails to understand why teachers rejected her ideas -- namely that they distrust her personally, that money by itself is not the strongest motivator of most teachers (else they would be in different professions), and that any teacher who accepted Rhee's "green plan" would be breaking ranks with her fellow teachers, thereby putting herself in an uncomfortable and possibly precarious situation.

Instead, Rhee offers two explanations.  First, a lack of communication, which enabled "misinformation" about the plan to take root.  Second, that a rank-and-file contingent (the leadership is written off) she expected as vocal supporters of the plan failed to materialize.  Pearlstein helpfully offers a "Gresham's Law" (bad money drives out the good) interpretation of the latter, and Rhee riffs on that, bringing us back to her favorite good teacher/bad teacher theme.  The bad teachers (in her view) were the vocal activists at union meetings who fought against the plan, while the good teachers who, Rhee is certain, favored the plan were too busy at home working on the next day's lesson to come out in support.

In other words, Rhee remains convinced that she is right: if teachers actually understoond the plan and could have democratically expressed their preference, they would have endorsed it.  Not for a moment does she consider that teachers might have legitimate reasons for opposition to her plan, and Pearlstein smiles and probes not.  By (again) mobilizing the good teacher/bad teacher schtick, she demonstrates that she fails to understand how offensive this is to ALL teachers because it only serves to fuel the deep anti-teacher sentiment so rife these days.   And that this offensiveness breeds loathing and distrust.

Saturday, September 26, 2009

No regrets

"I make mistakes all the time, but I don't have regrets about them."

Thus speaks Michelle Rhee in a Washington Post profile.  What a frightening statement coming from a such powerful person. 

Tuesday, September 22, 2009

Maryland high school seniors triumphantly reach the floor

The Washington Post reports that only 11 of 60,000 high school seniors in Maryland were denied diplomas for failing to pass 4 supposedly "rigorous" exams in math, English, science, and government.  (Well, only about 40,000 students actually passed all 4 tests while the others slid past by other means, but let's never mind the trifling details.)  "Now that we have achieved a floor," crows state superintendant Nancy Grasmick, "I think the next step is to raise the standards."

That's exactly what I said the last time I drank too much and found my nose buried in the carpet!

Sunday, September 20, 2009

How?

"I will always make decisions that are in the best interest of kids." Always spoken with great gravity and sincerity, this is one of the favorite utterances "reform" superintendants like Joel Klein or Michelle Rhee.  I wish, for once, upon hearing this bromide, some intrepid journalist or parent would ask the simple question: How?

Thursday, September 10, 2009

They're doing it again, part 2

More on the NYC teacher "value-added" reports.  In the previous post, I pondered how they managed to come up with a way of creating a percentile rank out of fuzzy data points.  So taken was I with the NYC Dept of Ed's statistical gimmickry that I forgot to note just how fuzzy those data points are. 

Changes in students' standardized test scores from year-to-year are mostly random.   The phenomenon is well known in the psychometric literature. Every test score is a fuzzy measure of a student's true ability, so when you subtract one fuzzy measure from another, you mostly end up with fuzz.  Because the measuring stick is so imprecise, it's almost impossible to say with any confidence whether one student gained more than another over the course of a year.  It's like measuring grains of sand with a household ruler.

Now if there were a few particularly large chunks of sand, you could confidently say they were bigger than the others, but for the most part, all the grains of sand would be indistinguishable as far as you could tell with your household ruler.

Quantity would help.  If you had 2 groups of lots (says, million of grains) of sand, and you knew how many grains were in each group, you could confidently measure the average size of the grains in each group by putting them in a big beaker of water, using your household ruler to measure how far the water rose, and calculating the average displaced volume per grain.  (Well, you'd have to go look up some formulas in your kid's 8th grade math book first, but it could be done, in theory.)

So, the big question when it comes to gains NYC student test scores is: how many are needed to get a good read?  Evidently, as evidenced here and here, a whole schoolful of kids is not enough.  These posts show that there is almost no correlation between school-level average test score gains from one year to the next.  Now, we know Klein loves shaking things up, but this suggests a degree of chaos in schools that is hard to believe.  The more plausible explanation for the lack of correlation between years is that average gains, at the school level, are mostly fluff.  It appears that most schools do have enough kids to provide an accurate measure.

I suspect, though I'm not sure, that most NYC teachers teach far fewer than a whole schoolful of kids.  So if "progress" cannot be accurately measured for entire schools, what does this tell us about the accuracy of progress measured for teachers?

My guess is that the weird confidence intervals NYC DOE gives for teachers' percentile ranks (and I am truly curious how they get those) would be much much larger if they accounted for the imprecision in the measurement of teacher average test score gains.

Tuesday, September 8, 2009

They're doing it again

Just days after nobody but the intellectually blind could fail to see that NYC's school report cards are statistical junk, comes news of the DOE releasing teacher data reports.  Naturally, they don't publish details of its procedures which, by itself, commends the reports to the trash. 

But even without the documentation, from the sample supplied courtesy of the NY Times, it is obvious that these bozos do not know what they're doing.    I'm going to skate past lots of quibbles and get to the weirdest thing, which is those ranges around the teacher valued-added percentiles.  Think back when you got your SAT scores.  Did they give you a possible range?  Did your report say, you scored somewhere between the 50th and 90th percentiles, but we think you're probably close to the 70th (which, for what it's worth, is the wrong interpretation of a confidence interval)?  No, of course not, because a percentile, by definition, is a real position on a scale, not a statistical estimate.

"Value-added gain" is a statistical estimate (sort of) being composed of the difference between "actual gain" and "predicted gain,"  where the latter is a predicted value with an associated error range (or, to be precise, the sum of lots of predicted values and their associated error ranges -- it's not known whether DOE did that summation right).  So the value-added gain should have a confidence interval around it (well, not really, because none of these data come from a random sample, but let's suspend disbelief for a bit). 

My guess is DOE tried to translate an imprecise value-added measure into an imprecise percentile rank.  How they did that is truly a mystery, because there ain't no way the percentile ranges should be symmetrically centered around a midpoint, but they are.  Why is this? 

Warning: geekspeak ahead.

The value-added confidence interval should come with a point estimate plus or minus some error.  For example, 53 plus or minus 5.  Now, if 53 is above the 50th percentile, this minus 5 is going to pull you down more percentile points than the plus 5 is going to pull you up, so on the percentile scale you should get an assymetrical range around the point estimate.  Think back to ye olde bell curve.  Lots of folks clustered around the middle, so a small score shift moves you up past quite a few others.  But there aren't too many people out in the tails, so you the same small shift doesn't go as far.  Nope, symmetrical percentile ranges just don't make sense.

Darn and golly gee, my curiousity is whetted.  I really wonder how they did it.  Hope they publish details soon.

Sunday, September 6, 2009

Great Moments in Social Science

Chicago public schools under new CEO Ron Huberman has come up with a "newfangled" statistical model predicting which kids will get shot.  According to the Chicago Sun-Times article:
The "brutal facts,'' [Huberman] said, are that such kids were more likely to be black males, homeless, special education students and students at alternative schools.  Such kids also tended to be at least two credits behind in high school, to have been absent for more than 40 percent of the school year and to have committed nearly one serious school violation per school year.
Amazing the stuff those newfangled statistical models can tell you.  No doubt gang membership and criminal activity had something to do with it as well.

I wonder what's the difference between a brutal fact and a gentle one.

Thursday, September 3, 2009

Rick Hess nails it

Frederick Hess works at the American Enterprise Institute, but he is so often right and says it so well (e.g. "the new stupid") that we can't hold that against him. Once again, he nails it:


While the Bush Department of Education was deservedly pummeled for having little use for those who questioned its agenda or actions, it has been fascinating to see that the Obama education team (for all its talk of moving beyond partisan divides) has proven every bit as insular. The leadership seems to find plenty of time for major foundations and sycophantic associations, but has shown little inclination to reach out to researchers, educators, or reformers who might challenge their assumptions and help sharpen their thinking.

One of the things that impressed me most about Obama is that he seemed like a genuinely smart, thoughtful guy. Somehow I still think he is, but that he made a mistake hiring a dunce for his secretary of education who really should be fired for all the deep doo-doo he's gottten the Prez into over stupid stuff at a time when the Prez really doesn't need such distraction. I mean, if Dunc can't even stage his boss to give a nice little talk to schoolkids without drawing fire, what sort of circus can we expect when NCLB renewal comes around? Duncan apparently led a very sheltered life there in Chicago with his protector mayor, and it's becoming increasingly clear that he just weren't ready for prime time.


And kudos, Mr. Hess.

Wednesday, September 2, 2009

Do they read their own newspaper?

The NY Times ran a story today on a Gates-funded 2-year study of teacher effectiveness. Here is my comment:

Four days ago, the NY Times ran an editorial chiding the NEA and others “clinging to the status quo” who are opposed to regulations proposed by the US Department of Education for states applying for the $5 billion “Race to the Top” funds. A particular sticking point for the NEA — and many highly respected researchers who understand the technical complexities involved — is that the RttT requires states to use student achievement growth measures to evaluate teachers. Now the Times reports that there is a 2-year study underway to “figure out a way to measure exactly who is effective, who is not.” I hope the NY Times editorial board takes note. The reason for NEA and other opposition to the RttT proposed regulations is that WE DO NOT KNOW how to correctly and fairly measure teacher effectiveness, and we are years away from coming up with an scientifically sound approach for doing so. If we knew how to do it, we wouldn’t be needing 2 year studies to figure it out, would we? Since we don’t know how to do it right, the RttT requirements will force states to use whatever half-baked measures can be quickly cobbled together so they can take their place at the federal trough. In other words, teacher evaluations under the proposed RttT regulatons would be based on JUNK SCIENCE, of roughly the same quality used to create the school progress report cards for New York City schools.