Wednesday, September 21, 2005

Did He Make It? Bring In the Chains!

In response to yesterday’s entry about mission creep, a friend on the tenure track at a midtier school wrote:


...the research expectations at this place are on the rise. A large percent of my department members, especially the younger faculty, have at least one book, for example. At the same time, we are expected to teach well. What’s been getting me is the steps being taken to determine if we are teaching up to par. First, there are the repeated ‘peer evaluations’ of your teaching, or when a colleague sits through your course. I’m in the middle of that, and besides being annoying, it very much feels like a bureaucratic hoop. My colleague was required to show up to every one of my classes (all three of them), and the requirements say that she has to make ‘repeated’ visits. And I need another person to do the same. Second, I’m supposed to put together a teaching portfolio...[that] will be hundreds of pages. Third, for every single course, I’m supposed to read my course evaluations and use them to write a self-analysis that describes strengths and weaknesses as well as discusses how I would improve on mistakes. Of course, I always find out about these rules secondhand, because the people who come up with them consider them self-evident.

The point being: The problem is only partially that we are being pulled in two directions. It’s also the bureaucratic enforcement. If this were still primarily a teaching institution, all this crap would be more tolerable. But, when you lose valuable research time and energy creating meaningless documents and sitting through other people’s classes, it seriously sucks.


Yes, it does. It sucks from the management side, too. Let’s say your institution values teaching, and someone you know to be a far below-average teacher is coming up for tenure. You want to deny, since you’re reasonably sure you could hire someone much stronger. (Assume, for the sake of argument, that you’re reasonably sure you won’t lose the position altogether.)

How do you prove that the candidate isn’t up to snuff as a teacher? What can you use? (And yes, in this litigious climate you have to assume that any negative decision will be challenged.)

Peer evaluations don’t work, since they’re subject to all manner of bias. In colleges with ‘consensus’ cultures, the unwritten rule is that no professor ever speaks ill of another to a dean. So the Lake Woebegone effect kicks in, and everybody is far above average, rendering the evaluations worthless. In a conflictual culture, the evaluation will reflect the political fault lines of the department – more interesting to read, but still useless as signs of actual teaching performance.

Evaluations by chairs are subject to both personal whims and political calculations, so their worth is frequently suspect, as well.

Student evaluations are less likely to reflect internal departmental politics, but they have a somewhat justified reputation for being manipulable. More disturbingly, I’ve read that student evaluations correlate with both the physical attractiveness of the teacher (particularly for male teachers, oddly enough), and the extent to which the teacher plays out the assigned gender role (men are supposed to be authoritative, women are supposed to be nurturing – when they switch, students punish them in the evaluations).

Other than evaluations, what should count? Outcomes are tricky, since they’re usually graded by the professor being evaluated. (My previous school used to fire faculty who failed ‘too many’ students. You can imagine what happened to the grading standards.) Outcomes also commonly tell you more about student ability and interest going into the course than about what went on during it.

Attendance isn’t a bad indicator, but it’s hard to get right. If the students hate the class so much that they simply stop showing up, something is probably wrong. But good luck getting that information in a regular, reliable way. And it, too, often reflects time slot and local culture.

Bad measures aren’t unique to academia. On those rare occasions when I actually get to watch football, I always get a kick out of the moments when they aren’t sure if the runner made quite enough yards (meters) for a first down. The referees put the ball on the ground where they think it should be, then march two poles onto the field, each supporting one end of a ten-yard chain. They plunk the first pole down by sort of eyeballing it, then pull the chain taut and use the location of the ball relative to the second pole to see if the runner made it. I’ve seen decisions hinge on inches (centimeters).

The false precision always makes me chuckle. If they can just eyeball the location of the first pole, then exactly how precise can the second pole really be? It’s just the initial eyeballed spot, plus ten yards.

Measuring the quality of teaching, sadly enough, is sort of like that. We use ungainly and even weird measures because we need to use SOMETHING, and nobody has yet come up with a better, practicable idea. Bad teachers rarely confess, so we need evidence. It’s fine to condemn the evidence we use – I certainly think my friend’s school is going way, way overboard – but I don’t foresee any change until we have an alternative.

Question for the blogosphere: how SHOULD we measure teaching quality? Put differently, what evidence would be fair game to use to get rid of a weak, but not criminal, teacher? If there’s a better, less intrusive way to do this that still gets rid of the weak performers, I’m all for it. Any ideas?