Wednesday, September 21, 2005


Did He Make It? Bring In the Chains!

In response to yesterday’s entry about mission creep, a friend on the tenure track at a midtier school wrote:


...the research expectations at this place are on the rise. A large percent of my department members, especially the younger faculty, have at least one book, for example. At the same time, we are expected to teach well. What’s been getting me is the steps being taken to determine if we are teaching up to par. First, there are the repeated ‘peer evaluations’ of your teaching, or when a colleague sits through your course. I’m in the middle of that, and besides being annoying, it very much feels like a bureaucratic hoop. My colleague was required to show up to every one of my classes (all three of them), and the requirements say that she has to make ‘repeated’ visits. And I need another person to do the same. Second, I’m supposed to put together a teaching portfolio...[that] will be hundreds of pages. Third, for every single course, I’m supposed to read my course evaluations and use them to write a self-analysis that describes strengths and weaknesses as well as discusses how I would improve on mistakes. Of course, I always find out about these rules secondhand, because the people who come up with them consider them self-evident.

The point being: The problem is only partially that we are being pulled in two directions. It’s also the bureaucratic enforcement. If this were still primarily a teaching institution, all this crap would be more tolerable. But, when you lose valuable research time and energy creating meaningless documents and sitting through other people’s classes, it seriously sucks.


Yes, it does. It sucks from the management side, too. Let’s say your institution values teaching, and someone you know to be a far below-average teacher is coming up for tenure. You want to deny, since you’re reasonably sure you could hire someone much stronger. (Assume, for the sake of argument, that you’re reasonably sure you won’t lose the position altogether.)

How do you prove that the candidate isn’t up to snuff as a teacher? What can you use? (And yes, in this litigious climate you have to assume that any negative decision will be challenged.)

Peer evaluations don’t work, since they’re subject to all manner of bias. In colleges with ‘consensus’ cultures, the unwritten rule is that no professor ever speaks ill of another to a dean. So the Lake Woebegone effect kicks in, and everybody is far above average, rendering the evaluations worthless. In a conflictual culture, the evaluation will reflect the political fault lines of the department – more interesting to read, but still useless as signs of actual teaching performance.

Evaluations by chairs are subject to both personal whims and political calculations, so their worth is frequently suspect, as well.

Student evaluations are less likely to reflect internal departmental politics, but they have a somewhat justified reputation for being manipulable. More disturbingly, I’ve read that student evaluations correlate with both the physical attractiveness of the teacher (particularly for male teachers, oddly enough), and the extent to which the teacher plays out the assigned gender role (men are supposed to be authoritative, women are supposed to be nurturing – when they switch, students punish them in the evaluations).

Other than evaluations, what should count? Outcomes are tricky, since they’re usually graded by the professor being evaluated. (My previous school used to fire faculty who failed ‘too many’ students. You can imagine what happened to the grading standards.) Outcomes also commonly tell you more about student ability and interest going into the course than about what went on during it.

Attendance isn’t a bad indicator, but it’s hard to get right. If the students hate the class so much that they simply stop showing up, something is probably wrong. But good luck getting that information in a regular, reliable way. And it, too, often reflects time slot and local culture.

Bad measures aren’t unique to academia. On those rare occasions when I actually get to watch football, I always get a kick out of the moments when they aren’t sure if the runner made quite enough yards (meters) for a first down. The referees put the ball on the ground where they think it should be, then march two poles onto the field, each supporting one end of a ten-yard chain. They plunk the first pole down by sort of eyeballing it, then pull the chain taut and use the location of the ball relative to the second pole to see if the runner made it. I’ve seen decisions hinge on inches (centimeters).

The false precision always makes me chuckle. If they can just eyeball the location of the first pole, then exactly how precise can the second pole really be? It’s just the initial eyeballed spot, plus ten yards.

Measuring the quality of teaching, sadly enough, is sort of like that. We use ungainly and even weird measures because we need to use SOMETHING, and nobody has yet come up with a better, practicable idea. Bad teachers rarely confess, so we need evidence. It’s fine to condemn the evidence we use – I certainly think my friend’s school is going way, way overboard – but I don’t foresee any change until we have an alternative.

Question for the blogosphere: how SHOULD we measure teaching quality? Put differently, what evidence would be fair game to use to get rid of a weak, but not criminal, teacher? If there’s a better, less intrusive way to do this that still gets rid of the weak performers, I’m all for it. Any ideas?

aren't you really asking the wrong question? shouldn't you ask 'are the students learning?'. frequently find that 'teaching' devolves into opinions of method and philosophy, then when those are found problematic, teaching indicators frequently become popularity indicators. so teaching evaluations change depending on whether or not the teacher is popular.... however learning evaluations do not. how do you measure learning? I like the model they use in Europe. You send your students coursework to another university to be graded blind by an expert in the field.
No, that wouldn't work. The problem with that (and all other 'learning' models) is that you don't have a baseline from which to measure the value added. (I know, that's jargon-y, but hey.) Pre-tests don't work, since students don't take them seriously.

Given the different populations that different schools (and different disciplines within schools) attract, I think this approach would quickly penalize anybody who works with high-risk students, and would falsely reward folks at schools with selective admissions.
One thing about attendance--students may like the class, but if it's at 8 or even 9:30am, you just aren't going to get a full house every time.

I can certainly see the difficulties inherent in teaching evals, but still, I find it kind of ridiculous that I don't get evaluated at all as part of my 3rd year review. They just go on the basis of student evals. Shows how little teaching counts at XU...
I imagine that you'd need someone who was a specialist teaching evaluator. Such a person would have to look at a variety of sources to try and find out whether someone was a good teacher or not.


a) what methods of making materials available to students does the teacher use and how timely and relevant are they?

b) Conduct a focus group of some of the students who attended at least 70% of the lectures on a course to quiz them with specific directed questions about the lecturer's performance, not letting vague generalities like "not bad" stand but getting more specifics about the lecturing style and how well information was conveyed, and how does that lecturer compare with others?

c) Show their lecture materials to two others elsewhere in the same field to see whether the explanations make sense, how suitable the coverage is with respect to the syllabus.

d) The evaluator could attend a lecture and focus on how attentive the students are (or aren't).

I'm sure there's lots more creative evaluation methods that could be used. But I agree it's problematic. The only thing I am pretty convinced of is that you need a variety of sources of information to do a proper measure, because teaching has so many different aspects to it.
We've gone away from inputs like you've discussed and moved toward outputs (Student Learning). Instead of the football analogy, which I like alot BTW, we are using specific Learning Outcomes as described by O'Bannion's research from the League for Innovation in the Community College. "By the end of the term, the student will be able to X with 90% accuracy." So far its working OK, but we are still relatively new at the game.
I like the specific learning outcomes concept, but how do you know how much reflects the quality of instruction, and how much reflects what the students brought with them to the class in the first place?
The question of evaluating teaching is hard, because the things we would like to measure are likely to be the results of a whole constellation of factors, only some of which are under the control of the faculty member.

One point I would make is that we might, in some ways, try to make the evaluation of teaching more like our evaluation of research. Now, don't go ballistic on me just yet.

This involves outside, anonymous peer review, and not at the point of decision-making (tenure, promotion), but on an ongoing basis. In research, we peer review articles, for example. In teaching, we could, if we chose to commit the resources to it, the course or something like it. The structure for doing this might be difficult to establish, and it would be expensive. Peer reviewers would have to be trained.

Actually, my real feeling is that universities are generally not really serious about evaluating teaching/learning. They'd like to be seen as being serious, but they're not. The consequence is what Dean Dad refers to as mission creep--in this context the elevation of reearch, because it's easier to measure, other people do it for you, and it's free.

Then again, if anyone would like to see commentary on performance evaluation in other contexts, take a look at some of the recent posts in Mark A. R. Keliman's blog (
I have no brilliant ideas but I like the concept of outside review (along the lines of "specialist teaching evaluator" proposed by lossy). Expensive in terms of money and time so it could never really happen. As someone who would appreciate more feedback/review/appreciation, I like to dream.

How about if we convince all schools with master or doctorate degrees in education to require they go to one local school (outside of their own) and review a class/instructor/professor each week for some sort of semester/term credit? Or, sister schools where evaluations would be done by someone who may have less bias but would be coming from a similar teaching background in regards to the type of student body. For example, a CC would only be a sister school for another CC, not a R1 school. Again, time and cost prohibitive but fun to think about.
sorry, master or doctorate degree programs in education - not degrees in education.
baseline does not matter, because we should not be talking about the 'teacher' at all. we should be talking about what the students learn. if the student chooses to enter the course knowing everything, that is there choice. but no course will be entirely full of those students. learning as such is measured by the external grades as whether or not in any given class the majority of students have mastered the material in that class. you could have a blathering rock as a teacher, but if somehow he/she manages to get the students to learn and master the course material.... then she/he is an effective teacher. end of story. everything else, in my mind, is either a popularity context, creeping scientism or 'theory of the day'.

i don't think the external approach penalizes anyone if it is done with finesse. seeing as significant parts of the world use it to some degree of success, isn't it more likely that the resistance in the u.s. is probably ideological. it isn't that you send the material to the external reviewers blindly, you prepare the report on the class, what was covered, what the demographics are, etc. then they are graded, sometimes the assignments are graded by two external reviewers. this process removes instructor bias, and the odd habit of teachers to give grades out for things other than merit.
When talking about teacher performance, baselines absolutely matter. That’s how you can determine "what students learn" versus "what they already knew."

Measuring strictly by grading outcome will encourage teachers and institutions to shy away from students who aren’t already prepared or are difficult to teach.

The problem is community colleges can’t do that. Defective widgets can be destroyed. Dumb or ill-educated kids at a community college can’t be. Somebody’s got to do take on the job. Why rig the system at a CC to cripple the career of that “somebody?”

Hiring a designated “teacher assessor” who travelled from class to class sounds like a catastrophically bad idea. Said person would instantly become the Secret Police of the school. Feared, loathed, and the target of more flattery and bribes than anyone you’ve ever met. Also, teachers would gear their classes to please that single assessor. Not a good idea.

What if the assessor is a jerkass? Or plays favorites? Or doesn’t understand a particular teacher’s approach? Those are risks inherent to any system, but concentrating assessment power in a single person’s hands magnifies those risks beyond any level I’d deem acceptable.
It is an important question. Where I work the HoD bases his evaluations on (I quote) "what I see in meetings" and student evaluations only. So basically he evaluates us on our administrative and relational skills, and what students choose to report.

I once argued that he should go himself to observe classes, but to no avail.

Somebody observing is necessary, we can't just count on students, though it's also important. I would argue for some kind of rotating peer evaluation (that is not too time consuming), with the Dean/HoD also actually seeing what is happening, and a meeting together to discuss the feedback.
So overall you have three different sources of information, and the oppotunity to put it in context.
Exams etc. should be monitored by someone else too.
Sorry I can't find a way to send an email DeanDad. You say " I’ve read that student evaluations correlate with both the physical attractiveness of the teacher (particularly for male teachers, oddly enough), and the extent to which the teacher plays out the assigned gender role (men are supposed to be authoritative, women are supposed to be nurturing – when they switch, students punish them in the evaluations)"
Can you tell me where you read this please?
Here's an annotated bibliography of research on how gender affects evaluations.

Hope that helps.
You've already received a number of comments about this, but it seems like a "secret shopper" approach would be best. Have non-students (in your case, possibly community volunteers) receive a brief training course in what you consider the reasonable metrics, and then have them take the course.
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?