Wednesday, July 27, 2011


If I Could Get Away With It...

I’d have everyone on campus read Moneyball, by Michael Lewis.

It’s several years old now, but its point still holds. It was about a general manager of a baseball team who figured out years before his colleagues that he could test the ‘folk wisdom’ of scouts against actual statistics. In some cases, the stats showed that what the scouts took as received truth didn’t hold up. But there’s no way that an individual scout, looking only at his own players, would ever know that; the patterns only showed up when you abstracted from individual experience and looked at large numbers of cases over time. (For example, this general manager realized before most others did that “on base percentage” mattered more than “batting average,” so he was able to make some lopsided trades.)

Some people within baseball read Moneyball but missed the point. They replaced the old rules of thumb with the new ones. The point was that all wisdom needs to be tested empirically, and that what works can change over time. On-base percentage was an example of the point, rather than the point itself. Once the rest of the sport wised up to on-base percentage, that measure lost its usefulness for improving a team. (Now they’re trying to develop good stats for fielding.)

Although the context for the argument is baseball, the point is true outside it. Empirical data over large numbers of cases can contradict folk wisdom that seems right. And when it does, it’s time to call that folk wisdom into question.

I mention this because I keep running across a few programs that believe, with all sincerity and more than a little self-righteousness, that they are the absolute best at what they do. They’re quick with anecdotes and testimonials, and they can tell stories that go back decades. And the numbers -- the actual, honest-to-blog numbers -- show they’re wrong.

But these are the kind of numbers that require looking at a decade of performance and hundreds or thousands of students. You won’t see those, or figure them out, through the daily experience of teaching classes. They aren’t apparent at ground level. The folks who honestly believe that the programs are successful aren’t lying, any more than the scouts who picked the wrong players were lying; they’re just wrong. Sincere and well-meaning, yes, but wrong.

This is one area where I believe that administrators have something very real to contribute to discussions of curriculum. If a program that, say, is supposed to improve graduation rates actually harms them, it’s easy not to see that in the face of real success stories and impassioned advocacy. But not seeing it doesn’t make it go away. Having someone whose job it is to give the view from an external perspective has real value. That’s not to discount the experience of the folks in the trenches, but sometimes the aerial view can show you things that the folks in the trenches aren’t well-positioned to see.

Culturally, this has proved a difficult argument to make. It’s hard to tell a proud staff with years of confirmation bias behind it that the numbers don’t mesh with their story. They react defensively, and some decide that the best defense is a good offense. But reality is stubborn.

It’s becoming clear to me that there’s cultural work to be done before the statistical work can show its value. People have to accept the possibility that what they see right in front of them, day after day, doesn’t tell the whole story, and that it can even lie. They need to be willing to get past the “I’m the expert!” posturing and accept that truth is no respecter of rank.

And that’s why, if I could get away with it, I’d have everyone read Moneyball.

Wise and worldly readers, assuming that there’s little appetite for allegories drawn from professional sports, is there a better way to make this point? A way that fits with academic culture, and that allows everyone to save face, but that still gets us past the discredited legends of the scouts?

The big suggestion I'd make is stop categorizing administration as on the side of "truth" and "reality" - implying that faculty and staff are on the side of, well, you do say they aren't liars and you don't want to discount their experiences, but nevertheless imply that they are stupid and wrong because their claims are "folk wisdom" and that what they see day after day is lying to them.

Um, you're talking about your colleagues here. Perhaps showing them some professional courtesy might be useful.

The fact of the matter is, testimonials are "true" insofar as they go, just as stats are only "true" insofar as *they* go. Quantitative data is not, in fact, "more true" than qualitative data, nor is the aerial view more real than the on-the-ground view - it's just the questions and methods are different, and the aims are different and lead toward different outcomes.

So what I'd suggest is that there is a need to educate your colleagues about *why* qualitative measures are appropriate in this case, *what the goal is* of gathering the information, the specifics about the methodology, and *how the information will be used.* And actually let them be part of the process and help to shape the direction of the information-gathering. And all of that needs to happen before you come in making baseball analogies about "the truth." Let your colleagues play a role in deciding how the process of self-study will go; let your colleagues bring their expertise to the table.

If you spring a bunch of numbers on people who are professionals and tell them that they are wrong, that in fact their "reality" is not reality but illusion, yes, they are going to get defensive. I don't think that's an issue of institutional culture. I think it's an issue of human nature. Professionals want respect. People want to be heard and valued. And if you want people to get with the program, you need to start there - not with number-crunching.
I'd recommend How to Lie with Statistics as a companion text, but Moneyball is a good example of putting the focus on a single agreed-upon success measure. Many of our challenges arise because we don't have a single outcome to measure. For example, it is impossible to find out if a student that started at your college graduated somewhere else or started a successful business.

I'm glad you raised this issue of data analysis because it makes it convenient for me to comment on something you slipped into yesterday's article about freeing up IR (institutional research) from mere compliance issues.

There is no reason that IR has to be the only way for a professor or group of professors (say, college algebra) or Deans (8AM classes) to look at what correlates with success and failure in a particular course or course sequence, or try innovative ways of measuring "success". IR should be looking at college-wide and external outcomes, such as how AA graduates do after transfer or job placement of AS graduates, that are hard to get at from your own data sets.

Well, there are two reasons for the reliance on your local Delphic Oracle:

1) The byzantine systems being used within the administrative part of the college make the use of modern data tools impossible.

2) Many of the faculty are innumerate and couldn't even misuse a statistical analysis tool.

Faculty rely on the plural of anecdote because they can't access actual data.

As for your ultimate question, you might need what used to be called a "teach in", and bring with it an attitude of open inquiry. You could start by blaming the administration for sharing in the belief that that particular program was doing well by failing to study the issue carefully and sharing the data widely. "We all need to examine our assumptions."

As a scientist, I know the importance of publishing honest results and correcting errors that show up in print. I also knew scientists who buried their mistakes in the literature. I have also seen an entire campus believe a lie that probably began as an honest misuse of non-comparable statistics by a President who had no quantitative academic background. I have also seen program admins make very large, public changes to advance an agenda, then make an even larger change quietly, without saying the previous idea failed to improve the situation or made things worse.

In contrast, I reserve the greatest respect for a high-level manager who was perfectly honest about the failure of an innovative idea and the need to find a different approach.
With all due respect to reassignedtime's comment (and all due adoration for that blog), how do you know that the respect hasn't been shown, respect for that experience hasn't been communicated, and softer methods of persuasion haven't been comprehensively ignored?

We don't have to look any further than the present Debtpocalypse to see people who are being repeatedly persuaded, as gently as possible, that the current crisis really is a Big Deal and needs to be resolved and see those people say "nope, it's worse for our country to default than to agree to a deal where I don't get my way". (I personally think there are a lot more of those people on one side of the aisle than there are on the other; however, which side of the aisle you think I'm talking about is completely secondary to the point.)

I don't think there has been any time in history, in fact, where there are so many people who are so sold out to their own version of "folk wisdom" that they COMPLETELY IGNORE facts on the ground, and do so ACTIVELY. In this respect I am totally sympathetic to Dean Dad's point...

...and I honestly wish I could make MY ADMINISTRATION read Moneyball.
"The point was that all wisdom needs to be tested empirically, and that what works can change over time."

From what I've seen lately, I doubt that any of our institutions of higher ed believe this statement. Which surprises me, because I would have expected them to be the source of it.
Edmund Dantes: they are. Who do you think developed all of those statistical tools? It's just that there's a bad culture right now due to a bunch of fairly obvious factors.

DD: I don't know. This seems like traditional politics kind of stuff -- lots and lots of individual meetings, sending out friends to calm down other friends, that sort of thing. The big thing would be not to get bitter, I think. The reasons for their error are obvious and sympathetic. You just have to outlast them in the nice department until folks start seeing you as the reasonable one, at which point the stats will start to sink in.
Anonymous Coward - I was responding to the language of DD's post, which did seem dismissive of those on the ground as deluded by their experience. If people are on the defensive, as I actually was in my comment, I assumed it was because the way the issue was presented in the post is how it's being presented to the players (faculty/staff) who implement this program. That may be a faulty assumption on my part, I acknowledge.

I think in many ways CC Physicist's comment is much more productive than mine, though in truth I entirely agree with him. This is the difference between how a humanities person and how a science person articulates the issues, I think. Just because the data contradicts the "experts" doesn't mean that the "experts" lack expertise, or that they are "wrong," is, I suppose, my broader point.

I suppose one last thing, getting back to the question at the end of the post. From a faculty perspective, when it comes to these issues, I would say that I care a lot about *process*. If I believe in the process, then I will be invested in the end product, regardless of whether I'm in love with the ultimate outcome or not (substantially changing a program, restructuring on a broad scale, whatever). In contrast, in my experience, administrators tend to be focused more on the end product than on how we get there (obviously there are exceptions to this, but I'm just talking generally here). I think the result, if both sides don't try to meet in the middle - in other words, if there isn't cultural change on *both sides* - is faculty belief that administrators are acting in bad faith (which isn't necessarily true) and administrators believing that faculty are delusional, entrenched, and also acting in bad faith (which isn't necessarily true). The issue isn't that administrators need to gently persuade faculty/staff to agree with them - it is, as CC Physicist noted - that an open dialogue between the two needs to occur and that both sides need to acknowledge their blind spots.
One of the reasons people will resist information that “attacks” their pet program is that it would leave them with nothing to do. The thing they were proud of, the thing they got release time for would be gone. For many people even outside the academy, the loss of those things creates a bias against any information that would cause change and pain.

You have several options. You can require that someone outside the school weigh-in on the value of the program in some financial way – give them 2 or 3 years to find grant support. If they’re as good as they thing they are, some agency somewhere should be able to kick in something. This lets you off the hook and makes the program accountable to an outside agency for outcomes. You could also give the program a chance to improve on a fixed timeline – this has the advantage of allowing the program to clean itself up in a reasonable amount of time and accomplish the goal you want to accomplish.

Whatever you do, the same level of rigor should be applied to all programs. If two are failing on some quantitative measure, they both need to be held accountable the same way. There needs to be a program of improvement or phase-out that applies equally to both.

Last but not least, you might get mileage out of offering faculty that run/participate in the programs an out – something that could soften the blow of losing release time or prestige as their project is winding down.
*Reading* Moneyball is not enough, as your comments indicate--too many people missed the point. Reading Moneyball alongside a discussion of what the point is, finding ways to illustrate that point in other settings...or, I guess, *teaching* Moneyball...that's what's needed.

Even having an institutional research office can be less than helpful. As you've noted, much of their time if tied up in doing compliance work rather than real programatic effectiveness work. And segregating that work in an IR office almost guarantees that much of what they do will be dismissed as uninformed. Or simply data manipulation without insight.

I'm on the Faculty Focus email list (and if any readers of this blog aren't on it, you should all get on it: Today's email dealt with 4 things any evaluation/assessment system should deal with, which are:

1.How do you define a successful student?

2. What's the evidence that students meet your definition of success?

3. Are you satisfied with your results? Why or why not?

4. If you're not satisfied, what could be the cause and what are you going to do about it?

I think that's a good set of questions for doing any evaluation or assessment (although I'd change the fourth one a little--not "what could be the cause?", but "what does the evidence suggest the cause is?"). And I'd agree with everyone who said that qualitative evidence is also evidence.

Statistical analysis can not be done in a vaccuum. It starts with a question, and stating the question properly is half the battle. We will find, at least some of the time, that the question *calls for* a qualitative answer, rather than a quantitative answer, which means that most (parametric) statistical processes won't help all that much. But when the question is susceptible to being answered on the basis of numerical data, beginning with framing the question helps define both the data needed, and the type of analysis that is appropriate to the question.

But, in the end, the most successful evaluation or assessment systems will be ones that have initially spent time determining what questions to ask, not what data to gather, or how to analyze those data. The question comes first.
Just a tiny thought -- a good analogy for you to use in discussions might be medical doctors. Most of them are indeed more-than-competent and caring, and try and learn from their own experience, but they still pay attention to research and change their treatments based on accumulated data.
Lead by example. Work with your faculty to develop statistically meaningful measures of some aspect of your own office's performance, run the test, and then take your bows or your licks, as appropriate.
Great point, doc, because item 1 gets skipped much too often or gets defined without any discussion or agreement across the campus.

For example, a campus presentation presented results that assumed item 1 (define the successful student) was "probability of being given an A, B, or C grade in a particular class". However, the faculty voiced the opinion that the correct measure would an A, B, or C in the NEXT (required for graduation) class, not in the revised prep class. They didn't have any data for that, because the IR people didn't collect it.
If you've got a program that helps individual students (according to the folks on the ground) but overall hurts graduation rates, your task is not to convince faculty that they are doing it wrong, but to figure out what they are doing right and why it is not applying to everyone.

In biology, the exceptions are very often as interesting as the rule, if you can figure out why they are exceptions.
If I had a new drug that was curing some people of late stage pancreatic cancer (like actually achieving cures- this is unprecedented for this type of cancer), but was overall decreasing survival, I'd get to work on drug metabolism enzyme polymorphisms or analyzing biomarkers for the actual cancers, stat. But then, that might be because I've done academic research; I suppose if I were a drug company who was worried most about profit, I might just go looking for another drug.

I guess the generalizable principles I'm getting at are: First, you can't take away someone's project without helping them develop a better one. Second, you can't expect academics to settle for 'it's not cost-effective' as a reason not to do something.
There certainly have been trends in curriculum RFPs for proposals to include quantitative analyses of success measures. If the relevant depts are applying for external funds, then perhaps some administrative support to develop in-house measures could be useful?

Alas, perhaps this would be more useful at Univ of DD than at CC of DD.
Before you get buy-in, you'll probably have to explain how you are defining and measuring "success" — and you'd better make sure your data holds up to scrutiny.

I've had data-driven administrators cite study after study, without (apparently) realizing that said studies weren't statistically valid, or couldn't be extrapolated to our circumstances. If you're going to use numbers, then expect your statistically-literate faculty to look at them.
Actually, that's a good point -- if you're looking for buy-in, one good tactic might be to invite the most statistically literate of your faculty to a room, feed them donuts, and have them take a crack at your data.
a good analogy for you to use in discussions might be medical doctors. Most of them are indeed more-than-competent and caring.
and also Great point, doc, because item 1 gets skipped much too often or gets defined without any discussion or agreement across the campus.
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?