Thursday, December 07, 2006

 

Stats

Although I was never a hardcore stat-head, I had to endure the requisite statistical methods courses in both college and grad school as part of my social science training. (The ugliest course requirements repeat themselves, the first time as tragedy, the second time as farce.) Although they were billed as the 'empirical' part of the discipline, I regarded them as the purest mysticism, and, with the benefit of hindsight, still do.

The graduate course was particularly absurd. It was a disciplinary requirement, so we hated it. The professor was working out some psychological issues before us in real time, which made for some amusing moments but little real education. My defense mechanism of choice was ironic distance; my final paper, an examination of the relationship between two variables (I don't even remember which ones) controlling for sex, found nothing, so I titled it “But Sex Always Affects a Relationship!” Not much of a paper, but a damn fine title, if I do say so myself.

The true highlight of the course, though, occurred when one student – either batshit crazy or a comic genius, history will decide – strolled in about ten minutes late one day brandishing a box of Crunch Berry cereal, offering the nutrition information on the side as a real-world example of statistics. (He was also the one who brought his guitar to a feminist theory seminar to favor the group with – I'm not kidding, and neither was he -- “I'm a lesbian, too.” Andy Kaufman had nothing on this guy.)

I think of him sometimes when we use stats on campus to try to make decisions.

Anybody with even the faintest whiff of mathematical training, or intuition, or common sense, can spot flaws in most of the statistics used to make decisions. That's not a shot at the folks generating the statistics – they/we know perfectly well that most of the information is partial, somewhat corrupted by flawed collection methods, and extremely hard to isolate from other variables. (“We changed the course requirements last year, and this year the enrollments went up 5%. Clearly, the new curriculum is responsible.” Um, not really...) The problem, other than the fatal combination of small samples and sheer complexity, is that most of what we want to know derives from problems we didn't anticipate, so we didn't think to collect the data at the time that would address the question we hadn't thought of yet.

Faculty tend to be pretty bright people, as a group, so any data-driven argument for a policy change they don't like immediately brings out the “your methods are flawed” crowd. Well, duh. Of course they're flawed – even a relative non-data-head like me can see that. They're flawed in any direction, but the flaws only seem to draw attention when the policies themselves are unpopular.

(Exception: sometimes, statistics can disprove certain things. That's real, but it's limited. It's hard to sustain the “my program is thriving” illusion when its enrollments are down twenty percent in two years. That said, the statistic doesn't tell you what to do about it.)

In the rare cases when it's possible to get good information, I'm a fan of evidence-based management. The catch is that really solid evidence is remarkably hard to come by, especially in the brief moments when decisions are actually possible. Scholars of higher ed can do national studies on foundation dimes and issue reports that say things like “student engagement in campus activities leads to higher graduation rates,” but even they can't really disentangle causation; do campus activities spark the less-driven students to step up, or do type A personalities naturally gravitate to organized activities?

Most of the decisions, though, are much more mundane than that. I once asked one of my chairs to add a weekend section of a popular intro course, since we're trying to reach out to adult students. He didn't want to, so he suggested that we do a study to see if this would work before actually committing resources to it. The academic in me is so used to seeing this line of argument as reasonable that I almost didn't catch what he was doing. The only way to see if it would work is to try it. We could ask a random sample of residents, but what people say and what they do are only vaguely correlated. The only way to do the study is just to run the damn class and see what happens. (We did, and it worked.) Calling for more data is much more respectable than just saying “I don't wanna,” but it often carries the same meaning.

One of the real shocks of moving from the classroom into administration was growing comfortable with making decisions based on far, far less (and less clear) information. Numbers bounce around, depending on when and how they're gathered, and the possible number of intervening variables in determining why program X is down this year is infinite. I can concede all of that, but still need to make a decision. Standards of evidence that even Andy Kaufman Guy would have found laughable sometimes carry the day in administration, simply by default. The owl of Minerva spreads its wings at dusk, but I don't have that long. Semesters start when they start, and we need to make decisions on the fly to make that happen. If you wait for the statistical dust to settle, you'll miss the moment. In faculty ranks, that would be called 'reductionist,' and it is. It has to be. Part of being in administration is being okay with that, and developing the intuition to focus on the two or three facts that actually tell you something. The rest is mysticism.

Comments:
The best way to curtail these problems is PREVENTION.
SEE:
www.ClassroomManagementOnline.com

Best wishes,
Prof. Howard Seeman
 
OK, Prof Seeman, I'll bite: Exactly what are you proposing to PREVENT? Numbers, perhaps? Disagreements about priorities? Administration's necessary reliance on unsatisfactory information? If you're not planning to engage the actual discussion, your comment is SPAM. (Bet he doesn't respond.)

DD: We have the same problem in government. I spend lots of time looking at numbers, often ones I've generated myself. The object is usually to try to understand what's going on around here. We've probably got "better" statistics than you do--after all, our purpose is record-keeping--but the main data structures didn't imagine the questions I ask, and get asked. So folks (myself included) are obliged to estimate the relationship between the information at hand and the information we actually need to make decisions. Then the deadline shows up, and we do the best we can.
 
mmmm...correlation vs. causation.

(I never took stats, but my husband did a couple of years ago, and I followed along for most of the course. Came away totally convinced that everyone should take statistics.)

and I've wondered for a long time about the real meaning of that correlation between activities involvement and academic success. Most of the students I've met who are involved in student goverment were already driven (or whatever) when they got here.

(totally tangential: I'm leaving higher education for finance, as a web gal, next week. I've immensely enjoyed your blog all this time, and yours is one of the few higher ed blogs that I'm still going to subscribe to.)
 
We recently did a search for a dean position and one of the candidates doomed himself by answering every question with some variation of "I'd need more data before I could decide about that." If you're not willing to go out on a limb and make decisions you will never survive as an administrator.
 
"Statistics"--statistical analysis--is not the same as "numbers" or "data." As I tell my students, statistical analysis is a means of answering questions. One should never even begin a statistical analysis until one knows what the question is. And almost always, the question ultimately has the form of "Is this number (measure) different enough from this other number (measure) that it matters?"

For example, "Does Factor X have a different enough effect on dropout rates for black males than for other students that we might want to design a different intervention based on Factor X for black males?"

Statistical analysis can be fascinating, but it's just one way of answering specific types of questions. (One of the most interesting books I've read in the last 5 years is "The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century," by David Salsburg, and it certainly does not make, or try to make, statistical analysis into the only way to deal with questions.)

(And then you have the Bayesians, who are just batsh*t crazy.)
 
I love this blog and have learned a lot from it about how to function in a large organization. This post is ... let's say my least favorite so far. To rebut briefly, statistics *are* evidence. As with hearsay, public testimony, expert assessment, and all other kinds of evidence, their relevance and accuracy vary. Using statistical evidence is just like using testimony: the "statistical" nature of it is as (almost completely ir-) relevant as "testimonial" nature for purposes of assessing a particular instance. And saying you don't have time is either irrelevant or a copout. Time spent on statistics is spent as wisely or not as time spent assessing any other kind of evidence.
 
I actually agonized over whether to post this one, because I'm not sure I said it quite right. Academic standards for persuasive statistics are so utterly out-of-line with the statistics actually available in the trenches as to be useless. That's not to deny that some fairly basic data or statistics can be useful, but 'fairly basic' is the key phrase.

In my academic discipline, there's a weird blend of extreme mathematical rigor and extreme softness of data, which strikes me as the worst of both worlds. I put much more faith into harder data with less interpolation. If you have to torture the numbers to get them to talk, you probably shouldn't.

Doc is quite right that the key point is choosing the right tool for the right question. In this line of work, 'close enough' really is close enough, since there just isn't the time to wait for every last piece of evidence to come in and subject it all to umpteen levels of calculus. Will that Tuesday night Psych class fill? I have to make the call now. The guess will be educated, but that's the best it will be.
 
Beta blogger ate the first version, so this is the rewrite. The problem is not statistics, the problem is designing courses, majors, deans, with statistical analysis in mind so that you can get meaningful answers. For example, to determine the value of a course, you need to know what the capablilities of the students were before they took the course, how they did in the course (against more than the 10 students taking it that year) and how they did in the next course in the sequence (at a minimum). Faculty resist this sort of thing, it requires time and on top of everything, ethics requires that you do the whole thing as a formative exercise.

Eli Rabett
 
Stats are a lot like Economics -- they're much better for telling you what's not working than for telling you what will work.
 
Another crucial but yet unmentioned element here is the small system size. In the physical sciences, since events can happen a lot (say 10^23 molecules do something), the inherent fluctuations (approximately the square root of the number of things) are often small compared to the average. No so in managing (or even sometimes in biology, I learned today at a seminar).

Did Bush really win Florida over Gore (or Ohio over Kerry)? Not within some of the statistical analyses that were done. But a decision had to be made anyway, like the result or not. Alas, we couldn't repeat the election 100 times to see what average result would occur.

With all of that said, do I find it frustrating when decisions are made based on interpreting statistical fluctuations as a long-term trend? Of course.
 
"In my academic discipline, there's a weird blend of extreme mathematical rigor and extreme softness of data"

Ah, you must be an economist! An economist I know is always annoyed at the lack of mathematical rigor in physics papers, even theoretical ones, compared to his own field. But he understood the point that we can repeat experiments until we get the data needed to test the model, whereas he has to repeat the statistical analysis until he is happy.

Both face challenges in administration, for the reasons you articulate: decisions need to be made now, with the data you have.
 
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?