Sunday, October 04, 2015

Assessment and the Value of Big, Dumb Questions


How do you know if a curriculum is working?

In the absence of some sort of assessment, too often, the answer was “because the people teaching it say so.”  One would think that the conflict of interest there would be obvious; for anyone outside the given department, it usually was.  But the existence of motive and opportunity does not, in itself, prove a crime, so curriculum committees fell back on a sort of mutual non-aggression pact by default.  You don’t attack my program, I don’t attack yours, and we’ll trust that it will all come out in the wash.  As long as nobody else comes sniffing around, that sort of mutual convenience -- usually couched in a huffy rhetoric about “professionalism” -- can protect sinecures for a while.

I bring that up because it’s impossible to understand the assessment movement without understanding what it was responding to.  

At its best -- and I’m not arguing for one minute that it’s always at its best -- it serves as a reality check.  If a nursing department claims that it’s the best in the country, yet the majority of its graduates fails the NCLEX, well, that raises a credibility issue.  If the students who transfer from a particular community college consistently and significantly underperform other transfers and native students at a four-year school, I’d raise some questions about that community college.  

Although faculty in many liberal arts programs think that assessment is new, it isn’t.  It has been the coin of the realm in fields with certifications for decades.  In the world of community colleges, for example, nursing programs are typically leaders in assessment, simply because they’ve done it from the outset.  Fields with external accreditations -- allied health, IT, engineering, teacher education, even culinary -- have done outcomes assessment for a long time.  It came later to the liberal arts, where many people responded with shock to the brazen newness of what was actually a longstanding practice.

Tim Burke’s piece on assessment and the curse of incremental improvement is well worth reading, because it acknowledges both the flaws in popular assessment protocols, and the need for some sort of reality check.  I’m of similar mind, and would draw a distinction between assessment as it’s often done or used, and assessment as it could be done or used.  Context matters -- a bachelor’s-granting college with mostly traditional-age students has a far easier time tracking students than an associate’s-granting college with a majority of part-time students.  But while the implementation mechanisms will differ, the basic idea is the same.  Students deserve efforts at improvement.

My issue with much of the outcomes assessment that’s actually practiced is that it falls prey to false precision.  I’ve seen too many reporting forms with subcategories that have subcategories.  When every subunit of a curriculum has to respond to the same global questions that entire curricula do, a certain measurement error has been baked into the cake.  Thoughtful assessment requires time and labor, both of which are at premiums when budgets are tight.  And when measures rely on students’ willingness to do tasks that don’t “count,” such as taking pre-tests and post-tests, I don’t blame anyone for being skeptical.

Instead, I’m a fan of the “few, big, dumb questions” approach.  At the end of a program, can students do what they’re supposed to be able to do?  How do you know?  Where they’re falling short, what are you planning to do about it?  Notice that the unit of analysis is the program.  For assessment to work, it can’t be another way of doing personnel evaluations.  And it can’t rely on faculty self-reporting.  The temptation to game the system is too powerful; over time, those who cheat would be rewarded and those who tell the truth would be punished.  That’s a recipe for entropy.  Instead, rely on third-party assessment.  The recent multi-state collaborative project on assessment is a good example of how to do this well: it uses third-party readers to look at graded student work taken from final-semester courses.  Even better, it uses publicly-available criteria, developed by faculty across the country.  In other words, it keeps the key faculty role and respect for subject matter expertise, but it gets around the conflict of interest.  

(For that matter, I’m a fan of third-party grading on campus, too.  If Professors Smith and Jones are teaching sections of the same course, and they swap papers for grading purposes and let students know that’s what they’re doing, they can immediately recast the student-professor relationship.  Suddenly, I’m not both helper and judge; I’m the helper, and that so-and-so over there is the judge.  It’s you and me against him.  Someday, I hope to try this at scale…)

When I’ve had conflicts with the folks who do assessment, it has largely been around the specificity of goals.  Here, too, Burke and I are on common ground.  In liberal arts fields in particular, assuming that the whole equals the sum of its parts can be a mistake.  The serious study of, say, history, is partly about learning techniques and facts, but partly about developing a way of thinking.  That latter goal takes time to manifest.  (There’s a famous line that the gift of historical study is a sense for the ways things don’t happen.  This is where many techno-utopians come to grief.)  The “tolerance for ambiguity” that many employers find lacking in new grads is exactly the sort of thing that the study of history, or sociology, or political science can foster.  But almost by definition, it’s hard to pin that down, especially early.  

Too-assiduous obedience to a grid can cut down the future to the size of the present.  If we only measure what we anticipated, we miss moments of discovery.  Excited minds can go in unanticipated directions; I’d argue that’s often a sign of spectacular success.  To the extent that assessment grids become like Procrustes’ bed, cutting off guests’ legs to make them fit, they should be consigned to the flames.  But they don’t have to be used that way.  

To the extent that local assessment offices have fallen into these traps, I can understand suspicion and resentment.  But I can’t understand the position that some people are so special, so far above the rest of us, that they’re simply immune to scrutiny.  Nobody is special.  Nobody is immune.  The task shouldn’t be to try to turn back the clock to 1970; the task should be to adapt the tools of assessment to serve its best purpose.  The more time we waste on the former, the longer we wait for the latter.