Tuesday, May 19, 2015

The Problem with Longitudinal Data

This may be the unsexiest title ever, but the subject matters.

This week we got the latest data on our six-year student success rate.  It’s supposed to tell us how we’re doing, and in a global sense, it does.  But it has a glaring flaw that reduces its usefulness in driving change, and renders it absurd for use in performance funding.

It’s at least six years old.  

In fact, it’s slightly older than that, due to the delay in gathering data.  Which means that we just got numbers for the cohort that entered in the Fall of 2007.

People who study retention data insist that the lion’s share of attrition happens in the first year.  That means that the hot-off-the-presses numbers we’re getting now are mostly reflective of what happened in the Fall of 2007 and the Spring of 2008.  That was before the Great Recession, the enrollment spike of 2009-10, its subsequent retreat, and the largest wave of state cuts in memory.  It reflects what was, demographically, a different era.  And it misses everything we’ve done in the last six years, since someone who dropped out in early 2008 missed the innovations introduced in 2010 or 2012.  

In other words, as a reflection of what we’re doing now, it really doesn’t help.

It’s possible to get much more recent data, of course, but it’s necessarily partial.  In any given year, indicators can point in seemingly contradictory directions; the underlying picture may not become clear until long after it has ceased to be useful.  The owl of Minerva spreads its wings at dusk, by which time it’s too late.

From a system perspective, longitudinal data has real value.  It can serve usefully as a reality check or a diagnostic, especially when the data are chosen to reflect a sound theory.  For example, I’m a fan of the surveys that show the percentage of state university grads who have some community college credits.  The percentages are so much higher than cc grad rates that they strongly suggest that we’re asking the wrong questions.  They don’t shed much light on individual campuses, but they strongly suggest that the ecosystem is more than the sum of its parts.  We’d be wise to keep that in mind when having discussions of, say, funding policy.

But drilling down from a long-term systemic view to a single campus and year-to-year variations in funding is problematic at best.  

On campus, it’s difficult to run “clean” experiments, since we can’t isolate interventions.  In any given year, we’re trying multiple things, and the external environment is changing in a host of ways.  Did a one-point gain last year reflect a policy shift, a demographic shift, better execution, or random chance?  It’s hard to know.

Has anyone out there found a really good, really early indicator that’s actually useful in improving institutional performance?  Right now, we have to choose between timely and good, and that’s a frustrating choice.