Last week I did a piece about the danger of predictive analytics inadvertently reinforcing negative stereotypes, and doing harm by dressing up stereotype threat in the trappings of science. To my pleasant surprise, IHE commissioned responses from a host of folks at various data companies.
Thank you to everyone who answered. The answers are varied and fascinating.
The gist of my piece, which was occasioned by listening to Claude Steele talk about his research on stereotype threat, was a concern that the students likeliest to get the “you’re unlikely to succeed” message from the algorithm are the same students who are most likely to be harmed by hearing it. (Steele’s research found that simply knowing that your own group is negatively identified in a given context will reduce your performance in that context; it’s a sort of cognitive tax.) I asked whether there could be an ethical duty to withhold data, if sharing that data could do active harm.
The respondents varied in the level to which they understood the question. Several of them took the position that data is inherently neutral, and what matters is what you do with it. But that misses the point. Data exists in a social/political context. If you’ve grown up hearing that people who look like you aren’t good at STEM, and you get feedback from the analytics at your college telling you that you’re unlikely to succeed at STEM, you may connect those dots in ways that wind up sabotaging your own efforts. If the data neutrally reflect a skewed system, and trigger psychological responses that do real harm, then in what sense are they neutral?
A few others took the position that yes, unfiltered data could be harmful, but unfiltered data should be shared only with professional advisors, who could craft it into a more useful and positive message. There’s a meaningful difference between “students like you tend to fail this class” and “students like you have done better in this class when they showed up at the tutoring center at least twice a week.” The former message is disempowering, but the latter message is empowering; it gives the student information she can actually use.
That’s better, certainly, but it implies a world other than the one in which I work. My community college, like most others, lacks the money to hire armies of advisors to reach out sensitively to every student. In the context of a small, well-funded college, I’d consider the response excellent; in my world, it’s frustrating. “If you had far more money, you could…” may be true, but it doesn’t help.
The best ones mentioned transparency in the factors that go into the algorithm, so students know that, say, their race isn’t being considered. That strikes me as an ethical imperative. But it still leaves in place other issues that can wind up reinforcing the very achievement gaps that we’re trying to overcome.
All of that said, though, I’d hate to rule out a potentially useful tool just because I haven’t quite figured out how best to use it.
The feedback helped me refine the question, which is a sign of good feedback. I should have prefaced the question with “given limited resources…”
So here goes. Given limited resources with which to hire people who can sensitively frame the message, and given the reality of stereotype threat, could there be an ethical obligation to withhold certain information? Put differently, which is more important: transparency, or “first, do no harm”?