James Mullenbach

Home
Blog
GitHub
LinkedIn
Twitter

Distill and 'fairness' in AI research

22 Mar 2017

A couple of discussions have popped up on Twitter recently about AI research that seem related to the “democratization” of AI. Hal Daumé III tweeted in response to Andrew Ng’s statement on open source and open data in AI - noting with a wink that those are only two parts of the equation. Especially with the explosion of deep learning, physical computational resources are becoming increasingly important, meaning we don’t know how to make AI always available to all just yet, even if we have open source code and data, as we should.

This ties in with another tweet thread from Delip Rao on the new Distill “ecosystem” set in motion by Google Brain, apparently led by Chris Olah. My immediate impression of Distill is positive; a stronger focus on clear explanations could certainly benefit the field of ML. The approach to make an entire journal devoted to this is, to my very limited knowledge, novel. The Distill team made another, quite philosophical, post today further explaining their vision, which makes some good points on the value of good explanations, but may not relieve some of the qualms about the potential to “steal” citations by posting better explanations than the original paper.

The concerns about resources, and the resource differential between academia and industry comes into play again here. Companies will probably always have more flexibility to buy as many GPU’s as they want, and are more likely to make the decision to hire people dedicated to explaining and visualizing research. Distill’s recent post also discusses an “infrastructure” for building the visualizations they hold in high regard. Perhaps in their ideal world, researchers just learn another tool akin to LaTeX and decide to post in a web journal like Distill rather than create a traditional PDF. But the problem of time and energy spent to think of the analogies remains. I don’t know who will solve these discrepancies, or even to what extent it needs solving. Discrepancies of resources, in all fields, exist between universities after all. But the differences here in AI seem large and growing, and could cripple the ways in which academic researchers can contribute. Not many labs could match 32 GPUs for three weeks.

I’m interested to see how this distill venture unfolds. I think the many takes are influenced both by people’s learning styles and perhaps seniority within research. It may be easier to dismiss for someone who learned many of these concepts long ago. As a young person, newcomer, and someone who learns visually, I welcome the fresh ideas from the group at Distill, and hope they can provide something of distinct value to the community. In my eyes it definitely has the potential to push forward the movement for open sharing of ideas that research is founded upon. At the very least, a move from PDFs to more dynamic web pages seems easy enough to accomplish and by itself would foster better visualizations.

As a side note, twitter has been unexpectedly great for keeping up with news in this fast moving field.