Machine Learning Shows Some Human Problems

Fascinating two-part interview with Richard Sharp, CTO of predictive marketing company Yieldify, from The Entrepreneurs Project website, about bias in machine learning.

Wait. Bias in machine learning? How is that possible? Machines have no race, gender, and come as an inherently blank slate. Where does bias come from?

According to Sharp, it’s the fact that even data has real-world biases embedded in it already. If the real world contains discriminatory biases like gaps in gender pay or wealth by race, the machine too can pick up these biases and exploit them. Even the best programmer has no idea that this is happening since the computer figured it out for itself via available data.

Sharp cites a real world example via a study from CMU that showed that Google was showing significantly more advertising for jobs paying over $200,000 to men than it was to women. Another study from the FTC revealed that searches for black-identifying names yielded a higher incidence of ads associated with arrest than white-identifying names.

“These issues were almost certainly down to machine learning systems exploiting biases in real-world data,” Sharp points out. “At no point did a sexist Google programmer ever deliberately write a program that encoded any particular rule.”

There are plenty of other examples too. Amazon recently rolled out same-day delivery to new city areas, while conspicuously excluding predominately black zip codes. And, a machine learning system actively in use in US courtrooms significantly overestimated the likelihood of black defendants reoffending.

The big problem with all this, Sharp says, is that it creates systems that continue to reinforced the world as it is now, instead of aspiring and moving towards the world the way we want it to be. Following that path, machine learning systems can perpetuate, or even exacerbate, existing biases.

Smart stuff, worth reading, not too long.

Part One is here.

Part Two is here.