Data-Backed Decisions

The traditional scientific method taught to elementary school students is coming up with a hypothesis, testing that hypothesis by acquiring data, and then comparing the results to the hypothesis. Sometimes, the data affirms your hypothesis. If not, you adjust your hypothesis and test again.

When one receives results that seems to affirm their preconceived notions about a topic, the source of that data itself is even less likely to be scrutinized. Now place the entire process in a black box and place your trust in another opaque entity to do it properly. People might begin to believe the outputs of “science” and think “The computer said x, so x must be true,” so you must understand my concern when Wired reported on a new eye-scanning lie detector made by Converus:

I settle in for a demonstration: a swift 15-minute demo where the test will guess a number I’m thinking of. An infrared camera observes my eye, capturing images 60 times a second while I answer questions on a Microsoft Surface tablet. That data is fed to Converus’ servers, where an algorithm, tuned and altered using machine learning, calculates whether or not I’m being truthful.

“You’ve got a little boy. He shows you his butterfly collection plus the killing jar. What do you do?”

Well that’s pretty neat actually. Converus also claims it has an 86% accuracy rate, which admittedly is pretty impressive. The article continues, however:

Converus derives its 86 percent accuracy rate from a number of lab and field studies. But an upcoming academic book chapter written by the company’s chief scientist and cocreator of EyeDetect, John Kircher, shows that from study to study the accuracy rates can vary quite a bit, even dipping as low as 50 percent for guilty subjects in one experiment.

So… About as good as flipping a coin. That just ruins any interest I’ve had in the tech. Not only that, it actively makes me feel worse because EyeDetect is already deployed in several government agencies and being piloted in several more. Further, I’m not convinced that the agencies will be removing their machines once the article comes out, potentially because this isn’t a new scenario. Polygraphs are notoriously bad at doing the single task they were designed for, and yet, for whatever reason, they continue to be used with very real consequences:

No other country carries out anywhere near the estimated 2.5 million polygraph tests conducted in the US every year, a system that fuels a thriving $2 billion industry. A survey by the Bureau of Justice Statistics from 2007 found that around three-quarters of urban sheriff and police departments use polygraphs when hiring. Each test can cost $700 or more. Apply to become a police officer, trooper, firefighter, or paramedic today, and there is a good chance you will find yourself connected to a machine little changed since the 1950s, subject to the judgment of an examiner with just a few weeks’ pseudoscientific training.

Common sense would dictate removing all these machines in favor of literally anything else that’s not worse than simply flipping a coin. However, the disconnect between the subject and the end result, with a hi-tech machine-learning black box in the middle, forms an odd misguided trust in the process. It’s not even as cognizant as “Many people with PhDs, far smarter than me, created this neural net with math that I don’t understand, so why shouldn’t I believe it?” Usually, it just comes down to simply “The machine said so.”

Machine learning is being used all around the world for improving efficiencies in processes across many businesses, but sometimes they’re used for very large societal decisions. Again, my main issue with this is that these sorts of technologies results in very real, real-world consequences. Police departments that use data analytics platforms to decide on where to deploy their officers for the day is a common practice already:

Take the crime prediction software police departments use to deploy officers and equipment. It relies, in part, on past interactions with law enforcement. But people of color are picked up for “nuisance crimes” at disproportionate rates.

The data, in other words, are biased. And if the software uses them to recommend a heavier police presence in black and Latino neighborhoods, that can lead to more arrests for the sort of low-level crimes that go unpunished in other places. Those arrests are then fed into the algorithm, and the cycle continues.

And it’s these implicit biases that aren’t initially accounted before deploying for real-world usage that concern me the most. Maybe it’s a general frustration with the tech industry in general not taking their role seriously. Compressed product roadmap timelines and fierce competition only exacerbate the problem. These blind spots manifests itself in “data” in so many ways and sometimes in ways that might feel flippant, but are pretty obvious in hindsight. Some examples:

Well, “technology marches forward and some people might get hurt by it” is starting to become a theme of this blog, isn’t it? I should just rename my entire blog to that.