In the weeks and months following the Charlottesville terror attack, among the after-effects has been rising pressure on online services to police hate speech and violent rhetoric on their sites. More qualified folks than I will have sifted (and continue to sift) through the moral quandaries of balancing a desire to stifle hate with a vision of an open and free internet, but I’d like to dig a little into the technical considerations of automated content moderation.
There’s a vision of machine learning as a kind of technological magic. Powerful computers and complicated algorithms sift through large swaths of data, and learn to recommend a movie, or identify your 2nd cousin in a photograph. Even the more tech-savvy users, who would never actually use the word magic, still consider this the work of a dispassionate, anonymous machine.
But there are a lot more human eyeballs involved than you might think. Most computers “learn” using a simple paradigm: look at a large swath of data, yes, but a large swath of data where someone also provides the answers. You don’t show a computer ten million pictures and ask it to tell you where the dog is in picture ten million and one. You show it ten million pictures and point at every dog. The aforementioned algorithm says, “Joe says this right here is a dog, and so is this, and so is this, and so is this. What mathematical set of features are common between them?”
To teach a computer to do a thing, a human needs to have already done that thing, many many times. Some use cases are fortunate to wield pre-existing data—methods of detecting positivity and negativity in language, for example, can look at movie reviews, with a helpful star rating or thumb label. Others rely on users—every thumbs up or down you give Netflix helps their systems build a picture of what B someone will like, if they also liked A.
Content moderation is not so simple. The end goal for companies is the automatic detection of terms-of-service violating language. So, how to teach this algorithm? Past data? We could leverage the content of known hate-group forums and news sites. But these forums are likely to have a large amount of off-topic, benign content as well. User labeling? Social media platforms have buttons to report or flag inappropriate content, but these are vulnerable to concentrated abuse, or the whims of a given user’s morality.
At some point, websites need eyeballs. And they have them, armies of content moderating “contractors,” performing the menial labor of the data economy: getting paid 4¢ a post to say “this is fine, this is fine, this is a violation, this is fine”—and if you think that job sounds like fun, you may not have it quite right.
There are (at least) three implications worth considering here, before we decide this is the way to go. (1) These people are in a hurry, (under)paid by the click, and clawing their way toward minimum wage. (2) These people are performing a variety of different tasks in rapid succession. One site might pay them to moderate 1000 flagged posts, the next might be that guy above, asking them to tell his computer where the dog is in 1000 pictures. (3) These people are, well, people. Even the most precise terms of service leave some wiggle room for human judgement; and most are instead intentionally vague, to allow for a lot of wiggle in either direction. Everyone’s got some bias in their tank, and the line for what is racist and what is not can vary drastically.
This is not a recipe for reasoned, consistent decisions. This is not “here is definitely a dog.” And it’s remarkable how easily even the most sophisticated technology can learn subtle bias, if the examples it learns from are biased in the first place. All it takes is a few people who keep pointing at the cat and insisting that it’s a dog. “Garbage in, garbage out,” is as true in machine learning as it is in cooking.
Does this mean that humans can’t do the job of teaching machines to detect hate? No. But it’s essential that the companies relying on these methods think long and hard about how to make their labeling process as effective as possible. Maybe pay the workers a little more to encourage a slower, more thoughtful response. Maybe recruit a demographically diverse group of workers to stymie bias in any direction. Most importantly, just think. Think as hard about the process you use to train your model as you’d think before handing any other critical decision over to an algorithm.
In addition to all the links above, NPR’s Note to Self has a fascinating interview with a contract content moderator on the ins and outs of the job.