on_the_internet_nobody_knows_you_re_a_dog

By Ed McLaughlin and Wyn Lydecker

Yesterday, I filtered over forty spam comments from this blog. Many of them are transparent: in Scandinavian or Cyrillic alphabets, listing links to designer handbags, or patently nonsensical. Some are unsolicited invitations to take advantage of SEO services or getting me hundreds of Twitter followers for the nominal price of my soul.  The tricky ones, though, are the ones that could have been written by a non-native speaker of English. I got excited about a few and showed them to a friend, who had to break the bad news: I’d been taken in by a bot. The experience motivated me to investigate further. I hope my discoveries will help you stay human online.

It turns out that fully one third of all internet traffic is non-human. That statistic is pretty amazing if you think about it: 36% of all traffic on the world wide web is made up of mindless bots, chittering away in the web of data. Some bots, like Cleverbot and certain novelty Twitter accounts like Stealth Mountain, have character and niches into which they fit. They stay in those niches and you must call on them to interact. Many bots, like Google’s spiders, have a legitimate and very useful function. Unfortunately, it’s not these programming experiments with which we share so much of the information highway. The Wall Street Journal ran an article about the shady sites that generate revenue by charging advertisers for fraudulent page views (A ‘Crisis’ in Online Ads, from March 23rd, 2014). They get advertisers to pay for visits to their pages, and then have a bot “click” the page thousands of times. They charge the advertisers by the click, making money without any humans actually seeing the ads. In fact, all their actions are behind the scenes. At least I could be thankful I hadn’t been bilked by a bot, just tricked into a conversation.

How did that happen, exactly? I went on the hunt for an explanation about comment bots, as opposed to the bots that stay hidden behind the curtain. A comment-bot has to have someone behind it who understands language – at least enough to pass as human. Linguists have been analyzing comment bots and gave a good analysis of the language engine behind the most common culprits at languagelog, a linguistics blog put out by the University of Pennsylvania. The bots rely on just enough variation between words and turns of phrase to avoid getting caught by standard algorithms, and even by humans. Some even have a function that allows them to grab key words or blog titles to make their contributions seem more topical and relevant. Comment bots’ actual comment copy is not where they get their payoff: these bots post with usernames that include keywords they are trying to optimize. The blatant advertising bots are doing the same thing, but their content and usernames work in concert. (This practice is a standard SEO tactic, considered “blackhat,” or ethically unsavory.) The pretend-commenter bots are just a little more subtle about it.

I had received what I thought was a general, but favorable and encouraging, response to my content. In fact, it was a clever program that knew the patterns of human speech well enough to fool even me. I think others have been taken in and would have been, if I’d published the comment. Rather than do that, however, I have chosen to forego the extra material in favor of meaningful dialogue with real people, with real insights when they choose to share them. I hope this blog post helps you do likewise.

Ed McLaughlin is currently co-writing the book “The Purpose Is Profit: Secrets of a Successful Entrepreneur from Startup to Exit” with Wyn Lydecker and Paul McLaughlin.

 

Copyright © 2014 by Ed McLaughlin   All rights reserved.