Powered by Movable Type 3.121
Home The Book Training Events Tools Stats
Web log archive.
A Dispatch

« How Botnets Spread | Main | Wall Street Journal Columnist Has it Wrong »

March 25, 2007

My Private Battle with Blog Spammers

As regular visitors here know, I don't have comments open on this blog. The primary reason is that blogs are readily compromised by blog spammers who litter blogs with links to their sites selling (or linking further to sites that sell) medz, ringtones, insurance, sex,...well, you know the stuff.

There are ways to keep this junk out, but they can entail requiring visitor registration or constant moderation of comments. I don't like the burden of the first solution on visitors, and I don't like the burden of the second solution on me. I could also install a CAPTCHA system, but I've personally grown weary of deciphering the squiggly letters and numbers at sites that use them.

The reason this issue even arises is that the Contact page of this site uses a well-known service provided by the site's hosting company. Any message you submit in that form arrives in one of my email addresses dedicated to that purpose. The system was very easy to set up, and works well.

The problem, however, is that robotic web crawlers can find the submission URL in the form (it identifies the well-known server program right in the URL). Blog spammers can locate the submission URL and then start sending their messages to the URL without even visiting the Contact page. The spammers don't know that the comments they submit to this site don't get automatically posted because they don't monitor the results of their spamming. All that happens is that my inbox fills up with alleged comments/questions that turn out to be nothing but spam.

Blog spammers try to make it sound as though the comment was submitted by a visitor to the site. Most messages stroke the egos of the blogger and other commenters, starting the message with the likes of the following (taken from actual blog spam, with original spelling):

  • This site is very nise and helpfull! Visit my sites, please:
  • Yo! Cool stuff! Thanks for being here. Please visit my site too:
  • Amazing artwork! This is spectacularly done! Would you please also visit my site?
  • Very well! Your site is neat! Please visit my site too:
  • You have a great page! Please visit my homepage:
  • I liked this site, it's neat. Good job! Please visit my site too:
  • I really enjoyed this page. I will be linking and I will be trying to read and research all that there is to offer from this site! Would you please also visit my homepage?
  • One of the best locations I've come across lately!!! Definately a permanent bookmark! Please also visit my site:
  • Hi people! Great job! Would you please also visit my site?
  • First time here on your site. I am delighted to find your wonderful website online. Please visit my homepage:
  • Nice webpage, lovely, cool design.
  • Nice page greetings to all in this guestbook! Please visit my site too:
  • Fascinating site and well worth the visit. I will be back
  • Excellent site, added to favorites!

The above list came from just two days' of blog spam attempts. I know many of these "senders" never visited the site because I have blocked access to spamwars.com from their IP addresses—making it kinda hard to "enjoy the page" or make it a "permanent bookmark." I used to have an advisory on the page that blog spammers were wasting their time because there was no automatic posting of submissions. It turned out that my advisory was a waste of bytes and pixels.

Thankfully, I know that blog spammers are just as lazy as I am. In other words, if I remove all vestiges of the contact URL from the Contact page, they (or, rather, their crawling computers) won't know to look deeper to see if some kind of obfuscation is going on. That led me to implement a solution that relies of browser JavaScript to embed the URL into the page for actual visitors. It's not even very sophisticated JavaScript—a scripting newbie could figure out without any trouble.

But I'm happy to report that my inbox has been completely clear of blog spam attempts for the past week. I'm not happy that JavaScript must be enabled for someone to submit a comment, but there are plenty of other ways for anyone to reach me (via the dannyg.com site).

While I'm ranting on the subject of blog spam, let me also rag on Blogger for being indirectly responsible for blog spam and doing seemingly little or nothing to fight it. Easily 60% of the blog spam aimed my way link to pages at blogspot.com. The blogspot.com pages often contain nothing more than even more links to sites that sell the spammed crap. The purpose of all of this is to help the spammers increase the likelihood of their URLs being picked up by search engine crawlers and raising in the rankings. Search engine optimizers will tell you that having lots of pages point to yours is a good way to bump up your rankings. The blog spammers' dearest wish is that someone searching for "ringtones" or "viagra" on Google or the like will find one of their links in the first page of results.

Blogger (owned by Google) is one of those gigantic sites that makes it extremely difficult for a human to get in contact with another human at the company to do things like file complaints. After some research months ago, I reported some of these blogspot.com pages to Blogger as examples of abuse. No response, and no action on the offending pages. Unfortunately, their Terms of Service document doesn't explicitly indicate that blog spamming with links to Blogger pages is taboo. I even posted a public query to the Blogger help forum, asking for an official response to the issue. No response after two weeks.

Okay, I think I've gotten the blog spam issue out of my system. But it's a problem that obviously isn't going away around the blogosphere. To protect themselves, bloggers who open their sites to active comments have to battle this junk constantly. And that's not counting the sites no longer actively being monitored by their owners, but are littered with hundreds of blog spam comments. All of them junking up search engine rankings.

If only we could harness for good all of the intellectual energy that is expended in the name of gaming the Interet systems.

Posted on March 25, 2007 at 02:46 PM