February 10, 2008The Art(lessness) of the Subject Line
Once a spam message squeaks past server and client spam filtering, the message's Subject: line is the spammer's primary weapon in overcoming human defenses. There are plenty of obvious spam Subject: lines, but the craftier spammer uses every trick in the book—including illegal (in the U.S.) deception—to lure recipients to open the message.
Although I can spot 99.9999% of spam messages by my combined spot check of the From: and Subject: data in my inbox list, I know that it's not always possible for others to do the same, even if they are very spam aware. Many email users in their jobs receive valid business-related email messages from customers or colleagues whose names they don't recognize by the From: field and what in other situations might look like total spam Subject: lines. Thus, it's vital to protect yourself from the spam message that gets opened through no fault of your own—and very often through the deception of the spammer.
I decided to get a view of the Subject: lines that came my way in a recent 24-hour period. This data is preserved in my mail server log, which is one of the data sources I use to assemble daily data for the Spam Stats page at this site. The Subject: lines of all messages that get processed (including the ninety-some percent of the ones that get immediately trashed at the server) are preserved.
If you're interested in the hard data, you can download the 100KB text file containing all the Subject: lines for Friday, February 8, 2008. I've sorted them alphabetically, which reveals when I get flooded with the same message from the botnets. It also indicates at a glance which opening words are used most often.
There is a huge batch of Russian-language spam (using the koi8-r character set). This past week, the floodgates opened from Russia (without love) to my addresses. Too bad for them that their messages are so easily detected and trashed at the server.
Opening words "get," "save," and "we have" are the most common. Of the nearly 2500 messages, 53 claimed to be replies (Re:) to messages I never sent. Here are some other counts:
- 184 contained "medications," "meds," or "medz"
- 108 contained the string "hydro" (as in Hydrocodone)
- 82 contained "pain" (usually with "killers" or "relief")
- 46 contained "weight" (with variations of "lose")
- 32 contained "sex" (mostly for meds)
- 47 contained "debt"
- 1 contained "mortgage" (see this dispatch)
- 23 contained "hey" (because I wouldn't see it otherwise)
- 24 contained "penis"
- 6 contained "dick" (and none referring to someone named Richard)
- 101 contained "save"
- 31 contained "free"
- 29 contained "discount"
- 23 contained "replica"
- 23 (different ones) contained "Gucci"
- 7 claimed to be confirmations for things I didn't order
- 5 times I won money (uh huh)
- 10 were Subject: nipples why guys got them
Some Subject: lines really troubled me because of the harm they are likely to inflict on unsuspecting and trusting recipients. Perhaps the most troubling was this one:
Subject: Free Bible Software - Just Download and Go
I didn't see the three messages bearing this or similar Subject: lines, so I can't say for sure what the payload is. I'd wager that it can't be goodware that gets installed. Preying on people's faith to get them to screw themselves is despicable (as many 419ers know to do).
Did I really learn anything from this exercise? Hmm, probably not. I did find it interesting to slice and dice the data in this manner for the first time, but it's not something I'm likely to do often. The results certainly reveal that spammers don't seem to care that they are sending the same message over and over to the same recipients on the same day. With the incremental cost of one spam message sent through a world full of bots approaching zero, it's probably chepaer to not better manage the process.
I do, however, enjoy the fact that spammers are wasting some of their resources, no matter how small. That all but about 50 of the messages listed in the downloadable file were deleted before they ever reached a server directory or file is a great source of glee. I would be even happier if not one of these messages addressed to other recipients triggered visits to spamvertised web sites. Response has to drop to zero for an extended period before the spammers will start to look for other ways to steal.Posted on February 10, 2008 at 06:36 PM