Phishing Unit Testing And Other Phishy Things

Seeing these results is pretty cool. I hope someone has/will come up with a way to have a test like this running periodically (at least weekly, if not daily or multiple times a day) which does an analysis on Phishing sites and how many are being blocked. I’d presume Google and other data services would have some interest in this. It could be as simple as an extension for browsers (yes IE too) which reads a feed and visits each site, and reports the results to a web service. Running in a confined environment (virtual machine, or dedicated box) free of tampering. I think the real advantage would be to see how effectiveness varies over time as phishers become more sophisticated.

Take for example spammers. First spam was pretty simple, now they are using animated GIF’s, sophisticated techniques to poison Bayesian analysis, botnet’s etc. I presume over time we’ll see the exact same thing with Phishing attacks. I doubt it’s going to get any better. On the positive side of things, this is still at it’s infancy, so we can start learning now, and be more aggressive than people were about the spam problem, which got way out of hand before everyone realized it was really something to worry about.

I’d ultimately like to see just percentages of different anti-phishing blacklists/software updated frequently, so we can keep a running tally. Perhaps it would be a good indicator of when phishing tactics require a software or methodology update. I think overall everyone would benefit from some industry collaboration rather than competition. The problem with phishing is to be effective your research must be good. To do good research you need to cast a wide net, and capture only one species of phish while not letting any dolphins get stuck in the net (sorry, couldn’t resist).

I’d be curious to know what others think of such testing, and efforts (from general users, as well as anti-phishing/spam vendors). Is the war against spam effective? Should the same techniques be used? Is it time for coalition building? Should we each go in alone? How do you monitor changes in techniques used by phishing?

I know Google is pretty serious about keeping up with the data in a very timely manner, and from what I can tell, most other vendors are as well. But I wonder how industry wide statistics could further benefit. Perhaps simply the competition of trying to have a higher average score. Perhaps simply the detection of changes in techniques (noted by everyones collective decline in detection rate).

I’d love to hear what others think of Phishing protection. It’s a rather interesting topic that many don’t give too much thought to, but it really is an important part of how browsers make the internet safer.

Hardened Defenses

This weekend my Contact page got spammed. It’s now rewritten and using a few blacklists (including Akismet) among other techniques to eliminate spam. Should be much better now. I also think the handling of attachments should be better.

The spam appeared to be from a botnet, based on the fact that no 2 seemed to have the same IP address. So just blocking IP’s wasn’t an option.

Now things should be even better.

eBay and banks need to implement SPF and Domain Keys

eBay and banks really need implement SPF, Sender Policy Framework and DomainKeys. There I said it.

I see quite a few Phishing attacks every day. And just about all of them aren’t caught by SpamAssassin. Technically they aren’t spam, so that does make sense. But what bothers me is that this is easy mitigate for many potential victims. If eBay and banks supported SPF and DomainKeys, it would be much easier for a filter to tell if the message is legitimate or not. Check out this sample SpamAssassin header from a eBay phishing email I received:

X-Spam-Level: **
X-Spam-Status: No, score=3.0 required=5.0 tests=BAYES_50,HTML_IMAGE_ONLY_28,
	MIME_HTML_ONLY autolearn=no version=3.1.0

That’s really not much in this otherwise pretty bad email. The IP of origin isn’t even in North America (it’s Pacific Rim).

Perhaps it’s time to start a campaign to urge institutions subject to having their name used in these attacks to start using a method like SPF and DomainKeys. A mail provider could then throw out emails that don’t match. Anyone know why they still don’t implement one or both of these methods?

It seems to me they could easily take a giant step to solve the problem. I know Google’s Gmail knows about SPF, and Yahoo knows about DomainKeys. That’s two major email providers right there.

Postage for Email? My Internet != Your Internet?

There’s been a lot of buzz lately over AOL and Yahoo charging to email their customers. I think this quote most likely will end up being the future:

“AOL users will become dissatisfied when they don’t receive the e-mail that they want, and when they complain to the senders, they’ll be told, ‘it’s AOL’s fault,’ ” said Richi Jennings, an analyst at Ferris Research, which specializes in e-mail.

Well said. Just wait until AOL customers realize they aren’t getting order confirmations, notifications, and other email’s because the sender won’t pay.

Another concern not really discussed is the possibility of having a Level 3/Cogent style battle where one ISP refuses to let another email their customers, because they aren’t getting paid what they feel they should.

Right now, email is essentially 100% peered. Everyone emails everyone, nobody charges. You pay your ISP to run the mail server, and that’s it. If commercial entities need to pay to email you, your going to get separate charges. Want an email when your order ships? Pay extra. Want an email when this item is back in stock? Pay extra.

This is a very slippery slope. Just one or two greedy ISP’s is all you need to ruin email. Once you can’t reliably email, the system is dead. Spam can reduce efficiency, but can’t kill email. Remember Email is by far the most used protocol in business.

I doubt this system will do anything to reduce spam for AOL customers. It will however help AOL’s revenue, which I’m assuming is the real goal. A slightly bold move as AOL is assuming their customers won’t mind not getting all the legitimate email they would if they used a free Gmail or even Hotmail account.

There’s also a decent possibility AOL customers might have to pay merchants an email fee when they buy products, to help cover that cost. Of course merchants eventually will sneak in their percentage there, further hiking prices.

Personally, I think this biggest threat is a Level 3/Cogent style dispute.

Should also note there’s currently a lot going on over Net Neutrality. Google’s been thrown into the middle of that, merely because of how ubiquitous the company is. Vint Cerf’s letter on the topic is really a must read. Paying for email right is really just an inverted case of network neutrality. Instead of the middle man dictating who you can/can’t communicate with, the next ISP down the line decides. That’s no better.

The Internet as an open medium could drastically change in the next few months if some of this stuff becomes reality. There are quite a few companies out there who believe the internet is enough of a threat to their business, that they want to go as far as crippling it.

Is phishing the new spam?

I’m almost convinced now that the majority of stuff SpamAssassin misses isn’t really spam, but phishing messages. I think it’s time for SpamAssassin to start considering detecting it. Perhaps take a look at mscott’s good work for Mozilla Thunderbird.

Odds are lots of that detection stuff, will also detect spam slipping through by other means.

Spammer Spot Checking

It’s pretty well known at this time that a rather large sum of Spam comes through regular ISP’s. There is a rather large debate on how to get rid of them. Some ISP’s just ignore it. Some block port 25. But is there a better way?

I’m going to propose the following:

  • A random check of 1 out of every 100 emails sent through an ISP’s servers, or via port 25 (for ISP’s who allow 3rd party mail servers) get checked by a spam filter (such as SpamAssassin).
  • If a user gets flagged, the user enters a “gray list”. In which their emails are checked at a lower interval (1 out of 25) for the next several days.
  • If more than 10% get flagged (a rather large margin for today’s Spam Filters). That account should be suspended and investigated by the ISP before being re-enabled.

The vast majority of the above can be automated. But how would this cut down on spam?


The vast majority of users send less than 100 emails a day. So the percentage of extra CPU required would be relatively minimal for each legitimate user an ISP has (only 1/100 of outgoing email would be scanned). Odds are the user will have 1 email scanned every 3-7 days (assuming they send between 15-20 emails a day) . For a spammer, or a computer infected with a Trojan, this computer will be sending large sums of spam (perhaps hundreds an hour). It will be rather likely to have one fall into the group tested by the spam filter. Then when it falls into the gray list, it will become rather obvious if it was a fluke (emailing a spouse about Viagra), or a spammer. Spammers need to send bulk amounts of mail to be profitable, since not many who get it actually click and buy something.

Why would an ISP want to bother?

A spammer not only can put a large burden on a mail server (read: cost), but cause an ISP to be blacklisted. This is a negative thing for any ISP because it reduces the quality of service for legitimate users, and could cause customers to feel they can get better service elsewhere. The best way to avoid being blacklisted is to keep your mail servers clean.

Wouldn’t this violate privacy policies?

Not likely. Many ISP’s already scan incoming email for spam and viruses. This is simply applying it in the reverse. There’s likely no additional privacy concerns by doing it this way.

Couldn’t this prevent many virus outbreaks?

Yes, it could be done to prevent viruses, simply by doing the above with a virus scanner.

Could this be done without a “gray list” to make it easier to implement?

Yes, in theory it could. You can just flag an account so an admin is aware. Or suspend right away. Suspending right away (on 1 catch) may cause more false positives than you would want, so I’d advise against it. I’d opt towards flagging an account or perhaps notifying an admin by email. If someone is a real spammer, they will be part of the random sampling a dozen or so times rather quickly. So it will be rather obvious. A “gray list” is more programming, but makes the system more automatic and tolerant. Providing a better experience for end users, with less work for admin’s in the long run.

Where did 1 out of 100 come from?

It’s somewhat arbitrary, but should prove effective. I’m sure some analysis could come up with an even better number. The goal is to prevent spam with minimal CPU. Odds are a spammer won’t send 1 email a day. So they will send it in volume (since the more they send, the higher the chances a consumer will bite). Hopefully more often than note, 1 will fall into the filter. You can cut that in half (1 out of 50) to double your chances. At the expense of system resources.

Wouldn’t this just make email slower?

Not really. You can send the email before you scan it. So this doesn’t slow outbound email. It’s just taking a random sampling at an interval, and reacting based on the analysis. Even if the filter goes off, the mail should be sent (it could be a false positive). Only when the user is flagged as a spammer should the account be unable to send email. This results in minimal disruption of service. For a spammer this should happen relatively quick. scanning 1% of outgoing email shouldn’t be to substantial. Assuming you keep an eye on your mail server anyway, this should only speed up the detection of a spammer using it. If you go to a 1:50 ratio of scanning, you’ll only improve your odds and speed in catching spammers.

Has anyone implemented this? Is there a tutorial?

To the best of my knowledge, nobody has done this yet, at least based on my theories. If you have done this, and would like to contribute some code, information, wisdom, or just mention who did it, let me know.

Why not just scan all outgoing email?

It’s just not practical for performance/resource reasons. Nor is it really necessary, since spammers need to send in bulk.

Couldn’t spammers work around this?

Well, they can space out when they send out mail, say batches of 50, but they still fall trap to perhaps being 1:100 and being scanned. They could send less, but that would be costly. They need to send in bulk so they can get as many eyes looking at their offers as possible. So for them, just sending less isn’t good business. This would hit them where it hurts. By making their business model ineffective. If they can’t send the mail, they can’t profit.

Doesn’t this protect others, rather than myself?

Yes, and no. We are a community, and communities do look out for each other. If everyone did this, the load on incoming mail servers would be substantially less. As said before, by catching your own spammers, you prevent being blacklisted by the many blacklists out there. That has a direct benefit to your business.

What about bounced email?

Those should be scanned as well. Simply because a spammer can bounce their spam off of your mail servers to get around blacklists. If I email, with a spoofed “From:” header, they will likely “bounce” that email to my recipient (who I put in my “from:” tag), quoting the message (my spam). By scanning these as well (1 out of 100), you can effectively cut down on this abuse by your leeching spammers.

The bottom line

By using the above method of scanning outgoing email, you can effectively prevent spammers from profiting off of your mail servers. Spammers need to send in bulk. The more they send, the easier it will be to catch them. This is an easy way for an ISP, webhost or mail provider to cripple the spammers business without harming legitimate email users.