Spam Filtering in Mozilla

A little discussion with David Bienvenu today regarding spam filtering in Mozilla. Allow me to summarize:

I’m a huge fan of the SpamAssassin project. I use it, and love it. It’s not perfect, but does a great job. SpamAssassin, adds a header to all email it searches, known as “X-Spam-Status” It’s “yes” if it’s spam, and “no” if it isn’t. If it’s spam, the message contains the tests that triggered (causing it to be recognized as spam), and it attaches the original message to the email.

As a result of this, the email isn’t pure spam anymore. It contains SpamAssassin markings. That’s good, and bad, depending on how you look at it. My suggestion was to acknowledge other spam products do this, and take advantage of it.

Bug 224318

Several things can be done as noted in the bug:

An option to use X-spam-status over bayes testing.
This in essence disables bayes testing in Mozilla. It uses the spam status to decide if the message is spam. The UI works the same (the little garbage icon’s and junk folder), just the actual spam checking is done by another product. Easier than configuring a filter (for end users). Cleaner UI.

Give weight to x-spam status
This would allow the mozilla to somehow give a weight to spam marked as spam.

Feed Mozilla’s Bayes
This I suggest as a default behavior, as SpamAssassin does this for it’s own bayes engine, and it’s successful. emails marked as spam are automatically acknowledge by Mozilla as spam/ham, and learned by the bayes system in Mozilla. In essence the bayes learns automatically without user interaction.

There are other possibilities as well. Regardless of the method(s) utilized in the future, there is serious room to enhance an already powerful tool. Comments on the bug would be nice. Mozilla Mail kicks butt thanks to it’s ability to provide great features. There has to be a way to utilize this to fight spam better than any other email product on the web.

3 thoughts on “Spam Filtering in Mozilla

  1. The headers are different from product to product.
    I use spampal and it adds x-spampal
    BTW: Why do you don’t use a product that also does bayes itself and not only fixed rules (scoring) and DNSBL
    ?

  2. When my college’s mail server detects spam, it adds:

    X-Spam-Status, X-Spam-Level (as a number of stars), X-Spam-Report, X-Spam-Flag, X-Spam-Checker-Version (SpamAssassin 2.53 (1.174.2.15-2003-03-30-exp))

    and the Subject is changed to start with “*****SPAM*****”.

  3. I don’t get it. I’ve only just started looking at this stuff but it doesn’t add up for me.

    Surely Mozilla’s bayesian filter already looks at the email headers and therefore these spam headers are already highly correlated with spam.

    So, as far as I can tell, of the three options you outline:

    “Feed Mozilla’s Bayes” is already being done,

    “GIve weight to x-spam status” is the kind of über-geeky feature that gives Mozilla a bad name, and…

    “An option to use X-spam-status over bayes testing” would hide a non-personalizable spam filter behind the UI for a personalizable one, rendering marking mail as spam/ham meaningless in regard to how spamassassin marks future mail.

    This last option appears to achieve the same effect as simply switching off Mozilla’s spam filter and defining rules for these headers (or subject line additions).

    So perhaps spam-filter specific extensions are the way to go, if people really need that functionality, but I can’t see the benefit myself.

Leave a Reply

Your email address will not be published. Required fields are marked *