<?xml version="1.0" encoding="utf-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Bayesian Spam Filter Poisoning With RSS</title>
	<atom:link href="http://robert.accettura.com/blog/2007/01/29/bayesian-spam-filter-poisoning-with-rss/feed/" rel="self" type="application/rss+xml" />
	<link>http://robert.accettura.com/blog/2007/01/29/bayesian-spam-filter-poisoning-with-rss/</link>
	<description>Robert Accettura&#039;s Personal Blog on Web Development and Tech</description>
	<lastBuildDate>Fri, 10 Feb 2012 05:07:57 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: Meddling Around v2.2.0 &#187; Blog Archive &#187; GMail Spam filter poisoning?</title>
		<link>http://robert.accettura.com/blog/2007/01/29/bayesian-spam-filter-poisoning-with-rss/comment-page-1/#comment-136265</link>
		<dc:creator>Meddling Around v2.2.0 &#187; Blog Archive &#187; GMail Spam filter poisoning?</dc:creator>
		<pubDate>Sun, 15 Apr 2007 11:01:17 +0000</pubDate>
		<guid isPermaLink="false">http://robert.accettura.com/archives/2007/01/29/bayesian-spam-filter-poisoning-with-rss/#comment-136265</guid>
		<description>[...] http://robert.accettura.com/archives/2007/01/29/bayesian-spam-filter-poisoning-with-rss/  [...]</description>
		<content:encoded><![CDATA[<p>[...] <a href="http://robert.accettura.com/archives/2007/01/29/bayesian-spam-filter-poisoning-with-rss/ " rel="nofollow">http://robert.accettura.com/ar.....ith-rss/ </a> [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Around the web &#124; alexking.org</title>
		<link>http://robert.accettura.com/blog/2007/01/29/bayesian-spam-filter-poisoning-with-rss/comment-page-1/#comment-117590</link>
		<dc:creator>Around the web &#124; alexking.org</dc:creator>
		<pubDate>Sun, 04 Feb 2007 17:03:23 +0000</pubDate>
		<guid isPermaLink="false">http://robert.accettura.com/archives/2007/01/29/bayesian-spam-filter-poisoning-with-rss/#comment-117590</guid>
		<description>[...] Robert Accettura: Bayesian Spam Filter Poisoning With RSS [...]</description>
		<content:encoded><![CDATA[<p>[...] Robert Accettura: Bayesian Spam Filter Poisoning With RSS [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kevin McGhee</title>
		<link>http://robert.accettura.com/blog/2007/01/29/bayesian-spam-filter-poisoning-with-rss/comment-page-1/#comment-116461</link>
		<dc:creator>Kevin McGhee</dc:creator>
		<pubDate>Wed, 31 Jan 2007 14:54:56 +0000</pubDate>
		<guid isPermaLink="false">http://robert.accettura.com/archives/2007/01/29/bayesian-spam-filter-poisoning-with-rss/#comment-116461</guid>
		<description>Interesting post! 

I agree with Justin that it&#039;s probably the same spammer but when stuff like this starts to work it generally catches on fairly quick. 

With the unfortunate rise in image spam Bayes is becoming less effective. More and more the only text in the body of the spam is Bayes busting!</description>
		<content:encoded><![CDATA[<p>Interesting post! </p>
<p>I agree with Justin that it&#8217;s probably the same spammer but when stuff like this starts to work it generally catches on fairly quick. </p>
<p>With the unfortunate rise in image spam Bayes is becoming less effective. More and more the only text in the body of the spam is Bayes busting!</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Justin Mason</title>
		<link>http://robert.accettura.com/blog/2007/01/29/bayesian-spam-filter-poisoning-with-rss/comment-page-1/#comment-116207</link>
		<dc:creator>Justin Mason</dc:creator>
		<pubDate>Tue, 30 Jan 2007 18:37:06 +0000</pubDate>
		<guid isPermaLink="false">http://robert.accettura.com/archives/2007/01/29/bayesian-spam-filter-poisoning-with-rss/#comment-116207</guid>
		<description>A good post -- thanks Robert! 

For what it&#039;s worth, the spams containing CNN headlines are
probably sent by a single spammer or spam team. That gives
you an idea of how _few_ &quot;bad guys&quot; there are, and how
much volume each of them is pushing out.

&quot;Bayes poisoning&quot;, in my opinion, does indeed have an effect;
spammers undoubtedly want it to cause their spams to
match nonspam training more closely.  However, in our
testing, we found that this doesn&#039;t necessarily happen;
instead, when a user trains on spam and nonspam, future
nonspam mails are biased towards looking like spam to
the filter -- ie. increased false positives.

It appears various tweaks that we use in &quot;real-world&quot; Bayesian-style probabilistic classifier filters, including the algorithms
used in SpamAssassin, SpamBayes and Thunderbird, may protect
against this however.

This tech report has the details:
http://www.cs.dal.ca/research/techreports/2004/CS-2004-06.pdf .
there&#039;s quite a bit of other research if you go through the
http://www.ceas.cc archives too.</description>
		<content:encoded><![CDATA[<p>A good post &#8212; thanks Robert! </p>
<p>For what it&#8217;s worth, the spams containing CNN headlines are<br />
probably sent by a single spammer or spam team. That gives<br />
you an idea of how _few_ &#8220;bad guys&#8221; there are, and how<br />
much volume each of them is pushing out.</p>
<p>&#8220;Bayes poisoning&#8221;, in my opinion, does indeed have an effect;<br />
spammers undoubtedly want it to cause their spams to<br />
match nonspam training more closely.  However, in our<br />
testing, we found that this doesn&#8217;t necessarily happen;<br />
instead, when a user trains on spam and nonspam, future<br />
nonspam mails are biased towards looking like spam to<br />
the filter &#8212; ie. increased false positives.</p>
<p>It appears various tweaks that we use in &#8220;real-world&#8221; Bayesian-style probabilistic classifier filters, including the algorithms<br />
used in SpamAssassin, SpamBayes and Thunderbird, may protect<br />
against this however.</p>
<p>This tech report has the details:<br />
<a href="http://www.cs.dal.ca/research/techreports/2004/CS-2004-06.pdf" rel="nofollow">http://www.cs.dal.ca/research/.....004-06.pdf</a> .<br />
there&#8217;s quite a bit of other research if you go through the<br />
<a href="http://www.ceas.cc" rel="nofollow">http://www.ceas.cc</a> archives too.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

