[clo-devel] Re: [lists] Re: Re: Payment Past Due, account

Anthony Ventimiglia anthony at ventimiglia.org
Thu Nov 13 14:15:59 UTC 2003


Erik Enge writes:
 > Anthony Ventimiglia <anthony at ventimiglia.org> writes:
 > 
 > > I don't know what kind of filtering rates Spamassassin gets, but the
 > > Bayesian filter I use (PopFile) has a success rate over 98%. 
 > 
 > The biggest problem I have here is that I get false positives when I set
 > the this-is-spam score lower than four and most spam I get is between
 > three and four.

So spamassassin still has those silly scores, that's the main problem,
A pure Bayesian filter gives a score 0 < score < 1, which is a
percentage, so basically anything over .50 is spam and under is not. 

I recommend trying Popfile or bogofilter for your personal use and
you'll see how quickly it "learns". False positives are the biggest
problem, but after a while, you'll see that they become quite rare
(under .5%). If you want a good argument read Paul Graham's web site.

Like I said I used spamassassin a while ago, before Bayesian filters
came to the forefront. When I learned about Bayesian filters (thanks
to Graham and ESR), I ended up writing my own Library (C++) and wrote
my own filter. They aren't perfect, but the spam that makes it through
is very un spam like, usually it looks like a spammer trying to beat
the filter, but by that point it's not very effective spam. 

I have been slowly converting my C++ library to Lisp, which I'll
eventually bring here. 

-- 
(incf *yankees-world-series-losses*)




More information about the clo-devel mailing list