[clo-devel] Re: [lists] Re: Re: Payment Past Due, account
Anthony Ventimiglia
anthony at ventimiglia.org
Thu Nov 13 14:15:59 UTC 2003
Erik Enge writes:
> Anthony Ventimiglia <anthony at ventimiglia.org> writes:
>
> > I don't know what kind of filtering rates Spamassassin gets, but the
> > Bayesian filter I use (PopFile) has a success rate over 98%.
>
> The biggest problem I have here is that I get false positives when I set
> the this-is-spam score lower than four and most spam I get is between
> three and four.
So spamassassin still has those silly scores, that's the main problem,
A pure Bayesian filter gives a score 0 < score < 1, which is a
percentage, so basically anything over .50 is spam and under is not.
I recommend trying Popfile or bogofilter for your personal use and
you'll see how quickly it "learns". False positives are the biggest
problem, but after a while, you'll see that they become quite rare
(under .5%). If you want a good argument read Paul Graham's web site.
Like I said I used spamassassin a while ago, before Bayesian filters
came to the forefront. When I learned about Bayesian filters (thanks
to Graham and ESR), I ended up writing my own Library (C++) and wrote
my own filter. They aren't perfect, but the spam that makes it through
is very un spam like, usually it looks like a spammer trying to beat
the filter, but by that point it's not very effective spam.
I have been slowly converting my C++ library to Lisp, which I'll
eventually bring here.
--
(incf *yankees-world-series-losses*)
More information about the clo-devel
mailing list