Re: Latest sa-stats from last week

Jay Lee Wed, 10 May 2006 13:44:04 -0700

Bowie Bailey wrote:

Michael Monnerie wrote:

On Mittwoch, 10. Mai 2006 17:27 Bowie Bailey wrote:

So you are saying that I should not feed Bayes with the unsolicited
marketing garbage that I get because it looks like something that
could have been requested?

If it's a newsletter from a seemingly legit company I don't feed it to
bayes. I try to unsubscribe from them. If they still send me, I write
some rule to filter them. If some customer then rants, I tell them
that said company doesn't work nicely - and he should make a filter
to get e-mail from that company out of the SPAM folder again.


If it comes to an account that does not subscribe to newsletters
(webmaster, sales, etc), it is spam by definition and is fed to Bayes.

Remember: 10 good SPAM and HAM are better than 200 where 5% are
wrong.

Wrong for who?  If it looks like marketing, 99% of the time, I don't
want it.  And for most of the accounts that I deal with, this goes
up to 100%.  Not true for my customers, tho.

Yes, some manual filters can catch those. If it's stupid SPAM, then
bayes.

My philosophy with Bayes has always been to skip the ham/spam
definitions and go with a wanted/unwanted model.  This way Bayes
learns to filter out the emails you don't want even if some of them
may technically be ham.  (Obviously, I would not be able to do this
on a site-wide installation)

But as you said your bayes is not quite accurate, so it seems not to
work really. Wouldn't it be better to have a highly accurate bayes,
and setup some filters for you personally? If a BAYES_99 would be
always SPAM for you, you could give it 4.5 or 5 points, and probably
filter more SPAM than now?


If I look at my personal database, the spam percentage shown in the
stats is lower than I'd like, but I wouldn't say it's not accurate.  I
very rarely see a true false positive or negative with Bayes and I
watch my account closely.  I do see a few ham with BAYES_99 and spam
with BAYES_00, but that's usually simply because those were either
spam that only hit BAYES_99 or ham (usually from this list) that
tripped a few extra rules.

But then again, I think less than half of my users are even taking
advantage of the spam markup.  Since I don't do any blocking or
sorting on the server, it is up to them to use MUA rules to sort or
delete the spam once my server has marked it.

I do the same, just wrote a nice document for Outlook 2003 describing
how to filter SPAM.


I've done the same for both Outlook Express and Thunderbird.  The
Thunderbird setup is a single checkbox. :)

It would be nice if updates.spamassassin.org wasn't using mirrors on non-standard ports, sa-update is trying to use http://buildbot.spamassassin.org.nyud.net:8090/updatestage/ which means I'd have to open a port on my firewall just to get updates, sigh...

Jay

Re: Latest sa-stats from last week

Reply via email to