Re: [SAtalk] SA's performance with mailing lists

Kerry Nice Wed, 20 Mar 2002 13:59:40 -0800

I did email Chris Prillo of Lockergnome and tried to
enlighten him.  His response basically was that he was
mad that people were using something that they didn't
know how to use and it was too powerful.  Ok fine, but
I think it is misdirected anger, but I see why he is
mad since his newsletter, which puts food on his
table, is going to /dev/null on a lot of systems.  I
thought the solution of the ISP with tunable levels
and caught spam folders was great.  I wish my provider
did that for me.

Maybe the newsletters I get are probably more spammy
looking than a lot of others on the list.  I added the
ELIST tests to my rules and it seems to help a bit,
but not terribly.  Most everything the Denver Post
sends, everything from shagmail.com, and a bunch of
other html formatted mail and advertiser supported
gets tagged, even the Linux-announcement digest.  

As an exercise, I've started running a copy of
everything I get though another user and all of it is
run though SA and everything that is tagged is saved
off to a folder.  If anybody wants, I can send this
monthly somewhere to have it added to the corpus.

Does it seem worth while to create a corpus of
legitimate mailing lists, have some email account
somewhere subscribed to a ton of different mailing
lists and create a normal non-technical user profile
that would weight things with that sort of mail in
mind.

Don't get me wrong, I love SA and it does a great job
for me.  Almost all the email I get is moved by
procmail somewhere before SA even looks at it so I
never see this problem in normal usage.  

I guess it depends on what the focus is here, do you
want something that works great for a largely US based
group with mostly technical email or is there a wider
goal?  Do you go for 100% spam catching with some
false positives or do you miss some because you never
want a false positive?

I've also noticed, I get a lot of mailing lists sent
to me as a digest.  That pretty much guarantees that
they will get tagged since the body is multiplied many
times and there are more chances to get something
spammy in them.

Kerry.

CertaintyTech - Ed Henderson wrote:
> Kerry,
> Could you try adding the tests that Matthew recently
posted specifically for
> lists?  Would be interesting to see how or if these
change your results.
> Here they are:
> 
> Here's some rules that I have for lists:
> 
> # Only look for 7 bit chars between square brackets,
because a lot
> # of spam with 8 bit chars in the subject would
match this rule
> header ELIST_1                  Subject =~
/^.{0,6}\[[\000-\177]{2,20}\]/
> describe ELIST_1                Subject has
something between square
> brackets
> 

__________________________________________________
Do You Yahoo!?
Yahoo! Movies - coverage of the 74th Academy Awards�
http://movies.yahoo.com/

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Re: [SAtalk] SA's performance with mailing lists

Reply via email to