[SAtalk] Training SA

2003-12-10 Thread kittonian
Title: Message



Our setup contains 
qmail with SA on a linux server for pop/smtp and a corporate exchange server 
running on win2k with which each client accesses all of their e-mail (exchange 
and pop accounts).  Most of the stuff is working just fine with SA tagging 
e-mails but there are some which it is tagging incorrectly.  Since the 
clients all download the mail and it becomes stored on the exchange server for 
the user's inbox, how exactly do I train SA to stop marking certain items?  
Our users are all over the place so if there's something I can setup where I can 
have them send the e-mail to a specific address that will make SA stop tagging 
that message that'd be great.  If not, how do I accomplish 
this?
 
kittonian


[SAtalk] Re: RHSBL Usage

2003-12-10 Thread era
On Wed, 10 Dec 2003 01:33:23 -0500, Jeffrey Posluns (List Address)
<[EMAIL PROTECTED]> posted to spamassassin-talk:
 > Is there a way to use an RHSBL list with SpamAssassin?
 > In searching, I've found a lot of info for custom RBL details, but
 > nothing on RHSBLs (domain based DNS blocklists).

Look at how e.g. dsn.rfc-ignorant.org is being invoked in 20_dnsbl_tests.cf

/* era */

-- 
The email address era the contact information   Just for kicks, imagine
at iki dot fi is heavily  link on my home page at   what it's like to get
spam filtered.  If you  500 pieces of spam for
want to reach me, see instead.  each wanted message.



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Training SA

2003-12-10 Thread Matt Kettler
At 02:01 AM 12/10/03 -0500, kittonian wrote:
 Since the clients all download the mail and it becomes stored on the 
exchange server for the user's inbox, how exactly do I train SA to stop 
marking certain items?  Our users are all over the place so if there's 
something I can setup where I can have them send the e-mail to a specific 
address that will make SA stop tagging that message that'd be great.  If 
not, how do I accomplish this?
I'd first suggest a perusal of the manpage for sa-learn

http://www.spamassassin.org/doc/sa-learn.html

and then the Bayes FAQ in the Wiki:

http://wiki.spamassassin.org/w/BayesFaq

in particular:

http://wiki.spamassassin.org/w/SiteWideBayesFeedback

and:
http://wiki.spamassassin.org/w/UsingAnAccountForLearning


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Re: One persistent spammer defeating SA.

2003-12-10 Thread era
On Wed, 10 Dec 2003 01:44:56 -0500, Bryan Hoover <[EMAIL PROTECTED]>
posted to spamassassin-talk:
 > [EMAIL PROTECTED] wrote:
 >> > /^reply-to:[EMAIL PROTECTED](\.org|\.net)[EMAIL PROTECTED](\.org|\.net)\$/igm
 >> This is probably a sufficient pattern, but one distinguishing feature
 >> in the examples was that the same address would be repeated twice.
 > Think there were instances with two different addresses.

Then you can't use a backref after all.

 >> Also the examples are in the .com domain so the restriction to .org/.net
 >> is wrong.
 > Sure.  Would have to add one for each domain.

Or just forego the ambition to keep an up-to-date list of all valid
TLDs in the world, and accept anything which looks vaguely like an
email address. It's not exactly likely to get you a large number of
false positives anyhow.

 >> I'd go with simply:
 >> /^Reply-to:\s+(\S+)\s+\1/i
 > I like the \1 -- "backreference" as I've come to know.
 > /(^reply-to:([EMAIL PROTECTED]){2})/i

You are also requiring a space after the second occurrence. You
shouldn't really be grabbing the spaces inside the parentheses anyway
as simply adding a bit of variation in the spaces would cause the
regex to fail. I'd turn it around like this:

  /^Reply-to:\s*(\w+\@(\w+\.)+\w+)\s+\1/i

Actually I'd probably replace the \w:s with something which is better
tuned to match on domain names, as characters such as dash are valid
in domain names but not included in \w. Also the examples Robert
posted had <>s around them. So here we go again:

  /^Reply-to:\s*(<[-a-z0-9_.]+\@([-a-z0-9_]+\.)+[a-z]+>)\s+\1/i

Underscore is not technically valid in a domain name but you do see
them in practice anyway.

I'm not sure this is any better than what I originally posted, as I
haven't tested this properly. My originally proposed rule would be
prone to false positives in case somebody had the same token twice (or
more :-) at the beginning of their Reply-To:, like so:

  Reply-To: Ma Ma Ma Belle <[EMAIL PROTECTED]>

Your mileage not included when stirred, etc.

 >> I'm guessing the multi-line appearance was simply due to word wraps in
 >> Robert's mail program, and not actually there in the original headers.

Oh, and even if the header was spread over two lines originally, SA
would have folded it back onto a single line before attempting to
match any rules.

/* era */

-- 
The email address era the contact information   Just for kicks, imagine
at iki dot fi is heavily  link on my home page at   what it's like to get
spam filtered.  If you  500 pieces of spam for
want to reach me, see instead.  each wanted message.



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Content Analysis

2003-12-10 Thread David B Funk
On Tue, 9 Dec 2003, Thomas Shoaf (PromoStep) wrote:

> As for correcting the items listed in my original post, I am looking for an
> example of the correct content that should be included in the content of the
> HTML message relating to such items appearing in the Content Analysis when
> checked through the Content Checked at Lyris.

Those were MIME header errors.

Read the MIME rfc (RFC 2045) to get the official statement of how
MIME headers should be used. RFCs can be found at the IETF site.


-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Re: DCC incidence

2003-12-10 Thread Simon Byrnand
> "Covington, Chris" wrote:
>>
>> On my site DCC hits approximately 20% of False Positives also (that is,
>> of the 1-2% of false positives, 20% have Razor hits), so don't give it
>> too much weight.  Razor2 is the worse for that (50% of false
>> positives)... but I've weighted my scoring accordingly.
>
> Setting logic_method=5 in razor_agent.conf might help:
>
> http://www.geocrawler.com/archives/3/2539/2002/11/0/10224077/
>
> As you say, you can adjust when dcc hits, but if it's because of bulk
> and/or list mail, you could white list, or bypass filtering it.  That
> is, why are you getting false positives with DCC?

Just a bit of a "me too" post, I checked my last two days email including
Ham and Spam and checked the hitrate of DCC and RAZOR2 and here were the
results:

Ham: 0 DCC hits, 1 RAZOR2 hit out of 203 Ham messages.
Spam: 174 DCC hits, 57 RAZOR2 hits out of 242 Spam messages.

And out of the total message sample there weren't any FP's or FN's

Interesting that DCC is triggering on about 71% of my Spam, far higher
than I might have guessed. Meanwhile RAZOR2 hit on only 23% of Spam.

I didn't check the overlap between the two. (Since it requires more than a
simple search/grep and I can't be bothered ;)

Regards,
Simon



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Training SA

2003-12-10 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Matt Kettler writes:
>and then the Bayes FAQ in the Wiki:
>http://wiki.spamassassin.org/w/BayesFaq
>in particular:
>http://wiki.spamassassin.org/w/SiteWideBayesFeedback
>and:
>http://wiki.spamassassin.org/w/UsingAnAccountForLearning

Oh yeah, forgot to mention I finally got around to migrating all the FAQ
stuff onto the Wiki ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Exmh CVS

iD8DBQE/1s8IQTcbUG5Y7woRAlxsAJ4z8RZpmtFLW0hgJ/r0zpL3UiPW8gCeL3xT
YiX6VZlrG/4VoFBNLRFfxhw=
=+4C/
-END PGP SIGNATURE-



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Content Analysis

2003-12-10 Thread David B Funk
On Tue, 9 Dec 2003, Thomas Shoaf (PromoStep) wrote:

>
> The answer to your question, Gary... We are an incentives marketing firm
> with an affiliate element.  Our members can send virtual promotions from
> their account to friends, family, colleagues, etc; however, some email
> services such as Hotmail, Yahoo, etc appear to be blocking these promotions.
>
> So - we have a duty to ensure such promotions are delivered "as perceived"
> by our members.
>
> Likewise, we send various updates/newsletters to our members periodically
> and we feel that a majority of such messages are not being delivered to our
> members.  Therefore, it pertinent for our company to check such SPAM scoring
> to ensure the customers of our members receive what they send them and that
> our members receive the communications from us as a company.
>
> So we are not trying to see how much spam related content can go into an
> email nor are we tring to find a way around SA... We are trying to allow our
> members to communicate with their customers as well as allow our company to
> communicate to our members.
>

Thomas,
A simple, non-technical solution to your problem is to obtain a Habeas
Warrant mark and use it (see ).
Any site using Spamassassin will honor such a mark and pass the message
on, even if the content is "muddy".

This will work for current and future versions of Spamassassin.

There is a cost associated with obtaining a Warrant mark, but since you
are -not- sending spam it should not be prohibitive. You should be
able to consider it part of the cost of doing marketing business on the
internet.

Dave

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Re: One persistent spammer defeating SA.

2003-12-10 Thread era
On Sun, 7 Dec 2003 00:28:31 -0600, Robert Nicholson <[EMAIL PROTECTED]>
posted to spamassassin-talk:
 > I've got a mailbox full of messages that got past SA
 > They are all from the same spammer.

Hmm, not all of these have the Reply-to pattern which the followups
were concentrating on. Here are the exceptions:

 >  From: Harley Davidson <[EMAIL PROTECTED]>
 >  Subject:Harley Davidson RC Motorcycles are here...
 >  Date:   December 6, 2003 3:21:26 PM CST
 >  To:   Robert David Nicholson <[EMAIL PROTECTED]>
 >
 >  From: Smiles Coffee <[EMAIL PROTECTED]>
 >  Subject:Try our Coffee - Get $70 in Complimentary Gifts
 >  Date:   December 6, 2003 10:58:32 AM CST
 >  To:   Robert David Nicholson <[EMAIL PROTECTED]>
 >
 >  From: Loyalty Interactive <[EMAIL PROTECTED]>
 >  Subject:Loyalty Interactive Homeowner News - December 2003
 >  Date:   December 6, 2003 10:23:27 AM CST
 >  To:   Robert Nicholson <[EMAIL PROTECTED]>
 >  Reply-To: Loyalty Interactive
 > <[EMAIL PROTECTED]>
 >
 >  From: Universal Cable Box <[EMAIL PROTECTED]>
 >  Subject:100% Legal Digital Cable TV Descramblers
 >  Date:   December 6, 2003 6:53:34 AM CST
 >  To:   Robert David Nicholson <[EMAIL PROTECTED]>
 >
 > From:  Credit Incentive <[EMAIL PROTECTED]>
 >  Subject:"Unlimited Cash and Prizes with our MasterCard!"
 >  Date:   December 5, 2003 10:09:36 PM CST
 >  To:   Robert David Nicholson <[EMAIL PROTECTED]>
 >  Reply-To: Credit Incentive <[EMAIL PROTECTED]>
 >
 >  From: Homestore | Everything Home <[EMAIL PROTECTED]>
 >  Subject:See the Latest Trends in Kids’ Rooms
 >  Date:   December 5, 2003 10:27:05 PM CST
 >  To:   Robert David Nicholson <[EMAIL PROTECTED]>
 >  Reply-To: Homestore | Everything Home
 > <[EMAIL PROTECTED]>

For many of these, one can observe that the "user name" in the From:
header often also occurs in the Subject line. This could be a useful
rule pattern, although there are bound to be false positives, so the
score should be rather low.

I don't know off-hand if there is a way to do this in SA currently.
I'd guess it would take a specialized eval: rule. Maybe it's not worth
the effort.

/* era */

-- 
The email address era the contact information   Just for kicks, imagine
at iki dot fi is heavily  link on my home page at   what it's like to get
spam filtered.  If you  500 pieces of spam for
want to reach me, see instead.  each wanted message.



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Log Help!

2003-12-10 Thread Ryan Lumsden
Hi all.

how do I get spamd to log to a diffrent file besides messages and mail.log. 

I am up2date with sa and I am running debian woody, any body have any ideas. 

Thanks in advance.

Ryan




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Re: Log Help!

2003-12-10 Thread era
On Wed, 10 Dec 2003 09:58:17 +0200, Ryan Lumsden <[EMAIL PROTECTED]>
posted to spamassassin-talk:
 > how do I get spamd to log to a diffrent file besides messages and mail.log. 
 > I am up2date with sa and I am running debian woody, any body have
 > any ideas.

Does being "up2date" mean that you have installed 2.60 yourself or are
you running a Debian package? The standard one for Woody or a backport?

Assuming you have a 2.60 backport, but it could be that this even
works for the age-old 2.20 (which is what you get with Woody):

Edit /etc/syslog.conf; possibly edit /etc/default/spamd.conf to set it
to use a different syslog facility (or even not log through syslog but
use something else instead).

I can't think of a scenario where it would make sense to run 2.20 so
you should try to find a good backport or build from 2.60 yourself.

/* era */

-- 
The email address era the contact information   Just for kicks, imagine
at iki dot fi is heavily  link on my home page at   what it's like to get
spam filtered.  If you  500 pieces of spam for
want to reach me, see instead.  each wanted message.



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Log Help!

2003-12-10 Thread David B Funk
On Wed, 10 Dec 2003, Ryan Lumsden wrote:

> Hi all.
>
> how do I get spamd to log to a diffrent file besides messages and mail.log.
>
> I am up2date with sa and I am running debian woody, any body have any ideas.
>
> Thanks in advance.
>
> Ryan

Yes, look at the man pages for syslogd and spamd. Note the usage
of the syslog facility. Pick a facility that is not already in use
on your system, tell spamd to use that particular facility and tell
your syslogd to log that facility to your desired file.

Dave

-- 
Dave Funk  University of Iowa
College of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include 
Better is not better, 'standard' is better. B{



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] sven and spam rule

2003-12-10 Thread stephane ancelot
Hi,
are there any rules to avoid sven messages ?
bye
steph



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Why no razor_add_header?

2003-12-10 Thread Adam Foxson

Out of curiosity, is there a reason why SpamAssassin omits a configuration
option for razor_add_header when dcc_add_header and pyzor_add_header exist?

Thanks in advance.

-- 
Adam J. Foxson


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Fwd: spamassassin on a relay

2003-12-10 Thread pachox

hi all.

i'm working to test spamassassin on a relay server.
classico configuration: sendmail as a relay to exchange.

i wrote a /etc/procmailrc like this:

DROPPRIVS=yes

  :0fw
  * < 256000
  | spamc

ther is a domain on my linux machine and another on exchange, say cippa.it and
pippa.it (internal lan of course).
now when i send mail fro a workstation to linux.it mail is correctly tagged as
spam, but when i send a mail to exchange.it nothing happens, simply spamd is
not
called. see the logs here:

Dec 11 11:22:02 geppetto sm-mta[16931]: hBBAM2UF016931: from=<[EMAIL PROTECTED]>,
size=480, class=0, nrcpts=1, msgid=<[EMAIL PROTECTED]>,
proto=SMTP, daemon=MTA, relay=revolution.cippa.it [10.10.2.3]
Dec 11 11:22:02 geppetto spamd[11997]: connection from localhost [127.0.0.1] at
port 33004 
Dec 11 11:22:02 geppetto spamd[16936]: info: setuid to pacho succeeded 
Dec 11 11:22:02 geppetto spamd[16936]: processing message
<[EMAIL PROTECTED]> for pacho:1001. 
Dec 11 11:22:02 geppetto spamd[16936]: identified spam (750.4/5.0) for
pacho:1001 in 0.3 seconds, 747 bytes. 
Dec 11 11:22:03 geppetto sm-mta[16932]: hBBAM2UF016931: to=<[EMAIL PROTECTED]>,
ctladdr=<[EMAIL PROTECTED]> (1001/1001), delay=00:00:01, xdelay=00:00:01,
mailer=local, pri=30675, dsn=2.0.0, stat=Sent



no spamd whe mail is sent to pipa.it:

Dec 11 11:22:51 geppetto sm-mta[16937]: hBBAMpUF016937: from=<[EMAIL PROTECTED]>,
size=478, class=0, nrcpts=1, msgid=<[EMAIL PROTECTED]>,
proto=SMTP, daemon=MTA, relay=revolution.cippa.it [10.10.2.3]
Dec 11 11:22:52 geppetto sm-mta[16939]: hBBAMpUF016937: to=<[EMAIL PROTECTED]>,
ctladdr=<[EMAIL PROTECTED]> (1001/1001), delay=00:00:01, xdelay=00:00:01,
mailer=smtp, pri=120478, relay=rellay.pipa.it. [10.10.0.125], dsn=2.0.0,
stat=Sent ( <[EMAIL PROTECTED]> Queued mail for
delivery)


where is the problem?

tnx, pacho







This message was sent using IMP, the Internet Messaging Program.


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Re: sven and spam rule

2003-12-10 Thread Nancy McGough
On 10 Dec 2003 stephane ancelot ([EMAIL PROTECTED]) wrote:
> are there any rules to avoid sven messages ?

If your mail flows through Procmail, I recommend Dallman Ross's
elegant and efficient Virus Snaggers procmail recipes. I describe
how to get and install them here:

 

-- 
Nancy McGough
Infinite Ink ~ 
Deflexion & Reflexion ~ 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Installation Help

2003-12-10 Thread Rahul Baweja



Hi,
 
How can i install Spam Assassin in exim 3.35 on 
linux 7
 
Rahul


[SAtalk] build problem with spamassassin-2.61-1.src.rpm

2003-12-10 Thread Tayfun Can
I'm trying to build SpamAssassin from the source RPM as suggested.
However, when I execute 

$ rpmbuild --rebuild spamassassin-2.61-1.src.rpm

It fails with 

Checking for unpackaged file(s): /usr/lib/rpm/check-files
/var/tmp/spamassassin-root
error: Installed (but unpackaged) file(s) found:
   /usr/lib/perl5/5.8.0/i386-linux-thread-multi/perllocal.pod

Any ideas?


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] build problem with spamassassin-2.61-1.src.rpm

2003-12-10 Thread tibyke

_unpackaged_files_terminate_build 0
_missing_doc_files_terminate_build 0

t

>Checking for unpackaged file(s): /usr/lib/rpm/check-files
>/var/tmp/spamassassin-root
>error: Installed (but unpackaged) file(s) found:
>   /usr/lib/perl5/5.8.0/i386-linux-thread-multi/perllocal.pod


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Preparing a spam corpus?

2003-12-10 Thread Bill
I currently run SA in the mode where it places spam in an attachment to the
report. Right now I extract the attachment and move it to a IMAP folder
where all the FP mail is kept previous to learning. Missed spam gets dragged
into another folder directly.

I am getting ready to move to another server and would like to set up a spam
corpus for the new server. I have several thousand spams saved in an outlook
folder. Do I need to extract each of those spams from the report or can I
submit the encapsulated report message? Will SA strip the added
headers/encapsulation?

TIA,
Bill



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Upgrading 2.60 to 2.61

2003-12-10 Thread Alan Munday
I've hit a problem upgrading to 2.61... (on RedHat 8)

I think the root of the problem is that having installed 2.60 I then
proceeded to install Razor2. Instead of installing the razor-agents-sdk
package, I tried installing the Perl modules from CPAN:. When this failed,
due to perl dependency errors, I went back to trying to build from the
tarball. Then I found the documented issue with Digest::SHA1 and failed to
get razor installed.

I therefore think that this Digest::SHA1 module is the cause of the upgrade
problem but I don't know enough about these modules to fix it.

If you can shed some light I would be grateful.

Thanks

Alan






Running make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e"
"test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/basic_lintok
t/cidrs.ok
t/db_awl_path...ok
t/db_based_whitelistok
t/db_based_whitelist_ipsok
t/forged_rcvd...ok
t/gtube.ok
t/html_obfu.ok
t/lang_pl_tests.ok
t/nonspam...ok
t/razor2skipped
all skipped: no reason given
t/recipsok
t/reportheader..ok
t/rule_testsok
t/sha1..Use of inherited AUTOLOAD for non-method
Digest::SHA1::sha1_hex() is deprecated at t/sha1.t line 34.
Can't locate auto/Digest/SHA1/sha1_hex.al in @INC (@INC contains: lib
../blib/lib /root/.cpan/build/Mail-SpamAssassin-2.61/blib/
lib /root/.cpan/build/Mail-SpamAssassin-2.61/blib/arch
/usr/lib/perl5/5.8.0/i386-linux-thread-multi /usr/lib/perl5/5.8.0 /usr/li
b/perl5/site_perl/5.8.0/i386-linux-thread-multi
/usr/lib/perl5/site_perl/5.8.0 /usr/lib/perl5/site_perl
/usr/lib/perl5/vendor_pe
rl/5.8.0/i386-linux-thread-multi /usr/lib/perl5/vendor_perl/5.8.0
/usr/lib/perl5/vendor_perl .) at t/sha1.t line 33
t/sha1..dubious
Test returned status 255 (wstat 65280, 0xff00)
DIED. FAILED tests 1-15
Failed 15/15 tests, 0.00% okay
t/spam..ok
t/spamc.ok
t/spamc_B...ok
t/spamc_c...ok
t/spamc_c_stdout_closed.ok
t/spamd.ok
t/spamd_allow_user_rulesok
t/spamd_hup.ok
t/spamd_maxchildren.ok
t/spamd_maxsize.ok
t/spamd_portok
t/spamd_protocol_10.ok
t/spamd_report..ok 5/8  Not found: habeas = HABEAS_SWE
# Failed test 7 in t/SATest.pm at line 388
t/spamd_report..FAILED test 7
Failed 1/8 tests, 87.50% okay
t/spamd_report_ifspam...ok
t/spamd_stopok
t/spamd_symbols.ok
t/spamd_unixok
t/spamd_utf8ok
t/strip2ok
t/stripmarkup...ok
t/utf8..ok
t/whitelist_addrs...ok
t/whitelist_to..Use of inherited AUTOLOAD for non-method
Digest::SHA1::sha1_hex() is deprecated at ../blib/lib/Mail/
SpamAssassin/SHA1.pm line 57.
t/whitelist_to..ok
t/zz_cleanupok
Failed Test  Stat Wstat Total Fail  Failed  List of Failed

---
t/sha1.t  255 6528015   15 100.00%  1-15
t/spamd_report.t81  12.50%  7
1 test skipped.
Failed 2/39 test scripts, 94.87% okay. 16/281 subtests failed, 94.31% okay.
make: *** [test_dynamic] Error 29
  /usr/bin/make test -- NOT OK
Running make install
  make test had returned bad status, won't install without force




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] rule match counting

2003-12-10 Thread Stephen M. Przepiora
Hello, I have constructed a huge list of rules and wish to detect how 
good they are. Is there a way to log the count of rule matches somewhere?

Steve

---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] build problem with spamassassin-2.61-1.src.rpm

2003-12-10 Thread Kenneth Porter
--On Wednesday, December 10, 2003 7:22 AM -0500 Tayfun Can 
<[EMAIL PROTECTED]> wrote:

I'm trying to build SpamAssassin from the source RPM as suggested.
However, when I execute
$ rpmbuild --rebuild spamassassin-2.61-1.src.rpm

It fails with

Checking for unpackaged file(s): /usr/lib/rpm/check-files
/var/tmp/spamassassin-root
error: Installed (but unpackaged) file(s) found:
   /usr/lib/perl5/5.8.0/i386-linux-thread-multi/perllocal.pod
Just downloaded the tarball and ran:

rpmbuild -ta Mail-SpamAssassin-2.61.tar.gz

The unpackaged file is reported as a warning, not an error. (Red Hat 8.)

Maybe the tarball was fixed? I downloaded at about 7:05 am PST from 
spamassassin.rediris.es.

---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] SA tests performed

2003-12-10 Thread Barb Bautista
Newbie question...sorry for my ignorance.

What do I do with the "tests performed" available here:
http://www.spamassassin.org/tests.html

Could someone please explain if I should just copy this file into a .cf file
in my local.cf?  I am currently running SA site-wide.

This is probably a silly question, but I'd appreciate your help.

Thanks.



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Preparing a spam corpus?

2003-12-10 Thread Matt Kettler
At 08:13 AM 12/10/03 -0500, Bill wrote:
Do I need to extract each of those spams from the report or can I
submit the encapsulated report message? Will SA strip the added
headers/encapsulation?
Read the FAQ:
http://wiki.spamassassin.org/w/LearningMarkedUpMessages
In short, SA will auto-remove it's own markups. 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] filtering spam tagged email before hitting exchange 2000

2003-12-10 Thread JRiley



SA does have the ability to filter (block/discard) 
if so configured.. basically by just setting SA to delete any incoming 
scanned msg with a score of 5+ (default score level).
 
As far as setting up a whitelist, on a win32 
implementation of SA,  read the SA docs, and/or visit some sites with 
configuration information.
 
-JR
http://www.spamfighter.org
http://spamstats.systemnt.net

  - Original Message - 
  From: 
  Efren 
  Pedroza 
  To: [EMAIL PROTECTED] 
  
  Sent: Tuesday, December 09, 2003 1:34 
  PM
  Subject: RE: [SAtalk] filtering spam 
  tagged email before hitting exchange 2000
  
  I 
  don't know why you are saying that SA does not filter e-mails, I does !!, I'm 
  very newbiew on this but i did install SA on same server where Exchange 2000 
  runs and it's doing well.
   
  The 
  only mater is that i can't find the way to make white list; Sa is tagging as 
  SPAM valids e-mails, can someone help me on this ?
   
  Saludos  
  === 
  
  Efren Pedroza Huerta 
  -Original Message-From: 
  [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED]On Behalf Of 
  gentianSent: Monday, December 08, 2003 7:50 AMTo: 
  [EMAIL PROTECTED]Subject: [SAtalk] filtering 
  spam tagged email before hitting exchange 2000
  Hi list,
   
  I am very new to spamassassin and i want to setup 
  a gateway for external spam and then i want to relay mail from spamassassin to 
  Exchange 2000. I read that spamassassin just tags the mail, doesn'filter it 
  and i should filter it on Exhange 2000 and that was done by some other tools. 
  The problems is that i do not want to mess around with Exchange 2000 and 
  install other stuff in there. It has already enough load and problems so I was 
  wondering if there is any way to filter tagged email before it hits Exchange 
  2000, something that filters it at at tehe same machine where spamassassin 
  lives. 
   
  Any idea is apprecciated.
   
   
  Thanx in advance
   
  Gentian


Re: [SAtalk] Log Help!

2003-12-10 Thread Matt Kettler
At 09:58 AM 12/10/03 +0200, Ryan Lumsden wrote:

how do I get spamd to log to a diffrent file besides messages and mail.log.
edit your /etc/syslog.conf and use spamd's -s parameter to change what 
syslog facility to use.

Spamd isn't writing to any files at all, it's just doing standard 
unix-style system logging, and it's up to the system log daemon to decide 
where to go with them. I'd strongly recommend getting familar with syslog, 
since 90% of unix daemons use syslog.

Basicaly, in the syslog world, a program sends a log message and specifies 
a "facility" (ie: general source) and a "level" (ie: critical, warning, 
informational, debug, etc). By default spamd, and most mail deamons, use 
the facility 'mail' so they all get written to the same file.

Most linux systems have a manpage for syslog.conf, you might want to look 
at that.

Assuming a "old-style" syslog daemon (and not something newer  like 
syslog-ng), if I wanted spamd to log to it's own file exclusively, I'd find 
a local facility that's not being used...

set spamd to log to that.
   spamd -s local3
then I'd configure syslogd to dump it somewhere

   local3.* /var/log/spamd.log





---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Training SA

2003-12-10 Thread Matt Kettler
At 11:45 PM 12/9/03 -0800, Justin Mason wrote:
Oh yeah, forgot to mention I finally got around to migrating all the FAQ
stuff onto the Wiki ;)
Heh, yeah, I caused me to go "Where the heck is that FAQ link???!!!" for 
about 5 seconds before I saw the wiki one..

Ok, my real impressions were a little less PG rated than that, but they 
were short lived :) 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] SA tests performed

2003-12-10 Thread Matt Kettler
At 09:34 AM 12/10/03 -0500, Barb Bautista wrote:
What do I do with the "tests performed" available here:
http://www.spamassassin.org/tests.html
Could someone please explain if I should just copy this file into a .cf file
in my local.cf?  I am currently running SA site-wide.
Um, don't do *anything* with that file, other than peruse it for your own 
information.

That's just a easy-to-read summary of the config files that come with 
spamassassin. They are already installed in /usr/share/spamassassin.





---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Help with Mark Motley's perl script

2003-12-10 Thread Lentz, Wayne
Guys,

I'm trying to use the perl script that Mark posted, for feeding bayes with
mail in our Exchange5.5 public folders.  But when I execute the script it,
it gives me the error below.  I know squat about perl and google isn't
giving me much, so I'm hoping you guys can help me help me with this.
Thanks all.

System info:
OpenBSD 3.3 with Postfix 2.0.13
Amavisd-new with SA 2.55
Perl modules, including SA, installed via CPAN.


When I run the script it gives me this error:

Can't call method "select" on an undefined value at
/usr/local/sbin/my-msex-spam.pl line 17.


And here is my personalized version of his script (thanks much Mark):

 start of file ---

#!/usr/bin/perl

use Mail::IMAPClient;
use Sys::Syslog;

my $FOLDER_NAME = 'Public Folders/All Public Folders/SpamLearn';
my $SEQ = 1; openlog('pullspam','cons,pid', 'user');

my $server = Mail::IMAPClient->new(
Server => "",
User => "",
Password => "",
Uid => 1,
Debug => 0 );

$server->select($FOLDER_NAME);
my @msgs = $server->search("ALL");
foreach my $msg (@msgs) {
$server->message_to_file("/var/amavisd/spamlearn/" . $SEQ,$msg);
$server->delete_message($msg);
$SEQ++;
};
$server->expunge($FOLDER_NAME);
print "Pulled ". ($SEQ-1) . " messages from spam folder.\n";
syslog('mail|info', 'Pulled '.($SEQ-1) . ' messages from ham folder.');

--- end of file ---


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] sven and spam rule

2003-12-10 Thread Matt Kettler
At 09:48 AM 12/10/03 +0100, stephane ancelot wrote:
Hi,
are there any rules to avoid sven messages ?
bye
steph
It's not really the point of SA, however Andreas Kotowicz posted a list of 
rules that appear to work well.

My only criticism of this ruleset is that he forgot to name all the 
sub-rules starting with double-underscore, so MDS_Swen_A_* get 1 point each.

I would however suggest adding clamav, or another virus scanner as the best 
way of detecting these.

In any event, here's his rules (note: be sure to undo line wraps)

header  MDS_Swen_A_0From=~
/(email|inet|internet|mail|microsoft|ms|net|network)/i
header  MDS_Swen_A_1From=~ /(section|service|system)/i
header  MDS_Swen_A_2From=~
/^\s*(admin|administrator)\s*$/i
header  MDS_Swen_A_3Subject =~
/^\s*(advice|announcement|failure\s+report|letter|mail|notice|report)\s*$/i
header  MDS_Swen_A_4Subject =~
/^\s*(abort|bug|error)\s+\S+/i
header  MDS_Swen_A_5Subject =~
/^.*\s+(advice|announcement|letter|message|notice)\s*$/i
header  MDS_Swen_A_6Subject =~
/^\s*(mail:\s+|message|(returned|undeliverable|undelivered)\s+(mail|message))/i
header  MDS_Swen_A_7Subject =~ /^\s*$/
header  MDS_Swen_A_8Subject =~
/^(critical|current|internet|last|latest|microsoft|net|network|new|newest|security)\s+/i
header  MDS_Swen_A_9Subject =~
/^.*\s+(pack|patch|update|upgrade)/i
metaMDS_Swen_A   (((   MDS_Swen_A_0 &&   MDS_Swen_A_1 ) ||
MDS_Swen_A_2 ) || (   MDS_Swen_A_3 || (   MDS_Swen_A_4 &&   MDS_Swen_A_5
) ||   MDS_Swen_A_6 ||   MDS_Swen_A_7 ) || (   MDS_Swen_A_8 &&
MDS_Swen_A_9 ))
describeMDS_Swen_A  MDS - Swen A worm
score   MDS_Swen_A  +10.0
#quick scoring bugfixes by mk:
score MDS_Swen_A_0  0.01
score MDS_Swen_A_1  0.01
score MDS_Swen_A_2  0.01
score MDS_Swen_A_3  0.01
score MDS_Swen_A_4  0.01
score MDS_Swen_A_5  0.01
score MDS_Swen_A_6  0.01
score MDS_Swen_A_7  0.01
score MDS_Swen_A_8  0.01
score MDS_Swen_A_9  0.01


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] SA tests performed

2003-12-10 Thread Terry Milnes
Barb Bautista wrote:
Newbie question...sorry for my ignorance.

What do I do with the "tests performed" available here:
http://www.spamassassin.org/tests.html
Could someone please explain if I should just copy this file into a .cf file
in my local.cf?  I am currently running SA site-wide.
That's just a web page that displays the built in tests that 
spamassassin currently uses.  No need to add anything from there.

tm.



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Using dccifd instead of dccproc

2003-12-10 Thread Smart,Dan

The DCC documentation says that the dccifd interface is much more efficient
than dccproc.  I see from doing a spamassassin -D that it looks for it.  

Is there any install procedure for dccifd, and should this be the generally
recommended interface for dcc?

Why or why not?

TIA

<>




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Wacky postmaster whitelist questions

2003-12-10 Thread Josh Endries
Hey guys and gals,

I get a lot of postmaster emails, and I'm trying to whitelist them so 
they aren't marked as spam. Even though many are bounces due to spam, I 
would like to whitelist them so I don't miss any legit emails. I turned 
bayes off because it learned these as spam. Anyway I haven't figured out 
a way that works. I'm currently running this:

whitelist_from_rcvd * localhost

I have tried whitelist_to *postmaster*, whitelist_from *MAILER-DAEMON*, 
and a bunch of other methods but none seem to work. This most recent 
version was more out of frustration than anything. Here are the headers 
my server gives me for postmaster emails:

Return-Path: 
Received: from localhost (localhost) by 
From: Mail Delivery Subsystem 
To: postmaster
Any ideas what I could/should whitelist? Maybe my syntax is incorrect.

Thanks!

--
Josh
---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] rule match counting

2003-12-10 Thread Matt Kettler
At 10:09 AM 12/10/03 -0500, Stephen M. Przepiora wrote:

Hello, I have constructed a huge list of rules and wish to detect how good 
they are. Is there a way to log the count of rule matches somewhere?
if you've got a spam/ham corpus, you can test your rules using the tools in 
the masses/ subdirectory of the tarball.

You'll want to use mass_check, and hit_frequencies.

The rule guide has a very short note about it at the bottom (section 3.4) 
but I've not added an example run yet.. It's my intention to write a 
separate guide for corpuses, mass_check, etc.

http://mywebpages.comcast.net/mkettler/sa/SA-rules-howto.txt



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Re: Help with DCC setup for use with Spamassaian

2003-12-10 Thread stan
On Wed, Dec 10, 2003 at 01:22:23AM -0500, Bryan Hoover wrote:
> stan wrote:
> > Yes, I just erviewd the firewall config. It will pass all trafic
> > originating on the innsied. I see that may not be a good general case, but
> > it should be OK here (Small home network).
> > 
> > BTW, I decided to try (breifly) disabling all packet firewalling. Guess
> > what? cdcc still says "No servers responfing".
> 
> Did you try just pinging one or more of the DCC servers?

Yes, as a metter of facy, see this traceroute:


[EMAIL PROTECTED]:~$ traceroute -p 6277 dcc1.dcc-servers.net
traceroute: Warning: dcc1.dcc-servers.net has multiple addresses; using 208.201.249.232
traceroute to dcc.dcc-servers.net (208.201.249.232), 30 hops max, 38 byte packets
 1  koala (205.159.77.234)  1.053 ms  0.757 ms  3.248 ms
 2  10.116.72.1 (10.116.72.1)  9.190 ms  22.048 ms  33.081 ms
 3  172.30.75.81 (172.30.75.81)  9.644 ms  9.809 ms  8.370 ms
 4  172.30.75.122 (172.30.75.122)  8.596 ms  27.574 ms  9.224 ms
 5  12.124.58.77 (12.124.58.77)  19.150 ms  18.641 ms  20.646 ms
 6  gbr5-p80.attga.ip.att.net (12.123.21.74)  21.172 ms  32.190 ms  19.746 ms
 7  tbr2-p013501.attga.ip.att.net (12.122.12.41)  21.487 ms  27.284 ms  24.384 ms
 8  ggr1-p370.attga.ip.att.net (12.123.20.253)  21.536 ms  21.056 ms  20.245 ms
 9  att-gw.ny.cw.net (192.205.32.118)  21.074 ms  21.752 ms  20.048 ms
10  dcr1-loopback.SantaClara.cw.net (208.172.146.99)  82.517 ms  82.504 ms  98.134 ms
11  bpr1-so-0-0-0.SanJoseEquinix.cw.net (208.173.54.65)  81.197 ms  79.956 ms  106.910 
ms
12  208.173.54.46 (208.173.54.46)  77.201 ms  76.175 ms  76.272 ms
13  fast5-0-0.border.sr.sonic.net (64.142.0.13)  83.196 ms  512.588 ms  216.541 ms
14  fast0-1.dist2-1.sr.sonic.net (208.201.224.160)  84.227 ms  81.727 ms  89.809 ms
15  eth0.d.spam.sonic.net (208.201.249.232)  108.510 ms  85.306 ms  88.993 ms
[EMAIL PROTECTED]:~$ 
Script done on Tue Dec  9 08:58:31 2003
> 
> Also, in addition to what Alex said about dccproc, dccifd is the daemon
> version.  SA will use either of them (in case you can't or don't want to
> use the daemon).

So, I don't have to run the daemon?
> 

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] sa-learn mbox processing?

2003-12-10 Thread Larry Starr
I currently have mimedefang (2.37) and spamassassin (2.60) running on a RH9 
mail gateway.

Spamassassin is configured to block messages with a very high SA score and to 
tag and pass along everything else.

I have two accounts set up, on an internal server, for users to forward 
received spam, and ham to.

My question regards scripts to ease processing of these mailboxes.  Since the 
messages are forwarded, from several different Email clients (netscape, 
kmail, pine, AppleMail, etc), extracting the original message, for sa-learn 
is proving to be non-trivial.

Does anyone, on the list, have or know of a tool that will reliably extract an 
original messaged from a forwarded message?

Thank you,
-- 
Larry G. Starr - [EMAIL PROTECTED] or [EMAIL PROTECTED]
Software Engineer: Full Compass Systems LTD.
Phone: 608-831-7330 x 1347  FAX: 608-831-6330
===
There are only three sports: bullfighting, mountaineering and motor
racing, all the rest are merely games! - Ernest Hemmingway




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] rule match counting

2003-12-10 Thread Stephen M. Przepiora
First, thanks for the reply.

This method wouldn't work for us as I can not keep a copy of all mail 
through the server for legal reasons. What I need is a way for spamd to 
log the hit count for each rule as it processes the mail. This way I can 
prune old rules from the system (we currently run close to 16000 rules) 
and cut down on the load of the mail server. At one point all four cpu's 
were running full bore so we had to cut the rules in half (down to 
16000). Of course this resulted in an increase in spam received to our 
users.

My response to this was to find a way to log the hit count for a rule 
and retire our rules that don't do much.

Most of our rules that are very affective for us are URI match rules. 
Almost all spam comes through with a url to click on, or a url to load 
an image from. Most spammers have multiple domains they use, and rotate 
through them. We match these domains and add a 2 to the score. I can not 
stress enough how effective this is. It cuts down the spam down from 
100-200 a day to 2-3 a day per user (obviously some users do not get 
quite as much).

The other way I have to prune rules is to remove a rule after the domain 
name has expired.  This however isn't as accurate in some ways, because 
the domain wouldn't get removed for a year. However if the domain is 
expired it defiantly doesn't belong as a rule.

Steve

Matt Kettler wrote:

At 10:09 AM 12/10/03 -0500, Stephen M. Przepiora wrote:

Hello, I have constructed a huge list of rules and wish to detect how 
good they are. Is there a way to log the count of rule matches 
somewhere?


if you've got a spam/ham corpus, you can test your rules using the 
tools in the masses/ subdirectory of the tarball.

You'll want to use mass_check, and hit_frequencies.

The rule guide has a very short note about it at the bottom (section 
3.4) but I've not added an example run yet.. It's my intention to 
write a separate guide for corpuses, mass_check, etc.

http://mywebpages.comcast.net/mkettler/sa/SA-rules-howto.txt



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Gary Funck

Hello,

after running a spam refiling script which invokes 'spamassassin -r', I
received the
following diagnostics:

/usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf.pm line 362.
Argument "" isn't numeric in numeric lt (<) at
/usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore.pm line 1248.
Argument "" isn't numeric in numeric lt (<) at
/usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore.pm line 1248.
[repeated 20 or more times]

Here's the offeding line:

   1247 my $oldmagic = $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN}
;
   1248 if (!defined ($oldmagic) || $atime < $oldmagic) {
   1249   $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN} = $atime;
   1250 }

and the dump of the magic numbers:

0.000  0  2  0  non-token data: bayes db version
0.000  0  45244  0  non-token data: nspam
0.000  0  16623  0  non-token data: nham
0.000  0 477886  0  non-token data: ntokens
0.000  0  0  0  non-token data: oldest atime
0.000  0 1916183140  0  non-token data: newest atime
0.000  0 1071074447  0  non-token data: last journal sync
atime
0.000  0 1071045963  0  non-token data: last expiry atime
0.000  0  43200  0  non-token data: last expire atime
delta
0.000  0  0  0  non-token data: last expire
reduction count

Things had been working fine up to this point.

should I try an 'sa-learn --rebuild' at this point?




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Using dccifd instead of dccproc

2003-12-10 Thread Bill Landry
It should already be installed at /var/dcc/libexec/dccifd (depending on your
./configure parameters).  All you have to do is setup the config files at
/var/dcc/dcc_conf and /var/dcc/libexec/start-dccifd, then execute
start-dccifd and you should be good to go.

Oh, and appears to run faster in our environment than dccproc, since it does
not need to be instantiated for each message scanned.

Bill
- Original Message - 
From: "Smart,Dan" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, December 10, 2003 7:57 AM
Subject: [SAtalk] Using dccifd instead of dccproc


>
> The DCC documentation says that the dccifd interface is much more
efficient
> than dccproc.  I see from doing a spamassassin -D that it looks for it.
>
> Is there any install procedure for dccifd, and should this be the
generally
> recommended interface for dcc?
>
> Why or why not?
>
> TIA
>
> <>
>
>
>
>
> ---
> This SF.net email is sponsored by: IBM Linux Tutorials.
> Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
> Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
> Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
> ___
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
>



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Wacky postmaster whitelist questions

2003-12-10 Thread Evan Platt
--On Wednesday, December 10, 2003 10:54 AM -0500 Josh Endries
<[EMAIL PROTECTED]> wrote:

> I get a lot of postmaster emails, and I'm trying to whitelist them so
> they aren't marked as spam. Even though many are bounces due to spam, I
> would like to whitelist them so I don't miss any legit emails. I turned
> bayes off because it learned these as spam. Anyway I haven't figured out
> a way that works. I'm currently running this:
> 
> whitelist_from_rcvd * localhost

How are you calling SpamAssassin? Why not just (assuming you're using
procmail), create a procmail rule?


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Razor and Spamassassin vs spamc

2003-12-10 Thread Adam D. Lopresto
Are you running spamd as yourself, or as another user (nobody, or root, or
whatever)?  It sounds quite likely that the user your running spamd as doesn't
have a ~/.razor/ set up, but your own user does.

On Mon, 8 Dec 2003, Mark Norton wrote:

> Any reason why I would get razor results against spamassassin and none when
> using the spamc client?
>
>
>
> I've been running the spamc client and have not seen any hits against any of
> the spam. After replacing spamc with spamassassin the second message through
> reported this:
>
>
>
> Content analysis details:   (9.6 points, 5.0 required)
>
>
>
>  pts rule name  description
>
>  --
> --
>
>  2.8 OBSCURED_EMAIL BODY: Message seems to contain rot13ed address
>
>  4.3 EMAIL_ROT13BODY: Body contains a ROT13-encoded email
> address
>
>  1.6 RAZOR2_CF_RANGE_51_100 BODY: Razor2 gives confidence between 51 and 100
>
> [cf: 100]
>
>  0.0 LINES_OF_YELLING   BODY: A WHOLE LINE OF YELLING DETECTED
>
>  0.9 RAZOR2_CHECK   Listed in Razor2 (http://razor.sf.net/
>  )
>
>
>
>
>
> I prefer using the spamc client. Any ideas??
>
>
>
>
>
> Regards,
>
> Mark Norton
>
>
>
>

-- 
Adam Lopresto
http://cec.wustl.edu/~adam/

"They say time is the fire in which we burn."
--Solan
"Anyone have any marshmallows?"
--Smeg47


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Gary Funck

> 
> should I try an 'sa-learn --rebuild' at this point?
> 
> 

follow-up. 'sa-learn --rebuild' just printed out more of these messages:

> /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf.pm line 362.
> Argument "" isn't numeric in numeric lt (<) at
> /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore.pm line 1248.
> Argument "" isn't numeric in numeric lt (<) at
> /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore.pm line 1248.



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Help with Mark Motley's perl script

2003-12-10 Thread Scott Harris
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On 
> Behalf Of Lentz, Wayne
> Sent: Wednesday, December 10, 2003 7:38 AM
> To: '[EMAIL PROTECTED]'
> Subject: [SAtalk] Help with Mark Motley's perl script
> 
> Guys,
> 
> I'm trying to use the perl script that Mark posted, for 
> feeding bayes with mail in our Exchange5.5 public folders.  
> But when I execute the script it, it gives me the error 
> below.  I know squat about perl and google isn't giving me 
> much, so I'm hoping you guys can help me help me with this.
> Thanks all.
> 
> System info:
> OpenBSD 3.3 with Postfix 2.0.13
> Amavisd-new with SA 2.55
> Perl modules, including SA, installed via CPAN.
> 
> 
> When I run the script it gives me this error:
> 
> Can't call method "select" on an undefined value at 
> /usr/local/sbin/my-msex-spam.pl line 17.
> 
> 
> And here is my personalized version of his script (thanks much Mark):
> 
>  start of file ---
> 
> #!/usr/bin/perl
> 
> use Mail::IMAPClient;
> use Sys::Syslog;
> 
> my $FOLDER_NAME = 'Public Folders/All Public 
> Folders/SpamLearn'; my $SEQ = 1; 
> openlog('pullspam','cons,pid', 'user');
> 
> my $server = Mail::IMAPClient->new(
> Server => "",
> User => "",
> Password => "",
> Uid => 1,
> Debug => 0 );
> 
> $server->select($FOLDER_NAME);


I was having trouble with the above my $server line when
I was trying to get the code to run.  The version of this
script I'm running looks like this:

# login to imap server
my $imap = Mail::IMAPClient->new (Server=>$imapserver, User=>$uid,
Password=>$pwd, Debug=>$debug)
or die "Can't connect to [EMAIL PROTECTED]: $@ $\n";

When I tried the above version it seemed to get confused that there
was a User and a Uid field, it kept failing the login.

Scott


I'm using the perl script found at:
http://marc.theaimsgroup.com/?l=spamassassin-talk&m=104806917615490&w=2



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Using sa-learn in a site-wide configuration

2003-12-10 Thread Stephen Westrip
Dear All,

I am trying to set up sa-learn in a site-wide configuration. I have a Red
Hat 9 server, SpamAssassin 2.61 and MIMEDefanf 2.39. I have got SA to work
fine and our spam has dropped considerably, but I would also like to use
sa-learn.

What exactly do I need to do to make this work? I have read lots about
adding 'bayes_auto_learn 1' and other bits and pieces to put in the cf file
but whatever I try the Bayes DB never gets added to.

Many thanks,
Stephen Westrip
MetaFour UK


<>

Re: [SAtalk] filtering spam tagged email before hitting exchange 2000

2003-12-10 Thread Bob Apthorpe
Hi,

[N.B. Reformatted into a sensible whole. Please trim your posts, line
wrap and (I know this sounds petty) please don't top post if you expect
follow-ups. Thanks.]

On Wed, 10 Dec 2003 09:22:06 -0600 "JRiley" <[EMAIL PROTECTED]> wrote:
>   From: Efren Pedroza 
> > On Behalf Of gentian
> > > I am very new to spamassassin and i want to setup a gateway for external
> > > spam and then i want to relay mail from spamassassin to Exchange 2000. I
> > > read that spamassassin just tags the mail, doesn'filter it and i should
> > > filter it on Exhange 2000 and that was done by some other tools. The
> > > problems is that i do not want to mess around with Exchange 2000 and
> > > install other stuff in there. It has already enough load and problems so
> > > I was wondering if there is any way to filter tagged email before it
> > > hits Exchange 2000, something that filters it at at tehe same machine
> > > where spamassassin lives. 
> > >
> > > Any idea is apprecciated.

Check the mailing list archives. If you have a spare reliable PC, you
can build a secure, spam-filtering mail relay to pre-filter internet
traffic to your Exchange server using Sendmail+MimeDefang+SA or
Postfix+Amavisd+SA. This is a common question, it may be a FAQ but I'm
dead sure this has been discussed on this list in last three months.

> > I don't know why you are saying that SA does not filter e-mails, I does
> > !! I'm very newbiew on this but i did install SA on same server where
> > Exchange 2000 runs and it's doing well.
> > 
> > The only mater is that i can't find the way to make white list; Sa is
> > tagging as SPAM valids e-mails, can someone help me on this ?

Read the documentation on how to use whitelist_from and
whitelist_from_rcvd. whitelist_from_rcvd is most likely what you want.
Your whitelist will need to go in a user_prefs file or local.cf; not
sure where those are located on Win32 (are you running SA under Cygwin?)

> SA does have the ability to filter (block/discard) if so configured..
> basically by just setting SA to delete any incoming scanned msg with a
> score of 5+ (default score level).

 No, SA itself does not, cannot block or discard mail, but it may
provide scoring information for other tools that block or discard mail
(e.g. procmail, milters, etc.) Think of SA as the judge and jury, some
other tool as the executioner. SA just renders an opinion and files the
paperwork. Something else pulls the trigger.

> As far as setting up a whitelist, on a win32 implementation of SA,  read
> the SA docs, and/or visit some sites with configuration information.

If you don't find your answer in the docs, the FAQ, or the mailing list
archive, you've either got an interesting question or you're not trying
hard enough. :)

-- Bob


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Brad Wilkin
I seem to have a rash of spam lately that gets by SA because the subject line
and/or body of the message contains spam phrases but words have been obfuscated by
inserting semicolons, periods and other punctuation or special characters.  In
somes cases, the punctuation displaces a character (s*xual) but most times, just
breaks up the word so it doesn't pattern match (en;large.ment)

Has anyone had success writing tests that can catch this sort of trickery?  It
seems if you could come up with a level of punctuation WITHIN words or simply
remove common punctuation from the subject/body before doing the pattern matching,
SA will be able to identify these.

Thanks.


Brad Wilkin
Lewis and Clark College

-
Brad Wilkin    Information Technology
Director of Information Systems Lewis & Clark College
[EMAIL PROTECTED] Portland, OR 97219
 PHONE (503) 768-7244





---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] making my own Evil rule list

2003-12-10 Thread Douglas Kirkland
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I am pulling my example off the following url.

http://www.merchantsoverseas.com/wwwroot/gorilla/bigevil.cf

I have setup the following rule in my user_prefs file.

uri EVILLIST_2 /\b(?:dubnh\.us)\b/i
describe  EVILLIST_2 Generated EvilList_2
score EVILLIST_2 3.0

The test message that I have does not seem to caught by the spam rule that I 
have inputed in to the user_prefs file.  I am wondering what i missed.

Here is the output of the test message from spamc/spamd.

Content preview:  
URI:http://213.4.130.210/%70%65rson%61%6C7/%62%6Flik17/%70%31/
  URI:http://dubnh.us/patch/?utopia URI:p1_01.gif URI:p1_02.jpg
  URI:p1_03.gif URI:p1_04.gif URI:p1_05.gif URI:p1_06.gif URI:p1_07.gif
  [...] 

Content analysis details:   (8.2 points, 4.0 required)

 pts rule name  description
-  -- --
 1.2 MIME_HTML_MOSTLY   BODY: Multipart message mostly text/html MIME
 0.2 HTML_50_60 BODY: Message is 50% to 60% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 1.5 HTML_IMAGE_ONLY_04 BODY: HTML: images with 200-400 bytes of words
 0.8 MIME_MISSING_BOUNDARY  RAW: MIME section missing boundary
 3.0 NORMAL_HTTP_TO_IP  URI: Uses a dotted-decimal IP address in URL
 0.7 HTTP_EXCESSIVE_ESCAPES URI: Completely unnecessary %-escapes inside a URL
 0.8 MSGID_FROM_MTA_HEADER  Message-Id was added by a relay


Thanks,

Douglas
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE/11isSpWn8R0Z08URAv0eAKC6udw8T4j8+xt9w9Bh6AxrSWvhpQCfdIvN
By6xIN6h6R8fVnNcWzImlIE=
=wlT4
-END PGP SIGNATURE-



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Batching files with spamc

2003-12-10 Thread Dimitar Haralanov
Hi, I was trying to find any information on batching messages with
spamc and could not find anything so I am asking the list:
Is it possible to batch multiple files with spamc? In other words
instead of redirecting messages one at a time to spamc (spamc [options]
< message) give spamc a number of files to send to spamd.
That could potentially make it easier when one has to filter 100
messages.
The reason I am asking is that often I get >100 messages and when my
mailer starts filtering them one by one which might be a little slow
since spamc gets started once per message.

-- 
Mitko Haralanov
voidtrance at comcast dot net
http://voidtrance.home.comcast.net
==
Everybody needs a little love sometime; stop hacking and fall in love!


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Gary Funck


> -Original Message-
> From: Brad Wilkin
> Sent: Wednesday, December 10, 2003 9:23 AM
[...]
> Has anyone had success writing tests that can catch this sort of
> trickery?  It
> seems if you could come up with a level of punctuation WITHIN
> words or simply
> remove common punctuation from the subject/body before doing the
> pattern matching,
> SA will be able to identify these.

I was looking at this and was thinking that counting the ratio
of punctuation to other letters might be one way to go. Otherwise,
ofen the punctuation has only letters on each side and that is unusual
for punctuation marks like ';'.

Here's a line from a recent spam, as an example:
  We do the_work for you. By subrn;itting your infor;mation across_to
hundreds of L;enders, we can_get you the_best int;erest r;ates around.

A pattern like the following:
   /([a-z][;][a-z]+.*){5}/i
might get some traction. This has to be run after the HTML is stripped.






---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] sa-learn mbox processing? (forwarded learning)

2003-12-10 Thread Matt Kettler
At 11:32 AM 12/10/2003, Larry Starr wrote:
My question regards scripts to ease processing of these mailboxes.  Since the
messages are forwarded, from several different Email clients (netscape,
kmail, pine, AppleMail, etc), extracting the original message, for sa-learn
is proving to be non-trivial.
In many cases, it's not only non-trivial, it's impossible. Most MUA's 
re-encode, strip down the headers, etc, etc when forwarding. Unless your 
users are careful, reconstruction is going to be impossible, as some of the 
data has been removed.


Does anyone, on the list, have or know of a tool that will reliably 
extract an
original messaged from a forwarded message?
I'm unsure if one is even possible.





---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Using sa-learn in a site-wide configuration

2003-12-10 Thread Matt Kettler
At 12:04 PM 12/10/2003, Stephen Westrip wrote:
What exactly do I need to do to make this work? I have read lots about
adding 'bayes_auto_learn 1' and other bits and pieces to put in the cf file
but whatever I try the Bayes DB never gets added to.
did you install DB_File? if not, bayes won't go. 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Using dccifd instead of dccproc

2003-12-10 Thread Smart,Dan
Do you have an example of how to configure the dcc_conf?  The INSTALL.txt
and dccifd.html offers very little on this.

<>


 

| -Original Message-
| From: Bill Landry [mailto:[EMAIL PROTECTED] 
| Sent: Wednesday, December 10, 2003 10:50 AM
| To: [EMAIL PROTECTED]
| Subject: Re: [SAtalk] Using dccifd instead of dccproc
| 
| It should already be installed at /var/dcc/libexec/dccifd 
| (depending on your ./configure parameters).  All you have to 
| do is setup the config files at /var/dcc/dcc_conf and 
| /var/dcc/libexec/start-dccifd, then execute start-dccifd and 
| you should be good to go.
| 
| Oh, and appears to run faster in our environment than 
| dccproc, since it does not need to be instantiated for each 
| message scanned.
| 
| Bill
| - Original Message -
| From: "Smart,Dan" <[EMAIL PROTECTED]>
| To: <[EMAIL PROTECTED]>
| Sent: Wednesday, December 10, 2003 7:57 AM
| Subject: [SAtalk] Using dccifd instead of dccproc
| 
| 
| >
| > The DCC documentation says that the dccifd interface is much more
| efficient
| > than dccproc.  I see from doing a spamassassin -D that it 
| looks for it.
| >
| > Is there any install procedure for dccifd, and should this be the
| generally
| > recommended interface for dcc?
| >
| > Why or why not?
| >
| > TIA
| >
| > <>
| >
| >
| >
| >
| > ---
| > This SF.net email is sponsored by: IBM Linux Tutorials.
| > Become an expert in LINUX or just sharpen your skills.  
| Sign up for IBM's
| > Free Linux Tutorials.  Learn everything from the bash shell 
| to sys admin.
| > Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
| > ___
| > Spamassassin-talk mailing list
| > [EMAIL PROTECTED]
| > https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
| >
| 
| 
| 
| ---
| This SF.net email is sponsored by: IBM Linux Tutorials.
| Become an expert in LINUX or just sharpen your skills.  Sign 
| up for IBM's
| Free Linux Tutorials.  Learn everything from the bash shell 
| to sys admin.
| Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
| ___
| Spamassassin-talk mailing list
| [EMAIL PROTECTED]
| https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
| 


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] filtering spam tagged email before hitting exchange 2000

2003-12-10 Thread Gary Smith
To comment on Bob's approach that's exactly what got me going in the Linux world... 
Exchagne2K.
 
Here is my experience with Exchange 2K.  This is a little off topic I just wanted to 
include some feedback.
 
Here was the problem that we had (and the solution) when I started running Exchange 
2K. 
 
Try 1 2000) I build an Exchange 2K box on a 800mhz box with 512mb of ram to handle 
about 200 email boxes. (that was the hardware of the day).  Because of budget 
constrains I could not build a second box.  So, we put this box into place, including 
AV, and started using it.  Day 10 they box was overloaded, customers complained and 
then a mail emailing took it out
 
Try 2 2000) Learning a little we built a second box to act as a front end.  About the 
same specs...  After configuring it the box seemed to work.  Then I started getting a 
"but load" of spam.  These were guess work spams...  You know, [EMAIL PROTECTED], 
 [EMAIL PROTECTED]  , etc.  It 
took it's tool quickly as well.  The machine itself would spend 90% of it's time 
dealing with it, 10% trying to keep IIS running.
 
Try 3 2000) Tired of maintaining Exchange everyone morning and night I decided to try 
that sendmail thing...  I implemented that on a couple cheap Linux boxes.  The test 
case was two 90mhz Compaq workstations.  Don't laught cause it seemed to help.  
Basically they just transported the mail to the Exchange DS box which was internall 
only now because of the FE server).  This helped immensly because when Exchange was 
busy then the mail would spool.  After proof of concept we did indeed upgrade those 
90mhz boxes (currently still 450's though)...
 
Try 5 2001/2002 and the somewhat final solution) We revamped our structure for web 
hosting and other solutions to include many Linux email server offloading many of the 
accounts to the Linux side (though we still have a couple hundred Exchange accounts).  
We ended up using postfix for our web server builds for several factors.  The big 
things that we did was include RBL's on the postfix servers which reduced the spam by 
80%.  There were some FP's when what one (can't remember now) RBL when down.  We also 
implemented AV on this box as well.  So this is the flow now.
 
Internet -> Postfix spoolers -> RBL -> Spam Assassin -> AV -> (reinjected back to 
postfix) -> Destination (which is E2K, other postfix servers and in some cases 
external client SMTP servers).
 
The 4 postfix servers that we have (which is also running linux-ha software) are at 
two locations.  Two clusters which consist of two 450mhz machines each.   
 
uptime:  10:22:59  up 74 days,  4:54,  1 user,  load average: 0.04, 0.03, 0.00
 
During the day the postfix server receives about 1 email every 2 seconds, around 7:am 
(when all of the friggen spamming seems to occur for us) it's about 5 a second. 
 
sar:
 
07:20:00 AM   all  4.55  0.01  0.74 94.70
07:30:00 AM   all  1.84  0.04  0.72 97.40
07:40:00 AM   all  3.14  0.03  0.81 96.03
07:50:00 AM   all  3.86  0.03  0.86 95.25
08:00:00 AM   all  4.29  0.01  0.86 94.84
08:10:00 AM   all  2.13  0.03  0.73 97.11
 
Since we decided upon Bob's simple approach we have increased hardware but reduced 
maintenance time significantly.  The cost of hardware is nothing for the benefit's 
that it provides.  Our offsite location just has DNS and a SMTP/postfix cluster.  Best 
investment we ever made.
 
Coincidentally, we have 4 new P4 servers to be put into place sitting idle for last 4 
months but because the system as a whole works so well we just haven't gotten around 
to changing it.  So for now they are backups.
 
Gary Smith

-Original Message- 
From: [EMAIL PROTECTED] on behalf of Bob Apthorpe 
Sent: Wed 12/10/2003 9:21 AM 
To: [EMAIL PROTECTED] 
Cc: 
Subject: Re: [SAtalk] filtering spam tagged email before hitting exchange 2000



Hi,

[N.B. Reformatted into a sensible whole. Please trim your posts, line
wrap and (I know this sounds petty) please don't top post if you expect
follow-ups. Thanks.]

On Wed, 10 Dec 2003 09:22:06 -0600 "JRiley" <[EMAIL PROTECTED]> wrote:
>   From: Efren Pedroza
> > On Behalf Of gentian
> > > I am very new to spamassassin and i want to setup a gateway for external
> > > spam and then i want to relay mail from spamassassin to Exchange 2000. I
> > > read that spamassassin just tags the mail, doesn'filter it and i should
> > > filter it on Exhange 2000 and that was done by some other tools. The
> > > problems is that i do not want to mess around with Exchange 2000 and
> > > install other stuff in there. It has already enough load and problems so
> > > I was wondering if there is any way to 

Re: [SAtalk] rule match counting

2003-12-10 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Stephen M. Przepiora writes:
> Hello, I have constructed a huge list of rules and wish to detect how 
> good they are. Is there a way to log the count of rule matches somewhere?

Hi -- 

about time I documented this properly; should be helpful for the exit0
folks too!  I've just put up some pages in the wiki at

  http://wiki.spamassassin.org/w/MassCheck
  http://wiki.spamassassin.org/w/HitFrequencies

to cover this.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Exmh CVS

iD8DBQE/12Y3QTcbUG5Y7woRAhyXAJ4xY6ozaiT0g/hdKjQ9BgmkIOhq8QCfVy2G
vYel8Fo36of8TnO6Cn/myuA=
=eTEn
-END PGP SIGNATURE-



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] rule match counting

2003-12-10 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Matt Kettler writes:
>At 10:09 AM 12/10/03 -0500, Stephen M. Przepiora wrote:
>
>>Hello, I have constructed a huge list of rules and wish to detect how good 
>>they are. Is there a way to log the count of rule matches somewhere?
>
>if you've got a spam/ham corpus, you can test your rules using the tools in 
>the masses/ subdirectory of the tarball.
>
>You'll want to use mass_check, and hit_frequencies.
>
>The rule guide has a very short note about it at the bottom (section 3.4) 
>but I've not added an example run yet.. It's my intention to write a 
>separate guide for corpuses, mass_check, etc.

Hi Matt --

feel free to extend the pages I've put on the wiki, in that case ;)

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.2 (GNU/Linux)
Comment: Exmh CVS

iD8DBQE/12Z/QTcbUG5Y7woRAqzqAKC/AQHalFUq+lYc3A10pjCcfeqb5wCgifcu
QyDm0QZN3VaufVHgEwML0qk=
=1eME
-END PGP SIGNATURE-



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Re: bayes permission errors

2003-12-10 Thread Lukreme
On 07 Dec 2003, at 01:02, David B Funk wrote:
You've got spamd running as the user "postfix" (that "-u postfix"
command line argument). Thus the user postfix needs to have write
permissions to the bayes_* files. but in that directory listing
you show:
4160 -rw---  1 user  staff  5111808 Dec  4 09:51 bayes_toks
So 'postfix' has -no- access permissions to "user"s bayes_toks
Thus the permission errors.
doh!

You have two different options:
1) run spamd as root and be sure that you pass the correct user
   name via "spamc -u user" for each message.
Spamc is being invoked via the user's procmailrc.  does it still need 
to have the -u flag?  For some reason I thought this "just happened."

hrm.

spamd[38231]: Still running as root: user not specified with -u, not 
found, or set to root.  Fall back to nobody.

Where would the -u be passed for users who do not have a .procmailrc?  
can I pass the -u in the /etc/procmailrc somehow?

2) Set the global 'bayes_file_mode' option to 0666 so that the
   spamd process always has read-write permission, regardless
   of who it is run as.
that should work for now.

--
There's nothing to do, so you just stay in bed [ah, poor thing] Why 
live in the world when you can live in your head?



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] spamassassin procmail

2003-12-10 Thread Daniel Kaliel



I  read through the readme and made a config 
change to the way procmail and spamassassin work together, however I now get the 
error:
 
couldn't create or rename temp file. 
"/var/spool/mail/il -oi [EMAIL PROTECTED]
 
I have a default /etc/procmailrc with the 
lines
 
:0fw: spamassassin.lock
* < 384000
| spamassassin
 
and a .procmailrc file for individual users.  
Emails works great for all users except those that have a .procmailrc file 
containing
 
:0 c
! [EMAIL PROTECTED]
 
SO obviously it is erroring on creating the forward 
temp file.  I have just been unsuccessful in finding a solution in the 
procmail archives and on a googled 
search.


[SAtalk] Writing a DNSBL rule for both SPEWS levels

2003-12-10 Thread Justin
Does anyone have a DNSBL rule for SPEWS that conditionally checks both
level 1 and level 2?  Remember that l2 also contains l1.  What I'm looking
for is a way to haev SA first query l1.spews.sorbs.net.  If a record
exists in l1 then a score should be assigned (2 for example) and the l2
check should be skipped.  If a record doesn't exist in l1 then l2 should
be checked and a different score assigned (4 for example).  This way I can
appease my users that don't want both levels to be scored the same way.

I suppose as an alternative I could assign a score of 2 to l1 and allow
the l2 check to procede.  Then if a record then exists in l2 an additional
score could be assigned like 2.  That would accomplish the same thing I 
suppose, now that I think about it.  Still, is there a way to 
conditionally check/skip a DNSBL rule?

Thanks

Justin



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] spamassassin procmail

2003-12-10 Thread Evan Platt


--On Wednesday, December 10, 2003 11:44 AM -0700 Daniel Kaliel
<[EMAIL PROTECTED]> wrote:

> 
> I  read through the readme and made a config change to the way procmail
> and spamassassin work together, however I now get the error: 
>   
> couldn't create or rename temp file. "/var/spool/mail/il -oi
> [EMAIL PROTECTED] 
>   
> I have a default /etc/procmailrc with the lines 
>   
> :0fw: spamassassin.lock 
> * < 384000 
>| spamassassin 
>   
> and a .procmailrc file for individual users.  Emails works great for all
> users except those that have a .procmailrc file containing 
>   
> :0 c 
> ! [EMAIL PROTECTED] 
>   
> SO obviously it is erroring on creating the forward temp file.  I have
> just been unsuccessful in finding a solution in the procmail archives and
> on a googled search. 

If no one can help you here, you may find better luck on a procmail users
group. This doesn't really have anything to do with SpamAssassin other than
the fact that you also use SA on your system. 

I'm unfortunately not that procmail savvy, otherwise I'd offer some help...

Perhaps permissions are incorrect? Does /var/spool/mail exist?

Evan

Evan


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] sa-learn mbox processing?

2003-12-10 Thread William Stearns
Good afternoon, Larry,

On Wed, 10 Dec 2003, Larry Starr wrote:

> I currently have mimedefang (2.37) and spamassassin (2.60) running on a RH9 
> mail gateway.
> 
> Spamassassin is configured to block messages with a very high SA score and to 
> tag and pass along everything else.
> 
> I have two accounts set up, on an internal server, for users to forward 
> received spam, and ham to.
> 
> My question regards scripts to ease processing of these mailboxes.  Since the 
> messages are forwarded, from several different Email clients (netscape, 

And that's the real problem.  _Forwarding_ screws up the headers, 
as you've found.  You should use _bounce_ or _redirect_, instead.

http://www.stearns.org/doc/spamassassin-setup.current.html#redirect

Cheers,
- Bill

---
"I give up, how DO you keep a mathematician busy for 350 years?"
-- Pierre de Fermat's friend
(Courtesy of Tim Connors <[EMAIL PROTECTED]>)
--
William Stearns ([EMAIL PROTECTED]).  Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at:   http://www.stearns.org
Linux articles at: http://www.opensourcedigest.com
--



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] sa-learn mbox processing? (forwarded learning)

2003-12-10 Thread Larry Starr
On Wednesday 10 December 2003 12:10 pm, Matt Kettler wrote:
> At 11:32 AM 12/10/2003, Larry Starr wrote:
> >My question regards scripts to ease processing of these mailboxes.  Since
> > the messages are forwarded, from several different Email clients
> > (netscape, kmail, pine, AppleMail, etc), extracting the original message,
> > for sa-learn is proving to be non-trivial.
>
> In many cases, it's not only non-trivial, it's impossible. Most MUA's
> re-encode, strip down the headers, etc, etc when forwarding. Unless your
> users are careful, reconstruction is going to be impossible, as some of the
> data has been removed.
>
> >Does anyone, on the list, have or know of a tool that will reliably
> >extract an
> >original messaged from a forwarded message?
>
> I'm unsure if one is even possible.
Matt,

Thank you for your reply.  

The more that I look at the problem, the more that I believe you are correct, 
that extracting the original message is not a solveable problem.  

I'm running mimedefang/spamassassin 2.60 on a mail gateway, with no local 
users.

My current "workaround" is to "quarantine" everything coming from outside my 
domain, thus saving a copy of the original message, which I can extract and 
feed to sa-learn, by searching for the "Subject" in my mimedefang logs.

Fortunately, or unfortunately depending on how you look at it, adding all of 
my normal mail to the quarantine only increases my quarantine directory usage 
by about 20-25%.

From here I guess I'll develop a few scripts to automate the process, unless 
someone has a better idea?
My current "workaround" is to 

-- 
Larry G. Starr - [EMAIL PROTECTED] or [EMAIL PROTECTED]
Software Engineer: Full Compass Systems LTD.
Phone: 608-831-7330 x 1347  FAX: 608-831-6330
===
There are only three sports: bullfighting, mountaineering and motor
racing, all the rest are merely games! - Ernest Hemmingway



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] making my own Evil rule list

2003-12-10 Thread Bret Miller
> I am pulling my example off the following url.
>
> http://www.merchantsoverseas.com/wwwroot/gorilla/bigevil.cf
>
> I have setup the following rule in my user_prefs file.
>
> uri EVILLIST_2 /\b(?:dubnh\.us)\b/i
> describe  EVILLIST_2 Generated EvilList_2
> score EVILLIST_2 3.0

Uri is a priveleged option-- it will only work in the site .cf files,
not in user_prefs. You can define the rule there and score it 0, then
change the score in your user_prefs if you only want it to run for you.

Bret





---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id78&alloc_id371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Christopher Kunz
Gary Funck wrote:

A pattern like the following:
  /([a-z][;][a-z]+.*){5}/i
might get some traction. This has to be run after the HTML is stripped.
 

That exact message got through here, too. Actually, it was using the 
whitelist_from trick to get a whopping -93.6 points, but OTOH, bayes_60 
and lots of other filters would have given it 6.4 points if it weren't 
for the white list.

These are the SA headers from one of these mails:

X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on webby
X-Spam-Status: No, hits=-93.6 required=5.0 tests=ADVERT_CODE2,BAYES_60,
	FROM_ENDS_IN_NUMS,HTML_FONTCOLOR_RED,HTML_FONT_BIG,
	HTML_FONT_INVISIBLE,HTML_MESSAGE,MIME_HTML_ONLY,
	RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_SORBS,USER_IN_WHITELIST autolearn=no 
	version=2.60

BTW: What kind of header is this?

X-Ki: 

--ck



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] spamassassin procmail

2003-12-10 Thread Daniel Kaliel
ya it exsists.  After no luck in the searches, I did try to join a procmail
user group, however after there server is rejecting all my attempts to join.
So I thought to try here, in the hopes of finding a procmail guru! :)

- Original Message - 
From: "Evan Platt" <[EMAIL PROTECTED]>
To: "SpamAssassin" <[EMAIL PROTECTED]>
Sent: Wednesday, December 10, 2003 11:50 AM
Subject: Re: [SAtalk] spamassassin procmail


>
>
> --On Wednesday, December 10, 2003 11:44 AM -0700 Daniel Kaliel
> <[EMAIL PROTECTED]> wrote:
>
> >
> > I  read through the readme and made a config change to the way procmail
> > and spamassassin work together, however I now get the error:
> >
> > couldn't create or rename temp file. "/var/spool/mail/il -oi
> > [EMAIL PROTECTED]
> >
> > I have a default /etc/procmailrc with the lines
> >
> > :0fw: spamassassin.lock
> > * < 384000
> >| spamassassin
> >
> > and a .procmailrc file for individual users.  Emails works great for all
> > users except those that have a .procmailrc file containing
> >
> > :0 c
> > ! [EMAIL PROTECTED]
> >
> > SO obviously it is erroring on creating the forward temp file.  I have
> > just been unsuccessful in finding a solution in the procmail archives
and
> > on a googled search.
>
> If no one can help you here, you may find better luck on a procmail users
> group. This doesn't really have anything to do with SpamAssassin other
than
> the fact that you also use SA on your system.
>
> I'm unfortunately not that procmail savvy, otherwise I'd offer some
help...
>
> Perhaps permissions are incorrect? Does /var/spool/mail exist?
>
> Evan
>
> Evan
>
>
> ---
> This SF.net email is sponsored by: IBM Linux Tutorials.
> Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
> Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
> Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
> ___
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
>



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Help with Mark Motley's perl script - part2

2003-12-10 Thread Lentz, Wayne
All,

It was suggested off list that I remove the '<>' brackets from this section:

my $server = Mail::IMAPClient->new(
Server => "",
User => "",
Password => "",
Uid => 1,
Debug => 0 );

So I tried that and it helped as the script now runs, but it does't pull any
messages off Exchange.  It reports that it pulled 1 message, and does create
an empty file named "1" in /var/amavisd/spam.  It produces these results
regardless of how many are in the public folder.  I'm now digging through
perldoc.org but if someone has an idea I would much appriciate it.


System info:
OpenBSD 3.3 with Postfix 2.0.13
Amavisd-new with SA 2.55
Perl modules, including SA, installed via CPAN.


My updated version of his script (thanks much Mark):

 start of file ---

#!/usr/bin/perl

use Mail::IMAPClient;
use Sys::Syslog;

my $FOLDER_NAME = 'Public Folders/All Public Folders/SpamLearn';
my $SEQ = 1; openlog('pullspam','cons,pid', 'user');

my $server = Mail::IMAPClient->new(
Server => "msexsrv1.knust.com",
User => "spamlearn",
Password => "houston",
Uid => 1,
Debug => 0 );

$server->select($FOLDER_NAME);
my @msgs = $server->search("ALL");
foreach my $msg (@msgs) {
$server->message_to_file("/var/amavisd/spamlearn/" . $SEQ,$msg);
$server->delete_message($msg);
$SEQ++;
};
$server->expunge($FOLDER_NAME);
print "Pulled ". ($SEQ-1) . " messages from spam folder.\n";
syslog('mail|info', 'Pulled '.($SEQ-1) . ' messages from spam folder.');

--- end of file ---


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Writing a DNSBL rule for both SPEWS levels

2003-12-10 Thread Matt Kettler
At 01:39 PM 12/10/2003, Justin wrote:
 Still, is there a way to
conditionally check/skip a DNSBL rule?
No.

However, if there's an agregate database, you can query multiple lists at 
the same time.. Currently the SORBS and OPM rules work this way.. only one 
DNS query is made for all the lists in the agregate. 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Writing a DNSBL rule for both SPEWS levels

2003-12-10 Thread Justin
On Wed, 10 Dec 2003, Matt Kettler wrote:

> At 01:39 PM 12/10/2003, Justin wrote:
> >  Still, is there a way to
> >conditionally check/skip a DNSBL rule?
> 
> No.
> 
> However, if there's an agregate database, you can query multiple lists at 
> the same time.. Currently the SORBS and OPM rules work this way.. only one 
> DNS query is made for all the lists in the agregate. 

Shoot.  I was hoping there was a way.  Thanks anyhow.

So that's how check_rbl and check_rbl_sub work?  I always wondered about
that.  So what happens if an IP exists in two subzones at the same time?  
For example what if a message originated from an IP in SORBS' zombie list
and was then sent through and open relay/proxy?  I'd think that it would
be best to score on each subzone if they contain unique or unrelated data.  
I'd definitely want SPEWS scored seperately from the SORBS zone it's 
hosted in.

Thanks for the info

Justin



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Fred
Christopher Kunz wrote:
> BTW: What kind of header is this?
>
> X-Ki: 
>
> --ck


That's a fake header name with your e-mail address encoded with base64.

Un-base64 that and you get:
[EMAIL PROTECTED]
I munged most of it for your protection, but having that encoding here is
enough to give your address to a few people ;)


Add the following rule to your local.cf and you will never see those again
;)

# Custom rule to catch spammers using base64 encoding of my domain.
header FVGT_BASE64_MYDOMAIN   ALL =~
/(?:BkZS1wdW5rdC5kZQ==|AZGUtcHVua3QuZGU=|QGRlLXB1bmt0LmRl)/
describe   FVGT_BASE64_MYDOMAIN   Message hides my domain with Base64
encoding
score  FVGT_BASE64_MYDOMAIN   5.0
This is custom for your domain of xx-x.de I have a script to
auto-produce these rules for me ;)



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Christopher Kunz
Fred wrote:
That's a fake header name with your e-mail address encoded with base64.
I guessed it is some spam devilry. Actually, I don't care if harvesters 
pick up this address, it's also under SA monitoring :-)

Add the following rule to your local.cf and you will never see those again
;)
Isn't header_ a privileged rule? I have quite a lot domains on the mail 
server, and xx-x.de is only the administrative domain - I don't want 
to clutter a sitewide local.cf with all that b64 and other munching stuff.

Anyhow, I saw the excellent script in some earlier post to the list, and 
gave it a try. If I could use it (partially) for user_prefs, that'd be 
perfect.

--ck



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Greg Webster
Here's what I've recently done:
rawbody GWW_PUNCT /([a-z][:punct:]+[a-z])|( [A-Z][:punct:]+[a-z])/i
score   GWW_PUNCT 2.0

It's not perfect, but it does the job.

As well, I've noticed a lot of these include the domain 'doctor45.com',
so I've been giving a good high score for that one.

Greg


-- 
Greg Webster - [EMAIL PROTECTED]
In-Touch Software Corporation
Ph: (604)278-0515 - Fax: (604)608-3112



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Christopher Kunz
Greg Webster wrote:

Here's what I've recently done:
rawbody GWW_PUNCT /([a-z][:punct:]+[a-z])|( [A-Z][:punct:]+[a-z])/i
score   GWW_PUNCT 2.0
It's not perfect, but it does the job.

As well, I've noticed a lot of these include the domain 'doctor45.com',
so I've been giving a good high score for that one.
What is the typical half-life of a spam domain? As far as I can recall 
from my brief glimpses at the stuff in my spam folders, I have never 
seen that a domain was spamvertised in two different spam runs.

--ck



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Writing a DNSBL rule for both SPEWS levels

2003-12-10 Thread Matt Kettler
At 02:08 PM 12/10/2003, Justin wrote:
So that's how check_rbl and check_rbl_sub work?  I always wondered about
that.  So what happens if an IP exists in two subzones at the same time?
With SORBS, it's done by returning multiple results for a single query.

host 138.81.106.218.dnsbl.sorbs.net
138.81.106.218.dnsbl.sorbs.net has address 127.0.0.2
138.81.106.218.dnsbl.sorbs.net has address 127.0.0.3
OPM looks like a bit-mask system, so one result can encode 8 different 
DNSBLs at once. 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] RE: [RD] Obfuscation by Punctuation

2003-12-10 Thread Gary Funck


> -Original Message-
> From: Greg Webster
> Sent: Wednesday, December 10, 2003 11:45 AM
>
>
> Here's what I've recently done:
> rawbody GWW_PUNCT /([a-z][:punct:]+[a-z])|( [A-Z][:punct:]+[a-z])/i
> score   GWW_PUNCT 2.0
>
> It's not perfect, but it does the job.

I think that pattern is going to catch a lot of lines that aren't what
you're looking for. As quick check, try the following:

   perl -ne 'print if /([a-z][:punct:]+[a-z])|( [A-Z][:punct:]+[a-z])/i'
$MAIL

and notice the lines that it matches. Also, the second alternative '|(
[A-Z][:punct:]+[a-z]' doesn't match anything different than the first
alternative.
Maybe you were looking for a space followed by a capital letter?  With /i
enabled, the check for [A-Z] will include [a-z] as well. Also, [:punct:]+
is likely too general and will pick up lots of stuff. Even just adding '_'
will pick up lots of programming language variable names and so on.

This pattern seemed to work pretty well:
  /([a-z][;]+[a-z].{0,20}){3}/i

Question to the group: what's the procedure for running the rules against
the
spam/ham samples to come up wiht hit frequencies?





---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Help with Mark Motley's perl script - part2

2003-12-10 Thread Lentz, Wayne
>-Original Message-
>From: Lentz, Wayne 
>
>So I tried that and it helped as the script now runs, but it does't pull
any
>messages off Exchange.  It reports that it pulled 1 message, and does
create
>an empty file named "1" in /var/amavisd/spam.  It produces these results
>regardless of how many are in the public folder.  I'm now digging through
>perldoc.org but if someone has an idea I would much appriciate it.

Ok, I worked it out.  Turns out the 'All Public Folders' level in the
Exchange Public Folders heirarchy should be omitted when defining the path
in the perl script.  Probably goes for any IMAP client as well.  So I
changed this line:

my $FOLDER_NAME = 'Public Folders/SpamLearn';

And it's rocking on now.  Many thanks to everyone.

-Wayne


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] sa-learn mbox processing?

2003-12-10 Thread Kris Deugau
William Stearns wrote:
>  You should use _bounce_ or _redirect_, instead.

Which, unfortunately, adds some new headers with most MUAs.  :(  Along
with the extra set of Received: headers that go along with sending a
message (which you could probably work around).

The only way I've seen to get a message from a user's inbox into your
hands unaltered is to have it forwarded as an attachment- even then, a
number of MUAs seem to add some message-status-tracking headers in the
original message.  >:(  Some (like Outlook for some versions and/or
contexts) strip most of the headers out anyway.

-kgd
-- 
"Sendmail administration is not black magic.  There are legitimate
technical reasons why it requires the sacrificing of a live chicken."
   - Unknown


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rendering streams

2003-12-10 Thread SpamTalk
It would seem to me that, for purposes of rule simplification, that the
subject and body of messages to be scanned should be available in
pre-processed flavors, some of which is currently available. Assume the spam
key is some thing like that Vuhee drug, V=P i=o e=a n=g s=r u=a (i.e.
Poensu)

RAW untouched

RARE(de-mimed) eye-readable 8/16 charset with HTML intact

FOLDED  set all lowercase
Remove HTML
punctuation to be underscore,
repeated punctuation collapsed to 1 instance

P.o;[EMAIL PROTECTED] becomes p_o_3_n_s_u

PLAIN   all lowercase remove all punctuation
P.o;[EMAIL PROTECTED] becomes po3nsu

ALPHED  strip numerics as well
P.o;[EMAIL PROTECTED] becomes ponsu

Rules would be defined with along the same lines as currently done for
Subject and body, e.g. Subject-PLAIN, Body-ALPHED etc. Bayes should tokenize
the most reduced (ALPHED) stream.


Best Regards
Bob
Robert Strickler


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] sa-learn mbox processing?

2003-12-10 Thread William Stearns
Good afternoon, Kris,

On Wed, 10 Dec 2003, Kris Deugau wrote:

> William Stearns wrote:
> >  You should use _bounce_ or _redirect_, instead.
> 
> Which, unfortunately, adds some new headers with most MUAs.  :(  Along
> with the extra set of Received: headers that go along with sending a
> message (which you could probably work around).

That article specifically address the "Resent*" headers; do a text 
search for "bayes_ignore_header".
You're still correct on the additional Received's, but to the best
of my recollection those go in the right location for you to be able to
tell what the original MTA's were.
Cheers,
- Bill

---
"Windows 2000 installed without a hitch, but the process takes
so long that InfoWorld Senior Analyst Maggie Biggs and I are debating
over the best 10 things you can do while you wait. One of her thoughts
is to run a marathon. And not finish first."
"I'll have to fall back on one of my favorites: spit shining the
Empire State Building."
(Courtesy of Nicholas Petreley, Infoworld)
--
William Stearns ([EMAIL PROTECTED]).  Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at:   http://www.stearns.org
Linux articles at: http://www.opensourcedigest.com
--



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Writing a DNSBL rule for both SPEWS levels

2003-12-10 Thread Justin
On Wed, 10 Dec 2003, Matt Kettler wrote:

> At 02:08 PM 12/10/2003, Justin wrote:
> >So that's how check_rbl and check_rbl_sub work?  I always wondered about
> >that.  So what happens if an IP exists in two subzones at the same time?
> 
> With SORBS, it's done by returning multiple results for a single query.
> 
> host 138.81.106.218.dnsbl.sorbs.net
> 138.81.106.218.dnsbl.sorbs.net has address 127.0.0.2
> 138.81.106.218.dnsbl.sorbs.net has address 127.0.0.3
> 
> OPM looks like a bit-mask system, so one result can encode 8 different 
> DNSBLs at once. 

Interesting.  I didn't know that either.  I'm going to have to dig into 
this further I think.  Thanks for the info!

Justin



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rendering streams

2003-12-10 Thread Matt Kettler
At 03:48 PM 12/10/2003, SpamTalk wrote:

FOLDED  set all lowercase
Remove HTML
punctuation to be underscore,
Why on earth do you want to "set all lowercase"? Every regex in the ruleset 
can be set to case sensitve or insensitve on it's own, so this adjustment 
only makes rules less flexible and doesn't add anything that I can see.

Otherwise, I can see a lot of value in your suggestions. 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Justin
On Wed, 10 Dec 2003, Christopher Kunz wrote:

> What is the typical half-life of a spam domain? As far as I can recall 
> from my brief glimpses at the stuff in my spam folders, I have never 
> seen that a domain was spamvertised in two different spam runs.

I have a 15,000 entry list of spamming domains and netblocks.  I started
it back in January of 2000 and hit the 15k mark last December.  I haven't
touched it since.  I'm still amazed at how much spam is still catches,
even though some of those domains are from spam runs more than 2 years
ago.  YMMV of course.

Justin



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rendering streams

2003-12-10 Thread Gary Funck


> -Original Message-
> From: SpamTalk
> Sent: Wednesday, December 10, 2003 12:49 PM
> 
> It would seem to me that, for purposes of rule simplification, that the
> subject and body of messages to be scanned should be available in
> pre-processed flavors, some of which is currently available. 
> Assume the spam
> key is some thing like that Vuhee drug, V=P i=o e=a n=g s=r u=a (i.e.
> Poensu)
> 
[abbreviated description follows]
> RAW (untouched), RARE (de-mimed), FOLDED (all lowercase),
> PLAIN (lc, no punctuation), ALPHED (no numbers).
>

It might be convenient to view each these transformations as
operating on the output of the previous. I think you were.
By doing so, it avoids replicating the description of the
previous phase.

Note that numbers are sometimes substituted for letters. Such
as Gr8t and zer0, any1, me2, all41 and 14all. This argues for
phoneming and/or spell-checking before ALPHA-ing.





---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rende ring streams

2003-12-10 Thread SpamTalk



>>FOLDED  set all lowercase
>> Remove HTML
>> punctuation to be underscore,

>Why on earth do you want to "set all lowercase"?

I guess folding the case might be overkill in the "simplification" process.
As a matter of curiosity, does the objection extend to doing that in the
"ALPHED" stream to be tokenized by Bayes as well?

Best Regards, Bob

[OT/P.S.] All this reduction of messages reminds me of an SF authors
(Heinlein?, Niven?, googling gives no answer) parody of modern product
marketing: "Mushie Tushies, pre-processed, pre-digested, pre-excreted: Just
heat them up and flush the down the toilet".
 


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rendering streams

2003-12-10 Thread Gary Funck


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Gary
> Funck
> Sent: Wednesday, December 10, 2003 1:09 PM
> To: [EMAIL PROTECTED]
> Subject: RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject
> rendering streams
> 
> 
> 
> 
> > -Original Message-
> > From: SpamTalk
> > Sent: Wednesday, December 10, 2003 12:49 PM
> > 
> > It would seem to me that, for purposes of rule simplification, that the
> > subject and body of messages to be scanned should be available in
> > pre-processed flavors, some of which is currently available. 
> > Assume the spam
> > key is some thing like that Vuhee drug, V=P i=o e=a n=g s=r u=a (i.e.
> > Poensu)
> > 
> [abbreviated description follows]
> > RAW (untouched), RARE (de-mimed), FOLDED (all lowercase),
> > PLAIN (lc, no punctuation), ALPHED (no numbers).
> >
> 
> It might be convenient to view each these transformations as
> operating on the output of the previous. I think you were.
> By doing so, it avoids replicating the description of the
> previous phase.

I meant to add the following sugested additional
transformation:

PHONEMED in this form, the words are either converted into their
phoneme form and/or spell-checked (perhpas augmented by a custom
dictionary of "popular" spammer spellings). The words would be
de-rooted as well.

This paragraph suggests that the spelling transformation would
proceed the ALPHED transformation.

> 
> Note that numbers are sometimes substituted for letters. Such
> as Gr8t and zer0, any1, me2, all41 and 14all. This argues for
> phoneming and/or spell-checking before ALPHA-ing.
> 
> 
> 
> 
> 
> ---
> This SF.net email is sponsored by: IBM Linux Tutorials.
> Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
> Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
> Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
> ___
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
> 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rende ring streams

2003-12-10 Thread SpamTalk



>It might be convenient to view each these transformations as
operating on the output of the previous.
Indeed, I was. Elegance + Efficiency + Functionality = GoodCode(TM)

>Note that numbers are sometimes substituted for letters.
>[SNIP] This argues for phoneming and/or spell-checking before ALPHA-ing.

I figured just stripping them would be best, or with maybe an adjunct
dictionary for common ones.

HMMM, how about an additional "PHONEME" rendering stream. An SA 3.1 feature,
I'm sure. From my recollection of R.A. Heinlein's "Farnham's Freehold" (you
don't think my knowledge base is based on TEXTBOOKS, do you?!?) phonemes
have their own individual symbols, IIRC some are Greek like delta and phi,
something in excess of 100 atomic representations of voiced sounds. I would
think there is probably some kind of USASCII cross reference that would
allow them to be represented in a plaintext fashion. I wonder if there is a
Unicode/ISO page definition for them.

Best Regards, Bob




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] RE: [RD] Obfuscation by Punctuation

2003-12-10 Thread Chris Thielen
Gary Funck said:
> Question to the group: what's the procedure for running the rules against
> the
> spam/ham samples to come up wiht hit frequencies?

mass-check in the masses directory of the SpamAssassin source archive
(methinks)


--
Chris Thielen

Easily generate SpamAssassin rules to catch obfuscated spam phrases:
http://www.sandgnat.com/cmos/


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Chris Thielen
Brad Wilkin said:
> I seem to have a rash of spam lately that gets by SA because the subject
> line
> and/or body of the message contains spam phrases but words have been
> obfuscated by
> inserting semicolons, periods and other punctuation or special characters.
>  In
> somes cases, the punctuation displaces a character (s*xual) but most
> times, just
> breaks up the word so it doesn't pattern match (en;large.ment)
>
> Has anyone had success writing tests that can catch this sort of trickery?
>  It

Brad,
I wrote a rules generator for just this purpose.  See the link in my sig.

--
Chris Thielen

Easily generate SpamAssassin rules to catch obfuscated spam phrases:
http://www.sandgnat.com/cmos/


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Gary Funck


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] Behalf Of Gary
> Funck
> Sent: Wednesday, December 10, 2003 8:49 AM
> To: Spamassassin List
> Subject: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)
>
>
>
> Hello,
>
> after running a spam refiling script which invokes 'spamassassin -r', I
> received the
> following diagnostics:
>
> /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf.pm line 362.
> Argument "" isn't numeric in numeric lt (<) at
> /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore.pm line 1248.
> Argument "" isn't numeric in numeric lt (<) at
> /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore.pm line 1248.
> [repeated 20 or more times]
>
> Here's the offeding line:
>
>1247 my $oldmagic =
> $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN}
> ;
>1248 if (!defined ($oldmagic) || $atime < $oldmagic) {
>1249   $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN} = $atime;
>1250 }
>

Follow-up, adding a check to see if $oldmagic is "" made the complaints
go away:

   1248 my $oldmagic =
$self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN};
   1249 $oldmagic = 0 if (defined($oldmagic) && $oldmagic eq "");
   1250 if (!defined ($oldmagic) || $atime < $oldmagic) {
   1251   $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN} = $atime;
   1252 }

this can be recoded more succinctly, I'm sure.




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rende ring streams

2003-12-10 Thread SpamTalk

>This paragraph suggests that the spelling transformation would
>proceed the ALPHED transformation.

Probably would have to be a fork rather than pipe, once it was phonemed, I
would think it would be hard to get back into recognizable English. Then
again that's what IBM ViaVoice and Dragon Dictate do, are there Open source
voice recognition projects with an API call into we could inject the phoneme
stream?
 That should not make too big a dent in CPU utilization 


Running an eval on the ALPHED stream and generating a found/unfound ration
against a dictionary

Keeping a hash of the found words in sequence might help too, although there
seems to be a trend in quoting from Project Gutenberg, and random blogs to
poison hashing functions, maybe just hash word tuples? Probably to close to
current Bayesian methods.

Best Regards, Bob


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] Using sa-learn in a site-wide configuration

2003-12-10 Thread Stephen Westrip
Yes, the DB_File Perl module has been installed.

If I run a sa-learn --dump I get this output;

0.000  0  2  0  non-token data: bayes db version
0.000  0  0  0  non-token data: nspam
0.000  0  0  0  non-token data: nham
0.000  0  0  0  non-token data: ntokens
0.000  0  0  0  non-token data: oldest atime
0.000  0  0  0  non-token data: newest atime
0.000  0  0  0  non-token data: last journal sync
atime
0.000  0  0  0  non-token data: last expiry atime
0.000  0  0  0  non-token data: last expire atime
delta
0.000  0  0  0  non-token data: last expire
reduction count

Stephen Westrip
Metafour UK Ltd

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Matt
Kettler
Sent: 10 December 2003 18:15
To: Stephen Westrip; [EMAIL PROTECTED]
Subject: Re: [SAtalk] Using sa-learn in a site-wide configuration


At 12:04 PM 12/10/2003, Stephen Westrip wrote:
>What exactly do I need to do to make this work? I have read lots about 
>adding 'bayes_auto_learn 1' and other bits and pieces to put in the cf 
>file but whatever I try the Bayes DB never gets added to.

did you install DB_File? if not, bayes won't go. 



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list [EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Theo Van Dinter
On Wed, Dec 10, 2003 at 01:44:17PM -0800, Gary Funck wrote:
> Follow-up, adding a check to see if $oldmagic is "" made the complaints
> go away:
> 
>1248 my $oldmagic =
> $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN};
>1249 $oldmagic = 0 if (defined($oldmagic) && $oldmagic eq "");
>1250 if (!defined ($oldmagic) || $atime < $oldmagic) {
>1251   $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN} = $atime;
>1252 }
> 
> this can be recoded more succinctly, I'm sure.

FYI: you changed the logic of the code there.  if there is no oldest
token atime (ie: this is the first token to be learned), it should set
the value to the atime.  you've now forced oldest to be 0 until the
first expire occurs and fixes it.

the !defined bit is supposed to catch when there is no oldest token, but
apparently instead of undef, it sometimes comes back as "" (boo DB_File!)

can you open a ticket in bugzilla about this?

-- 
Randomly Generated Tagline:
"The cardinal rule at our school is simple. No shooting at teachers. If
 you have to shoot a gun, shoot it at a student or an administrator."
 - "Word Smart II", from Princeton Review Pub.


pgp0.pgp
Description: PGP signature


Re: [SAtalk] Using sa-learn in a site-wide configuration

2003-12-10 Thread William Stearns
Good afternoon, Stephen,

On Wed, 10 Dec 2003, Stephen Westrip wrote:

> I am trying to set up sa-learn in a site-wide configuration. I have a Red
> Hat 9 server, SpamAssassin 2.61 and MIMEDefanf 2.39. I have got SA to work
> fine and our spam has dropped considerably, but I would also like to use
> sa-learn.
> 
> What exactly do I need to do to make this work? I have read lots about
> adding 'bayes_auto_learn 1' and other bits and pieces to put in the cf file
> but whatever I try the Bayes DB never gets added to.

http://www.stearns.org/doc/spamassassin-setup.current.html#autoreporting

might be a good place to start if you're looking to have a 
sitewide bayes database.
Cheers,
- Bill

---
"To disable the Internet to save EMI and Disney is the moral
equivalent of burning down the library of Alexandria to ensure the
livelihood of monastic scribes."
-- John Ippolito (Guggenheim Institute)
(Courtesy of FD Cami <[EMAIL PROTECTED]>)
--
William Stearns ([EMAIL PROTECTED]).  Mason, Buildkernel, freedups, p0f,
rsync-backup, ssh-keyinstall, dns-check, more at:   http://www.stearns.org
Linux articles at: http://www.opensourcedigest.com
--



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


RE: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Gary Funck
Hi Theo.


> -Original Message-
> From: Theo Van Dinter [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, December 10, 2003 2:02 PM
> To: Gary Funck
> Cc: Spamassassin List
> Subject: Re: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)
> 
> 
> On Wed, Dec 10, 2003 at 01:44:17PM -0800, Gary Funck wrote:
> > Follow-up, adding a check to see if $oldmagic is "" made the complaints
> > go away:
> > 
> >1248 my $oldmagic =
> > $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN};
> >1249 $oldmagic = 0 if (defined($oldmagic) && $oldmagic eq "");
> >1250 if (!defined ($oldmagic) || $atime < $oldmagic) {
> >1251   $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN} 
> = $atime;
> >1252 }
> > 
> > this can be recoded more succinctly, I'm sure.
> 
> FYI: you changed the logic of the code there.  if there is no oldest
> token atime (ie: this is the first token to be learned), it should set
> the value to the atime.  you've now forced oldest to be 0 until the
> first expire occurs and fixes it.
> 
> the !defined bit is supposed to catch when there is no oldest token, but
> apparently instead of undef, it sometimes comes back as "" (boo DB_File!)
> 

Are you saying it should've looked like this?

if ((!defined($oldmagic) || $oldmagic eq "") || $atime < $oldmagic) {
  $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN} = $atime;
}

> can you open a ticket in bugzilla about this?

Sure will be glad to.




---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Theo Van Dinter
On Wed, Dec 10, 2003 at 02:08:33PM -0800, Gary Funck wrote:
> Are you saying it should've looked like this?
> 
> if ((!defined($oldmagic) || $oldmagic eq "") || $atime < $oldmagic) {
>   $self->{db_toks}->{$OLDEST_TOKEN_AGE_MAGIC_TOKEN} = $atime;
> }

yeah, although my patch will be a little different.  (ignoring the fact
I'll have to find all the times that needs changing ... )

-- 
Randomly Generated Tagline:
Mac error message:  "Like, dude, something went wrong."


pgp0.pgp
Description: PGP signature


[SAtalk] Re: Bug#223399: spamassassin: not_ok_languages, no way to split Chinese

2003-12-10 Thread Dan Jacobson
>> And what if it doesn't match any of our ok_languages? then it will
>> fail, against our wishes.  Can you guarantee that you know all the
>> possibilities?

D> I still don't understand. You don't speak 1000 languages. Most speak 3
D> or 4 at most They can add these to ok_languanges.

OK, you agree that there should be an easy way to blacklist one
language now... all I must convince you is why I would want to do it.

Just like when we read a book and there's one line of russian or
greek, we still want to read the book.  Just because there's a bit of
Burmese in there, we don't want to throw it into the fire.
Who knows what intentional witty quotes we might encounter... good to
keep an open mind.

However, anything in language [X] is always spam, so let me ban [X]
without having to unban every other possible language.


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Re: Bug#223399: spamassassin: not_ok_languages, no way to split Chinese

2003-12-10 Thread JRiley
While gargling concrete, "Dan Jacobson" <[EMAIL PROTECTED]> spewed:

> However, anything in language [X] is always spam, so let me ban [X]
> without having to unban every other possible language.
>
>

Pretty Draconian. Must be nice to be able to do that.
My clients/customers tend to whine a little when they don't get their email
msg from their business partner overseas with the contract for them to sign
in the next hour, because .ru tends to send a bit more spam than anyone else
and is easier to just say damn..block the whole country.
Might as well go the way of SPEWS and start blocking entire netblocks
because of the few idiots that allow spam from their leased IP's.
no communication is better than spam right?

-JR



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] Punctuation in text rule I sent

2003-12-10 Thread Greg Webster
Please note that the following rules that I sent earlier today...

rawbody GWW_PUNCT /([a-z][:punct:][a-z])|( [A-Z][:punct:]+|[a-z])/i
score   GWW_PUNCT 2.0

...was effectively untested. And I discovered a flaw fairly quickly. It
also matches words like "can't" and "it's".

It also has a case-insensitive flag that nulls the second part of the
regexp!

Don't use it without knowing that this is the case. Test it, see if you
can do better, and please post the results.

Thanks,

Greg

-- 
Greg Webster - [EMAIL PROTECTED]
In-Touch Software Corporation
Ph: (604)278-0515 - Fax: (604)608-3112



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] Re: Bug#223399: spamassassin: not_ok_languages, no way to split Chinese

2003-12-10 Thread Bob Apthorpe
Hi,

> While gargling concrete, "Dan Jacobson" <[EMAIL PROTECTED]> spewed:
>
> > However, anything in language [X] is always spam, so let me ban [X]
> > without having to unban every other possible language.
>
> Pretty Draconian. Must be nice to be able to do that.

Not 'nice' but 'convenient'; I don't think anyone really enjoys filtering
their mail by language. Draconian? Perhaps. Let me check my mail archives
to see how many legitimate messages we received written in a
non-ISO-8859-x charset.

> My clients/customers tend to whine a little when they don't get their
> email msg from their business partner overseas with the contract for
> them to sign in the next hour, because .ru tends to send a bit more spam
> than anyone else and is easier to just say damn..block the whole
> country.

YMMV. For those of us that run mail systems that serve only a handful of
users, blocking mail originating from Asia, Russia, and South and Central
America as well as mail written in languages not spoken within 500 miles
of the server closet would keep a lot of crap out of our users' mailboxes
with nary a complaint[1].

Again, YMMV, so don't feel compelled to block more than you're
already comfortable blocking. But please don't presume that filtering that
does (or doesn't) work for you will (or won't) for others. Limiting choice
in blocking tools & criteria doesn't help matters.

> Might as well go the way of SPEWS and start blocking entire netblocks
> because of the few idiots that allow spam from their leased IP's.
> no communication is better than spam right?

Perhaps I should stop rejecting mail from known open proxies and dynamic
allocations because I might hurt the feelings of some suburbanite DSL
customer who thinks kernel patches are some form of military insignia.

I'm not generally a betting man but I'd wager you a dollar that
enlightened self-interest is less effective at convincing an ISP to
enforce it's AUP than a bawling throng of irate customers, upset that
their mail is being rejected by the handful of providers using SPEWS.
There are a number of providers that are utterly unresponsive until
someone (metaphorically) smacks them upside the head with a paving brick.

Seriously, suggest a practical, kinder, gentler, more effective
alternative (build a better brick) and SPEWS will disappear overnight.

> 

Taken under advisement. :)

-- Bob

[1] The mail systems under my control don't block by language or
the sender's geographical location but they easily could. Culturally-blind
DNSBLs and envelope, header, and body checks keep most of garbage out.


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


[SAtalk] A question abouting teaching Spamassain

2003-12-10 Thread stan
I installed Spamassain a couple of days ago, on a Debina machine, and at
frst it seemed to work great, catching all but a few spam messages. Then I
used sa-learn to teach it using hundreds of stored mails I have in my mail
folders (mostly from mailing lists). Now it seesm to be missing nearly
every spam!

Did I do wrong by teaching it with lots of _good_ messages? Should I reset
it to the base rules, and start over? BTW how can I do that?

-- 
"They that would give up essential liberty for temporary safety deserve
neither liberty nor safety."
-- Benjamin Franklin


---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


Re: [SAtalk] A question abouting teaching Spamassain

2003-12-10 Thread Matt Kettler
At 06:48 PM 12/10/2003, stan wrote:
Did I do wrong by teaching it with lots of _good_ messages? Should I reset
it to the base rules, and start over? BTW how can I do that?
Idealisticaly you want to train it with something "realistic" in terms of 
spam/ham ratio.. ie: something close to what you get in reality. Bayes will 
tolerate a lot of variance from the "ideal" ratio, but you can't go too far 
off course with it... 200 spams and 200,000 hams will result in a bad database.

you can wipe out your bayes db and start over just by deleting it.. it's 
stored in a collection of files: ~/.spamassassin/bayes*



---
This SF.net email is sponsored by: IBM Linux Tutorials.
Become an expert in LINUX or just sharpen your skills.  Sign up for IBM's
Free Linux Tutorials.  Learn everything from the bash shell to sys admin.
Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click
___
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk


  1   2   >