Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread Jason Haar
On 09/09/2009 12:53 PM, Karsten Bräckelmann wrote: > > Ah, good point, Mark -- that reminds me of the infamous issue of > un-bound or nested quantifiers in RE rules. In some pathological cases, > I've even debugged these to be the culprit of bringing SA down to its > knees. > > Any custom rules? Do

Re: user prefs from sql problem

2009-09-08 Thread Matt Kettler
Karel Beneš wrote: > Hi, > > I am trying to load user preferences from SQL db (mysql). Setup was > done according to "doc/spamassassin/sql/README.gz", but user > preferences are still loaded from files. No error message is raised > into log file in debug mode. DB-based bayes and awl works fine. >

Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread Karsten Bräckelmann
On Wed, 2009-09-09 at 02:21 +0200, Mark Martinec wrote: > On Tuesday September 8 2009 21:23:42 Jason Haar wrote: > > Actually, it's HAM - not spam. In the end it's really become clear it > > shows limitations in perl's parsing power - so either we get gruntier > > boxes - or increase the timeout. W

Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread Mark Martinec
On Tuesday September 8 2009 21:23:42 Jason Haar wrote: > Actually, it's HAM - not spam. In the end it's really become clear it > shows limitations in perl's parsing power - so either we get gruntier > boxes - or increase the timeout. We've gone with the latter. Some regexps do perform terribly whe

Re: A silly logging question

2009-09-08 Thread Mark Martinec
On Tuesday September 8 2009 12:10:41 Clunk Werclick wrote: > I'm using syslog-ng, but despite listening to; > unix-stream("/dev/log"); > It gets nothing - but I don't expect it to as the default spamassassin > conf has this line; > > OPTIONS="--create-prefs --max-children 5 --username spamd > --hel

Re: A silly logging question

2009-09-08 Thread Karsten Bräckelmann
> On Tue, 8 Sep 2009, Clunk Werclick wrote: > > > On Tue, 8 Sep 2009, Clunk Werclick wrote: > > > > I have it now - the only disappointment for me is it does not log the > > > > 'to' or 'from' or client ip. Blew away most of this thread already, before it started getting my attention. Anyway, just

Re: A silly logging question

2009-09-08 Thread Martin Gregorie
On Tue, 2009-09-08 at 12:08 -0700, John Hardin wrote: > On Tue, 8 Sep 2009, Clunk Werclick wrote: > > Sadly, no. As Fetchmail is polling a remote POP3 server, the only part > > of the system to see *all* of the information, is Spamassassin. The MTA > > only sees 'localhost' from Fetchmail. Postfi

Re: A silly logging question

2009-09-08 Thread Clunk Werclick
On Tue, 2009-09-08 at 11:50 +0300, Jari Fredriksson wrote: > > This is probably a dumb question, but my looking through > > the docs is just confusing me. > > > > Can I get SpamAssassin to fully log what it is doing? The > > best I can ever get is something like this; > > > > Mon Aug 3 06:27:57

Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread John Hardin
On Wed, 9 Sep 2009, Jason Haar wrote: On 09/09/2009 04:07 AM, John Hardin wrote: Do you have any stats on how spammy this class of mail is? Is it pure ham that you can detect using other methods, e.g. it's sent from a trusted source? Actually, it's HAM - not spam. My point. If those messa

Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread Jason Haar
On 09/09/2009 04:07 AM, John Hardin wrote: > > Do you have any stats on how spammy this class of mail is? Is it pure > ham that you can detect using other methods, e.g. it's sent from a > trusted source? > Actually, it's HAM - not spam. In the end it's really become clear it shows limitations in p

Re: A silly logging question

2009-09-08 Thread John Hardin
On Tue, 8 Sep 2009, Clunk Werclick wrote: On Tue, 2009-09-08 at 09:34 -0700, John Hardin wrote: On Tue, 8 Sep 2009, Clunk Werclick wrote: I have it now - the only disappointment for me is it does not log the 'to' or 'from' or client ip. You may be able to determine that if you correlate mor

Re: whitelist_from_dkim

2009-09-08 Thread McDonald, Dan
On Tue, 2009-09-08 at 18:24 +0100, Martin Gregorie wrote: > On Tue, 2009-09-08 at 18:54 +0200, Benny Pedersen wrote: > > On Tue 08 Sep 2009 06:25:49 PM CEST, Mark Martinec wrote > > > > > Sure, if you want it to be be whitelisted. > > > > tidy give me 95 warns on the html part :) > > > That's no

Re: whitelist_from_dkim

2009-09-08 Thread Martin Gregorie
On Tue, 2009-09-08 at 18:54 +0200, Benny Pedersen wrote: > On Tue 08 Sep 2009 06:25:49 PM CEST, Mark Martinec wrote > > > Sure, if you want it to be be whitelisted. > > tidy give me 95 warns on the html part :) > That's normal. The HTML generated by word processors, etc is seldom clean but every

Re: A silly logging question

2009-09-08 Thread Clunk Werclick
On Tue, 2009-09-08 at 09:34 -0700, John Hardin wrote: > On Tue, 8 Sep 2009, Clunk Werclick wrote: > > > I have it now - the only disappointment for me is it does not log the > > 'to' or 'from' or client ip. > > You may be able to determine that if you correlate more than one log. SA > logs the m

Re: whitelist_from_dkim

2009-09-08 Thread Benny Pedersen
On Tue 08 Sep 2009 06:25:49 PM CEST, Mark Martinec wrote Sure, if you want it to be be whitelisted. tidy give me 95 warns on the html part :) In absence of the second parameter, whitelist_from_dkim whitelists only on author signatures. this makes it simple to dump address books from horde

Re: A silly logging question

2009-09-08 Thread John Hardin
On Tue, 8 Sep 2009, Clunk Werclick wrote: I have it now - the only disappointment for me is it does not log the 'to' or 'from' or client ip. You may be able to determine that if you correlate more than one log. SA logs the message-ID, and the MTA log should give you enough information to det

Re: A silly logging question

2009-09-08 Thread Clunk Werclick
On Tue, 2009-09-08 at 09:08 -0700, John Hardin wrote: > On Tue, 8 Sep 2009, Clunk Werclick wrote: > > > Can I get SpamAssassin to fully log what it is doing? The best I can > > ever get is something like this; > > > > Mon Aug 3 06:27:57 2009 [4290] info: logger: removing stderr method > > Mon Aug

Re: whitelist_from_dkim

2009-09-08 Thread Mark Martinec
Benny, > > Still when it is checked by DIM, it reports "author > > keine-antw...@community36.net, not in any dkim whitelist". > > correct it happends here aswell > > [22718] dbg: dkim: VALID third-party signature > by id keine-antwort=3dcommunity36@mcsv129.net, > author keine-antw...@com

Re: A silly logging question

2009-09-08 Thread John Hardin
On Tue, 8 Sep 2009, Clunk Werclick wrote: Can I get SpamAssassin to fully log what it is doing? The best I can ever get is something like this; Mon Aug 3 06:27:57 2009 [4290] info: logger: removing stderr method Mon Aug 3 06:27:58 2009 [4292] info: spamd: server started on port 783/tcp (runni

Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread John Hardin
On Tue, 8 Sep 2009, Jason Haar wrote: We're having problems with a particular class of email. >400K in size, text-only. Do you have any stats on how spammy this class of mail is? Is it pure ham that you can detect using other methods, e.g. it's sent from a trusted source? If so, you may be

Re: whitelist_from_dkim

2009-09-08 Thread Benny Pedersen
On Tue 08 Sep 2009 10:04:21 AM CEST, Per Jessen wrote Still when it is checked by DIM, it reports "author keine-antw...@community36.net, not in any dkim whitelist". correct it happends here aswell [22718] dbg: dkim: performing public key lookup and signature verification [22718] dbg: dkim: sig

Re: Filtering depending mail header

2009-09-08 Thread Theo Van Dinter
There's no way to do that with SpamAssassin itself. Once you send something to SA, it will do the whole process (there's short circuiting, but that's not really what you want here). It sounds like you're trying to not filter internal mail but filter external mail, so I would recommend two things:

Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread Karsten Bräckelmann
On Tue, 2009-09-08 at 13:50 +1200, Jason Haar wrote: > [...] Allowing spamd to only scan the first 50KB of text attachments > would do the trick. I can't think of a way that could be misused by > spammers? (ie they aren't going to send text-spam where the first 50KB > is "bayes killer" and the fina

user prefs from sql problem

2009-09-08 Thread Karel Beneš
Hi, I am trying to load user preferences from SQL db (mysql). Setup was done according to "doc/spamassassin/sql/README.gz", but user preferences are still loaded from files. No error message is raised into log file in debug mode. DB-based bayes and awl works fine. Debian GNU/Linux 5.0.3, spamas

Re: whitelist_from_dkim [solved]

2009-09-08 Thread Per Jessen
Mark Martinec wrote: > Per, > [snip] > whitelist_from_dkim *...@community36.net mcsv129.net > Just to confirm that it works: dkim: author keine-antw...@community36.net, WHITELISTED by whitelist_from_dkim /Per Jessen, Zürich

Re: whitelist_from_dkim

2009-09-08 Thread Per Jessen
Mark Martinec wrote: > Per, > > Without the second argument to whitelist_from_dkim, it checks for > author signatures, as documented. In your case the mail carries a > signature by domain mcsv129.net, so you have a third-party signature > there. > > If you want to whitelist an author by some thi

Re: whitelist_from_dkim

2009-09-08 Thread Mark Martinec
Per, > >> http://jessen.ch/files/community36.eml > >> whitelist_from_dkim *...@community36.net > >> > >> The actual author is 'keine-antw...@community36.net'; I have run it > >> through SA with debug on and I see it being added to whitelist > >> entries. Still when it is checked by DIM, it reports

Re: whitelist_from_dkim

2009-09-08 Thread Per Jessen
Matus UHLAR - fantomas wrote: > On 08.09.09 10:04, Per Jessen wrote: >> I still don't seem to be getting more friendly with >> whitelist_from_dkim - >> >> could someone please try feeding this email through your SA setup: >> >> http://jessen.ch/files/community36.eml >> >> with this enabled: >>

Filtering depending mail header

2009-09-08 Thread Daniel Ruiz Molina
Hi, I want to know if it would be possible a spamassassing configuration that allows me execute spamassassing just in case a header mail exists with a defined value. System configuration is the following: Spamassassing: /etc/spamassassin/ rewrite_header Subject *SPAM* report_safe 0

Re: whitelist_from_dkim

2009-09-08 Thread Matus UHLAR - fantomas
On 08.09.09 10:04, Per Jessen wrote: > I still don't seem to be getting more friendly with > whitelist_from_dkim - > > could someone please try feeding this email through your SA setup: > > http://jessen.ch/files/community36.eml > > with this enabled: > > whitelist_from_dkim *...@community36.n

Re: A silly logging question

2009-09-08 Thread Jari Fredriksson
> This is probably a dumb question, but my looking through > the docs is just confusing me. > > Can I get SpamAssassin to fully log what it is doing? The > best I can ever get is something like this; > > Mon Aug 3 06:27:57 2009 [4290] info: logger: removing > stderr method Mon Aug 3 06:27:58 20

Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread Jason Haar
On 09/08/2009 07:54 PM, Matus UHLAR - fantomas wrote: > > It would also make spamd more complicated for no good reason. Simply use > spamc -t 120 or 180, I think up to 240 is safe at SMTP level unless you are > using other time-consuming test (data phase should end in 5 minutes > otherwise client m

whitelist_from_dkim

2009-09-08 Thread Per Jessen
SA list, I still don't seem to be getting more friendly with whitelist_from_dkim - could someone please try feeding this email through your SA setup: http://jessen.ch/files/community36.eml with this enabled: whitelist_from_dkim *...@community36.net The actual author is 'keine-antw...@communi

Re: how to speed up scans of really large text-only emails?

2009-09-08 Thread Matus UHLAR - fantomas
> On 09/08/2009 01:50 PM, Jason Haar wrote: > > We're having problems with a particular class of email. >400K in size, > > text-only. spamd takes 40-80sec to process it, and spamc is set with a > > 30sec timeout. The long processing time isn't network-related: it's > > all those "body" searches tha