Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Ben Johnson
Apologies for the rapid-fire here folks, but I wanted to correct something. I had these backwards: >> Yes, I believe that me and the system always execute SA commands as the >> "amavis" user. When I was using the SQL setup, I had the following in >> local.cf: >> >> bayes_path /var/lib/amavis/.sp

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Ben Johnson
On 4/19/2013 1:54 PM, Benny Pedersen wrote: > Ben Johnson skrev den 2013-04-19 18:02: > >> Still stumped here... > > for amavisd-new, put spamassassin sql setup into user_prefs file for the > user amavisd-new runs as might be working better then have insecure sql > settings in /etc/mail/spamass

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Ben Johnson
On 4/19/2013 12:12 PM, Axb wrote: > On 04/19/2013 06:02 PM, Ben Johnson wrote: > >> Still stumped here... > > do a bayes sa-learn --backup > > switch to file based in SDBM format (which is fast) > > do a > > sa-learn --restore > > feed it a few thousand NEW spams > > see what happens > >

Reporting matched Rules and Scores (was: Re: sa-exim Terse Rules)

2013-04-19 Thread Karsten Bräckelmann
On Thu, 2013-04-18 at 19:24 -0500, John Traweek CCNA, Sec+ wrote: > I’m new to the list, so if there are web archives that are easily > searchable where I can find this info please point me to it. I am > running sa-exim with SA 3.3.1. I am trying for the life of me to turn > on the Terse report o

Re: bayes - large message

2013-04-19 Thread Joe Acquisto-j4
>>> On 4/19/2013 at 8:26 PM, "Joe Acquisto-j4" wrote: > I thought I had corrected this issue, with someone's assistance, a while ago: > > Apr 19 20:21:02.477 [23670] dbg: bayes: expiry completed > Apr 19 20:21:02.477 [23670] info: archive-iterator: skipping large message > Learned tokens from 0 m

bayes - large message

2013-04-19 Thread Joe Acquisto-j4
I thought I had corrected this issue, with someone's assistance, a while ago: Apr 19 20:21:02.477 [23670] dbg: bayes: expiry completed Apr 19 20:21:02.477 [23670] info: archive-iterator: skipping large message Learned tokens from 0 message(s) (0 message(s) examined)

Re: local score ignored

2013-04-19 Thread Joe Acquisto-j4
>>> On 4/19/2013 at 10:41 AM, John Hardin wrote: > On Fri, 19 Apr 2013, Joe Acquisto-j4 wrote: > >>> What output does the command "sa-learn --dump magic" produce? >> >> 0.000 0 1872 0 non-token data: nspam >> 0.000 0 9184 0 non-token data: nham >

Re: Need rule to catch lots of font changes

2013-04-19 Thread Alex
Hi, > I'm trying to adapt this to work with multiple tags, but I must be doing something wrong. I've tried changing it to match just 10 > instances of , just for testing. Here's what I have: > >> rawbody __LOC_BR // >> tflags __LOC_BR multiple maxhits=11 >> meta LOC_MULT_BR > 10 >> score L

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Benny Pedersen
Ben Johnson skrev den 2013-04-19 18:02: Still stumped here... for amavisd-new, put spamassassin sql setup into user_prefs file for the user amavisd-new runs as might be working better then have insecure sql settings in /etc/mail/spamassassin :) i dont know if this is really that you have a

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Benny Pedersen
John Hardin skrev den 2013-04-18 04:15: ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin; unicode is overkill since bayes is just ascii it will if unicode is used create bigger db, that will slow down more then ascii Please check the SpamAssassin bugzilla to see if this situation is al

Re: local score ignored

2013-04-19 Thread Benny Pedersen
Joe Acquisto-j4 skrev den 2013-04-19 13:10: 0.000 0 1872 0 non-token data: nspam 0.000 0 9184 0 non-token data: nham any use of whitelist_from ? score whitelist_from 0.001 why ?, whitelist_from can be forged, and will poison bayes if not car

Re: sa-exim Terse Rules

2013-04-19 Thread Benny Pedersen
John Traweek CCNA, Sec+ skrev den 2013-04-19 02:24: I'm new to the list, so if there are web archives that are easily searchable where I can find this info please point me to it. I am running sa-exim with SA 3.3.1. http://spamassassin.apache.org/ dont trust maillist archives, use the web :=)

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Axb
On 04/19/2013 06:02 PM, Ben Johnson wrote: Still stumped here... do a bayes sa-learn --backup switch to file based in SDBM format (which is fast) do a sa-learn --restore feed it a few thousand NEW spams see what happens

Re: local score ignored

2013-04-19 Thread Bowie Bailey
On 4/19/2013 7:10 AM, Joe Acquisto-j4 wrote: 0.000 0 3 0 non-token data: bayes db version 0.000 0 1872 0 non-token data: nspam 0.000 0 9184 0 non-token data: nham 0.000 0 140303 0 non-token data:

Re: Need rule to catch lots of font changes

2013-04-19 Thread Bowie Bailey
On 4/18/2013 7:32 PM, Alex wrote: I'm trying to adapt this to work with multiple tags, but I must be doing something wrong. I've tried changing it to match just 10 instances of , just for testing. Here's what I have: rawbody __LOC_BR // tflags __LOC_BR multiple maxhits=11 meta LOC_MULT

Re: Need rule to catch lots of font changes

2013-04-19 Thread Alexandre Boyer
Hi, your meta is wrong. It should be: meta LOC_MULT_BR __LOC_BR > 10 Note that it will not match "just" 10 instances of this tag. It will match "at least" ten of them. If you want exactly 10, you have to do something like: meta LOC_MULT_BR __LOC_BR = 10 Never done that, maybe you need to

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Ben Johnson
On 4/19/2013 11:42 AM, Alex wrote: > Hi, > >> Is this normal? If so, what is the explanation for this behavior? I have > > marked dozens of nearly-identical messages with the subject > "Garden hose > expands up to three times its length" as SPAM (over the course of >

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Alex
Hi, > Is this normal? If so, what is the explanation for this behavior? I have > marked dozens of nearly-identical messages with the subject "Garden hose >> expands up to three times its length" as SPAM (over the course of >> several weeks) as SPAM, and yet SA reports "not enough usable tokens >>

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Alex
Hi, > Might anyone be in a position to offer an authoritative response to > these questions? > > I continue to see messages that are very similar to dozens of messages > that have been marked as SPAM slipping through with *no Bayes scoring* > (this is *after* fixing the SQL syntax error issue): >

Re: Calling spamassassin directly yields very different results than calling spamassassin via amavis-new

2013-04-19 Thread Ben Johnson
On 4/18/2013 12:18 PM, Ben Johnson wrote: > > My concern now is that I am on 3.3.1, with little control over upgrades. > I have read all three bug reports in their entirety and Bug 6624 seems > to be a very legitimate concern. To quote Mark in the bug description: > >> The effect of the bug wit

Re: local score ignored

2013-04-19 Thread John Hardin
On Fri, 19 Apr 2013, Joe Acquisto-j4 wrote: What output does the command "sa-learn --dump magic" produce? 0.000 0 1872 0 non-token data: nspam 0.000 0 9184 0 non-token data: nham Generally you want the ratio of trained messages to reflect the

Re: local score ignored

2013-04-19 Thread John Hardin
On Fri, 19 Apr 2013, Joe Acquisto-j4 wrote: On 18.04.13 21:45, Joe Acquisto-j4 wrote: All I can do is feed it. that is what you should do. You need to train on both spam and ham, since the BAYES filter must know how they differ... That has always given me pause, as I get very little ham. Go

Re: local score ignored

2013-04-19 Thread Matus UHLAR - fantomas
Niamh Holding 04/19/13 7:11 AM >>> You only get one ham email a month? On 19.04.13 09:22, Joe Acquisto-j4 wrote: That's all *I* seem to get. Other users may differ, but I have them instructions on how to forward stuff for training. This is a rather small system compared to what many of y

Re: local score ignored

2013-04-19 Thread Joe Acquisto-j4
That's all *I* seem to get. Other users may differ, but I have them instructions on how to forward stuff for training. This is a rather small system compared to what many of you deal with. joe a. >>> Niamh Holding 04/19/13 7:11 AM >>> Hello Joe, Friday, April 19, 2013, 12:02:32 PM, you wro

Re: local score ignored

2013-04-19 Thread Matus UHLAR - fantomas
On 4/19/2013 at 6:29 AM, Matus UHLAR - fantomas wrote: that is what you should do. You need to train on both spam and ham, since the BAYES filter must know how they differ... On 19.04.13 07:02, Joe Acquisto-j4 wrote: That has always given me pause, as I get very little ham. Got one this AM.

Re: local score ignored

2013-04-19 Thread Joe Acquisto-j4
>>> On 4/19/2013 at 6:35 AM, Matus UHLAR - fantomas wrote: > On 4/19/2013 at 12:06 AM, John Hardin wrote: >>> BAYES_50 is the bayes classifier's way of saying "insufficient data" or "I >>> don't know". >>> >>> Do you really want to assign 3 points for "I don't know"? > > On 19.04.13 06:09, J

Re: local score ignored

2013-04-19 Thread Niamh Holding
Hello Joe, Friday, April 19, 2013, 12:02:32 PM, you wrote: JAj> That has always given me pause, as I get very little ham. Got one this AM. which I will feed JAj> but that's the first in at least a month. You only get one ham email a month? -- Best regards, Niamh

Re: local score ignored

2013-04-19 Thread Joe Acquisto-j4
>>> On 4/19/2013 at 6:29 AM, Matus UHLAR - fantomas wrote: > On 4/18/2013 at 7:21 AM, Matus UHLAR - fantomas wrote: >>> Train your bayes database, if you get many spams with this small score. > > On 18.04.13 21:45, Joe Acquisto-j4 wrote: >>All I can do is feed it. > > that is what you shoul

Re: local score ignored

2013-04-19 Thread Matus UHLAR - fantomas
On 4/19/2013 at 12:06 AM, John Hardin wrote: BAYES_50 is the bayes classifier's way of saying "insufficient data" or "I don't know". Do you really want to assign 3 points for "I don't know"? On 19.04.13 06:09, Joe Acquisto-j4 wrote: In this case, from the samples I've seen. Absolutely, yes

Re: local score ignored

2013-04-19 Thread Matus UHLAR - fantomas
On 4/18/2013 at 7:21 AM, Matus UHLAR - fantomas wrote: Train your bayes database, if you get many spams with this small score. On 18.04.13 21:45, Joe Acquisto-j4 wrote: All I can do is feed it. that is what you should do. You need to train on both spam and ham, since the BAYES filter must k

Re: local score ignored

2013-04-19 Thread Joe Acquisto-j4
>>> On 4/19/2013 at 12:06 AM, John Hardin wrote: > On Thu, 18 Apr 2013, Joe Acquisto-j4 wrote: > > On 4/18/2013 at 7:21 AM, Matus UHLAR - fantomas wrote: >>> On 18.04.13 06:45, Joe Acquisto-j4 wrote: I was concerned about this: [score: 0.4968] >>> >>> This meant that BAYES has