Re: Bayes Questions

2005-07-11 Thread Daniel J. Cody
Andrew, Andrew Ott wrote: Also is there any way to see the count of spam and ham messages that are in the bayes database, I can't seem to find any info on that. I want to make sure there are a lot in there before I turn the bayes rules on. If you run spamassassin --lint -D you should see a li

Bayes Questions

2005-07-11 Thread Andrew Ott
For those of you running large sites ( we have about 12,000 users, with 210,000 messages a day) what do you have for a bayes_expiry_max_db_size? Also is there any way to see the count of spam and ham messages that are in the bayes database, I can't seem to find any info on that. I want to make s

Re: simultaneous sa-learn processes

2005-07-11 Thread Robert Menschel
Hello Chavdar, Monday, July 11, 2005, 3:40:14 AM, you wrote: CV> Hi List, CV> Our mailserver server serves about 100 users. Our config: CV> Sendmail+Procmail+SpamAssassin. CV> The question is: CV> If I got it right, we should run sa-learn for each user in order to benefit CV> from bayes. We int

Re: Fedora changed SpamAssassin default level to 7?

2005-07-11 Thread Kelson
Justin Mason wrote: fyi, if you're using Fedora Core -- http://blog.dave.org.uk/archives/000715.html totally unconfirmed, but worth noting in case that really is the case. My copy of Fedora Core 4 has "required_hits 5" in local.cf using the distribution's RPM for Spamassassin. rpm -Va made n

Re: Bypass URI check

2005-07-11 Thread Daryl C. W. O'Shea
[EMAIL PROTECTED] wrote: Hi All, I have received a few messages like the following. This asks the receiver to copy and past the link into their web browser. Since the href is missing, there is no URI check. That sucks, because the URIBL is my best friend right now (love black). We are close

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kelson
Joe Flowers wrote: BTW, if anyone knows a command line program that can easy run thu a bunch of mbox files and tell how many messages are in them, I will report back how many ham and how many spam messages that I have fed to bayes. It's far from perfect, but it may offer some interesting info

Fedora changed SpamAssassin default level to 7?

2005-07-11 Thread Justin Mason
fyi, if you're using Fedora Core -- http://blog.dave.org.uk/archives/000715.html totally unconfirmed, but worth noting in case that really is the case. --j.

Re: (repost) bayes_ignore_from with wildcard ?

2005-07-11 Thread Daryl C. W. O'Shea
Matt Kettler wrote: Although by looking at _check_whitelist, I wonder if it works the way the docs say. The docs claim it's file glob and not regex, but _check_whitelist looks a lot like it does a regex. _check_whitelist does use a regexp to do the matching but the config parser (add_to_addrl

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
> BTW, if anyone knows a command line program that can easy run thu a bunch of mbox files and tell how many messages are in them, I will report > back how many ham and how many spam messages that I have fed to bayes. Well, I thought this might give some good stats on the FP:FN ratio, but I for

Help debugging spamc/spamd

2005-07-11 Thread email builder
Hi, We recently changed some of our network topology so that we are temporarily connecting with spamc to spamd over a regular external network connection (we usually keep it inside our LAN, but this is a temporary thing... don't ask). Unfortunately, spamd stops (mostly) responding it seems.

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kai Schaetzl
Kai Schaetzl wrote on Mon, 11 Jul 2005 22:31:29 +0200: > With the default of 5 we get almost none, not even one per day. That was about FPs. Wrong. We don't get *any* FPs. We do not get even one *FN* per day. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: htt

Re: Performance: files or SQL?

2005-07-11 Thread Michael Parker
Cami wrote: > SQL simply doesnt scale very well for bayes. We have a serverfarm of > 12 spamassassin servers and storing bayes in SQL. We see on average > about 4000 queries per second. The MySQL server has been optimized > to hell and back and is running on high-end hardware,but just simply >

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kris Deugau
jdow wrote: > A few weeks ago I'd have said "Easy, Ducky!" Then I ran into DoveCot > that uses an indexed almost "mbox" file. There is no way to do it > other than "good guess". However, for a traditional UNIX mbox file > you should be able to nail it perfectly simply looking for the "From" > featu

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kai Schaetzl
Loren Wilton wrote on Mon, 11 Jul 2005 11:30:07 -0700: > Which of course means that by picking the ratio value you can pick pretty > much any fp/fn ratio you want. Only if the distribution was equal. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.c

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Kai Schaetzl
Joe Flowers wrote on Mon, 11 Jul 2005 12:09:29 -0400: > We are very glad and happy about this concept and implementation. Well, the big question is: How many of your spam messages score between the default 5 and your "floating score"? If it is many there's obviously something wrong with your se

Re: SA 2.63 vs 2.64

2005-07-11 Thread Matthias Fuhrmann
On Sun, 10 Jul 2005, Matthias Fuhrmann wrote: [...] > # jm: do not... > > the lines from Bayes.pm fits to the error messages. didnt checked > PerMsgStatus.pm, but i guess its the same issue. > can someone explain the difference or the impact to the problem, described > above? > > what about repl

Re: Performance: files or SQL?

2005-07-11 Thread Cami
Mike Jackson wrote: On my personal server, I'm running SA 3.0.4 with the user prefs, Bayes, and AWL in a MySQL database (mostly because it would be "cooler" that way). On my employer's server, I'm running the same SA version, but with file-based DBs and user prefs. We're going to be rolling out

Re: procmail: Could not create INET socket on 127.0.0.1:783: Permission denied

2005-07-11 Thread jdow
From: <[EMAIL PROTECTED]> > Hello, > > I set up spamassassin to work with procmail according to instructions. > Here is what is in ~/.procmailrc: > > #SPAM ASSASSIN SECTION > > :0fw: spamd.lock > * < 256000 > | /usr/sbin/spamd ^ The spamd tool is run as

procmail: Could not create INET socket on 127.0.0.1:783: P ermission denied

2005-07-11 Thread prosolutions
Hello, I set up spamassassin to work with procmail according to instructions. Here is what is in ~/.procmailrc: #SPAM ASSASSIN SECTION :0fw: spamd.lock * < 256000 | /usr/sbin/spamd :0: * ^X-Spam-Level: \*\*\*\*\*\*\*\*\*\*\*\*\*\*\* almost-certainly-spam :0: * ^X-Spam-S

Performance: files or SQL?

2005-07-11 Thread Mike Jackson
On my personal server, I'm running SA 3.0.4 with the user prefs, Bayes, and AWL in a MySQL database (mostly because it would be "cooler" that way). On my employer's server, I'm running the same SA version, but with file-based DBs and user prefs. We're going to be rolling out doing filtering for

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread jdow
A few weeks ago I'd have said "Easy, Ducky!" Then I ran into DoveCot that uses an indexed almost "mbox" file. There is no way to do it other than "good guess". However, for a traditional UNIX mbox file you should be able to nail it perfectly simply looking for the "From" feature. The dirt stupid "m

RE: SURBL, SA 3.0.4, and firewalls

2005-07-11 Thread Stewart, John
> All it needs is port 53 TCP and UDP open (outbound), > depending on what > firewall product you use, depends on how. A bit of Google with what > ports on what product will yield what you should need. One thing to note... if your firewall is proxying for you, make sure it doesn't think it's a

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
jdow wrote: > The greater the separation the > better the results for a decision point between them. > But anything you can do that widens the > typical score distribution between ham and spam is a good thing. Amen

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Loren Wilton
> There's another thing worth noting -- the SpamAssassin score distribution > for hams and spams isn't even. I don't necessarily see that those particular curve shapes necessarily in any way invalidate this method, although they do bias the method somewhat. The two curves are essentially smooth cu

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
Matt: I know you know a lot more about this than I do, but for what it's worth, you're impressions/intuitions are very close to mine. Originally back in April, I started off using the "average of the means", but that let through way too much spam. So, what I have now is it set to 30% above th

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Loren Wilton
> > score of -2.1532284. I have the divding line "set" at 30% of the > > distance between the average ham score and average spam score (30% above > > the average ham score). So, the dividing line is currently floating > > around 0.55416414. > > > The only problem I see with this approach is that i

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread jdow
From: "Matt Kettler" <[EMAIL PROTECTED]> > Joe Flowers wrote: > > I don't know if this will help anyone or not, but I wanted to report > > back just in case. > > > > In early April, I completely unhinged the dividing line between what SA > > score is used to mark a message as spam or ham (5.00 = d

RE: sa-learn on a wide site HOWTO ?

2005-07-11 Thread Aaron Grewell
> Forget about this. Most of you users will only report spams, > not ham, they're going to screw the bayes database. As a > consequence, you'll have more spam, or more fp. > > You should find another solution or educate your users (but > it takes too much time) so they feed correctly the bayes

Re: simultaneous sa-learn processes

2005-07-11 Thread jdow
From: "Chavdar Videff" <[EMAIL PROTECTED]> > On Monday 11 July 2005 14:50, JamesDR wrote: > > Chavdar Videff wrote: > > > Hi List, > > > > > > Our mailserver server serves about 100 users. Our config: > > > Sendmail+Procmail+SpamAssassin. > > > The question is: > > > If I got it right, we should r

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 the real-world figures can be seen for various thresholds in the rules/STATISTICS*.txt files... - --j. Matt Kettler writes: > Joe Flowers wrote: > > Matt Kettler wrote: > > > >> The only problem I see with this approach is that it treats false > >>

Re: How can I correctly detect these spams?

2005-07-11 Thread Kai Schaetzl
I repeat myself ;-) > It seems you are not using *any* custom rules. You may want to check out > RDJ and SARE. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de & http://msie.winware.org

Re: simultaneous sa-learn processes

2005-07-11 Thread Kai Schaetzl
Chavdar Videff wrote on Mon, 11 Jul 2005 16:13:44 +0300: > If there is a way to set up a single bayes database I would prefer that There is one, just look in the SA documentation. (documentation for local.cf should do.) Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Se

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Matt Kettler
Joe Flowers wrote: > Matt Kettler wrote: > >> The only problem I see with this approach is that it treats false >> positives and >> false negatives as being equally bad. >> >> > > We do get many more false negatives than false positives, even though we > don't get false positives very often - t

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
Thanks Jason! That's good, new info for me. That'll help me *at the very least* visualize what I am trying to do a little better. I've been very curious to know what the rough shapes of those graphs look like. Joe Justin Mason wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 There'

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 There's another thing worth noting -- the SpamAssassin score distribution for hams and spams isn't even. If you draw a graph of hams and spams, plotting the number of mails in each category as the vertical axis and the score they get as teh horizonta

Re: sa-learn on a wide site HOWTO ?

2005-07-11 Thread Julien Reveret
On 16:56, Mon 11 Jul 05, Karl.Oulmi wrote: > Hi, > > I always have a box with postfix/amavis and Spamassin running. > Now, I'd like to run sa-learn in order my users (~500) learn Spam & Ham > to Spamassassin. > > The idea is the following. > On every mail passed through my mailserver, a header o

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Joe Flowers
Matt Kettler wrote: The only problem I see with this approach is that it treats false positives and false negatives as being equally bad. We do get many more false negatives than false positives, even though we don't get false positives very often - they are rare. We certainly don't get 1

RE: spamassassin with GORDANO

2005-07-11 Thread Bret Miller
> Does anyone know If I can use Spammain with GMS (Gordano > Mail Software for Linux) In theory, you could use MailScanner as a proxy in front of GMS to run SpamAssassin before the message gets to GMS. And, if I recall correctly (I haven't used GMS for several years), I think you can use thei

Re: update on floating dividing score between spam and ham messages

2005-07-11 Thread Matt Kettler
Joe Flowers wrote: > I don't know if this will help anyone or not, but I wanted to report > back just in case. > > In early April, I completely unhinged the dividing line between what SA > score is used to mark a message as spam or ham (5.00 = default). This > allows the system and this dividing l

RE: Bypass URI check

2005-07-11 Thread Chris Santerre
Title: Bypass URI check I'm thinking it may be time for SARE to look at this phrase:   "then copy // paste the below page into your window: "   I'll see what I can do with it.   --Chris (I also love the black ;) -Original Message-From: [EMAIL PROTECTED] [mailto:[EMAIL PROTEC

sa-learn on a wide site HOWTO ?

2005-07-11 Thread Karl.Oulmi
Hi, I always have a box with postfix/amavis and Spamassin running. Now, I'd like to run sa-learn in order my users (~500) learn Spam & Ham to Spamassassin. The idea is the following. On every mail passed through my mailserver, a header or a footer is added to the mail with à mailto link that

Re: Rule: envelope to <> header to - help?

2005-07-11 Thread Matt Kettler
Michael W Cocke wrote: > Does anyone have a rule to chech the envelope To: against the header > to: ? I'm sure that there's a reason why it's allowed to be different, > but it doesn't apply here, and almost half of the spam that gets thru > everything else would get stopped by that. No. It's gener

Bypass URI check

2005-07-11 Thread Martin.Carnegie
Title: Bypass URI check Hi All, I have received a few messages like the following.  This asks the receiver to copy and past the link into their web browser.  Since the href is missing, there is no URI check.  That sucks, because the URIBL is my best friend right now (love black).  We are cl

Re: SURBL & SA 3.0.4

2005-07-11 Thread Matt Kettler
Dr Robert Young wrote: > Is there a particular "port" and/or "protocol (TCP/UDP) that must be > opened on any firewalls that might be on the network for the plugin to > work? You don't "need" to open any ports, however you must be able to resolve DNS queries. In general you can test it by using "

Re: (repost) bayes_ignore_from with wildcard ?

2005-07-11 Thread Matt Kettler
At 04:43 AM 7/11/2005, [EMAIL PROTECTED] wrote: Hello, Does anyone know if this will work: bayes_ignore_from [EMAIL PROTECTED] The docs don't say specifically if this kind of directive is allowed. They do say that this kind of thing will work for whitelist_from. We all got your message the

Re: simultaneous sa-learn processes

2005-07-11 Thread Chavdar Videff
On Monday 11 July 2005 15:31, Kai Schaetzl wrote: > Chavdar Videff wrote on Mon, 11 Jul 2005 13:40:14 +0300: > > If I got it right, we should run sa-learn for each user in order to > > benefit from bayes. We intend to run a cron job for each user and do it > > at night by supplying a daily snapshot

Re: simultaneous sa-learn processes

2005-07-11 Thread Kai Schaetzl
Chavdar Videff wrote on Mon, 11 Jul 2005 13:40:14 +0300: > If I got it right, we should run sa-learn for each user in order to benefit > from bayes. We intend to run a cron job for each user and do it at night by > supplying a daily snapshot of our spam and ham collections to sa-learn. Do I und

Re: simultaneous sa-learn processes

2005-07-11 Thread Chavdar Videff
On Monday 11 July 2005 14:50, JamesDR wrote: > Chavdar Videff wrote: > > Hi List, > > > > Our mailserver server serves about 100 users. Our config: > > Sendmail+Procmail+SpamAssassin. > > The question is: > > If I got it right, we should run sa-learn for each user in order to > > benefit from bayes

RE: simultaneous sa-learn processes

2005-07-11 Thread Sander Holthaus - Orange XL
JamesDR wrote: > Chavdar Videff wrote: >> Hi List, >> >> Our mailserver server serves about 100 users. Our config: >> Sendmail+Procmail+SpamAssassin. >> The question is: >> If I got it right, we should run sa-learn for each user in order to >> benefit from bayes. We intend to run a cron job for ea

Re: simultaneous sa-learn processes

2005-07-11 Thread JamesDR
Chavdar Videff wrote: Hi List, Our mailserver server serves about 100 users. Our config: Sendmail+Procmail+SpamAssassin. The question is: If I got it right, we should run sa-learn for each user in order to benefit from bayes. We intend to run a cron job for each user and do it at night by su

simultaneous sa-learn processes

2005-07-11 Thread Chavdar Videff
Hi List, Our mailserver server serves about 100 users. Our config: Sendmail+Procmail+SpamAssassin. The question is: If I got it right, we should run sa-learn for each user in order to benefit from bayes. We intend to run a cron job for each user and do it at night by supplying a daily snapshot

Re: How can I filter this kind of spam?

2005-07-11 Thread Michael Moyse
Kai Schaetzl wrote: Michael Moyse wrote on Fri, 08 Jul 2005 17:55:32 +0100: To me it looks like a duck and sounds like a duck I'm probably wrong and missing something here because I'm no expert so I'm happy to be enlightened. Ok, I enlighten you ;-) I hope I'm not wrong. Now that I

bayes_ignore_from with wildcard ?

2005-07-11 Thread lists
Hello, Does anyone know if this will work: bayes_ignore_from [EMAIL PROTECTED] The docs don't say specifically if this kind of directive is allowed. They do say that this kind of thing will work for whitelist_from. Regards, Devin