Re: [SAtalk] New Ruleset: EvilNumbers

2004-01-19 Thread Chris Petersen
ching as much as possible, and go with [\s\(\)\-\.]+ (heck, I would probably go as far as just doing \W+ or \W* to catch any characters the spammers might try to throw in). -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.

Re: [SAtalk] New Ruleset: EvilNumbers

2004-01-19 Thread Chris Petersen
> Lets not forget parentheses. Here is how I would have it look. > [\s(\(|\-|\)\.]+ > Well I hope it is correct. [\s\(\)\-\.]+ -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.siliconmechanics.com/c

Re: [SAtalk] Ann: "Rules De Jour": An automated way to keep up with the latest rulesets

2004-01-17 Thread Chris Petersen
1 ] ; then /etc/init.d/spamassassin restart 2>/dev/null 1>/dev/null echo "Restarted SpamAssassin" fi -- Chris Petersen Programmer / Web Designer Silicon M

Re: [SAtalk] Bayes.

2004-01-14 Thread Chris Petersen
de, especially when it's a LOT of code and you don't really know what you're looking for. -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.siliconmechanics.com/c

Re: [SAtalk] Bayes.

2004-01-14 Thread Chris Petersen
y tokens? (if not, it should) What about message headers? Does it tokenize rawbody or body? Does it tokenize only word-based characters, or would something like "[EMAIL PROTECTED]@" become a token? I'd honestly like some answers to these questions - I've asked before but didn

Re: [SAtalk] Checking for spelling errors

2004-01-13 Thread Chris Petersen
ish email (no way to guarantee that if a server is in the US that all of its users send english-only email), and here I'm counting programming languages as non-english. I often mail code around to friends/clients/whatever, and I can guarantee that more than 75% of it would set off a spell check

Re: [SAtalk] The idea behind Habeas?

2004-01-12 Thread Chris Petersen
eir text is subject to trademark violations, since the right to use the trademark is granted only to those who send messages compliant with their definition of not-spam. -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.s

Re: [SAtalk] spamc and maildrop

2004-01-10 Thread Chris Petersen
; same thing you'd use for invoking spamassassin or any other kind of filter. If you have virtual users, you'll need to look at the AuthCourier.pm module submitted to the courier-users mailing list. It makes spamd work with courier's authdaemon and properly handles homedirs for u

Re: [SAtalk] OT: My REALY REALY Crazy idea!! ahahahahahah

2004-01-09 Thread Chris Petersen
> How many do think would end up with an overly outgoing well blessed > healthy abit thin gorilla with a huge desire to descramble tv while > becoming financially secure and well insured don't forget well-endowed in both male and female aspects! -- Chris Petersen Programmer

Re: [SAtalk] New Spam Type

2004-01-08 Thread Chris Petersen
that get tossed into the bunch. -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php 1U Servers: http://www.siliconmechanics.c

RE: [SAtalk] detecting large collections of random words

2004-01-08 Thread Chris Petersen
#x27;ve done serious regex stuff (and I used to do it for a living). -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php 1U Servers: http://www.si

RE: [SAtalk] detecting large collections of random words

2004-01-08 Thread Chris Petersen
rds score WORDWORD_10 .5 rawbody WORDWORD_15 /(?:\b(?!=(?:from|even|more|were|with)\b)[a-z]{4,12}\s+){15}/ describe WORDWORD_15 string of 15+ random words score WORDWORD_15 2.5

RE: [SAtalk] detecting large collections of random words

2004-01-08 Thread Chris Petersen
mization and not grab backreferences. -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php 1U Servers: http://www.siliconmechanics.com/c2

[SAtalk] questions about how bayes scores...

2004-01-08 Thread Chris Petersen
he message is spam or not." -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php 1U Servers: http://www.siliconme

RE: [SAtalk] detecting large collections of random words

2004-01-08 Thread Chris Petersen
> Would this regex make more sense? > /([a-z]{4,12}\s){12,}/ Yes. though I used: /(\b[a-z]{4,12}\s+){12}/ notice the initial /b, and there's no need to make SA continue to search beyond the "minimum" match, so leave off the , in the last {} cluster. -- Chris Pete

[SAtalk] detecting large collections of random words

2004-01-08 Thread Chris Petersen
like a/and/the? It'd be easy for spammers to get around, but at least it would keep them out of inboxes for awhile. -- Chris Petersen Programmer / Web Designer Silicon Mechanics: http://www.siliconmechanics.com/ Blade Servers: http://www.siliconmechanics.com/c292/blade-server.php 1U S

Re: [SAtalk] auto whitelist ADDS points?

2004-01-01 Thread Chris Petersen
> The whitelist part is a misnomer. It's an automatic score adjuster > (white/black-list if you want). I realize this. Just figure that the name should be more informative. Better yet, shouldn't it be somehow tied to the bayes DB? These messages are correctly scoring "0% chance of spam" from B

[SAtalk] auto whitelist ADDS points?

2004-01-01 Thread Chris Petersen
I recently started receiving spam addressed from someone on one of the mailing lists I'm on, and since at about that time, my own address started "sending" spam, we determined that the web archive of the list had been spidered. Anyway, now, whenever I get mail from him, SA has tagged it as spam du

[SAtalk] spamd + courier authdaemon?

2003-12-30 Thread Chris Petersen
Someone recently posted a spamd modification that allows it to access courier mta's authdaemon to properly find home directories, etc for virtual users. It was requested that he send it over to you guys, but I didn't see anything float through this list about it. Has it been received? It would b

Re: [SAtalk] Re: BIG HUGE EVIL RULE NEWS!!!!

2003-12-04 Thread Chris Petersen
> I'm too embarrassed to tell people I use pico... I was trying to avoid the editor-war, but I have to say that I'm right there with you (though when I can, I use nano because it has a few more features but still keeps the nice EASY TO USE interface). Then again, when I really want to code, I u

Re: [SAtalk] BIG HUGE EVIL RULE NEWS!!!!

2003-12-03 Thread Chris Petersen
> 1) put the full URL of the canonical source into the file itself, so >people know from where to get updates Or, for those who like a little automation: #!/bin/sh wget -N http://www.merchantsoverseas.com/wwwroot/gorilla/bigevil.cf \ -O /etc/mail/spamassassin/bigevil.cf \ 2>&1 | gr

[SAtalk] bayes + soundex/similar?

2003-07-29 Thread Chris Petersen
With spammers resorting to misspellings, has anyone thought of combining the bayes token stuff with soundex (or something similar) matching? I don't know much about how bayes works, but this sounds like a decent idea... -Chris - getting tired of "Viagraaa" ads.

Re: [SAtalk] Why wil "sa-learn" not learn?

2003-07-13 Thread Chris Petersen
> Whatever I do, sa-learn will not accept it as spam. :( It takes about 500-1000 learned spams before SA starts marking things with the bayes filter. I've put at least that many in, and still only about half of my spams have bayes markings in them (and about 10% of my daily spam still gets throug

RE: [SAtalk] spamc and homedir

2003-01-07 Thread Chris Petersen
> FYI, works just fine. We've deployed it here using the full email as > the key. We just made the field longer. hmm, maybe I'll have to look into this. shouldn't be too hard to add courier auth tool support to spamd so I wouldn't have to set up a separate mysql database for this kind of thing.

[SAtalk] spamc and homedir

2003-01-06 Thread Chris Petersen
I've asked about this before, but could never get a straight answer, so I figured I'd try again... my problem: I use courier and maildrop, and have virtual users. The virtual users are all owned by the same user/group, which obviously has a different homedir than the virtual users. I also have

Re: [SAtalk] spamd authenticating spamc's uid

2002-11-14 Thread Chris Petersen
> Why do we need to authenticate the user of spamc at all? Are we > worried about a remote user running spamc on their box and forging mail > through ours? A local user forging something through our box? I think this was mentioned in an earlier thread, but as I understand it, the worry is that

RE: [SAtalk] spamd and virtual users. Theo's link?

2002-11-01 Thread Chris Petersen
> Problem: Use SA on Aliases and / or virtusers AND real users combined. yup, that one exactly. > So far the only solution looks like using MySQL. (Which for reasons of my > own I don't want to do right now.) >From what I've seen, a simple solution would be to get spamc to pass a couple more bi

Re: [SAtalk] spamd and virtual users

2002-10-31 Thread Chris Petersen
> I doubt that passing a couple of variables to spamd would increase the > overhead of spamc by anything noticeable. But I get your point. I'm > just trying to figure out a way to make this thing work properly with my > setup and that seemed like the easiest solution short of hardcoding > spamd t

Re: [SAtalk] spamd and virtual users

2002-10-28 Thread Chris Petersen
> The purpose of spamc is to be as lightweight as possible. Ideally > spamc will not contain any features that aren't useful by most people. I doubt that passing a couple of variables to spamd would increase the overhead of spamc by anything noticeable. But I get your point. I'm just trying to f

Re: [SAtalk] spamd and virtual users

2002-10-28 Thread Chris Petersen
> Why not use an SQL database instead? That's what we do with vpopmail > (although I'd love for qmail-scanner to have support for virtual users, oh > well) and it works just dandy. sql database for what? authentication? Or everything? The only reason these users are "virtual" is because they'

Re: [SAtalk] spamd and virtual users

2002-10-28 Thread Chris Petersen
ts $ENV{HOME} var to spamd... why not just let spamc handle more of the normal spamassassin commands, anyway? -Chris On Wed, 23 Oct 2002, Chris Petersen wrote: > To: [EMAIL PROTECTED] > From: Chris Petersen <[EMAIL PROTECTED]> > Subject: [SAtalk] spamd and virtual users > Date

[SAtalk] spamd and virtual users

2002-10-23 Thread Chris Petersen
I've recently run into an issue... I use courier-mta's userdb auth to set up virtual accounts for a few domains I host on my machine. This is nice, since I don't need to create system accounts on my machine for people who have no right to be in there. I finally figured out why I hadn't been

[SAtalk] "new" spam host

2002-09-20 Thread Chris Petersen
azoogle.com should be added to the global "bad hosts" list. They don't seem to forge their headers, so a simple from/body check should handle it. and here they go, off to my server's blacklist. -Chris --- This sf.net email is sponsored by:

Re: [SAtalk] Fixing all_spam_to problem in local.cf

2002-07-18 Thread Chris Petersen
> For my personal use (in qmail) if I can have it also look at "Delivered-To:" > then my problem will be solved. I'm not sure how to handle this in sendmail, > but if someone can help me figure out where SA is determining the recipient of > a message I'd like to try to fix this myself by making i

Re: [SAtalk] SpamAssassin - Additional Rating System

2002-07-12 Thread Chris Petersen
> The solution: Change the way it reports on the subject line, and let it > all through. That's how I handle most of my stuff... > In PerMsgStatus.pm (which is where the subject line is changed) I am > adding two additional variables: _SPLV_ and _SPLG_. > Spamlevel = hits/threshold (so hits of 1

[SAtalk] how did this get through?

2002-07-10 Thread Chris Petersen
You'd think that "increate penis size" would trigger some rule or another. Maybe freehostchina.com should also be added for some score or another, too? (I don't know how legit of a service this is, but most things coming out of chinese webhosts these days aren't good). -Chris -- F

RE: [SAtalk] s/SPAM/spam/ it seems

2002-07-08 Thread Chris Petersen
> So if anyone else thinks this would be useful what other categories are > there? Here is what Postini offers: I really like this idea. as you said, it could be used to NOT filter out certain things... so if you want "get rich quick" mail, you dis/enable it in your user file and those scores

Re: [SAtalk] 2.31 released

2002-06-20 Thread Chris Petersen
> > Thanks for releasing the update to 2.31. Is there an ETA on RPM > > availability? > > > ftp://ftp.kluge.net/pub/felicity/SRPMS/spamassassin-2.31-1.src.rpm ... > I usually suggest people rebuild from the SRPM since there may be > different versions of perl involved and such. Is there ever

[SAtalk] rpm version for 2.3?

2002-06-16 Thread Chris Petersen
I just noticed that I'm still running 2.2, and when I went to download the latest rpm, I noticed that the site still only has 2.2... Has anyone made rpm's of 2.3 or later? And would it be possible for some rpm-knowledgeable person on the dev team to move the spec file into the .tar.gz branch so

[SAtalk] mousetrap email address

2002-06-05 Thread Chris Petersen
A friend of mine recently suggested the idea of using a "mousetrap" email address for detecting spam. I've heard of this technique being used for tracking spam, but it had never occurred to me that I could use it to catch spammers. Basically, the idea is to put an email address on a web page, ei

Re: [SAtalk] Spam Tracking

2002-05-17 Thread Chris Petersen
> > 0.01 * 10^34 = 10^32 times. at 1,000,000,000 tries per second, that > > will only take you 10^23 seconds = roughly the age of the universe. > > Not to mention the challenge of coming up with 10^32 unique intelligible > ways of talking about penis enlargement, multilevel marketing, and wild >

Re: [SAtalk] Spam Tracking

2002-05-17 Thread Chris Petersen
> Subjects being slightly different shouldn't be a problem because you can do > soundex or "like" searches when you have the data set. good point. advanced comparisons like that would help a lot. > I was debating the reply-to and from but maybe it's best just ot use all of > them for now. Aw

Re: [SAtalk] Spam Tracking

2002-05-17 Thread Chris Petersen
> It's just an old habit. When I learned SQL I was taught (mostly from > the big SQL books) and of course the little black book of normalization, > _Handbook of Relational Database Design_ that table columns should try > to be unique yet understandable. ahh. I started db stuff with filemaker (a

Re: [SAtalk] Spam Tracking

2002-05-17 Thread Chris Petersen
> 1) Razor uses SHA1, not MD5. ah, noted. > 2) Either way, while you're correct (you _can_ have multiple inputs >with the same resulting hash), it's very unlikely to find two sets of >different data with the same hash output. So in reality, MD5/SHA1/etc >aren't unique, but they're u

Re: [SAtalk] Spam Tracking

2002-05-17 Thread Chris Petersen
> Now I really want to do this. I'll see what I'm up to this weekend. :-) heh, it all looks good to me. I think I'm just not quite sure what you're up to (that, and understores in field names confuse me for some reason ;). > What really can you track with this besides scoring and the correla

Re: [SAtalk] Spam Tracking

2002-05-17 Thread Chris Petersen
> One thing I want to do is write a little C program that connects to Postgres > (or Perl but with a C client just like spamc/d) and reports on the tests that > *all* messages score on. wouldn't it be easier to integrate this into spamd? You'd already have your db client set up that way. > F

Re: [SAtalk] Undisclosed.Recipients@

2002-05-16 Thread Chris Petersen
> Is it a properly formatted header according to the relevant RFCs? If > not, this entry in an ACL in my exim.conf rejects it at SMTP time. not sure there. presumably it's all ok. > If you want, how about coming up with a test like exim's that looks > for syntactic validity of the header. All

Re: [SAtalk] Undisclosed.Recipients@

2002-05-16 Thread Chris Petersen
> It depends on your setup. If each user is invoking spamassassin > directly form procmailrc, then it's no problem. If spamd is running > as root (or some other user), then there can be security concerns, > especially since some of the rules require an eval. aha, that makes sense, then...Ev

[SAtalk] Undisclosed.Recipients@

2002-05-16 Thread Chris Petersen
like the subject says... This is one of the most common spam recipients that I receive... would be nice to get it added to the master list. also, what's the reasoning for not letting users define filtration regex's in their user files? -Chris __