[SAtalk] expand_regex.pl - bug fixes, improvements

2004-01-30 Thread Gary Funck
Attached, is version 1.4 of expand_regex.pl. Notable changes are: - improved handling of bracketed regex's in situation like ( ( )? ) where the previous version did not deal with nested balanced expressions correctly - added a -lint option which will run the most helpful warning options. -

RE: [SAtalk] A simple tool to extract URL's from mail folders

2004-01-29 Thread Gary Funck
Wow. I sent that e-mail out last *week*, and it is just dribbling in today. Received: from intrepid.intrepid.com ([192.195.190.1] ident=[1qHbG1J2WyZEN0gY3ydWgHO2WHps6+zg]) by sc8-sf-mx1.sourceforge.net with esmtp (TLSv1:AES256-SHA:256) (Exim 4.30) id 1Ak5VT-TZ-H9 for [

[SAtalk] A simple tool to extract URL's from mail folders

2004-01-28 Thread Gary Funck
Inspired by "Filters that fight back", by Paul Graham http://www.paulgraham.com/ffb.html I found a reference to a short script that scans e-mail for URL's, and then turns around and automatically references the offending page. Well, I'm not interested in doing that at the moment, but I have enhan

[SAtalk] expand_regex: a tool for debugging regex rules

2004-01-27 Thread Gary Funck
Attached is a perl script, expand_regex.pl, which will accept an SA rules file on standard input and will by default output the expansions of those rules, taking into account regex factoring due to parentheses. When invoked with the -verbose option, the program will preface the expansion by the r

RE: [SAtalk] better whitelisting - using feedback?

2004-01-21 Thread Gary Funck
> > I'm not sure I'd do this. One day (for a bunch of reasons) I whitelisted > my own address, and promptly got a bunch of spam "from" myself. > Good point, but all local addresses can (and must) be verified based upon the incoming gateway's Received: header. ---

[SAtalk] better whitelisting - using feedback?

2004-01-21 Thread Gary Funck
One of the speakers at Spamcon 2004, talked about the effectiveness of automatically generated white lists. As I recall, his scheme depended upon two sources of info: the mail addresses that typically appeared in your To: From: and Cc: lines in your corpus of ham, during training and automatically

[SAtalk] (OT) Spam Conference 2004 re-cap?

2004-01-17 Thread Gary Funck
> > There was an excellent presentation by John Graham-Cumming at the > 2004 Spam Conference about this and how your experience is what most > people find. The issue being that spammers don't know what tokens are > considered hammy in your Bayes DB, so random dictionary words tend to fail > very

RE: [SAtalk] sa-learn, mbox deleted messages

2004-01-17 Thread Gary Funck
> > You could use formail/procmail, > > formail -s procmail sa_learn.rc < mbox | sa-learn > > where sa_learn.rc might appear as follows: > > > LOGFILE=$HOME/sa_learn.log # While debugging > VERBOSE=yes# """" > LOGABSTRACT=yes# """" > SENDMAIL=

RE: [SAtalk] sa-learn, mbox deleted messages

2004-01-17 Thread Gary Funck
> From: Barton L. Phillips > Sent: Saturday, January 17, 2004 9:28 AM > > I am using Mozilla and when I delete a message it is marked: > Status: RO > X-Status: D > > When I run sa-learn the deleted messages are learned. I can "compact > this folder" but I sometimes forget. Is there a way to ha

RE: [SAtalk] Spam Collecting

2004-01-16 Thread Gary Funck
> From: cube > Sent: Friday, January 16, 2004 8:52 AM > > Does anyone have a good way of collecting ham for the bayesian > filters. I > can collect spam quite easily, but mixed in with my ham is all > kinds of spam. > (There is a buttload of spam with less hits than 1.) > > I read everywhere t

RE: [SAtalk] FP with backhair

2004-01-14 Thread Gary Funck
> > > > Got my first false positive :-/ > > Backhair scored on a .pdf... > > Any hints how to avoid these? > > > > > > X-Spam-Status: Yes, hits=12.0 tagged_above=3.0 required=5.3 > > tests=J_BACKHAIR_11, J_BACKHAIR_12, J_BACKHAIR_13, J_BACKHAIR_14, > > J_BACKHAIR_21, J_BACKHAIR_22, J_BACKHAIR

RE: [SAtalk] FP with backhair

2004-01-14 Thread Gary Funck
Matt replied (in part): > > >I thought it was only supposed to scan text/html attachments? > > I've never heard anyone claim such. > Here's what the current docs. say: body SYMBOLIC_TEST_NAME /pattern/modifiers Define a body pattern test. pattern is a Perl regular expression. The 'body' in this

RE: [SAtalk] FP with backhair

2004-01-14 Thread Gary Funck
> -Original Message- > From: Rolf Kraeuchi > Sent: Wednesday, January 14, 2004 10:05 AM > > Got my first false positive :-/ > Backhair scored on a .pdf... > Any hints how to avoid these? > > > X-Spam-Status: Yes, hits=12.0 tagged_above=3.0 required=5.3 > tests=J_BACKHAIR_11, J_BACKHAIR_

[SAtalk] (OT) Anti-spam law enacted -- so what's all this junk in myin-box?Risks Digest 23.12

2004-01-12 Thread Gary Funck
[I read Earthlink's suggestions below, and thought "We really are in trouble."] http://www.interesting-people.org/archives/interesting-people/200401/msg0011 7.html Date: Mon, 12 Jan 2004 10:15:53 -0700 From: "NewsScan" <[EMAIL PROTECTED]> Subject: Anti-spam law enacted -- so what's all this junk

RE: [SAtalk] Duplicate Emails

2004-01-12 Thread Gary Funck
As an aside, formail -D 2 /tmp/dup_id_cache.$$ -s < mbox.txt > mbox_no_dupes.txt rm -f /tmp/dup_id_cache.$$ will do a decent job of weeding out duplicates (based upon message id), where 2 is the size of the id cache. --- This SF

[SAtalk] (OT) Inbox Trauma: New Anti-Spam Tools Falter

2004-01-11 Thread Gary Funck
http://www.interesting-people.org/archives/interesting-people/200401/msg0010 7.html -Original Message- From: Claudio Gutierrez <[EMAIL PROTECTED]> Date: Sun, 11 Jan 2004 20:56:04 To:Dave Farber <[EMAIL PROTECTED]> Subject: Inbox Trauma: New Anti-Spam Tools Falter Dave I think you

[SAtalk] RE: Neural Net scoring

2004-01-10 Thread Gary Funck
> From: Nix > Sent: Saturday, January 10, 2004 10:35 AM [...] > > See bug 2910. > Thanks. Here's the link: http://bugzilla.spamassassin.org/show_bug.cgi?id=2910 Copyright (c)2003 Henry Stern Fast SpamAssassin Score Learning Tool Henry Stern Faculty of Computer Science Dalhousie University 605

RE: [SAtalk] Re: Is BigEvil for me?

2004-01-10 Thread Gary Funck
> From: Bryan Hoover > Sent: Friday, January 09, 2004 10:52 PM [...] > > > Gary Funck wrote: > > > > > From: Robert Menschel > > Here's an idea that I've been considering for a while: have SA > change its > > scoring strategy to use

RE: Re[2]: [SAtalk] Is BigEvil for me?

2004-01-09 Thread Gary Funck
> From: Robert Menschel > Sent: Friday, January 09, 2004 7:34 PM [...] > > Evil rules, if/when guaranteed, can be scored at or above your spam > threshold. An example from my personal files (where my spam threshold is > 9): > uri RM_u_53x /53x\.net/i > describe RM_u_53

RE: [SAtalk] backhair hits on an attachment

2004-01-09 Thread Gary Funck
> From: Lindsay Snider > Sent: Friday, January 09, 2004 8:03 AM > > > I'm running pbw dated 2003-11-13. Backhair recently hit hard on a piece > of ham. The rules matched on an uuencoded pdf attachment. Does anyone > have any insight on how I might prevent these from hitting in the > future.

RE: [SAtalk] how to filter the MS Update virus?

2004-01-09 Thread Gary Funck
Not with SA, but in proccmail, I use a canned recipe fetched off the net: In .procmailrc: # # eliminate virus mail. # MYVIRUS=virus-trap INCLUDERC=/etc/mail/procmail/virussnag.rc In virussnag.rc is located here: http://www.spamless.us/pub/procmail/virussnag.rc Leading comments: ##

RE: [SAtalk] Re: Checking URLs in email body against RBLs too?

2004-01-09 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Bob > Proulx > Sent: Friday, January 09, 2004 12:30 AM > To: [EMAIL PROTECTED] > Cc: Petri Koistinen > Subject: [SAtalk] Re: Checking URLs in email body against RBLs too? > > > Petri Koistinen wrote: >

RE: [SAtalk] detecting large collections of random words

2004-01-08 Thread Gary Funck
> From: Chris Petersen [...] > > Yes. though I used: > > /(\b[a-z]{4,12}\s+){12}/ > > notice the initial /b, and there's no need to make SA continue to search > beyond the "minimum" match, so leave off the , in the last {} cluster. > Looks good. just running this over a ham mail box with about

RE: [SAtalk] detecting large collections of random words

2004-01-08 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > Sent: Thursday, January 08, 2004 12:57 PM > > Would this regex make more sense? > > /([a-z]{4,12}\s){12,}/ Slightly better might be: /(?:(\b[a-z]{4,12}\s+){12,})/ --- This SF.net e

RE: [SAtalk] Continuing saga of runaway spamd

2004-01-07 Thread Gary Funck
> but was the advocate for adding spamassassin, so this matters to me :-) > > I did suggest yesterday disabling bayesian and auto learning, but I > could not see any evidence of a spamassassin bayes database anywhere. > We are running spamd from exim, if that has any import, and all > running on

RE: [SAtalk] online degree glop

2004-01-05 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Brad > Koehn > Sent: Monday, January 05, 2004 8:58 PM > To: [EMAIL PROTECTED] > Subject: [SAtalk] online degree glop > > > I've been getting a bunch of messages either squeaking by SA 2.61 or > nearly so

RE: [SAtalk] Re: Resource conservation

2004-01-04 Thread Gary Funck
> From: Bryan Hoover > Sent: Saturday, January 03, 2004 8:41 PM [...] > > > > > > Notice in, HEADERTAGVAL=`formail -rztx To:`, there must be that space > > > between the options, and the tag -- would not work on my > system without > > > it, and procmail tips page notes it as well. > > > > Intere

[SAtalk] RE: Resource conservation

2004-01-03 Thread Gary Funck
> From: Bryan Hoover [mailto:[EMAIL PROTECTED] > Sent: Saturday, January 03, 2004 7:05 PM [...] > > #extract only the address part -- so that's the format addresses should > appear in #mailing list address file too. > > HEADERTAGVAL=`formail -rztx To:` > > ISMAILINGLIST=no > > :0 > * ? grep -i -x

[SAtalk] OT: History of Spam

2004-01-03 Thread Gary Funck
http://www.interesting-people.org/archives/interesting-people/200401/msg0002 1.html Delivered-To: [EMAIL PROTECTED] Date: Fri, 02 Jan 2004 23:14:27 -0500 From: Jonathan B Spira <[EMAIL PROTECTED]> Subject: History of Spam To: [EMAIL PROTECTED] Dave, hi and Happy New Year! Our report on the his

RE: [SAtalk] blacklisting in procmail (was: Resource conservation)

2004-01-03 Thread Gary Funck
> > > About the quiet parameter -- maybe Procmail ignores failed match > > output. > > Yup, but why generate output when you don't need it? > follow-up: the '-q' switch to grep, apart from being 'quiet' causes grep to stop immediately when a match is found. Therefore, it is also more efficien

RE: [SAtalk] blacklisting in procmail (was: Resource conservation)

2004-01-03 Thread Gary Funck
> From: Bryan Hoover > Sent: Saturday, January 03, 2004 2:46 AM > [...] > Bob Proulx wrote: > > > > Bryan Hoover wrote: > > > HEADERTAG=From > > > ADDRESSFILE=/usr/home/bhoover/listreply > > > > Use $MAILDIR here? > > > > ADDRESSFILE=$MAILDIR/listreply If you use $MAILDIR, there is no reason t

RE: [SAtalk] procmail

2004-01-02 Thread Gary Funck
> -Original Message- > From: Jack Gostl > Sent: Friday, January 02, 2004 7:00 PM > To: Douglas Kirkland > Cc: [EMAIL PROTECTED] > Subject: Re: [SAtalk] procmail > > > > > On Friday 02 January 2004 17:26, Jack Gostl wrote: > > > This is really more of a procmail question, but its part o

RE: [SAtalk] Spell Checking the Subject Header (RESULTS)

2003-12-31 Thread Gary Funck
Building on Adam's perl script, this rendition will print the words it sees which begin with rare tuples. my (@rare_tuples) = qw/bb bc bd bf bg bh bj bk bm bn bp bq bs bt bv bw bx bz cb cc cd cf cg cj ck cm cn cp cq cs ct cv cw cx db dc dd df dg dj dk dl dm dn dp dq ds dt dv dx dz eh ez fb fc fd

RE: [SAtalk] running SA on existing mail spools

2003-12-30 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Joey > Netterville > Sent: Sunday, December 28, 2003 1:55 PM > To: Dave Kliczbor > Cc: Joey Netterville; [EMAIL PROTECTED] > Subject: Re: [SAtalk] running SA on existing mail spools > > > okay, that was a

[SAtalk] suggestion: mail_grep - a tool for scanning e-mail

2003-12-30 Thread Gary Funck
In ferreting out spam phrases, bad URL's, and in scanning e-mail in general, I think it'd be convenient if there were a grep-like utility that understood e-mail. Let's call it mail_grep. Mail_grep would be able to scan e-mail (in mbox format at a minium) for occurrences of a given string. To do thi

RE: [SAtalk] Re: False positives

2003-12-29 Thread Gary Funck
> From: JRiley > Sent: Monday, December 29, 2003 9:43 PM > [...] > > > > The offending sentence is "We are an online discussion group in > > GA for parents and caregivers of children and young adults with > > disabilities." Sounds really pornographic, doesn't it? [...] > > There are several othe

RE: [SAtalk] remove markup question and bayes question

2003-12-28 Thread Gary Funck
> -Original Message- > From: S. M. C. Butler > Sent: Sunday, December 28, 2003 7:51 PM > [...] > [Simon] I get about 50 spams a day and maybe 10 regular emails of which > 4 are under the -1 threshold for ham. It's going to be somewhat > difficult to get even close to parity for my spam/

RE: [SAtalk] Some spam getting very low scores despite consistently using sa-learn

2003-12-28 Thread Gary Funck
> -Original Message- > From: Ricardo Kleemann > Sent: Sunday, December 28, 2003 9:56 AM [...] > I've placed a tarball at: > > www.americasnet.com/spam_samples/spam_samples.tgz > > If anyone would be kind enough to take a look and give me > some pointers on how I could improve my SA conf

RE: [SAtalk] running SA on existing mail spools

2003-12-28 Thread Gary Funck
> -Original Message- > From: Joey Netterville > Sent: Friday, December 26, 2003 12:44 PM > > i'm running spamassassin and it works wonderfully on new, incoming email. > i'm an administrator hoping to implement this on my machine and give users > the ability to start filtering their email.

RE: [SAtalk] Ebay spoof?

2003-12-26 Thread Gary Funck
> -Original Message- > From: Martin Radford > Sent: Friday, December 26, 2003 4:21 PM > > > At Fri Dec 26 22:31:27 2003, Gary Funck wrote: > > > > As far as 66.135.209.220 goes: > > > > # dig -x 220.209.135.66 +recursive > > You

RE: [SAtalk] Re: Ebay spoof?

2003-12-26 Thread Gary Funck
> -Original Message- > From: Bryan Hoover > Sent: Friday, December 26, 2003 2:52 PM [...] > > Original message attached -- pretty much the same I think, as that > pasted was from a straight 'cat /var/mail/bhoover', but your 'dig' is > interesting. Interesting. The message body appeared a

RE: [SAtalk] Ebay spoof?

2003-12-26 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Bryan > Hoover > Sent: Friday, December 26, 2003 1:59 PM > To: [EMAIL PROTECTED] > Subject: [SAtalk] Ebay spoof? > > > Does anyone know if the following mail is an ebay spoof? I think I've > got an accou

RE: [SAtalk] The first spam to make it through since Friday...

2003-12-24 Thread Gary Funck
Time to update your BigEvil.cf list: http://www.merchantsoverseas.com/wwwroot/gorilla/bigevil.cf Latest entry: uri BigEvilList_191 /\b(?:starterz\.net|linksss\.com|53x\.net|savvypurchaser\.com|hurricane\ -map\.com|webcastingsales\.com|thisishow2market\.com|Opportunity\.com|66\.98 \.194\.243

RE: [SAtalk] Single image spams with random info

2003-12-23 Thread Gary Funck
Try this recent thread: http://marc.theaimsgroup.com/?t=107208359200023&r=1&w=2 > From: Greg Webster > Sent: Tuesday, December 23, 2003 12:05 PM > To: [EMAIL PROTECTED] > Subject: [SAtalk] Single image spams with random info > > > We're getting a TON of these, all of similar format. > > > hr

RE: [SAtalk] Amuseing hidden text in spam

2003-12-22 Thread Gary Funck
> > > And, anyone know what the x-stuff-for-pete I often see in > spam is from > > ? > > > Ask Pete. :) After some Googling, it seems that this spammer has it in for [EMAIL PROTECTED] Take a look at news.admin.net-abuse.sightings, and you'll see that Pete was posting a lot of these spam samp

RE: [SAtalk] Image-only spam

2003-12-21 Thread Gary Funck
Try Fred's rules, http://www.merchantsoverseas.com/wwwroot/gorilla/90_FVGT.cf esp. this combo image only rule: metaFVGT_combo_IMAGEONLY1 ((HTML_IMAGE_ONLY_02 + MIME_HTML_ONLY + MIME_HTML_ONLY_MULTI) > 1) describeFVGT_combo_IMAGEONLY1 FVGT - Image only type spam? score

RE: [SAtalk] MSGID_FROM_MTA_SHORT problem

2003-12-21 Thread Gary Funck
Known problem/bug? http://bugzilla.spamassassin.org/show_bug.cgi?id=2311 If it is not working for you, you can disable the test by setting its score to 0: SCORE MSGID_FROM_MTA_SHORT 0 in your user_prefs, or local.cf depending upon whether you're running as a user or a sysadmin. > -Original

RE: [SAtalk] a few newbie questions

2003-12-21 Thread Gary Funck
Couple of things: 1) this SA Talk list can help you more if you attach an example of the offending message, complete with headers (all headers, unchanged). That way we can run it through our collection of tweaked rules, and let you know what's working for us. , 2) Mosie on over to Chris Santerre's

[SAtalk] auto whitelist questions

2003-12-20 Thread Gary Funck
[I'm reposting this question. I think it might've gotten lost in the many messages over the past few days.] Hello, I've been using auto whitelist for a while now, but today while doing some experimentation I'm wondering if the explicit (auto) white listing feature is working at all (version 2.

RE: [SAtalk] Possible rule for IE_Redirect exploit

2003-12-19 Thread Gary Funck
> -Original Message- > From: Bill Larson > Sent: Friday, December 19, 2003 9:31 AM > > Comments and suggestions on this rule are appreciated. > > full LOCAL_IEREDIR /[EMAIL PROTECTED](\/|htm|html|php|shtml)?/ > score LOCAL_IEREDIR 150 > describe LOCAL_IEREDIR Possible phishing/URL Mask

RE: [SAtalk] OT: Spam: Behind the scenes

2003-12-19 Thread Gary Funck
> -Original Message- > From: Chris Santerre > Sent: Friday, December 19, 2003 8:54 AM > To: 'Gary Funck'; Spamassassin List > Subject: RE: [SAtalk] Spam: Behind the scenes > > > *snip* > > From: [EMAIL PROTECTED] > > Sent: Thursday, Jun

[SAtalk] Another method for letting users tag false postives/negatives

2003-12-19 Thread Gary Funck
Something Robert N. said on sa-dev, " ... basically reconstructing a message that I'm forwarding myself in order to have it "unlearnt" ... " led me to think that there might be an easy to use method for users to tag mis-classified messages from within their (MIME-aware, html-capable) mail clien

RE: [SAtalk] IE redirect rule

2003-12-18 Thread Gary Funck
There was a lengthy thread on this topic recently, http://marc.theaimsgroup.com/?t=10711571901&r=1&w=3 but it didn't look like it concluded with a definitive rule definition. --- This SF.net email is sponsored by: IBM Linux Tutorials. Bec

[SAtalk] auto whitelist questions

2003-12-18 Thread Gary Funck
Hello, I've been using auto whitelist for a while now, but today while doing some experimentation I'm wondering if the explicit (auto) white listing feature is working at all (version 2.61)? I'm also unsure of the exact syntax for explicitly (auto) white listing an address. I begin by trying to

[SAtalk] Spam: Behind the scenes

2003-12-18 Thread Gary Funck
http://blog.seattlepi.nwsource.com/microsoft/archives/001161.html December 18, 2003 Spam: Behind the scenes Ever wonder about the mindset of the people who clog your inbox with unsolicited e-mail? The lawsuits filed today by Microsoft and the New York Attorney General against an alleged spam ring

RE: [SAtalk] regex expansion tool

2003-12-17 Thread Gary Funck
I think I got the original pointer from this list; maybe this program is what you're looking for? http://weitz.de/regex-coach/ Abstract The Regex Coach is a graphical application for Linux and Windows which can be used to experiment with (Perl-compatible) regular expressions interactively. It ha

RE: [SAtalk] bigevil 2.04 posted

2003-12-17 Thread Gary Funck
Hi Chris, welcome back. I've been running with the prior version of BigEvil, and their working great. Thanks for all your hard work. quick question: > For fun, check out http://www.rollie.biz/ , yeah that IP got a listing in > my firewall now. When you say "firewall", above, does that mean in

RE: [SAtalk] Test hit results report or log

2003-12-16 Thread Gary Funck
> From: Chris A > Sent: Tuesday, December 16, 2003 9:35 PM > > Is there a way to get a report or log of the test > results hits that spamassasin finds. The idea is I > want to better fine tune the values assigned to cretin > tests. However it is hard to narrow down just which > test are getting hi

RE: [SAtalk] Received headers disappearing...?

2003-12-15 Thread Gary Funck
With Larry's help, and a quick read of "perldoc Mail::SpamAssassin::Conf", I determined that I need to add the following to local.cf to get the behavior that I'm looking for: report_safe_copy_headers Received > -Original Message- > From: Larry Rosenman > Sent: Monday, December 15, 2003 9:

RE: [SAtalk] Received headers disappearing...?

2003-12-15 Thread Gary Funck
> -Original Message- > From: Brad Koehn > Sent: Monday, December 15, 2003 9:26 AM > To: [EMAIL PROTECTED] > Subject: Re: [SAtalk] Received headers disappearing...? > > > Oh boy am I dumb! > > Never mind, the Received headers are correct, it's the attached spam > message that shows the ful

RE: [SAtalk] Mysterious SA tags in SPAM message?

2003-12-13 Thread Gary Funck
Teergruber? OK, I had to ask Google about this term of art: http://www.iks-jena.de/mitarb/lutz/usenet/teergrube.en.html > From: Nix > Sent: Saturday, December 13, 2003 2:49 PM [...] > > It makes a lot more sense to teergrube the buggers and /dev/null the > results, I'd say. > > (Anyone got a

RE: [WL] [SAtalk] Re: YO DEVELOPER! SA Rule COUNTER?

2003-12-13 Thread Gary Funck
> -Original Message- > From: Charles Gregory > Sent: Saturday, December 13, 2003 12:56 PM [...] > > On Sat, 13 Dec 2003, Bryan Hoover wrote: > > > But if we were able to check the COUNT of how many times a particular > > > rule was matched, we could easily distinguish runaway use of > o

RE: [SAtalk] Mysterious SA tags in SPAM message?

2003-12-12 Thread Gary Funck
> > > > Ah. That's what I was starting to think as I typed up my > > original message. > > > > I agree. It seems funny to do a check for SPAM and not do > > any sort of check for open relay. I'm no expert on Received headers, but: Received: from 212.214.136.47 (EHLO smtp-fe2.ballou

[SAtalk] RE: [RD] raw/rare/folded/plain/alphed body/subject rende ring streams

2003-12-11 Thread Gary Funck
> > > One implementation might be to convert the rewrite rules into an > > equivalent flex description, and let flex generate the automaton in > > C. Compile the C, and build a Perl binding to it. Scott replied: > I considered that and did a prototype (which was useful for > performance estimate

RE: [SAtalk] Re: [RD] raw/rare/folded/plain/alphed body/subject rende ring streams

2003-12-11 Thread Gary Funck
> -Original Message- > From: Scott A Crosby > Sent: Thursday, December 11, 2003 6:49 AM [...] > > The major catch with this particular implementation is that it cannot > deal with nondeterministic transformations. What this means is that > any consequent for a substitute rule must be a si

RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rendering streams

2003-12-10 Thread Gary Funck
Soundex might be a practical solution. Perhaps a manageable approach is to first apply a spelling check using both a regular dictionary and augmenting it with a set of spammer mis-spellings. Then, send the output of that step into Soundex. The Soundex is a heuristic for catching the creative alter

RE: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Gary Funck
Hi Theo. > -Original Message- > From: Theo Van Dinter [mailto:[EMAIL PROTECTED] > Sent: Wednesday, December 10, 2003 2:02 PM > To: Gary Funck > Cc: Spamassassin List > Subject: Re: [SAtalk] non-numeric atime in Bayes db? (SA 2.61) > > > On Wed, Dec 10, 200

RE: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Gary > Funck > Sent: Wednesday, December 10, 2003 8:49 AM > To: Spamassassin List > Subject: [SAtalk] non-numeric atime in Bayes db? (SA 2.61) > > > > Hello, > &g

RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rendering streams

2003-12-10 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Gary > Funck > Sent: Wednesday, December 10, 2003 1:09 PM > To: [EMAIL PROTECTED] > Subject: RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject > rendering streams >

RE: [SAtalk] [RD] raw/rare/folded/plain/alphed body/subject rendering streams

2003-12-10 Thread Gary Funck
> -Original Message- > From: SpamTalk > Sent: Wednesday, December 10, 2003 12:49 PM > > It would seem to me that, for purposes of rule simplification, that the > subject and body of messages to be scanned should be available in > pre-processed flavors, some of which is currently availabl

[SAtalk] RE: [RD] Obfuscation by Punctuation

2003-12-10 Thread Gary Funck
> -Original Message- > From: Greg Webster > Sent: Wednesday, December 10, 2003 11:45 AM > > > Here's what I've recently done: > rawbody GWW_PUNCT /([a-z][:punct:]+[a-z])|( [A-Z][:punct:]+[a-z])/i > score GWW_PUNCT 2.0 > > It's not perfect, but it does the job. I think that pattern is g

RE: [SAtalk] Obfuscation by Punctuation

2003-12-10 Thread Gary Funck
> -Original Message- > From: Brad Wilkin > Sent: Wednesday, December 10, 2003 9:23 AM [...] > Has anyone had success writing tests that can catch this sort of > trickery? It > seems if you could come up with a level of punctuation WITHIN > words or simply > remove common punctuation from

RE: [SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Gary Funck
> > should I try an 'sa-learn --rebuild' at this point? > > follow-up. 'sa-learn --rebuild' just printed out more of these messages: > /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf.pm line 362. > Argument "" isn't numeric in numeric lt (<) at > /usr/lib/perl5/site_perl/5.8.0/Mail/Spam

[SAtalk] non-numeric atime in Bayes db? (SA 2.61)

2003-12-10 Thread Gary Funck
Hello, after running a spam refiling script which invokes 'spamassassin -r', I received the following diagnostics: /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/Conf.pm line 362. Argument "" isn't numeric in numeric lt (<) at /usr/lib/perl5/site_perl/5.8.0/Mail/SpamAssassin/BayesStore.pm line

RE: [SAtalk] OT Help: Mail form CGI script?

2003-12-09 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Evan > Platt > Sent: Tuesday, December 09, 2003 1:05 PM > To: SpamAssassin > Subject: [SAtalk] OT Help: Mail form CGI script? > > > I'm in need of a decent CGI script, easily implementable by a somewhat

RE: [SAtalk] Content Analysis

2003-12-09 Thread Gary Funck
Interesting mail header on this message: Return-Path: <[EMAIL PROTECTED]> Received: from intrepid.intrepid.com (intrepid.intrepid.com [192.195.190.1]) by screamer.intrepid.com (8.12.8/8.12.8) with ESMTP id hB9H4rhE029668 for <[EMAIL PROTECTED]>; Tue, 9 Dec 2003 09:04:53 -0800 Rece

RE: [SAtalk] Delete mail with a score above n

2003-12-08 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Bob > Apthorpe > Sent: Monday, December 08, 2003 9:45 AM > To: [EMAIL PROTECTED] > Subject: Re: [SAtalk] Delete mail with a score above n > > > On Mon, 8 Dec 2003 12:25:56 -0500 "Bill" > <[EMAIL PROTECTED

RE: [SAtalk] Negative score redux

2003-12-08 Thread Gary Funck
> From: Matt Kettler > Sent: Monday, December 08, 2003 7:55 AM [...] > If you're doing a procmail/MDA type delivery, you can make accounts for > each user and start passing -u to spamc, and have the > all_spam_to's in each user's home directory. > > Other than that, there's not much you can do. I

RE: [SAtalk] filtering spam tagged email before hitting exchange 2000

2003-12-08 Thread Gary Funck
> -Original Message- > From: Ralf Hildebrandt > Sent: Monday, December 08, 2003 6:52 AM [...] > > Use Postfix+amavisd-new+SpamAssassin -- and user D_DISCARD to drop the > spam on the gateway. What are the pros/cons of using sendmail+MIMEDefang+SA versus Postfix+amavisd-new+SpamAssassin?

RE: [SAtalk] Being BigEvil inspired...

2003-12-05 Thread Gary Funck
> > > > > > But incoming messages encoded in Base64 containg links with > > the above > > > domains are not recognized ?? Why ?? > > one word: rawbody > > > > LER > > > > OUCH! That makes a lot of sense. Hmmm.Should I change bigevil to > URI??? > A related question: when you scan the mess

RE: [SAtalk] How to fix ?

2003-12-04 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of Gary > Lopez > Sent: Thursday, December 04, 2003 9:45 AM > To: Matt Kettler > Cc: [EMAIL PROTECTED] > Subject: Re: [SAtalk] How to fix ? > > > > > Matt Kettler wrote: > > > At 06:38 PM 12/3/2003, Gar

RE: [SAtalk] Spam Statistics

2003-12-04 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > Sent: Thursday, December 04, 2003 7:45 AM > To: Rubin Bennett > > > Assuming my minor tweaks to the original script I saw posted here are > correct, here are my latest spam stats.. *sheesh* > > Mail Statistics; >

RE: [SAtalk] empty e-mails

2003-12-02 Thread Gary Funck
As it stands, this recipe only checks that the there is a line whose first charcater is not a space or a tab. And if this sort of mesage is found, it is deposited in the spam folder. Likely not what was intended. > -Original Message- > From: Aaron Young > Sent: Monday, December 01, 2003

RE: [SAtalk] .procmailrc

2003-12-02 Thread Gary Funck
In addition, this looks wrong: > :0: > * ^X-Spam-Flag: YES > /Spamfolder It looks as if you're trying to deposit the spam into a folder that lives just below the root directory. Perhaps you meant: :0: * ^X-Spam-Flag: YES Spamfolder > -Original Message- > From: Martin Radford > Sent: Mon

RE: [SAtalk] Another Procmail question

2003-11-29 Thread Gary Funck
1. There is a procmail mailing list, where questions like this are discussed. Warning: although not as cranky as the sendmail list, this group will encourage you to RTFM before posting your question. see http://mailman.rwth-aachen.de/mailman/listinfo/procmail for details. 2. Given tha

[SAtalk] (humor) welcome to Spam University

2003-11-05 Thread Gary Funck
A friend sent me this one: http://j-walk.com/other/spamu/index.htm Not just an education... a career." Welcome to Spam University, the world's top-rated educational institution for the growing spam industry. Are you tired of your dead-end job? Want to make some big-time cash without actually wo

[SAtalk] Easy way to join two Bayes databases?

2003-10-19 Thread Gary Funck
My personal Bayes database is more up-to-date than a co-worker's. I'd like to share the database with him, but since he likely also has Bayes entries which are unique to his own mix of ham and spam, I was wondering if there might be some tool/trick for merging the databases? Is this a meaningful o

RE: [SAtalk] image only porn

2003-10-13 Thread Gary Funck
> -Original Message- > From: John Scully > > I have been running spamassassin 2.55 for some time (am about to go to > 2.60). > I'd recommend trying out 2.60, and seeing if things get better. In particular, it has a pattern which matches HTML redirection through a a Yahoo! site. Last ti

RE: [SAtalk] Scan Message Max Size

2003-09-19 Thread Gary Funck
> -Original Message- > From: Tom Meunier > Sent: Friday, September 19, 2003 1:20 PM > To: Spamassassin List > Subject: RE: [SAtalk] Scan Message Max Size > > > > > > Define "near". The latest Microsoft update spoof is about 155K. > > > > That'd be like that New Shimmer! It's a viru

RE: [SAtalk] Scan Message Max Size

2003-09-19 Thread Gary Funck
> -Original Message- > From: Tom Meunier > Sent: Thursday, September 18, 2003 1:44 PM > To: [EMAIL PROTECTED] > Subject: RE: [SAtalk] Scan Message Max Size > > > Define "safe" - I stick with the default of 250kb and have never had > an issue with it. I can't see receiving a spam anywh

RE: [SAtalk] System goes down

2003-09-02 Thread Gary Funck
You might be experiencing hardware problems that only occur under load. SA/spamd uses a lot of cpu and memory cycles. If it runs long enough, on large messages, it might push the cpu past its operating range temperature, if for example, the system cooling is marginal. Likewise, marginal memory mig

[SAtalk] A quick look at differences between 2.55 and 2.60 (rc4) - rules/scores

2003-09-01 Thread Gary Funck
[below, words containing a well-known spam word, were changed to NIAGRA, in order to make it past Source Forge's lame spam filters.] I was curious to get a feeling for the differences bwtween the 2.55 release and the upcoming 2.60 release, and gathered the following brief statistics. New rules in

RE: [SAtalk] big attachments taking too long to process

2003-08-25 Thread Gary Funck
> -Original Message- > From: Bart Schaefer > Sent: Sunday, August 24, 2003 3:02 PM > > > On Sun, 24 Aug 2003, Gary Funck wrote: > > > # Otherwise, just test an excerpt, and deliver spam > > # directly into big-spam.mbox. > > :0E: > &

RE: [SAtalk] big attachments taking too long to process

2003-08-24 Thread Gary Funck
> > The procmail rule might look like this: > > # Filter small messages the regular way > :0fw:spamassassin.lock > * ! > 14999 > | spamassassin > > # Otherwise, just test an excerpt, and deliver spam > # directly into big-spam.mbox. > :0E: > ? (head -c 7500; echo ""; tail -c 7500) | spamassassin -

RE: [SAtalk] Spam filters rated on Slashdot

2003-08-24 Thread Gary Funck
The test was uniformily unfair. The author trained the Bayesian spam detector programs with something like 70 messages of spam and ham each. Thus, even if SA had been run with Bayes enabled, it might've yielded atypical results that fell short of how SA might perform in a production environment. As

RE: [SAtalk] big attachments taking too long to process

2003-08-24 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of > Abigail Marshall > Sent: Friday, August 22, 2003 4:35 PM > To: Paul Adams > Subject: Re: [SAtalk] big attachments taking too long to process > > > > > Hello Paul, > > Wednesday, August 20, 2003, 8:37:30

RE: [SAtalk] Re: How To Change Recipient In User Unknown Message?

2003-08-19 Thread Gary Funck
> -Original Message- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf Of John > P Verel > Sent: Tuesday, August 19, 2003 6:43 AM > To: spamassassin list > Subject: [SAtalk] Re: How To Change Recipient In User Unknown Message? > > > > On 08/19/03 07:06 +0100, Yorkshire Dave wro

RE: [SAtalk] Re: help with procmail script

2003-08-18 Thread Gary Funck
> -Original Message- > From: John P Verel > Sent: Monday, August 18, 2003 1:24 PM [...] > > I do this, use the following, as I prefer to not have [SAtalk] on the > subject line. This recipe strips out the string and sends the message > along to the Spamassassin_talk folder. You can modi

RE: [SAtalk] Default Bayes scoring, and default cutoff value - too many false positives

2003-08-14 Thread Gary Funck
> -Original Message- > From: Robert Menschel > Sent: Tuesday, August 05, 2003 8:29 PM [...] > > Of those 1100 messages, how many were spam, and how many were ham? I > don't think I've seen more than a half dozen FPs in any *month*, much > less a day. > > GF> Generally, I'm using SA in loc

RE: [SAtalk] SA finally working... but now it needs to learn, how?

2003-08-14 Thread Gary Funck
Angel, what did you do to fix the problem you were having with procmail? > From: Angel Gabriel > Sent: Monday, August 11, 2003 12:43 PM > > Now that my mail is actually getting filtered [...] --- This SF.Net email sponsored by: Free pre-bu

  1   2   >