[SAtalk] [RD] Justified text

2004-01-28 Thread Regis Wilson
Got some new variants on the "justified text" ratware. By going to 66 chars, they have slipped through the rule. So I've fixed it up a bit. Please test and let me know, etc. full JUSTIFIED_TEXT /(\n.{60,75}=){4}/ describe JUSTIFIED_TEXT Body uses 60-75 char wide, justified text score JUSTIF

Re: [SAtalk] [RD] spammers write rules for us

2004-01-28 Thread Regis Wilson
>>MESSAGEID_RATWARE -- 13479s/973h of 91714 corpus (74113s/17601h) 01/22/04 >> >>Hits almost a thousand ham here. 93% of the hits are spam, which is very >>promising. Tighten up the rule a little bit, and we'll probably have a >>winner. >> >header MESSAGEID_RATWARE \ > Message-ID: =~ /<[A-Z0

Re: [SAtalk] Recieved From database

2004-01-28 Thread Regis Wilson
>A friend of mine also has suggested the following (the coding is my own, >so if it doesn't work, I've poorly implemented the suggestion): > > header SYL_BAD_XOIP X-Originating-IP !~ /\[?(\d{1,3}\.){3}\d{1,3}\]?/ > As was noted before, a negative test will false-positive when there is no X-O-IP

Re: [SAtalk] [RD] spammers write rules for us

2004-01-26 Thread Regis Wilson
>From [EMAIL PROTECTED] Thu Jan 22 20:18:10 2004 Date: Thu, 22 Jan 2004 20:18:08 -0800 From: Robert Menschel <[EMAIL PROTECTED]> To: Regis Wilson <[EMAIL PROTECTED]> CC: [EMAIL PROTECTED] Subject: Re: [SAtalk] [RD] spammers write rules for us >Hello Regis, >RW> Got a

[SAtalk] [RD] spammers write rules for us

2004-01-22 Thread Regis Wilson
Got a spam that's so easy, the spammers write the rules for us: Message-ID: <[EMAIL PROTECTED]> So, header MESSAGEID_RATWAREALL =~ /\nMessage-ID:.<[^-]{7,13}-[^-]{3,11}-[^-]{2,6}/i describe MESSAGEID_RATWARE Message-ID has ratware pattern score MESSAGEID_RATWARE 0.5 I've s

[SAtalk] 7245 Habeas and counting

2004-01-19 Thread Regis Wilson
I ran a report looking for HABEAS_SWE matches and got 7245 to date. Am I really going to report all of them? No, not on your life. I've sent in two; I've done my duty. The Habeas mark is actually a scam on everyone: Habeas customers don't really have any protection for their mail; spammers don

[SAtalk] Habeas SWE violation

2004-01-12 Thread Regis Wilson
I received the following spam that does not comply with the Habeas agreements: >From [EMAIL PROTECTED] Mon Jan 12 03:38:18 2004 Received: (from [EMAIL PROTECTED]) by replaced for replaced Date: Mon, 12 Jan 2004 03:38:18 -0800 (PST) Received: from ([142.177.249.186]) by replaced a

Re: [SAtalk] Subject contains username

2004-01-11 Thread Regis Wilson
>From [EMAIL PROTECTED] Wed Dec 31 18:12:45 2003 Date: Wed, 31 Dec 2003 21:12:44 -0500 From: Theo Van Dinter <[EMAIL PROTECTED]> To: Regis Wilson <[EMAIL PROTECTED]> Cc: [EMAIL PROTECTED] Subject: Re: [SAtalk] Subject contains username >On Wed, Dec 31, 2003 at 08:03:19AM -

[SAtalk] Subject contains username

2003-12-31 Thread Regis Wilson
I get a lot of spam with the username (even the whole email address) in the subject line. To wit: To: [EMAIL PROTECTED] Subject: user find your auto To: [EMAIL PROTECTED] Subject: At last, secrets of the rich finally revealed user To: [EMAIL PROTECTED] Subject: [EMAIL PROTECTED], grow your manh

[SAtalk] RD: X-Originating-IP

2003-12-28 Thread Regis Wilson
Recently, one of the idiot tags by some ratware have been using the X-Originating-IP: header in the style of [somdhost.comIP] or similar. Which got me to writing the following: header __HAS_XOIP exists:X-Originating-IP describe __HAS_XOIP Header contains X-Originating-IP

Re: [SAtalk] new spamming techniques are flooding me. Any suggestions?

2003-12-19 Thread Regis Wilson
I got several of these from different IPs recently. There were several key identifiers that gave it away to me: >Subject: Re: IXCH, the sky over > For example. One slipped through that was pretty dumb but very easy to figure out: >Subject: Re: %RND_UC_CHAR[2-8] > See, the spammers are so dumb

RE: [SAtalk] RD: "justified" HTML

2003-12-16 Thread Regis Wilson
Based on the excellent feedback, the latest generation of rule looks like: full JUSTIFIED_TEXT /(\n.{74}=){4}/ describe JUSTIFIED_TEXT Body uses 74 char wide, justified text score JUSTIFIED_TEXT2.0 There are a few false positives, so your site may reduce the score as needed. Of course, wi

[SAtalk] RD: "justified" HTML

2003-12-12 Thread Regis Wilson
One type of spam I receive is a weirdly formatted HTML that is exactly 74 characters wide with an "=" at the end of each line. This may be in a normal speicification. You can set your own scores. For me, the score has been wildly successful, 3000 spam in a week, no ham. full JUSTIFIED_HTML

RE: [SAtalk] [RD] Help with Subject rule

2003-12-09 Thread Regis Wilson
>What I'm looking for are subject headers as shown below: > >Subject: =?us-ascii?B?MCBNZW4sIGl0IHJlYWxseSB3b3JrcyEgZnA=?= iwsgfb > I've posted some examples. You need to do it thus: header T_SBJT_ENC ALL =~ /\nSubject:.[etc.]/i It works for me, but not always (I don't use + or *, I always limit

Re: [SAtalk] 'Windows-1251' in subject line.

2003-11-24 Thread Regis Wilson
>Okay, not to pick on Miroslav, but, here is a case where a legitimate >English language e-mail has 'Windows-1251' embedded in the subject line. > >> =?windows-1251?Q?Re:=A0[SAtalk]=A0[OT]=A0Switching=A0OS=A0for=A0Gateway?= > Here is the rule I came up with. You will want to put your own score

[SAtalk] RD: make your own RBL

2003-11-24 Thread Regis Wilson
This may or may not be of use to people. I don't have access to the RBLs due to firewall configuration. The firewall doesn't give any lookups in the received line, so I have to extract the IPs and then count them up. This script will spit out a set of rules for the "largest" offenders. Some note

RE: [SAtalk] [RD] second weeds set

2003-11-20 Thread Regis Wilson
>So, while I do not like the inefficiency of the iterations, the >effectiveness on multiple levels is excellent. The only FPs I am seeing > I think you can see why we desperately need to try to get an accumlator rule for spamassassin. Multiple rule hits add up, we need it! -

[SAtalk] Spam nets

2003-11-19 Thread Regis Wilson
Just thought I'd share. I don't have access to the rbls so I did a grep on two day's worth of spam IPs (received: line from mailserver). I have this: no. spam network 7849 24.0.0.0 6819 68.0.0.0 4723 66.0.0.0 4200 200.0.0.0 3139

RE: [SAtalk] [RD] simple rule for consumption

2003-11-12 Thread Regis Wilson
>I had several false positives today based on the BAD_X_HEADERS rule. I'm >using the rules from Chris' site (Nov02). The legitimate emails had an >"X-URL" header. All of the FPs where from a single mailing list. For what >ever reason, they are providing a valid link to some content within this >

[SAtalk] Combining spamc options

2003-11-12 Thread Regis Wilson
I would like to combine the two options "-c" and "-y" for spamc. I would like to retain the functionality of printing the hits/required score and setting the exit code while also printing out the rules hit. I want to log the hits to the rules for accounting purposes, but spamc does not appear to

[SAtalk] Spamproxyd.pl queue

2003-11-05 Thread Regis Wilson
I am contemplating using the spamproxyd.pl code for spam filtering. It seems to function, but I am trying to setup spam filtering on a dedicated machine The problem is that the mail server may go down or refuse connections and I want the spamproxyd.pl code to either queue or retry during these dow

RE: [SAtalk] [RD] simple rule for consumption

2003-10-23 Thread Regis Wilson
>Nope these are bogus. I have seperate rules for them in the last Rule >Emporeum update. I used seperate, as they often are seen in pairs. Although >I didn't tag X-Email, because I'm not sure about that one. > X-Email: is pretty spammy for me, so it is in there. I grepped my corpus for X-headers

[SAtalk] [RD] simple rule for consumption

2003-10-22 Thread Regis Wilson
Recently had some false negatives come through. Most of them are one sentence saying hello, and a URL. I noticed some strange headers, listed here: X-E: X-I: X-ENVID X-Email So I wrote a quickie rule and it catches about 126 spam per week. header BAD_HEADERS ALL =~ /X-(?:E|Email|E

[SAtalk] Whitelisting problems

2003-10-22 Thread Regis Wilson
The problem is allowing end users to manage their whitelists without intevention and without using vi :) One solution was to export home directories via Samba, this was shot down due to problems with cr/lf, typographical errors that casue spamassassin to choke, etc. Some steps in the right direct

[SAtalk] Default negative scores

2003-10-06 Thread Regis Wilson
I got another spam through last week, and it's funny because one spam now annoys me whereas I used to get 10-12 per day. :) Anyway, the spam claimed to be from paypal and I got several copies at several accounts, so I assume it was widespread any everybody knows about it. I reported it to paypal

[SAtalk] Spam and bounces

2003-09-19 Thread Regis Wilson
I have an interesting one: Is it possible to not bounce mail that is marked as spam? Unfortunately, I believe spamassassin is usually called by procmail and thus would not be possible. Maybe as a milter? But I have read bad things about the milters. Thanks for any helpful replies. --

RE: [SAtalk] Rule frequencies

2003-09-12 Thread Regis Wilson
r! I want to make sure I'm >reading this correct. I'll have to change '/ham/or/spam/file' to my traps. >So I have to run this once for each, correct? > >I don't call this with any parameters, right? The $2 thru me for a loop >here. I'm pretty sure that is th

Re: [SAtalk] Site-wide spam traps

2003-09-11 Thread Regis Wilson
Matt Tencati writes: > We receive lots of SPAM that is addressed to several users at our site some of which > are > old and basically spam traps. I'd like for that message to be tagged as spam for all > users not just for the spamtrap delivered email. > > Does this make sense and if so is anyone

[SAtalk] Rule frequencies

2003-09-11 Thread Regis Wilson
I wrote a quick and dirty hack script to count the number of rules that match in a Spam or Ham file. There are probably better reporting tools. There are probably better ways to do this, as well. The output is comma-delimited, you can import into Excel or similar for sorting, further processing.

[SAtalk] Multi-match?

2003-09-03 Thread Regis Wilson
I have a rule for HTML comments much like: rawbody HTML_COMMENTS // score HTML_COMMENTS 0.05 But I want them to add up. Five matches is .25, ten matches is .5, etc. Is there a special way to make each match count separately? --- This sf.net

Re: [SAtalk] Mail arrival time may be a criteria

2003-08-01 Thread C. Regis Wilson
>Graphing spam and ham amounts, with the help of spamstats >(http://www.gryzor.com/tools), I notice the legitimate >emails mostly arrive from about 9am, until maybe 7pm, on >working days. > I have written about this and others too. The correct procedure is to submit some proposed code to the dev

Re: [SAtalk] those pesky small v*agra ads

2003-08-01 Thread C. Regis Wilson
>Here's the HTML: > >font color="#ff" > This one seems easy to me, but I'm not a programmer. :) Just count the number of "font color" tags. How about a test that is even more generic by counting repeating numbers of HTML tags? EXCESSIVE_REPEATED_HTML something like that. -

[SAtalk] G-a-p-p-y text

2003-07-31 Thread C. Regis Wilson
I have been writing these rules to combine various version of words into one rule. For example: I start with /prescription/ and do some minor subs to make /pr[e3][s5]cr[i1]pt[i1][o0]n/i Then, add gappies /p.?r.?[e3].?[s5].?c.?r.?[i1].?p.?t.?[i1].?[o0].?n/i (Side question, what's a good gap

Re: [SAtalk] Two-letter domains for SA RULES

2003-07-31 Thread Regis Wilson
I have a quick question or update. I read in a perl guide that \> actually matches the end of a "word". I guess similar to \b, or something? I want to match the literal character ">", as in a rule that matches "<[EMAIL PROTECTED]>". So if the rule ends something like: >From =~ /.\>/i It s

Re: [SAtalk] Two-letter domains for SA RULES

2003-07-29 Thread Regis Wilson
>.cc is not a foreign domain, it's a commercial domain. And, just like >.com, it's being used for various purposes (ex: my home domain is >changing from domain.org to rudd.cc). > I don't care. :) --- This SF.Net email sponsored by: Free pre-bu

[SAtalk] Two-letter domains for SA RULES

2003-07-28 Thread Regis Wilson
Here's a rule I wrote to score two letter domains. I am not banning mail from foreign sites, I am only listing the ones that send us spam, and we do not ban the mail, merely tag it with a score. It may be a steep 3.0 out of 5.0, though. :) Also, note that I am not an "ugly american" because .us

[SAtalk] Time of day rule?

2003-07-23 Thread C. Regis Wilson
I would also like to see if this is a useful concept: a time of day rule that can add a score to mail sent in "the dead of night". This is probably not universally desired, and maybe should only be a localized setting. That's fine. What I'd like (for example): TIMEZONE = Eastern TIMEOFDAY_1900

Re: [SAtalk] Blacklist-to?

2003-07-23 Thread C. Regis Wilson
>If you read the bug entry, there's a work-arround in there to use a custom >rule to have the same basic functionality. > >http://bugzilla.spamassassin.org/show_bug.cgi?id=883 > This worked perfectly. Thank you very much. I did two things: Renamed it for me to LOCAL_BAD_TO_ADDRESS and also adde

[SAtalk] Blacklist-to?

2003-07-22 Thread C. Regis Wilson
There is a function for "whitelist-to" which allows mail to the person in the "to" field (not exactly, but you get my meaning). What about a blacklist-to? We have usernames that consistently show up in the to: or cc: for spam, and we know for sure any mail addressed to those users is spam (the use