Re: [SAtalk] Phoneme and Grammar anti-noise scanning ?

2004-01-21 Thread Matthew Cline
On Wednesday 21 January 2004 12:09 am, John August wrote: > This just an idea, in the tradition of 'I've got a good idea and hope > someone else will carry it through'. I don't expect it, but thought I'd > throw it in :) > > I've noticed a lot of spam which tries to dilute scanners by including a >

Re: [SAtalk] [RD] spammer reactions to antidrug (humorous)

2004-01-30 Thread Matthew Cline
On Friday 30 January 2004 05:52 pm, Kelson Vibber wrote: > At 09:04 AM 1/30/2004, Brian Godette wrote: > >Maybe they'll start writting in Middle English to target that untapped > > market of english lit majors/grads. > > I can just see it: > > "Whan thou wouldst gaine the favour of a lass > Thy Pr

[SAtalk] How to test modifications to SA?

2002-02-27 Thread Matthew Cline
I have some ideas for additions to the filtering rules, but I'd like to test their usefulness before sugesting them to the list. Ideally, I'd get the mass-checking and GA programs, run them against the archive of spam and non-spam messages that the official scores file is generated from, and s

[SAtalk] Filter ideas

2002-02-28 Thread Matthew Cline
I just started using SpamAssassin, and it's real cool. Looking through the rules, there's a some rules I've found useful in my home grown spam filter, that might also be useful in SA. - In rule LIMITED_TIME_ONLY, you might want to also search for "Limited Time Offer". - Lots of prhases I've

Re: [SAtalk] spam repositories?

2002-02-28 Thread Matthew Cline
On Thursday 28 February 2002 12:14 pm, Bill Becker wrote: > I see that a lot of people on the list have carefully enshrined their spam > into repositories, and this turns out to be a great thing for when you > want to test rules. > Are there any collections out there available for download? Note

[SAtalk] Kmail and Qmail

2002-02-28 Thread Matthew Cline
I tried setting up SpamAssassin with Kmail as the README says, and it worked just fine. However, with the KDE 2.2.2 version of Kmail, Kmail becomes unresponsive while piping through SpamAssassin, which can take a noticeable amount of time if you're checking with multiple BlackHole lists, and i

[SAtalk] SpamAssasin body results not added, but headers are

2002-02-28 Thread Matthew Cline
The following email was marked as spam, and had the X-Spam-Flag and X-Spam-Status headers set properly, but the result lines starting with "SPAM:" weren't added to the body. Received: (qmail 5437 invoked from network); 1 Mar 2002 00:41:22 - Received: from unknown (HELO sarah.freejokes4u.com)

[SAtalk] Nigerian scam filter improvements

2002-02-28 Thread Matthew Cline
I got 179 Nigerian scam message bodies (though not headers) from http://www.quatloos.com/cm-niger/cm-niger.htm, and used them to test out how SA handles them. Testing with the default 2.2 setup (taken from CVS today), 72 out of 179 is correctly tagged as spam. Overriding some of the weird ne

[SAtalk] PATCH: lib/Mail/SpamAssassin/EvalTests.pm

2002-03-01 Thread Matthew Cline
In check_for_faraway_charset(), get_decoded_stripped_body_text_array() returns a reference to an array, but are_more_high_bits_set() was being called like it was a normal strings, so are_more_high_bits_set() was never returning true. I added a line to join() the lines of the array before hand

[SAtalk] Filter idea: non-spammish mail agents

2002-03-01 Thread Matthew Cline
Here's something that I'm currently trying out: make a list of mail user agents which are unlikely to be used for spam, and lower the hits for mail that matches. Here's what I currently have: header NON_SPAM_MAILER_1X-Mailer =~ /(?:CWMail|Edsamail|eGroups|KMail|MailCity|MIME-tools|Moz

Re: [SAtalk] Filter idea: non-spammish mail agents

2002-03-01 Thread Matthew Cline
On Friday 01 March 2002 01:42 am, Nigel Metheringham wrote: > On Fri, 2002-03-01 at 08:40, Matthew Cline wrote: > > Here's something that I'm currently trying out: make a list of mail user > > agents which are unlikely to be used for spam, and lower the hits for >

[SAtalk] Idea: ignore self for auto-whitelist and identifcal to/from

2002-03-01 Thread Matthew Cline
I'm thinking of making patches to SA so that the auto-whitelist and identical to-from rules ignore messages that are from the user. This is because both rules interfere when I send myself messages to test SA, and there's some spammers who forge spam as comming from the user him/herself, so it

Re: [SAtalk] Nigerian scam filter improvements

2002-03-01 Thread Matthew Cline
On Friday 01 March 2002 05:40 am, Greg Ward wrote: > Wow, good work! One question: I have seen several "Nigerian" scams that > are actually about Zimbabwe or Sierra Leone or some other African > country. (They sound like the same scam, though.) Does this collection > include any of those? It

[SAtalk] How to use spamassassin-sightings with KMail?

2002-03-01 Thread Matthew Cline
I tried to send some false-negatives to [EMAIL PROTECTED] with KMail's "Redirect" function, but it was reject with the following diagnostics: - Your message to mail.sourceforge.net was rejected. I said:     . And mail.sourceforge.net responded with     550 rejected: there is no valid sender

[SAtalk] False FORGED_YAHOO_RCVD

2002-03-01 Thread Matthew Cline
These headers gave me a false FORGED_YAHOO_RCVD: Received: (qmail 3704 invoked from network); 2 Mar 2002 03:34:28 - Received: from n22.groups.yahoo.com (216.115.96.72) by nightrealms.com with SMTP; 2 Mar 2002 03:34:28 - X-eGroups-Return: [EMAIL PROTECTED] Received: from [216.115.97.190

Re: [SAtalk] Setting up Auto Razor Reporting?

2002-03-01 Thread Matthew Cline
When I try using razor-report from the command line, I get this from the diagnostics: debug: Razor Agents 1.20, protocol version 2. debug: Read server list from /usr/home/matt/.razor-report.lst debug: 78733 seconds before closest server discovery debug: Closest server is 64.90.187.2 debug: Agent

[SAtalk] Does "spamassasin -r" strip spam reporting?

2002-03-01 Thread Matthew Cline
Does "spamassasin -r" strip all the spam reporting stuff from the message before it's sent to Razor? Or is it merely a wrapper around "razor-report"? I'd like to use it to report spam that doesn't go above the auto-report thershhold, but I don't like having to manually strip out the spam repo

[SAtalk] Misc header filters

2002-03-01 Thread Matthew Cline
header X_ADVERT X-Advertisement =~ /./ describe X_ADVERT X-Advertisement header exists header X_AUTH_WARNING X-Authentication-Warning =~ /./ describe X_AUTH_WARNING X-Authentication-Warning header exists header DATE_WARNING Date-warning

Re: [SAtalk] False FORGED_YAHOO_RCVD

2002-03-02 Thread Matthew Cline
OK, I figured it out. It happens when someone uses Yahoo! Mail to send something to Yahoo! Groups. The "received" for Yahoo! Groups looks like: > Received: from [216.115.97.190] by n22.groups.yahoo.com with NNFMP; 02 Mar > 2002 03:35:46 - Here's a patch that should recognize this. Index:

[SAtalk] Email list rules

2002-03-02 Thread Matthew Cline
Here's a stab at some rules that attempt to detect messages from mailing lists. "List-Unsubscribe", "X-Original-Date" and "Errors-To" might all be from the same mailing list software, in which case they'd be redundant. # Only look for 7 bit chars between square brackets, becau

[SAtalk] Unitialized value in subsitution?

2002-03-02 Thread Matthew Cline
I've been getting these warning messages lately from qmail: Use of uninitialized value in substitution (s///) at /usr/home/matt/develop/spamassassin/lib/Mail/SpamAssassin/PerMsgStatus.pm line 826./Use of uninitialized value in substitution (s///) at /usr/home/matt/develop/spamassassin/lib/Mail/Sp

Re: [SAtalk] Unitialized value in subsitution?

2002-03-02 Thread Matthew Cline
It's apparently because of this rule: header PLEASE_READ /please read/i describe PLEASE_READ Please read this! Please oh please of please! I didn't put in "XYZ =~". Hmmm, maybe there should be better warning messages? Or maybe a lint for the rules files? -- Visit http://dmoz.org, the

Re: [SAtalk] Does "spamassasin -r" strip spam reporting?

2002-03-02 Thread Matthew Cline
On Saturday 02 March 2002 08:38 am, dman wrote: > macro index "!echo reporting to SA\rspamassassin -d | > spamassassin -r ; echo \"report done!\"\r" "spamassasin -d". Right. I should have RFTM'd. -- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm largest human e

[SAtalk] 0.0 scored rules

2002-03-02 Thread Matthew Cline
Looking in the scores files, I find these rules with score 0.0 score FREQ_SPAM_PHRASE 0.0 score FROM_FORGED_HOTMAIL0.0 score HUNZA_DIET_BREAD 0.0 score SEXY_PICS 0.0 score SPAM_PHRASES_020 0.0 score SPAM_PHRASES_030

[SAtalk] Attachment checking eval tests

2002-03-02 Thread Matthew Cline
I've gotten lots of spam that's only an attachment. To detect this, I've written two rawbody eval subroutines. One checks if the first part of a multi-part mail has any non-blank lines, and if it has none, it returns true; this is supposed to detect messages that are soley attachments with no

[SAtalk] MIME null block report fix

2002-03-03 Thread Matthew Cline
Here's a CVS patch which fixes the problem of the spam report being added before the first MIME part. Index: lib/Mail/SpamAssassin/PerMsgStatus.pm === RCS file: /cvsroot/spamassassin/spamassassin/lib/Mail/SpamAssassin/PerMsgStatus

Re: [SAtalk] Attachment checking eval tests

2002-03-03 Thread Matthew Cline
On Saturday 02 March 2002 09:37 pm, I wrote: > rawbody ONLY_ATTACHMENTS eval:check_for_only_attachments() > describe ONLY_ATTACHMNETS Only attachmnets, no text Ooops, spelling mistake in "describe". Should be describe ONLY_ATTACHMENTS Only attachmnets, no text > sub check_

Re: [SAtalk] Attachment checking eval tests

2002-03-03 Thread Matthew Cline
On Sunday 03 March 2002 07:54 am, Rob McMillin wrote: > You made the same spelling error twice in the original. Gah! I really must use my spellchecker more often. -- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm largest human edited web directory. | for a minut

Re: [SAtalk] 2.11 released

2002-03-03 Thread Matthew Cline
On Sunday 03 March 2002 05:58 pm, Craig R Hughes wrote: > I just pushed out the new scores (and a bugfix or two) as 2.11 > > The new scores are done by constraining the GA more, using Michael Moncur's > submitted scores as a starting point, and then hand-tweaking the output > where basically any -

[SAtalk] check_for_spam_reply_to() questions

2002-03-03 Thread Matthew Cline
check_for_spam_reply_to() uses get_address_commonality_ratio(), which checks to see how many characters the two addresses have in common. Why not compare the domains of the hosts for equality? Take the last three parts of the hostname for two letter TLDs ("foobar.co.uk") and the last two part

[SAtalk] Improvements to LINE_OF_YELLING

2002-03-04 Thread Matthew Cline
Here's my attempt at improving the LINE_OF_YELLING rule. First I changed it from a rawbody rule to a body rule. I'm not sure why it was a rawbody rule in the first place, since that would have HTML markup, non-decoded text, and such. Then I chaned it from a regular expression to an eval test

Re: [SAtalk] A better alternative to test ROUND_THE_WORLD

2002-03-04 Thread Matthew Cline
I don't know if anyone's suggested this yet, but a "optional" sub-dir could be added to the rules directory, to which a something like "20_US_centric.cf" could be put; SUBJ_FULL_OF_8BITS, ROUND_THE_WORLD and so on could be put in it. Put a prominent note of the optional directory in the README

Re: [SAtalk] Determining version

2002-03-04 Thread Matthew Cline
On Monday 04 March 2002 04:06 pm, Mike Loiterman wrote: > How do you determine what version is running? I ran it with -D but I > saw no mention of the version number. Run with the "-h" option for help, and the version number will be in the last line of the output. -- Visit http://dmoz.org, t

[SAtalk] Combined subject and body tests?

2002-03-05 Thread Matthew Cline
There's some body tests that would also work for the subject, like the CASHCASHCASH test, and I've seen some spam were the tests didn't match the body but would have matched the subject. Would it be worth it to make a body_subject test which would add the subject to the body before running the

[SAtalk] Another English-centric rule

2002-03-05 Thread Matthew Cline
For those of you who find that English-centricity helps to filter spam, here's a rule that looks for non-ASCII encoding in the subject line: header NON_ASCII_ENC_SUBJ Subject =~ /=\?(?:euc-kr|big5|iso-8859-1)\?/ describe NON_ASCII_ENC_SUBJ Non-ASCII encoded subject It just does EUC Ko

Re: [SAtalk] Another English-centric rule

2002-03-05 Thread Matthew Cline
On Tuesday 05 March 2002 02:08 am, I wrote: > header NON_ASCII_ENC_SUBJ Subject =~ > /=\?(?:euc-kr|big5|iso-8859-1)\?/ Actually: header NON_ASCII_ENC_SUBJ Subject =~ /=\?(?:euc-kr|big5|iso-8859-1)\?/i as it needs to be case insensitive. -- Visit http://dmoz.org, the world's | G

Re: [SAtalk] Another English-centric rule

2002-03-05 Thread Matthew Cline
On Tuesday 05 March 2002 07:23 am, Rob McMillin wrote: > Already exists -- see CHARSET_FARAWAY_HEADERS. Ooops, didn't realize it did that. Guess I should read all of EvalTest.pm before submitting new rules... -- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm lar

[SAtalk] NEW_DOMAIN_EXTENSIONS improvement

2002-03-05 Thread Matthew Cline
The current NEW_DOMAIN_EXTENSIONS didn't catch the last bit of doamin registry spam I got. body NEW_DOMAIN_EXTENSIONS /new\s*domain\s*extension/i The word "Internet" was inserted between "new" and "domain", so the rule wasn't triggered. This rule catches both variants: body NEW_DOMAIN_EX

Re: [SAtalk] how to take care of false positives?

2002-03-05 Thread Matthew Cline
On Tuesday 05 March 2002 06:11 pm, Ricardo Kleemann wrote: > Hi, > > I have a message from a mailing list which spamassassin flags as spam, > however it is not. > > How can I handle that? If the mailing list puts its email address in the From header, put whitelist_from [EMAIL PROTECTED]

Re: [SAtalk] Combined subject and body tests?

2002-03-06 Thread Matthew Cline
On Wednesday 06 March 2002 01:40 am, Matt Sergeant wrote: > What I suggest is that the body stripping code adds the subject header in. If you did that, then the LINE_OF_YELLING rule will get invoked whenever the SUBJ_ALL_CAPS is invoked. I assume that there should be two lots-of-caps rules, o

Re: [SAtalk] Combined subject and body tests?

2002-03-06 Thread Matthew Cline
On Wednesday 06 March 2002 01:40 am, Matt Sergeant wrote: > What I suggest is that the body stripping code adds the subject header in. Also, if the hits threshold is passed, then SA report has "BODY:" put before the rule description, but now it would be "BODY OR SUBJECT:" ... But I guess that

Re: [SAtalk] Combined subject and body tests?

2002-03-06 Thread Matthew Cline
On Wednesday 06 March 2002 01:40 am, Matt Sergeant wrote: > What I suggest is that the body stripping code adds the subject header in. > > I'll apply this patch if there are no objections: > > diff -u -r1.79 PerMsgStatus.pm > --- lib/Mail/SpamAssassin/PerMsgStatus.pm 5 Mar 2002 17:44:51 -00

[SAtalk] Non-SAtalk messages sent to SAtalk

2002-03-06 Thread Matthew Cline
I'm seeing a bunch of messages from places like [EMAIL PROTECTED] and [EMAIL PROTECTED], which are getting sent through SAtalk (since they end with the SAtalk signature), and which are filtered through some SA I'm not using, since their subjects are being rewritten, which I've turned off. I'v

Re: [SAtalk] relays.osirusoft.com (was Re: false positives since upgrading to 2.11 (1/7))

2002-03-06 Thread Matthew Cline
On Wednesday 06 March 2002 05:01 pm, Daniel Pittman wrote: > Er, does anyone out there know that this is actually a usable source of > information? Can anyone say that it's a success story for them? 17 out of 87 spams detected hit the osirusoft rule, and I've seen no false positives because of

[SAtalk] Modification to null block patch, fixes infinite looping

2002-03-06 Thread Matthew Cline
I found that this line of my patch to fix the "MIME null block" problem was causing an infinite loop sometimes: my $boundary = "--$1"; Since it's used in a regular expression, if the boundrary string has regexp meta-characters in it, things will get messed up. Specifically, if the bo

Re: [SAtalk] FWD: SPAM: 213.50 [!CrackMonkey!] SpamAssassin this, MOFO!!!

2002-03-06 Thread Matthew Cline
My copy of SA had this to say about it: > SPAM: Content analysis details: (217.2 hits, 5 required) Jesus! -- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm largest human edited web directory. | for a minute, but set him on fire, and

Re: [SAtalk] Modification to null block patch, fixes infinite looping

2002-03-06 Thread Matthew Cline
On Wednesday 06 March 2002 05:28 pm, Matthew Cline wrote: > I found that this line of my patch to fix the "MIME null block" problem was > causing an infinite loop sometimes: > > my $boundary = "--$1"; Ugh, that wasn't the only problem. If, for some

[SAtalk] Single line of binary for text array...

2002-03-06 Thread Matthew Cline
When I run a raw body eval test on the attached email, the text array I get consists of only one line of binary; I have now clue as to why. I'm using the latest CVS version of SA. -- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm largest human edited web directory

Re: [SAtalk] More 2.11 false positives

2002-03-07 Thread Matthew Cline
On Thursday 07 March 2002 09:00 am, Matt Sergeant wrote: > > # 3) Some whitespace > > my $num_lines = scalar grep(/\s/, grep(/^[A-Z]{20,}$/, @lines)); "\s" needs to be added to the stripping regexp and the extraction regexp, or $num_lines will always be 0. That should be: Index: lib/Ma

Re: [SAtalk] Spammers are catching on...

2002-03-07 Thread Matthew Cline
On Thursday 07 March 2002 02:53 am, Matt Sergeant wrote: > On Thu, 7 Mar 2002, Bart Schaefer wrote: > > On Thu, 7 Mar 2002, Matt Sergeant wrote: > > > Yep, I'm seeing this stuff too (though not in huge numbers yet). I'm > > > going to examine the body rules in a bit more detail, and if it makes

[SAtalk] Some fixes to 20_uri_tests.cf

2002-03-07 Thread Matthew Cline
In HTTP_CTRL_CHARS_HOST and PORN_4, there is no "?" after "https", so it never matches "http://";. I'm curious as to how many spamm messages include an https URI; I've never seen any. Index: 20_uri_tests.cf === RCS file: /cvsroot/

[SAtalk] REMOVE_PAGE rule improvement

2002-03-07 Thread Matthew Cline
Currently the rule is: uri REMOVE_PAGE /^https?:\/\/[^\/]+\/remove/ However, the "remove" might not come at the beginning of the file portion of the URI. For instance, http://www.chippynet.com/pharmacy/remove.html The most general rule would be (also making it case insensitive):

[SAtalk] Misc. rule ideas

2002-03-07 Thread Matthew Cline
First a few rules to match non-spam: body SIGNATURE_DELIM/^-- $/ describe SIGNATURE_DELIMStandard signature delimiter present While there would be no effort in faking this, it might take a while for some of the spammers to catch on. uri HTTPS_URL /

Re: [SAtalk] Misc. rule ideas

2002-03-08 Thread Matthew Cline
On Friday 08 March 2002 12:42 am, Rob McMillin wrote: > Matthew Cline wrote: > >First a few rules to match non-spam: > > > > body SIGNATURE_DELIM/^-- $/ > > describe SIGNATURE_DELIMStandard signature delimiter present > > > >While ther

[SAtalk] SA like project: SPASTIC

2002-03-08 Thread Matthew Cline
Found a new anti-spam project announced at FreshMeat, which works via procmail. Homepage at http://spastic.sourceforge.net/index.html According to the homepage: * Filtering based on header and/or body contents * Predefined sets of filters to get started quickly * Whitelist to bypas

Re: [SAtalk] SA like project: SPASTIC

2002-03-08 Thread Matthew Cline
On Friday 08 March 2002 05:26 pm, I wrote: > Maybe we should take a look at their fitlers. I took a look, and it currently has 28 subject/header strings (non-regexp) to reject and 17 body strings to reject, with no scoring. Most of the stuff seems to already be covered by SA. Some strings SP

[SAtalk] WE_HONOR_ALL

2002-03-09 Thread Matthew Cline
body WE_HONOR_ALL /we (?:honou?r|respect)(?: all|) remov[eal] requests/i Shouldn't "remov[eal]" be "remov(?:e|al)"? The way it is now, it matches "remove", "remova", or "removl". -- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm largest human edited web directory

[SAtalk] (?: in regexps

2002-03-09 Thread Matthew Cline
Why is "(?:" used in the rules regexps instead of just "("? Does the engine that applies the rules put normal parens around the whole regexp, and we don't want to interfere with it generating $1, $2 and such? Does "(?:" use less resources? Something else? -- Visit http://dmoz.org, the worl

[SAtalk] Spam with "To:" including text file names?

2002-03-09 Thread Matthew Cline
I just got this weird spam who's "To:" field is "\unabad.txt, \8un.txt, \8bad.txt"; looks like a piece of Chinese spam. Is this malconfigured spam software, or something else? -- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm largest human edited web directory.

Re: [SAtalk] Misc. rule ideas

2002-03-09 Thread Matthew Cline
On Saturday 09 March 2002 09:39 pm, Rob McMillin wrote: > Kerry Nice wrote: > >#this one will only work for me, but if it there, it > >is 100% GUARANTEED to be spam > >#not sure how to make a general case for this > >header KERRYSUBJECT Subject =~ /kerry_nice/ > >describe KERRYSUBJECT

[SAtalk] Mime null block report fix *still* not quite right...

2002-03-09 Thread Matthew Cline
Grrr The previous patch I submitted *still* wasn't quite right; a malicious user could still send a mail which would result in an infinite loop, eating up CPU resources and slowing down the mail server. *This* patched I've stared at, slept on, and stared at again, and I'm positive that i

[SAtalk] "=3D" stuff in rawbody and uri rules

2002-03-10 Thread Matthew Cline
Some of the rawbody and uri rules have stuff to deal with the "=3D" tricks that spammers use. However, get_decoded_text_array() already translated quoted printable encodings, so any "=3D" should be gone. H Looking at get_decoded_stripped_body_text_array(), it redoes the same stuff as

Re: [SAtalk] installing in home dir

2002-03-10 Thread Matthew Cline
On Sunday 10 March 2002 02:18 pm, Will Yardley wrote: > i have something that's both a question and a request... from what i've > seen in the archives, it's not terribly easy to install spamassassin > without root access. From the README (http://spamassassin.taint.org/dist/README): These steps a

Re: [SAtalk] Lot's of spam gets thru because of missing rules

2002-03-10 Thread Matthew Cline
On Sunday 10 March 2002 09:35 am, Rob McMillin wrote: > What bugs me is that CASHCASHCASH doesn't work on the Subject header, > and in your spam example, it would have caught that. This will be fixed in SA 2.20 (or is fixed, if you use the latest CVS version), as the subject is added to the bod

[SAtalk] Join all consecutive whitespace into a single space?

2002-03-10 Thread Matthew Cline
In get_decoded_stripped_body_text_array(), there is: # join all consecutive whitespace into a single space $text =~ s/\s+/ /sg; # reinsert para breaks $text =~ s//\n\n/gis; The first regexp, in addition to compacting down normal whitespace, also turns all newlines in spaces; I guess th

Re: [SAtalk] SA should block spam that matches government rules

2002-03-11 Thread Matthew Cline
On Monday 11 March 2002 03:52 pm, William R Ward wrote: > Spammers are required by law in at least some major jurisdictions to > include "ADV" and/or "ADLT" in the subject lines of their spam. SA > should detect this with a very high score. SA does match "ADV:" in the subject, but not without th

[SAtalk] Spamming via sound files, and other intersting techniques

2002-03-11 Thread Matthew Cline
I just got a spam that slipped through SA, which is only a sound file; since you can't find spammish words and phrases in a sound file, it'll get past any filters that there might be. The same spam had another intersting technique, like thus (after decoidng quoted-printable): cid:V34tfyBO4160

Re: [SAtalk] Spamming via sound files, and other intersting techniques

2002-03-11 Thread Matthew Cline
On Monday 11 March 2002 08:24 pm, Michael Moncur wrote: > I think that would be a great addition to SA, although I see more virus > emails formatted like that than actual spam. I'm trying the following in my > custom rules file: > > rawbody HTML_FRAMES / describe HTML_FRAMES HTML with an embed

Re: [SAtalk] Spamming via sound files, and other intersting techniques

2002-03-11 Thread Matthew Cline
On Monday 11 March 2002 06:46 pm, Charlie Watts wrote: > Did you play it? (or at least look at it more closely) Ah. It's file type *is* "MS-DOS executable (EXE), OS/2 or MS Windows", so I guess it's a virus. And the raw text of the message has: Content-Type: audio/x-wav; name=speedte

Re: [SAtalk] Messages with empty bodies?

2002-03-12 Thread Matthew Cline
On Tuesday 12 March 2002 08:21 am, Matt Sergeant wrote: > On Tue, 12 Mar 2002, Charlie Watts wrote: > > Does anybody get legit mail with no body? > Yep, and I send a lot too (just mailing each other files in the office > would be one example, and my mail hits the smtp server due to the way it's

Re: [SAtalk] Some more rule ideas

2002-03-13 Thread Matthew Cline
On Tuesday 12 March 2002 09:03 am, Kerry Nice wrote: > Would it be possible to come up with a rule for those > random things that are the final lines of a lot of > spams? These are the kind of things that break razor, > since the hash is different. > > I cut some samples out of some recent spams:

[SAtalk] PORN ideas

2002-03-13 Thread Matthew Cline
-- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm largest human edited web directory. | for a minute, but set him on fire, and | he'll be warm for the rest of his life. [EMAIL PROTECTED] ICQ: 132152059 | ___

[SAtalk] PORN ideas 2

2002-03-13 Thread Matthew Cline
Ooops, hehe; sorry about that... Interesting subject for a technical mailing list, huh? :-) I started fiddling around with the PORN_3 rule because it wasn't catching any of the porn spam that I got. body PORN_3 /(?:(?:\bcum|\borg[iy]|\bwild|fuck|\bteen|\baction\b|spunk|\bp

[SAtalk] Porn spam which didn't trigger any porn rules

2002-03-13 Thread Matthew Cline
This spam didn't trigger any of the porn rules, not even the PORN_4 URI rule. Maybe "nude" should be added to PORN_4. --- Subject: Jennifer Love Huitt Caught Naked WANT TO SEE SOME CELEBRITY SKIN? Have you ever wondered what Christina Aguillera looks like under her lingerie? Ever wish you

Re: [SAtalk] PORN ideas 2

2002-03-13 Thread Matthew Cline
On Wednesday 13 March 2002 02:57 am, Matthew Cline wrote: > body PORN_12 > /(?:(?:\bxxx|\bsex|\bslut|\bwhore|\bhottest\b|hard-?core|\bhorny\b|\bhornie >st\b|\bvirgin|\bnaughty\b|\bnaughtiest\b|\bwebcam||\ble[sz]b(?:ian|o) > describe PORN_12Uses words

Re: [SAtalk] PORN ideas 2

2002-03-13 Thread Matthew Cline
On Wednesday 13 March 2002 02:56 pm, Matthew Cline wrote: > On Wednesday 13 March 2002 02:57 am, Matthew Cline wrote: > > body PORN_12 > > /(?:(?:\bxxx|\bsex|\bslut|\bwhore|\bhottest\b|hard-?core|\bhorny\b|\bhorn > >ie st\b|\bvirgin|\bnaughty\b|\bnaughtiest\b|\bwebc

[SAtalk] Unique subject IDs the current rule doesn't catch:

2002-03-14 Thread Matthew Cline
I've recently gotten two spams with subject IDs which check_for_unique_subject_id() doesn't match: Seen On Tv Plan Lets You tap Into Explosive Internet Growth 4151dl5 Web Content Management 6217ecfg7-1l10 Hmmm, a "seen on TV" without a "as seen on"... -- Visit http://dmoz.org, the world's

Re: [SAtalk] More filter ideas

2002-03-14 Thread Matthew Cline
On Thursday 14 March 2002 01:19 am, Daniel Pittman wrote: > On Thu, 14 Mar 2002, Rob McMillin wrote: > > I characterize this as follows: > > header TO_8BITTo =~ /[\x80-\xff]/ > > describe TO_8BITAddressee has 8 bit characters > > score TO_8BIT5 > This isn't a legal DNS name

Re: [SAtalk] PORN ideas 2

2002-03-14 Thread Matthew Cline
On Wednesday 13 March 2002 08:28 am, Matt Sergeant wrote: > Geoff Gibbs wrote: > >I believe that the current version of PORN_4 (2.11) is triggered by :- > > > >http://www.essex.ac.uk/ > > > >giving:- > >X-Spam-Status: No, hits=2.3 required=5.0 tests=PORN_4 version=2.11 > Good. That means SpamAss

Re: [SAtalk] What header to add to disable SA

2002-03-14 Thread Matthew Cline
On Thursday 14 March 2002 08:07 pm, Olivier Nicole wrote: > Hi, > > I am writting a small script that will send email to my users. > > I want the email message not to be checked by SA. > > I am wondering if there is any way to do so. I am using > procmail/spamc/spamd. There is no way to do this,

Re: [SAtalk] Contributed rules: stock market spam

2002-03-15 Thread Matthew Cline
On Friday 15 March 2002 12:11 am, Michael Moncur wrote: > Here's my file of rules for stock-market spam. I tried to avoid anything > that would be used in "normal" mail, but people who subscribe to stock > reports could get false positives. I used a bunch of low-scoring rules > rather than fewer h

Re: [SAtalk] HotMail email advertising trips wire...

2002-03-15 Thread Matthew Cline
On Friday 15 March 2002 06:31 pm, Andrew Kohlsmith wrote: > > I recently received some personal mail with the following > > HotMail-generated ad. at the end (linebreaks are mine): > > MSN Photos is the easiest way to > > share and print your photos: > href='http://go.msn.com/bql/hmt

Re: [SAtalk] spamassassin & razor

2002-03-16 Thread Matthew Cline
On Saturday 16 March 2002 12:31 am, Robert Fleming wrote: > I installed Razor as per the docs, and get an error when it is called by > spam-assassin > > Here is the entry from my log files: > 2002-03-16 01:24:29.886330500 delivery 685: success: > razor_check_skipped:__undefined_Razor::Client/did_0

Re: [SAtalk] General comments on Spamassassin, with a few questions

2002-03-16 Thread Matthew Cline
On Saturday 16 March 2002 03:56 pm, Jason White wrote: > Now to my questions: whenever Spamassassin fails to detect a spam > message (which occasionally happens), I pipe the offending message to > spamassassin -r > Is this the correct response? The effect is (or should be) to add the > message (o

Re: [SAtalk] List emails

2002-03-18 Thread Matthew Cline
On Monday 18 March 2002 10:22 am, CertaintyTech - Ed Henderson wrote: > I am very pleased with SA and the job it is doing. Good job to all! > > But...In my situation if SA makes a false positive it is often on mailing > list type emails. Perhaps a user has suscribed to a joke of the day or > som

Re: [SAtalk] SA's performance with mailing lists

2002-03-19 Thread Matthew Cline
On Monday 18 March 2002 07:38 pm, Kerry Nice wrote: > I saw in the Lockergnome newsletter I received today, Spamassassin was > slammed big time. I do see his point though. Does SA really do that > great of a job with newsletters and journals? We could take out the rules that get triggered ofte

[SAtalk] Another unique subject ID regexp

2002-03-19 Thread Matthew Cline
I've occasioanlly gotten spam with a subject that looks like this: >> Subject: !Beautiful, Custom Websites - $399 Complete! >> (7217vPhZ0-478TLdy5829qicU9-0@26) >> Subject: Custom Websites for $399 Complete! (or yours re-designed) >>(2539OiAs5-871MeWq8@17) The current check_for_unique_subject_i

Re: [SAtalk] SA's performance with mailing lists

2002-03-19 Thread Matthew Cline
On Tuesday 19 March 2002 12:57 pm, Craig Hughes wrote: > Actually, something I've noticed is that otherwise legitimate-looking > email frequently gets tripped up by an ad tacked on the bottom of the > mail -- this happens with mailing lists trying to support themselves, > but also with things lik

Re: [SAtalk] newbie question

2002-03-19 Thread Matthew Cline
On Tuesday 19 March 2002 02:54 pm, dman wrote: > On Tue, Mar 19, 2002 at 02:22:22PM -0800, Byrne Reese wrote: > | Hopefully, someone can tell me to go read a specific FAQ or something, > | but I have nothing that will help me get qmail to work with spam > | assassin. > | > | I need spamassassin to

Re: [SAtalk] Skipping multipart/related is bad

2002-03-19 Thread Matthew Cline
On Tuesday 19 March 2002 03:02 pm, dman wrote: > On Tue, Mar 19, 2002 at 02:34:23PM -0800, Bart Schaefer wrote: > | On Tue, 19 Mar 2002, Daniel Rogers wrote: > | > I guess this would mean having to recurse through all the mime parts? > | Yes. This is now bugzilla #115. > Does perl not have an

[SAtalk] CC new bugzilla bugs to SAtalk?

2002-03-19 Thread Matthew Cline
Don't know how big of a hack this would be, but it might be a good idea to CC newly created bugs to the SAtalk list, so people would be reminded of it's existance. -- Visit http://dmoz.org, the world's | Give a man a match, and he'll be warm largest human edited web directory. | for a minut

Re: [SAtalk] Failed test Razor::Client

2002-03-20 Thread Matthew Cline
On Wednesday 20 March 2002 11:46 am, Lewis Bergman wrote: > I have installed SpamAssassin and it is working as it should be. The only > problem I seem to have is this error is reported when it runs: > razor check skipped: No such file or directory undefined Razor::Client If you are using Razor 1.

Re: [SAtalk] ORBZ shutdown

2002-03-20 Thread Matthew Cline
On Wednesday 20 March 2002 08:25 am, Jason wrote: > Looks like ORBZ has been shutdown... > > As seen on slashdot > > http://slashdot.org/article.pl?sid=02/03/20/1528246&mode=thread&tid=111 Mentioned in one of the article's comments was another RBL that SA currently doesn't use, http://njabl.org

Re: [SAtalk] spamassassin -r problem

2002-03-20 Thread Matthew Cline
On Wednesday 20 March 2002 12:20 pm, Michael Blakeley wrote: > WIth 2.01 installed via CPAN, spamassassin complains about missing > Razor::Client - but it is installed: > > $ perl -MRazor::Client -e 'print "$Razor::Client::VERSION\n";' > 1.20 Downgrade to 1.19 -- Visit http://dmoz.o

[SAtalk] Improvement to NVALID_MSGID

2002-03-20 Thread Matthew Cline
Currenlt, INVALID_MSGID doesn't catch message IDs like <026b10d87e4c$8543d8d6$8ad36ae8@ihervr>, because it only requires that there be something after the "@". I've changed it so that it requires something like a normal host name after the "@" (with at least one "." in it). While I was at it, I

Re: [SAtalk] Improvement to NVALID_MSGID

2002-03-20 Thread Matthew Cline
On Wednesday 20 March 2002 06:45 pm, Theo Van Dinter wrote: > On Wed, Mar 20, 2002 at 06:09:24PM -0800, Matthew Cline wrote: > > Currenlt, INVALID_MSGID doesn't catch message IDs like > > <026b10d87e4c$8543d8d6$8ad36ae8@ihervr>, because it only requires that > &g

[SAtalk] Some changes to unsub/remove URI rules

2002-03-20 Thread Matthew Cline
I've made some changes to a few of the unsub/remove URI rules. A diff for the changes is included as an attachment... The UNSUB_PAGE regexp was: /^https?:\/\/.*(?!cgi).*unsubscribe/i Using (?!cgi) to exclude URIs with a "cgi" in them doesn't work, because the first ".*" can match everythin

[SAtalk] Unusual munging of "To" hostname

2002-03-20 Thread Matthew Cline
I've just received two web hosting related spams from "hostingguy" at bigfoot.com. The "To" had random gibberish added as a host name to actual domain name: [EMAIL PROTECTED] [EMAIL PROTECTED] Is this to avoid spam-trap searches that look for exactly the smap trap adress? They passed b

[SAtalk] How to deal with ?

2002-03-20 Thread Matthew Cline
One way that spammers could try to get around some of the URI rules (at least for HTML only spam) is to put the main part of the URI into a tag, so that all of the URIs pulled from won't match rules which look for domain names and "http://";. I've modified get_decoded_stripped_body_text_array()

[SAtalk] Some changes to get_decoded_stripped_body_text_array()

2002-03-21 Thread Matthew Cline
I've made a bunch of changed to get_decoded_stripped_body_text_array(). First, rather than decoding hex entities like � directly to ascii characters, I chaned it to convert them to decimal before the decimeal entities are replaced. Thus ” will get converted first to ň and then to a double quo

  1   2   3   >