-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello Matt, Chris,
Monday, August 4, 2003, 11:03:04 PM, Matt wrote: MK> At 09:45 PM 8/4/03 -0700, Robert Menschel wrote: >>uri L_u_time4more /time4more\.net/i >>describe L_u_time4more Body text references known spammer >>score L_u_time4more 9.00 # graphics-only spam Aug 4 03 MK> Personaly, I tend to not go over 4.0, even on a sure-fire spam rule. MK> This is mostly as a result of accepting the general spamassassin MK> philosophy that any single rule shouldn't be enough. (with the MK> exception of things like GTUBE) I understand, but then I also firmly believe and accept the philosophy that there's an exception to every rule. :-) The exception to the philosophy of not flagging spam with a single rule has exceptions in 1) the blacklist, and 2) when the spammer gives no other spam clues. An extreme example of the latter would be those spam which contain nothing but a single URL, itself not clearly spam, with no spamsign in subject or other headers. MK> Admittedly this isn't very likely to false positive, however rather MK> than creating one rule worth 9, if at all possible I tend to create a MK> handful of rules for the same spam which total 6-9. I normally aim for the same. Since most false negatives get 50-75% of the way to my required hits, I frequently need to just add 1-2 points to get the email to score as spam. MK> I would also try improve the rule by framing it with \b's, or at least MK> starting it with one. MK> /\btime4more\.net\b/ That's a good enhancement, which I'm adding to my rule. This should match "go to time4more.net." as well as http://www.time4more.net and http://www.time4more.net/links/spampage.html Tuesday, August 5, 2003, 6:53:35 AM, Chris wrote: CS> THis is exactly what MY_EVIL does and works great. My tips page talks CS> about how you should mark these as Temp_My_EVIL, as they will CS> eventually expire. Good point. CS> For those who may not see it, these are not the sender of the spam CS> domains, but the domain of the image hosts, often owned by spammers. CS> Therefore it is ever changing like a RBL. So submissions of these to CS> the Rule Emporium would be tooo lengthy. You would almost have to CS> have an RBL for the rule :) We could, however, set up a blacklist through a website, such that anyone can submit an entry, a simple domain name such as time4more.net, or an IP address if that's the reference in the spam, or a more specific URI (spaml3.time4more.net/spamdir or 123.234.56.78/spamdir). The web system would track submissions, and create a ruleset from them. Initial score on first submission would be 0.1, with score increasing perhaps to 1.0 as additional submissions/reports come in. We could also have password-authorized trusted submitters, whose submissions would score higher (allowing scores to get up to 2.5 perhaps). Perhaps these scores would be doubled for those systems not using DNSBLs? The system would then dump these scores into an ASCII file that could be retrieved by anonymous FTP. This file could be stored as auto-uribl.cf for those who can have multiple local.cf files, and could be automatically added to the user_prefs file for people like me who are limited to the user_prefs file. (Such rules wouldn't do any good unless you use a system like mine that calls SA a second time.) CS> This type of rule can also be combined with others. There is almost no CS> chance of timeformore.net showing up in a code at the same time as CS> tastemysalad.com, so it is easier to combine. Agreed. The only concern would be readability / editability. Rules which get too long bother me from an esthetic perspective. (That wouldn't apply if we develop an automated system.) Since the rules are temporary, perhaps it'd be good to name them something like L_u_Tmp_AugW1 (rule added first week in August). When we then review the rules, we know to check in Sept and Oct whether this rule has been superceded by a DNSBL. We know to check in Nov and Dec whether the domain's in this rule are no longer being used. We know in Jan and Feb that if the rule is still active and beneficial, maybe we should remove the Tmp flag. >>header L_s_CorelWPOffice Subject =~ >>/(?:Corel|WordPerfect).{1,15}Office/i MK> More \b action, on general principle, although not strictly needed. Agreed. Thanks. CS> Yeah, I have the norton system works rule like this. If you don't use CS> WP office, then by all means make a rule. But an ISP would shy away CS> from this one. Actually, we DO use WP Office. And we frequently share files from WP Office. But we don't refer to WP Office as such in subject headings. Just like we don't name each other in subject headings either. As for an ISP, I would think it's still a valid rule; they'd just need to be careful to score it low enough to be incremental rather than definitional. >>header L_hr_lattelekom Received =~ /lattelekom\.net/ MK> Seems fine, although a bit of a duplication of effort with DNSBL's.. MK> have you enabled them? DNSBLs are enabled by my host. I wouldn't be without them. This was a spam that didn't score from them -- apparently it's too new a pathway. This should probably be given a temporary name/flag, and removed once the DNSBLs catch up. CS> Hmmm.....this is interesting. This would help me greatly if I listed CS> IPs. My blocked IPaccess list has stopped a few legit emails that CS> I've had to fix. However if I had SA read in that list and simply CS> score some points for matches, it would be less apinful on FPs. As an end user, with no access to procmail or similar, SA is my only method of providing a fixed access list, and it works well for me. Perhaps a web site like the one theorized above could provide a set of Received header rules, by domain name and/or IP address, which indicate spam, with the same scoring considerations applying (so provisional or wrong submissions don't cause false positives). CS> This comes down to the same problem of SA and my lack of perl. Having CS> SA function that reads a text file and looks for matches in the CS> email. Such as a list of domains or IPs. There was mention that 2.6 CS> might have some sort of eval like this. That would be sweet. My understanding is that if you can create/update your local.cf, you also have the ability to have multiple *.cf files in that directory, and all of them will be used. So you can have a relays.cf file which you replace daily or weekly, and it takes effect whenever spamd is restarted or SA is run manually. As for people like me with user_prefs files, if we can get rules activated as in my system, it's simple enough to have multiple *.cf files in our $HOME/.spamassassin directory which are normally invisible to SA, but to "cat *.cf >user_prefs" on a daily basis to do the same type of updating. I am considering doing something like that to automate the implementation of William Stearns' blacklist collection. CS> I think your method has some potential. But most of the spam I see CS> fake the domain names and come right from an open relay. Can they be identified? If we discard the domain names, are there reliable IP addresses which identify the open relays? If so, the same idea should work with those. Thanks to both of you for your ideas. (Anyone else?) Bob Menschel -----BEGIN PGP SIGNATURE----- Version: PGP 8.0 iQA/AwUBPzCGk5ebK8E4qh1HEQJh6ACfSmGnQ1I8gOzM/B229ch9B2ZBqdoAnizB cl/71T7NtScCYIeHYJacoCkd =6n1t -----END PGP SIGNATURE----- ------------------------------------------------------- This SF.Net email sponsored by: Free pre-built ASP.NET sites including Data Reports, E-commerce, Portals, and Forums are available now. Download today and enter to win an XBOX or Visual Studio .NET. http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01 _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk