Botnet spam not being caught
Hi all, I'm using SA-3.2.5 on Linux and my system is being deluged with spam that isn't being caught, apparently from botnets. I'm using botnet-0.7. The subject is random and the "Received from" header is always an unresolvable IP. Is there a more robust botnet plugin that may be more effective? Botnet-v08 was catching too many FPs. (score too high). The body is also quite random -- enough so as to keep bayes usually at 50 or less. Is there a later version of SA that's stable? Here's the relevant headers: Received: from [78.97.185.89] (unknown [78.97.185.89]) Message-ID: Subject: Where is this bar? MIME-Version: 1.0 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: 7bit Date: Sat, 13 Jun 2009 04:05:44 -0400 (EDT) X-Virus-Scanned: by amavisd-new at mydomain.com X-Spam-Status: No, hits=4.9 tagged_above=-300.0 required=5.0 use_bayes=1 tests=BAYES_50, BOTNET, HTML_MESSAGE, MIME_HTML_ONLY, RDNS_NONE, URIBL_BLACK X-Spam-Level: The body is HTML and contains the following: Click here to view this message as a web page. Copyright © 2002-2009 by the Pyahqql, Inc. All rights reserved. Click here if this picture is blocked Home | Contact Us | Privacy Policy | Terms of Use | Unsubscribe | Where can I go from here? Thanks, Alex
Re: Botnet spam not being caught
Hi John, Botnet seems to have caught that just fine (it's listed in the rules > which were triggered). The problem is either that you're running it > at a lower score (which you could also do for Botnet0.8 if you wanted > to upgrade -- their default scores are exactly the same), or you need > other rules/configs to supplement your overall scoring system. Yes, I didn't intend to blame it on botnet; I realize the rule is being triggered. I guess I was concerned about raising the score above my current 1.5, and was thinking that instead some other rule was available, or being used by someone on the list, in conjunction with botnet to catch these. If not, can you recommend an approach on calculating the right score for botnet for my environment, so it doesn't tag so many FPs, or what an appropriate value should be with my threshold being set to 5.0? Thanks again, Alex
Re: Botnet spam not being caught
Hi Charles, Received: from [78.97.185.89] (unknown [78.97.185.89]) >> Message-ID: >> > > Do they all have message ID's that include the IP? Yeah, great, it looks like they all do. Would something like this work? header MYMSGIPMessage-ID =~ /78.97.185.89/ score MYMSGIP0.3 describe MYMSGIPMessage-ID from botnet Can someone help to write a rule that wildcards this safely? > Also give a bit mroe score to the RDNS rules Yeah, great idea. It's currently only 0.1. I also see BOTNET_NORDNS in Botnet.cf, but it isn't being triggered. It's also weighted at 0.0. Is there a reason for this? > You also might want to block that line that says "if picture is blocked". There's a couple of variations, but this also looks like it would work well. Thanks, Alex
Debugging and scripting
Hi. I'm relatively new to spamassassin and perl scripting, and I must already be doing a few things wrong that I hoped the list could help me to solve. I'm receiving the following output when running "spamassassin -D < spam-test.txt 2>&1|less' [32692] warn: Number found where operator expected at (eval 607) line 1, near "0 0" [32692] warn: (Missing operator before 0?) Where is this coming from? Perhaps local.cf, but where? It's not line 607. I'm also having a problem with one of my rules: [32692] info: config: invalid expression for rule LOCAL_XPS: "Subject =~ /Free\ DELL\ XPS/i": syntax error Here is the full rule: meta LOCAL_XPSSubject =~ /Free\ DELL\ XPS/i score LOCAL_XPS1.5 describe LOCAL_XPSRule by AS: XPS Dell Do I need the backslashes to escape the spaces? Will that match that pattern anywhere on the line, or only that text on the line? Can you explain to me the meaning of '(.+)' as in: header LOCAL_RULE1 Subject =~ /(.+)Spam\ Sample(.+)/i score LOCAL_RULE1 5.0 describe LOCAL_RULE1Subject Spam Sample How about without the parens? I believe this is somehow causing emails to hit the "MISSING_SUBJECT" rule, even though the email clearly has a subject. Any help greatly appreciated. Thanks, Alex
Re: Debugging and scripting
Hi Dan, > Do I need the backslashes to escape the spaces? > > no, although \s would be fine. > Okay, so either \s or nothing at all works just the same? > this can be much more effectively written as: > /.spam\ssample./i > That will match the words "spam sample" in the subject as long as there > is at least 1 character before and one after. But you had previously written that /Spam Sample/ will also match that text anywhere on the line. Is that not the case? Thanks again, Alex
Re: Debugging and scripting
Hi Matus (and list :-) > I'm not Dan. This is a mailing list. Meny people read it and many can > respond your mail. Yes, thanks, I had responded to him directly and probably didn't need to, but the reply-to must not be set to the list address? /spam sample/ will match the test anywhere on line. > /.spam sample./ will match the text anywhere on line, except the begin and > the end, since it must be preceded by at least one character. > > /(.+)spam sample(.+)/ will match exactly the same, but the match will be > slower since the (.+) will need to compare all text before/after the "spam > sample" and store them both to capture buffers. Okay, that's great. Thanks so much for your help. Best regards, Alex
BAYES_99 score & lint
Hi all, When I run "spamassassin -D --lint", I receive this output: [14406] info: rules: meta test LOCAL_BAYES_RTF has dependency 'BAYES_99' with a zero score Which is it saying has a zero score? BAYES_99 in 50_scores.cf is shown as: score BAYES_99 0 0 3.5 3.5 The LOCAL_BAYES_RTF is a meta rule that combines BAYES_99 with a mimeheader rule with 0.1 score that catches RTF files. Ideas greatly appreciated. Thanks, Alex
Re: BAYES_99 score & lint
> > > Post your entire scoring block for LOCAL_BAYES_RTF meta LOCAL_BAYES_RTF(BAYES_99 && LOCAL_CTYP_RTF) score LOCAL_BAYES_RTF 1.5 describe LOCAL_BAYES_RTF Rule by AS: Probably an Inline RTF spam mimeheader LOCAL_CTYP_RTFContent-Type =~ /^application\/octet-stream.\.rtf/i score LOCAL_CTYP_RTF0.1 describe LOCAL_CTYP_RTFRule by AS: Content-Type: RTF I also looked a bit further, and don't see where else BAYES_99 might be redefined, and I'm sure that it's scoring above zero. Is there a way to print out all the rules with their scores? Thanks, Alex
SA & amavisd & scanning attachments
Hi, I'm not sure this is an SA question specifically, but perhaps an amavisd-new question that I hoped someone could help me to answer. I'm using amavisd-new, postfix, and spamassassin for multiple domains. I'd like to know if it's possible to permit per-domain forwarding of certain attachment types while stripping others? It appears $banned_filename_re is sitewide, but I thought there might be another way to permit attachments on a per-domain basis? Thanks, Alex
Re: perms problems galore
Hi, I guess I have more of a general sa-update question. I have sa-update running against updates.spamassassin.org and these others: 70_sare_stocks.cf.sare.sa-update.dostech.net 70_sc_top200.cf.sare.sa-update.dostech.net 70_sare_adult.cf.sare.sa-update.dostech.net 90_2tld.cf.sare.sa-update.dostech.net They never seem to update, however. Am I doing something wrong? Are there others I should consider? Thanks, Alex On Fri, Jul 3, 2009 at 11:05 PM, Gene Heskett wrote: > Greetings all; > > I _thought_ I had sa-update running ok, but it seemed that the > effectiveness > was stagnant, so I found the cron entry that was running as-update & > discovered a syntax error there, which when I fixed it, disclosed that I > had > all sorts of perms problems that I don't seem to be able to fix readily. > > sa-update is being run as the user saupdate, which is a member of the group > mail. I have made the whole /var/lib/spamassassin/keys tree an > saupdate:mail, > with very limited rights as in: > drw--- 2 saupdate mail 4096 2008-12-19 16:05 keys > > But sa-update appears not to have perms to access or create gpg keys there. > -- >
Spam troubleshooting
Hi all, I am stuck trying to figure out why the attached spam isn't caught properly. In fact, BAYES_99 isn't flagged and I know it should be, and the total score is 0.0, despite several rules being flagged. The LOCAL_BODY_1577053434 and LOCAL_BODY_4046600451 both catch the phone numbers and have a 2.01 value. The X-MailCleaner headers were there when I received the email. I've obfuscated our customers domain for security. Any ideas greatly appreciated. Where can I start? Am I doing something wrong or is there something in the header that is reducing the score? Thanks so much. Best regards, Alex phone-spam-out.txt.gz Description: GNU Zip compressed data
Re: Spam troubleshooting
Hi, spamassassin 2>&1 -D --lint > > search here for missing perl modules How effective are razor/pyzor and SPF/DKIM? I've always been a bit hesitant to use any of those. and the spam mail have all_trusted ?, you trust a spammer in > trusted_networks trusted_networks isn't at all defined. It looks like it was previously defined with just 127.0.0.1, but it's now commented out. What should it be? You are referring to the spamassassin trusted_networks, not postfix, right? Thanks, Alex
Re: Spam troubleshooting
Hi again, and the spam mail have all_trusted ?, you trust a spammer in > trusted_networks I meant to add, how can I determine which IP it was that is being trusted, anyway? Thanks again, Alex
Spam gathering contact details
Hi, I'm receiving a lot of spam that I can't catch containing fields where the recipient is supposed to enter their contact details, like this: Full Legal Name : Address : City : State : Zip code : Country : Nationality : Home and Cell # : I've added specific rules that look for, say /Full Legal Name :/, but it otherwise only hits BAYES_99. Does anyone have any suggestions for catching these more effectively? Thanks, Alex
Re: Spam gathering contact details
Hi, ...actually, the rules sandbox in svn has been rearranged a bit since that > announcement. The current ruleset lives here: > > > http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/sandbox/jhardin/20_fillform.cf > > The updated ReplaceTags.pm is available at: > > > http://svn.apache.org/viewvc/spamassassin/trunk/lib/Mail/SpamAssassin/Plugin/ReplaceTags.pm Okay, I've updated both, and it's already catching some. Thanks so much, John. However, it still doesn't catch the original one due to score too low, although it's close. It only catches FILL_THIS_FORM_SHORT/LONG and the rule I created for /Full Legal Name/. I've added it to bayes, but for some reason isn't being tagged. How did you determine the scores for FILL_THIS_FORM? How safe would it be to raise each by 0.5? Thanks, Alex
Re: Spam troubleshooting
Hi, ALL_TRUSTED is a bit odd. If you you look back through the debug, it > has identified untrusted relays: > > [11689] dbg: metadata: X-Spam-Relays-Untrusted: [ ip=194.230.33.137 > rdns=mx.xm-rz.net helo=mail.xm-rz.net by=myhost.mydomain.com ident= > envfrom= intl=0 id=B94C2118004 auth= msa=0 ] [ ip=62.2.104.4 rdns= Yes, after noticing xm-rz and t-p.com in 'Received:' headers on several of these, I've since added a header rule to add points for those relays. Is this the proper way to do it? header LOCAL_RECVD_TP Received =~ /.\.t-p\.com/ score LOCAL_RECVD_TP 3.6 describe LOCAL_RECVD_TP Recvd from botnet Thanks, Alex
Re: Spam troubleshooting
Hi again, I have more information on those untrusted hosts. ALL_TRUSTED is a bit odd. If you you look back through the debug, it >> has identified untrusted relays: >> >> [11689] dbg: metadata: X-Spam-Relays-Untrusted: [ ip=194.230.33.137 >> rdns=mx.xm-rz.net helo=mail.xm-rz.net by=myhost.mydomain.com ident= >> envfrom= intl=0 id=B94C2118004 auth= msa=0 ] [ ip=62.2.104.4 rdns= > > Now, for some reason, when I run this spam through SA, I see this: X-Spam-Report: * -4.0 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/ , * medium trust * [194.230.33.137 listed in list.dnswl.org] * 0.0 STOX_REPLY_TYPE STOX_REPLY_TYPE * 3.6 LOCAL_RECVD_TP Recvd from botnet * 3.6 LOCAL_RECVD_XM Recvd from botnet * 2.0 LOCAL_BODY_4046600451 BODY: This message contained the string * "1.845.709.8044" * 2.0 LOCAL_BODY_1577053434 BODY: This message contained the string * "845.709.8044" X-Spam-Status: Yes, score=7.2 required=5.0 tests=LOCAL_BODY_1577053434, LOCAL_BODY_4046600451,LOCAL_RECVD_TP,LOCAL_RECVD_XM,RCVD_IN_DNSWL_MED, STOX_REPLY_TYPE shortcircuit=no autolearn=disabled version=3.2.5 What the hell is RECVD_IN_DNSWL_MED and why is it trusted in dnswl.org? Thanks, Alex
Re: Spam troubleshooting
Hi, have any of you tryed going to dnswl.org homepage ?, even tryed to lookup > the ip ?, got refused submit of new ticket ? Yes, I went to the site, but didn't try to resolve either of them because I knew they were already on the list. They now appear to no longer be on the list. Now I know to submit a ticket. Thanks, Alex
Eliminating unnecessary rules
Hi, I have created a routine where I can enter a string into a text file and it gets converted into a set of rules that form a cf file. They are all of the form LOCAL_RULE_N, where N is a random 6-digit number. Two points are added if the rule is triggered. There are now about 3800 of these rules, dating back chronologically about a year or so. I've learned a lot over the past year, and I now think some of these patterns may be catching valid mail, so I'd like to figure out how best to prune at least the ones that are no longer triggered or are triggered but don't cause the email to become spam. IOW, the message would be spam regardless of whether the rule fired. What is the best way to do this? An awk script on mail.log over the past few weeks? How can I wildcard the script with so many rules, and when they have random numbers at the end? I'm still surprised how many are hitting for things like "Acai Berry" or "PO Box 1845 | Ft. Worth | TX", for example. Thanks for any ideas. Alex
Re: Spam troubleshooting
>> How effective are razor/pyzor and SPF/DKIM? > > very effective, razor/pyzor altogether with DCC. > > SPF also helps much, although it should be implemented at SMTP level and > refuse all messages that cause (hard) fail. > > While DKIM is currently in SA, the only place it currently applies is > whitelisting, since it has scores of +/-0.001. Different scores were > mentioned here, but not incorporated into SA scores yet. > >> I've always been a bit hesitant >> to use any of those. > > Why? Because how often do spammers have DNS entries with valid SPF or DKIM information? How often do spammers use compromised hosts with valid SPF or DKIM information? Will they help with emails that only contain a random URL and a line or two of text, like: : Get your Nursing Degree here http://spamsite.com/ Or would that be DCC? Often times these types of emails get through, apparently before the URL is listed in spamcop, SURBL, or URIBL_BLACK? Can I also ask where the best place to start with to implement razor and/or pyzor in SA3.2 on Linux with postfix? Thanks, Alex
Re: boosting PBL score suggestions
> when that was set a couple of years back, PBL had a few FPs -- the FP > rate has dropped greatly since then, going by recent ruleqa results. > go ahead and bump it up. I just checked many of my FPs that have RCVD_IN_PBL, and increasing the score there would sure help me too! Thanks for spotting that, Aaron. Best, Alex
whitelist_from questions
Hi all, Some time ago someone had mentioned to never use whitelist_from but instead use whitelist_from_rcvd. Where is whitelist_from_rcvd documented? It doesn't appear in the SA docs in the same place that whitelist_from is listed. So, forever I have been using whitelist_from and have probably a thousand entries. Given that it doesn't appear to be well documented, Is it okay to do a one-to-one translation of my whitelist_from rules to whitelist_from_rcvd? Do these entries have to be in local.cf, or can I create a whitelist_from.cf file to place them in? Thanks, Alex
Re: whitelist_from questions
> It is documented on the Mail::SpamAssassin::Conf man page just like > whitelist_from. Ugh, thanks. > whitelist_from_rcvd a...@lists.sourceforge.net sourceforge.net > Use this to supplement the whitelist_from addresses with a check against the > Received headers. The first parameter is the > address to whitelist, and the second is a string to match the relay’s rDNS. Okay, so for example if I was going to whitelist j...@orbitz.com, the appropriate line would be: whitelist_from_rcvd j...@orbitz.com psmtp.com psmtp.com is the domain that controls mail for orbitz, according to the MX records. Thanks, Alex
Re: sa-stats.pl and SpamAssassin 3.2.4
Hi, Are spamd and amavisd-new mutually exclusive? I'm also trying to use sa-stats.pl, and it is reporting zeros because I've just learned it relies on spamd, which I'm apparently not using. Here is the relevant log information from line in my mail.log: Jul 22 00:01:24 mail02 amavis[30729]: (30729-266) SPAM, -> , Yes, hits=40.6 tag1=-300.0 tag2=5.0 kill=5.0 use_bayes=1 tests=BAYES_99, BODY_8BITS, BOTNET, FORGED_YAHOO_RCVD, FROM_ILLEGAL_CHARS, HEAD_ILLEGAL_CHARS, HTML_IMAGE_RATIO_02, HTML_MESSAGE, HTML_TAG_BALANCE_BODY, MIME_HTML_ONLY, MIME_HTML_ONLY_MULTI, MPART_ALT_DIFF, MSGID_RANDY, RCVD_DOUBLE_IP_LOOSE, RCVD_HELO_IP_MISMATCH, RCVD_IN_XBL, RCVD_NUMERIC_HELO, RDNS_NONE, REPTO_QUOTE_YAHOO, SUBJECT_NEEDS_ENCODING, SUBJ_ILLEGAL_CHARS, TVD_RCVD_IP, TVD_RCVD_IP4, quarantine spam-d55bdeb21a3775a8f250921df74e14d7-20090722-000123-30729-266 (spam-quarantine) Jul 22 00:01:24 mail02 amavis[30729]: (30729-266) TIMING [total 785 ms] - SMTP EHLO: 1 (0%), SMTP pre-MAIL: 1 (0%), create email.txt: 0 (0%), SMTP pre-DATA-flush: 1 (0%), SMTP DATA: 80 (10%), body hash: 0 (0%), mime_decode: 6 (1%), get-file-type: 13 (2%), decompose_part: 1 (0%), parts: 0 (0%), AV-scan-1: 4 (0%), AV-scan-2: 6 (1%), SA msg read: 13 (2%), SA parse: 2 (0%), SA check: 519 (66%), write-header: 25 (3%), save-to-local-mailbox: 8 (1%), delete email.txt: 105 (13%),unlink-1-files: 0 (0%), rundown: 0 (0%) Can sa-stats.pl be configured to parse this output? Other ideas? Thanks, Alex
Re: Lotto/Money & email address spam
> Please use pastebin. Yes, will do, thanks. >>It hit BAYES_99, but that's it. Are there any rules that pertain to >>'loan' or this type of mail that can somehow block these? > > FreeMail.pm and the SOUGHT_FRAUD rules. Some time ago you were speaking about the AOL tunome.com freemail domain, and that Dan was going to create an updated list. Any progress on that? I thought FreeMail was part of SA proper, but apparently not. Who maintains that, and how do I find it? I found the SOUGHT_FRAUD rules in jm's sandbox. Are those the proper ones to use? Are the testing ones safe? Thanks, Alex
Re: Lotto/Money & email address spam
Hi, >> I found the SOUGHT_FRAUD rules in jm's sandbox. Are those the proper ones >> to use? Are the testing ones safe? > > Subscribe your sa-update to the sought rules channel. The reulsets are > regenerated too often for manual maintenance to be feasible. Okay, I have configured sa-update to download the following rulesets: 70_sare_stocks.cf.sare.sa-update.dostech.net 70_sc_top200.cf.sare.sa-update.dostech.net sought.rules.yerp.org updates.spamassassin.org Do people have a script that lints the rules, copies them to /etc/mail/spamassassin/ and restarts amavisd? SA should automatically pick up on the new rules, correct? I'm somewhat concerned about there being some type of error and SA failing, or a typo in a rule that is now catching all my mail as spam? Also, the system this is running on has a really old compiler and glibc that are incompatible with sa-compile. Can the rules be compiled on another system and migrated to the server where SA is running? An upgrade is planned for late in the year, but it's just too involved to do now :-( Thanks, Alex
Re: Spam troubleshooting
>> Can I also ask where the best place to start with to implement razor >> and/or pyzor in SA3.2 on Linux with postfix? > > EHM? implement it on your mailserver... Heh, no, I mean where can I go to learn how to implement it? Where's the docs? :-) I think I'm headed towards razor first, as it doesn't require python and appears to be simpler and more effective, even? Thanks, Alex
URL Block Lists
Hi, What is the preferred list of URL block lists that everyone uses? I'm currently using SURBL and a few others, often times there are URLs like 'learningbetter.net' that isn't tagged. We've set up our own internal URL block list that gets trained manually by inspecting email visually, until the URL is added to URIBL or SURBL, but I must be missing something, because lately there are far too many not being tagged. Thanks, Alex
Re: Lotto/Money & email address spam
>> I thought FreeMail was part of SA proper, but apparently not. Who >> maintains that, and how do I find it? > > You need three files: > http://sa.hege.li/FreeMail.pm > http://sa.hege.li/FreeMail.cf > http://sa.hege.li/freemail_domains.cf > > And it's also worthwhile to add the > 90_sare_freemail.cf.sare.sa-update.dostech.net channel to sa-updates To update my previous post, I've now also added the 90_sare_freemail channel. Wouldn't it be more efficient or effective to combine the two lists, 90_sare_freemail and freemail_domains? Thanks for putting up with my newbie questions. Best, Alex
Re: Lotto/Money & email address spam
Hi, >> Please don't paste examples to this list. >> >> Please post them to pastebin (or a similar service) and then include the >> link. .. Yes, understood. FWIW, I know enough to not post an entire message with headers to the list -- I'm sure half the time it would be filtered anyway. This time it was just a snippet, but in the future I'll post even those online, too. Thanks, Alex
Re: Lotto/Money & email address spam
Hi, > sa-update lint checks the rules in a sandbox, and does not update the > local channel, if there are any issues. Moreover, do NOT copy these > updates to your site config dir -- but keep it in the update dir where > sa-update puts them [1]. SA knows how to use them instead of the > "install-time" default conf. Okay, great. That is what I have now done. I actually have multiple mail servers, none of which have direct access to the Internet other than inbound SMTP, so I have sa-update running on another box, which creates a tarball, which is then scp'd to the mail servers and extracted. For me, this now means the sa-update channels are in /var/lib/spamassassin/3.0005/ and my local site-config is /etc/mail/spamassassin, where local.cf and init.pre reside. I also spent much of the day reading docs. I've worked with Linux now for many years, and have been involved with SA, just not to the level that I'm involved now. > It's a rather bizarre picture I'm sensing here. From your recent posts I > understand you are running a mail server for a large organization. Yet > there is this cannonade with rather basic questions... guenther, I knew you were a smart guy :-) Yes, there is a bigger picture; hopefully I get some cred for trying to tackle this on my own (with the help of others more experienced). Anyway, I'm trying to use sa-update to install the SOUGHT rules, and linting them shows this: [17021] warn: config: invalid regexp for rule __SEEK_AY2NNY: /This place is so exclusive, how did you get an invite\x{e2}\x{80}\x{a6} /: /This place is so exclusive, how did you get an invite\x{e2}\x{80}\x{a6} /: Can't use \x{} without 'use utf8' declaration I'm using perl-5.6.0; is that the cause? Thanks again, Alex
Re: whitelist_from questions
Hi, > Firstly, before you convert all these to whitelist_from_rcvd, perhaps you > ought to ask yourself whether you really need 1000 entries on your > whitelist. I'm surprised you were the first to make that very comment, so thanks. > Does mail from these addresses actually get miscategorised as > spam, or would SA get it right without the whitelist? Mail was being tagged as spam, and the organization became concerned that others would be tagged, so it seemed anytime there was a high-profile external business contact that they couldn't risk being tagged, they had it added to the whitelist. The list used to be much larger until we spent quite a while (months and months) going through it with them to prune it. I don't doubt that if we removed a substantial amount of them that SA would do what's right, but there doesn't seem to be any scientific way to do that successfully. > Secondly, don't forget about whitelist_from_spf. If a domain has an SPF > record, this is a better solution than whitelist_from_rcvd as it avoids the > need for *you* to work out which are the outgoing servers. Is there a way to script that for the 1000 or so entries, to see which have SPF records? > Lastly, if you do use whitelist_from_rcvd, remember that there may be > multiple outgoing servers for a given domain, and worse they may change over > time. Yeah, I thought of that too, so it doesn't sound like that's going to work well here. Thanks, Alex
Re: Low Scoring Lotto Spam
Hi, > * 3.0 RCVD_IN_UCEPROTECT2 RBL: Received via a relay in > * dnsbl-2.uceprotect.net > * [81.202.69.68 listed in dnsbl-2.uceprotect.net] > * 2.0 RCVD_IN_UCEPROTECT3 RBL: Received via a relay in > * dnsbl-3.uceprotect.net > * [81.202.69.68 listed in dnsbl-3.uceprotect.net] How successful have you been with the UCEPROTECT lists? Seems like a nice project. How come more people aren't using it? IOW, you seemed to be the only one of the four or five people that posted their output from this lotto spam. Why such a disparity in the rules that people use? Thanks, Alex
Re: whitelist_from questions
Hi, I'm looking an email that appears to be one of the users from the whitelist, but instead was from: From probesqt...@segunitb1.freeserve.co.uk Mon Jul 27 19:49:19 2009 Why can't a comparison be made between the "From:" info and the actual sender? Is this because of virtual domains and/or users? Thanks, Alex
Upgrading perl modules for SA
Hi, I recently upgraded perl from 5.6.0 to perl-5.10.0, along with all the modules necessary for sa-3.2.5 and amavisd-new (an old version still). I'm now having a problem that I really don't understand: Jul 30 14:24:30 bigship amavis[1757]: (01757-175) TROUBLE in check_mail: decoding2-get-file-types FAILED: 'file' utility (/usr/bin/file) failed, status=1 (256 ) at /usr/sbin/amavisd line 4019. Jul 30 14:24:30 bigship amavis[1757]: (01757-175) PRESERVING EVIDENCE in /var/amavis/amavis-20090730T142430-01757 The amavisd children are running as a regular user. When I su to that user and run "/usr/bin/file" with the files listed above, it successfully returns the correct type of file. The lines in amavisd surrounding 4019 are: $file ne '' or die "Unix utility file(1) not available, but is needed"; for my $part (@$partslist) { my($filename) = "$tempdir/parts/$part"; my($filetype) = ''; my($proc_fh) = run_command(undef, undef, $file, $filename); while( defined($_ = $proc_fh->getline) ) { $filetype .= $_ } my($err); $proc_fh->close or $err=$!; my($ret) = retcode($?); <= 4019 $ret==0 or die "'file' utility ($file) failed, status=$ret ($? $err)"; chomp($filetype); my($taint) = substr($filetype,0,0); # remove file name $filetype = $1.$taint if $filetype=~/^.+?:[\t ](.*)$(?!\n)/s; section_time('get-file-type'); local($_) = $filetype; my($ty); # try to classify some common types and give them short type name # _last_ match wins! Running spamassassin --lint returns no errors or warnings. Amavis complains that I'm missing a few modules, like SPF, DKIM, and IO::Socket::SSL, but I don't think they're related, and I guess they weren't on there before when it was working fine. Thanks, Alex
Re: Upgrading perl modules for SA
Hi, >> check_mail: decoding2-get-file-types FAILED: 'file' utility >> (/usr/bin/file) failed, status=1 (256 ) at /usr/sbin/amavisd line > How's this a SA question? Yes, my apologies. I don't know enough about amavis yet, and thought it may be related to all the modules I upgraded, and not amavis itself. I've since reverted my changes back to perl-5.6.0, and going to subscribe to that list too. I also upgraded Berkeley DB to db4 and have left db3, db2, and db1 on the system too. However, now I'm having a problem with bayes: [10496] dbg: bayes: tie-ing to DB file R/O /home/sscan/.spamassassin/bayes_toks [10496] dbg: bayes: tie-ing to DB file R/O /home/sscan/.spamassassin/bayes_seen [10496] dbg: bayes: found bayes db version 0 [10496] warn: bayes: bayes db version 0 is not able to be used, aborting! at /usr/lib/perl5/site_perl/5.6.0/Mail/SpamAssassin/BayesStore/DBM.pm line 196. I guess I don't understand the logic, because around 196 is the following, which appears to say that if $self->_check_db_version doesn't equal zero, then fail, but we know it equals version zero from what is stated above... $self->{db_version} = ($self->get_storage_variables())[6]; dbg("bayes: found bayes db version ".$self->{db_version}); # If the DB version is one we don't understand, abort! if ($self->_check_db_version() != 0) { warn("bayes: bayes db version ".$self->{db_version}." is not able to be used, aborting!"); $self->untie_db(); return 0; } Thanks, Alex
Bayes training
Hi, We have accumulated quite a large list of whitelisted users, primarily because they were previously tagged incorrectly. I've extracted a copy of all whitelisted mail into a separate mbox. Certainly there is some spam in there as well, but assuming I only learn the ham, would it make sense to train bayes using the emails from this folder? It's all business-related, but I'm concerned that it may have things in the email that caused it to be tagged in the first place, like excessive HTML, sent from a host with no reverse DNS, etc. -- all the reasons for it being whitelisted in the first place. Looking at the logs before the addresses were added to the whitelist, I see quite a few that were BAYES_99, probably because they resemble mailing lists, such as those from networkworld, for example. IOW, I wouldn't want to whitelist an email from networkworld.com, but one of the company's partners could send the company an email that had many of those characteristics. Someone may also send them a one-line email with a small GIF as an attachment, such as their corporate logo in their signature. This would be a valid email, but also very much resembles the characteristics of a typical spam. This is all being done to hopefully train bayes to better recognize corporate email, and hopefully cut down on the number of whitelisted senders that must be added in the future (or, corporate email that gets tagged then must be whitelisted). Ideas greatly appreciated. Thanks, Alex
Upgrading bayes DB
Hi, I'm still working on my bayes training project, but also trying to upgrade the bayes DB due to upgrading perl and all the associated modules. I started with this output from "sa-learn --dump magic" 0.000 0 3 0 non-token data: bayes db version 0.000 0 1786 0 non-token data: nspam 0.000 0 3698 0 non-token data: nham 0.000 0 198349 0 non-token data: ntokens 0.000 0 929232460 0 non-token data: oldest atime 0.000 0 1249369370 0 non-token data: newest atime 0.000 0 1249369387 0 non-token data: last journal sync atime 0.000 0 1249342872 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count After the upgrade (sa-learn --sync -D), it zeroed the nham and nspam. How could this happen? What could I have done wrong? This is after the upgrade: 0.000 0 3 0 non-token data: bayes db version 0.000 0 0 0 non-token data: nspam 0.000 0 0 0 non-token data: nham 0.000 0 0 0 non-token data: ntokens 0.000 0 1249438016 0 non-token data: oldest atime 0.000 0 1249438016 0 non-token data: newest atime 0.000 0 1249438016 0 non-token data: last journal sync atime 0.000 0 1249438016 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count It seemed to indicate that it was upgrading from db version 0 to db version 2, then db version 3, although the first sa-learn output shows that it was already version 3. Thanks, Alex
RelayCountry Config
Hi, I'm trying to configure RelayCountry. I have it installed, and SA recognizes it: # spamassassin --lint -D 2>&1|grep -i country [4278] dbg: diag: module installed: IP::Country::Fast, version 604.001 [4278] dbg: plugin: loading Mail::SpamAssassin::Plugin::RelayCountry from @INC [4278] dbg: plugin: Mail::SpamAssassin::Plugin::RelayCountry=HASH(0x8fb9648) implements 'extract_metadata', priority 0 [4278] dbg: plugin: Mail::SpamAssassin::Plugin::RelayCountry=HASH(0x8fb9648) implements 'parsed_metadata', priority 0 I've loaded the plugin, and "add_header" according to the wiki page: add_header all Relay-Country _RELAYCOUNTRY_ loadplugin Mail::SpamAssassin::Plugin::RelayCountry I can create rules for each country I'd like to identify, and that successfully adds it to the header: header RELAYCOUNTRY_RU X-Relay-Countries =~ /RU/ describeRELAYCOUNTRY_RU Relayed through Russian Federation score RELAYCOUNTRY_RU 2.0 I was hoping to also have the X-Spam-Countries header added, but that doesn't seem to work. I'm using v3.2.5, so it has the RelayCountries.pm patch to add that support. What am I missing? Somewhat of a basic question, but once I do manage to get that header working, I know I can parse that and make decisions based on it. Are there any pre-written perl routines or utilities that can make that information useful? Also, I believe I read it adds bayes metadata to the email. Is that just through the additional headers or is it supposed to add something else? Thanks, Alex
Re: RelayCountry Config
Hi, > I don't know if it makes a difference, but I call it Relay-Countries to > match the name of the pseudo-header used in the tests > > add_header all Relay-Countries _RELAYCOUNTRY_ It doesn't appear to make a difference. I must be doing something else wrong. Using "spamassassin --lint -D 2>&1 | less" shows the X-Relay-Countries header, but it's null: # spamassassin --lint -D 2>&1 | egrep -i 'relay|country|countries' [23760] dbg: diag: module installed: IP::Country::Fast, version 604.001 [23760] dbg: config: read file /etc/mail/spamassassin/70_relay_country.cf [23760] dbg: plugin: loading Mail::SpamAssassin::Plugin::RelayCountry from @INC [23760] dbg: plugin: loading Mail::SpamAssassin::Plugin::RelayEval from @INC [23760] dbg: Botnet: adding (\b|\d)relay(\b|\d) to botnet_serverwords [23760] dbg: Botnet: adding (\b|\d)relay(\b|\d) to botnet_serverwords [23760] dbg: metadata: X-Spam-Relays-Trusted: [23760] dbg: metadata: X-Spam-Relays-Untrusted: [23760] dbg: metadata: X-Spam-Relays-Internal: [23760] dbg: metadata: X-Spam-Relays-External: [23760] dbg: plugin: Mail::SpamAssassin::Plugin::RelayCountry=HASH(0x8fb9698) implements 'extract_metadata', priority 0 [23760] dbg: metadata: X-Relay-Countries: [23760] dbg: plugin: Mail::SpamAssassin::Plugin::RelayCountry=HASH(0x8fb9698) implements 'parsed_metadata', priority 0 [23760] dbg: rules: ran eval rule NO_RELAYS ==> got hit (1) [23760] dbg: Botnet: no trusted relays [23760] dbg: check: tests=MISSING_DATE,MISSING_HEADERS,MISSING_SUBJECT,NO_RECEIVED,NO_RELAYS,RELAYCOUNTRY_LOW I've added your rules in 70_relay_country.cf, and they trigger in the "tests=", but the header isn't added. I've added the "add_header" in init.pre, above the loadplugin line as well as adding it in local.cf when it didn't work in init.pre. I've also checked email that has actually been tagged by these rules, and not just from a "-D" run, and it's not there either. Thanks again, Alex
Anti-Phishing and Spear-Phishing Version 2
Hi, Has anyone tried the phishing rules generated by Julian Field and developed by Google? It looks really neat: http://www.jules.fm/Logbook/files/anti-phishing-v2.html It's basically a list of 3.5k email addresses found in email thought to be spam. Looks to be developed by Google, so it's "safe?" Thanks, Alex
Re: RelayCountry Config
Hi, >> [23760] dbg: metadata: X-Relay-Countries: >> > The --lint test is *NOT* valid for this. --lint is *ONLY* to verify your > config files are parseable. Yes, thanks, I should have known that, and I think I did. I mentioned in the previous post that I tried it with a real message, and even viewed a number already in quarantine, and the same result. I found this message on nabble: http://www.nabble.com/Question-about-RelayCountry-td18309349.html#a18339974 Same problem, back in'08, with no resolution. I even downgraded to the IP::Fast released in Jan 09, and no difference. Could this be a problem with one of the modules, or is this most likely a configuration issue? What I don't understand is that it knows which country its relayed through, because it prints the rules in the "tests=" section: X-Spam-Status: Yes, hits=21.8 tag1=-300.0 tag2=4.9 kill=4.9 use_bayes=1 tests=BAYES_50, BODY_ENHANCEMENT, BOTNET, FH_HELO_EQ_D_D_D_D, RDNS_NONE, RELAYCOUNTRY_UK, SARE_ADULT2, SARE_RECV_IP_FROMIP3, URIBL_AB_SURBL, URIBL_BLACK, [] Curiously, why doesn't it print them each in a column with description, instead of all together? Thanks, Alex
Re: RelayCountry Config
Hi, > This is also why the plugin works and you do get the per-country rule > hits, but don't get the SA Relay-Countries header. Yes, you are correct. Thanks for the lead and the explanation. Here's a thread that talks about how to add the header for amavisd: http://www.mail-archive.com/amavis-u...@lists.sourceforge.net/msg12416.html I'm not sure it's really necessary after all, though, because the rules work without it, and it still doesn't print the header in quarantined mail. > char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; > main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}} How did you get line noise from your modem to look so much like perl code? :-) Thanks, Alex
Re: RelayCountry Config
Hi, > I find ordinary header and meta rules are all I need: > > http://pastebin.com/f5e5232d1 Among those rules you have: meta RELAYCOUNTRY_MED ! RELAYCOUNTRY_HIGH && ( __RELAYCOUNTRY_AF || __RELAYCOUNTRY_AS || __RELAYCOUNTRY_EU_S || __RELAYCOUNTRY_OC_S || __RELAYCOUNTRY_AM_S ) It's probably hard to read, but doesn't this exclude the US? RELAYCOUNTRY_AM_S are all the Americas except US and CA. If I understand correctly, this says NOT RELAYCOUNTRY_HIGH and all countries except US and CA, which means that RELAYCOUNTRY_MED would trigger on all US and CA relays. Thanks, Alex
Scores, razor, and other questions
Hi, After another day of hacking, I have a handful of general questions that I hoped you could help me to answer. - How can I find the score of a particular rule, without having to use grep? I'm concerned that I might find it at some score, only for it to be redefined somewhere else that I didn't catch. Something I can do from the command-line? - How do I find out what servers razor is using? What is the current license now that it's hosted on sf, or are the query servers not also running there? It doesn't list any restrictions on the web site. - The large majority of the spam that I receive these days is a result of a URL not being listed in one of the SBLs. I'm using SURBL, URIBL, and spamcop. For example, I caught guadelumbouis.com several hours ago, and it's still not listed in any of the SBLs. Am I doing something wrong or am I missing an SBL? Has anyone else's spam with URLs increased a lot lately? Thanks, Alex
Elusive spam
Hi, I'm having trouble catching a particular type of spam, and hoped someone had some time to take a look: http://pastebin.com/d57336542 It doesn't match RAZOR2, or any of the URI lists, and it's only BAYES_50. I have a pretty well-established BAYES db, so I'm surprised it's only BAYES_50. What can I do to block spam like this in the future? Thanks, Alex
Re: Elusive spam
Hi, >> Maybe this will sound dumb but wouldn't it be perfectly >> safe to blacklist "example.com" after all, that isn't a >> domain your ever going to get mail from. > > I could be wrong, but I'm guessing the example.com is the OP's munging. Yes, that's correct. My apologies. Best, Alex
Re: Elusive spam
Hi, > Are we to make guesses on what else might be munged? > Is just example.com munged or the 172.0.0.1 also munged? Just the domain was munged. Thanks for the info. I should have been able to figure that out. Thanks, Alex
Re: Elusive spam
Hi, > it hits spamhaus, and spamcop, what more do you want ? > > meta haus_cop (spamhaus && spamcop) > score haus_cop 5 X-Spam-Status: No, hits=4.8 tagged_above=-300.0 required=5.0 use_bayes=1 tests=BAYES_50, DATE_IN_PAST_03_06, RCVD_IN_BL_SPAMCOP_NET, RCVD_IN_SORBS_WEB, RCVD_IN_XBL, RELAYCOUNTRY_US, URI_HEX 50_scores.cf:score RCVD_IN_BL_SPAMCOP_NET 0 2.188 0 1.960 # n=0 n=2 50_scores.cf:score RCVD_IN_XBL 0 2.896 0 3.033 # n=0 n=2 70_relay_country.cf:score RELAYCOUNTRY_US 0.1 50_scores.cf:score RCVD_IN_SORBS_WEB 0 1.117 0 0.619 # n=0 n=2 50_scores.cf:score BAYES_50 0 0 0.001 0.001 50_scores.cf:score URI_HEX 1.777 1.316 1.395 0.368 50_scores.cf:score DATE_IN_PAST_03_06 2.299 1.394 1.306 0.044 Something doesn't seem right. Am I adding them wrong? It sure seems to equal more than 5.0. Is it possible the rules are being scored differently in another location? The meta rule is a good one. I'll create that now. Thanks, Alex
Re: Elusive spam
Hi, > 50_scores.cf:score RCVD_IN_BL_SPAMCOP_NET 0 2.188 0 1.960 # n=0 n=2 > 50_scores.cf:score RCVD_IN_XBL 0 2.896 0 3.033 # n=0 n=2 > 70_relay_country.cf:score RELAYCOUNTRY_US 0.1 > 50_scores.cf:score RCVD_IN_SORBS_WEB 0 1.117 0 0.619 # n=0 n=2 > 50_scores.cf:score BAYES_50 0 0 0.001 0.001 > 50_scores.cf:score URI_HEX 1.777 1.316 1.395 0.368 > 50_scores.cf:score DATE_IN_PAST_03_06 2.299 1.394 1.306 0.044 > > Something doesn't seem right. Am I adding them wrong? It sure seems to > equal more than 5.0. Is it possible the rules are being scored > differently in another location? It does look like the XBL scores may have been modified in another config file by a previous admin, ugh. Thanks, now I know. Thanks, Alex
Post trips pastebin spam filter
Hi, I have another spam message that is very elusive, and thought someone might be able to take a look. I tried to post it to pastebin, and its spam filter apparently catches it, and prevents me from posting. It's definitely in the header. Is there something else I can do to post it, or does someone know how their spam filter works? I tried even obfuscating the spam URLs, but it still catches it. The spam has BAYES_99, and is also DKIM signed and verified, and passes SPF, and despite having "Congratulations!", "Wal-Mart" and several URLs in the body, it's not caught. Thanks, Alex
Re: Barracuda RBL in first place
Hi, > Unknown user 32.00% (32.00%) 87427696 > Greylisted 24.88% (16.92%) 46225401 > Throttled 11.03% (5.64%) 15399444 > Relay access denied 0.01% (0.00%) 7034 > Bogus DNS (Broadcast) 0.01% (0.00%) 11692 > Bogus DNS (RFC 1918 space) 0.07% (0.03%) 82135 > Spoofed Address 0.26% (0.12%) 319551 > Unclassified Event 0.77% (0.35%) 949388 > Temporary Local Problem 0.01% (0.00%) 8165 > Require FQDN sender address 0.04% (0.02%) 51022 > Require FQDN for HELO hostname 8.97% (4.02%) 10988455 [...] Can I ask how you produced those stats? They look very helpful. Thanks, Alex
Re: Barracuda RBL in first place
Hi, >> What log script do you good people use to generate the list above ? Is it >> a home brew or one we can download so we can compare our own hits ? > > http://www.rulesemporium.com/programs/sa-stats.txt Any chance someone knows where there is a compatible one that parses amavisd instead of spamd? I've tried, but guess I don't know enough perl to get it right. Any chance someone has a bit of time to hack on it on this lazy Saturday afternoon? :-) Thanks, Alex
Counting RAZOR2 hits
Hi, I thought "grep -c RAZOR2_CHECK" through my mail logs would give me a good approximation of the number of times RAZOR2 was consulted, but that doesn't seem to be the case. There are some mails that don't have it listed in the "tests=" section. I've also tried the razor-* commands, and they don't appear to be able to help here either. What am I missing? Does RAZOR2_CHECK mean that it was found in the RAZOR2 db, or that it merely consulted the db? Thanks, Alex
Re: Barracuda RBL in first place
Hi, > So perhaps instead of adding another RBL, maybe some admins need to > consider adding in some HELO checking / rejection. Can you explain a bit more here? What are you checking for, that the host is valid? Thanks, Alex
Re: Counting RAZOR2 hits
Hi, > You can also set your min_cf in your razor config files, which will > affect when the RAZOR2_CHECK rule fires. This does work in SpamAssassin, > as I have over-ridden the min_cf on my own system, and have done so for > years. Thanks to everyone for their great ideas thus far. I'm looking forward to working through it to learn more. I'm seeing a lot of FNs that include various RAZOR rules, but still don't have enough points to be tipped. Are there meta rules that people have created and can share that might help? How about combining it with BOTNET? The ones that have BAYES_99 and most of the SURBLS and RAZOR* are all properly tagged already, but many only have BAYES_50. Some have only RAZOR2_CHECK and contain an inline image. X-Spam-Status: No, hits=4.1 tagged_above=-300.0 required=5.0 use_bayes=1 tests=BAYES_50, HTML_MESSAGE, RAZOR2_CF_RANGE_51_100, RAZOR2_CF_RANGE_E8_51_100, RAZOR2_CHECK, RDNS_NONE, RELAYCOUNTRY_US, SPF_HELO_PASS, SPF_PASS score RAZOR2_CHECK 0 0.9 0 0.9 score RAZOR2_CF_RANGE_51_100 0 0.8 0 0.8 score RAZOR2_CF_RANGE_E4_51_100 0 1.8 0 1.8 score RAZOR2_CF_RANGE_E8_51_100 0 1.5 0 1.5 I see now that RAZOR2_RANGE_E8 should also be at least 1.8, which I've now changed. Does everyone do their own mass-checks these days? How do you go about analyzing the FNs to figure out why they aren't caught and adjust the scores? Of course they need to be looked at individually for additional patterns, but how are the scores best "personalized" of the rules that are triggered? Thanks, Alex
Re: sa-update: stuck at 795855?
Hi, > The problem is that the spammers test with the SA rulesets as soon > as they are released, which is why the rulesets become ineffective. I'm not sure I agree with that. If this were the case, I would have a lot less spam with scores of 50 or more, which obviously aren't even trying to do something as easy as pass it through SA first. Also, couldn't we then draw conclusions from this that, since vendors like Symantec have rules which never are seen by spammers, that their rules are better? Incidentally, are there technologies that vendors like Symantec, Proofpoint, Cisco, Google, etc, use that we don't have or don't have access to? Thanks, Alex
Re: Assistence needed with spamassasin under RedHat 5.2
Hi, > spamassasin. I have a test message which is genuine. Running this through > spamassasin with -t (test) mode as described below gives the output below: > > Running : spamassassin -t /tmp/rose2 gives at the bottom the following > (edited for privacy) report. Try adding some debugging output, and first look for something obviously wrong: # spamassassin -D -t /tmp/rose2 2>&1 | less Go line-by-line looking for something that stands out as obviously wrong. Consider obfuscating your message, replacing your domain with "example.com", for instance, and uploading it to pastebin.com. Then post a link here so we can all view the message for further ideas. Regards, Alex
Re: gpgkey failures with sa-update
Hi, > list. No errors reported then, and I've now forgotten the url. www.yerp.org > now gets me a webmail login screen, so obviously that wasn't it. Toss that > url to me and I'll replay it again. You should be able to search through your browser history, no? With Firefox v3.5, you can also just type "yerp" in the location bar, and it will do a more aggressive search through your previous URLs for anything containing those letters. Regards, Alex
Re: spam mail with flagged style images
Hi, > Text added to e-mail is a bogus one, never repeated, same as the old styled > spam mail with attached images. The OCR doesn't detect nothing, I understand > because of flagged effect. Also, image file name changes, if it have. A few of these have slipped through on my systems, but for the most part, these rules have worked here: mimeheader AS_090505_CDIS_INLINE Content-Disposition =~ /inline/ score AS_090505_CDIS_INLINE 0.5 describe AS_090505_CDIS_INLINE Rule by AS: Content-Disposition: inline mimeheader AS_090508_CTYP_PNG Content-Type =~ /image\/png/ score AS_090508_CTYP_PNG 0.5 describe AS_090508_CTYP_PNG Rule by AS: Content-Type: PNG mimeheader AS_090508_CTYP_JPG Content-Type =~ /image\/jpg/ score AS_090508_CTYP_JPG 0.5 describe AS_090508_CTYP_JPG Rule by AS: Content-Type: JPG mimeheader AS_090508_CTYP_JPEG Content-Type =~ /image\/jpeg/ score AS_090508_CTYP_JPEG 0.5 describe AS_090508_CTYP_JPEG Rule by AS: Content-Type: JPEG meta AS_090508_PNGSPAM (AS_090505_CDIS_INLINE && AS_090508_CTYP_PNG) score AS_090508_PNGSPAM 0.5 describe AS_090508_PNGSPAM Rule by AS: Probably an Inline PNG spam meta AS_090508_JPGSPAM (AS_090505_CDIS_INLINE && AS_090508_CTYP_JPG) score AS_090508_JPGSPAM 0.5 describe AS_090508_JPGSPAM Rule by AS: Probably an Inline JPEG spam meta AS_090508_JPEGSPAM (AS_090505_CDIS_INLINE && AS_090508_CTYP_JPEG) score AS_090508_JPEGSPAM 0.5 describe AS_090508_JPEGSPAM Rule by AS: Probably an Inline JPEG spam meta LOCAL_BOTNET_JPG(BOTNET && AS_090508_JPGSPAM) score LOCAL_BOTNET_JPG 1.5 describe LOCAL_BOTNET_JPG Rule by AS: Probably an Inline JPEG spam meta LOCAL_BOTNET_JPEG(BOTNET && AS_090508_JPEGSPAM) score LOCAL_BOTNET_JPEG1.5 describe LOCAL_BOTNET_JPEGRule by AS: Probably an Inline JPEG spam The LOCAL_* are mine, adapted to others I found some time ago. I'd be interested in people's input on these. Can they be simplified? Do you agree with the scoring? How about bayes poisoning? The messages also all have random text, mostly spelled correctly, but nonsensical. If they are trained, could it adversely affect my bayes db? Thanks, Alex
Junkmailfilter rules
Hi, I've been using the junkmailfilter rules for a few days now, and it's doing quite well. It occurred to me that I might be able to use the RCVD_IN_JMF_W rule filter whitelisted domain mail, and use that to train bayes ham. Would this work? There of course would be mail from constantcontact.com, mailing list mail, "newsletters", etc, that all contain a lot of HTML and other components that could equally be seen in spam. How do people typically train bayes ham? I can't rely on my users not to mix up spam and ham, surely corrupting the database. I did find this in one of the emails, passed through delivery.net: X-Spam-Status: No, hits=4.9 tagged_above=-300.0 required=5.0 use_bayes=1 tests=BAYES_50, BOTNET, DKIM_SIGNED, DKIM_VERIFIED, HTML_MESSAGE, RAZOR2_CF_RANGE_51_100, RAZOR2_CF_RANGE_E4_51_100, RAZOR2_CHECK, RCVD_IN_JMF_W, RELAYCOUNTRY_US, SPF_HELO_PASS, SPF_PASS It was a citibank credit card email. How could it be in RAZOR and also whitelisted, and BOTNET? Certainly there were no domains in there that it was relayed through that were part of a botnet. Ideas greatly appreciated. Thanks, Alex
Re: spam mail with flagged style images
Hi, >> mimeheader AS_090508_CTYP_PNG Content-Type =~ /image\/png/ >> mimeheader AS_090508_CTYP_JPG Content-Type =~ /image\/jpg/ >> mimeheader AS_090508_CTYP_JPEG Content-Type =~ /image\/jpeg/ > > All scored the same. Can be written as a single rule. I've spent some time and tried to refine my rules based on your advice, guenther. Can I ask you to check them over again and see if this is any better, or at least more inclusive? mimeheader LOC_CDIS_INLINE Content-Disposition =~ /inline/ score LOC_CDIS_INLINE 0.1 describe LOC_CDIS_INLINE Content-Disposition: inline mimeheader LOC_CTYP_IMG ((Content-Type =~ /image\/png/) || (Content-Type =~ /image\/jpg/) || (Content-Type =~ /image\/jpeg/) || (Content-Type =~ /^application\/octet-stream.\.rtf/)) score LOC_CTYP_IMG 0.1 describe LOC_CTYP_IMG Content-Type: PNG-JPG-JPEG-RTF meta LOC_IMGSPAM ((LOC_CDIS_INLINE && LOC_CTYP_IMG) score LOC_IMGSPAM 0.1 describe LOC_IMGSPAM Probably inline image meta LOC_BOTNET_IMG ((BOTNET && LOC_IMGSPAM) || (BAYES_99 && LOC_IMGSPAM)) score LOC_BOTNET_IMG 1.5 describe LOC_BOTNET_IMG Probably inline image spam > Generally, no. A spam advertising body part enhancers also has > correctly spelled words. Training them doesn't "poison" Bayes either. > And there usually are still useful tokens around. That's great, thanks! Thanks, Alex
Re: spam mail with flagged style images
Hi, > mimeheader LOC_CTYP_IMG ((Content-Type =~ /image\/png/) || > (Content-Type =~ /image\/jpg/) || (Content-Type =~ /image\/jpeg/) || I thought this passed through my --lint, but I only caught it the second time. I was looking around for the (new) right way to do it, and found this in 80_additional.cf: mimeheader __ANY_IMAGE_ATTACH Content-Type =~ /image\/(?:gif|jpeg|png)/ Now I know. Does the rest look like it will work as expected? Thanks, Alex
Re: lottery message scored hammy by bayes
Hi, > If you're using autolearning, what are your learning thresholds? What do you recommend for thresholds? I'm considering using autolearning, but very concerned about corrupting the database. I think I would use something like +15 for spam. There are FNs on occasion in the 2.x range with low bayes numbers (or BAYES_50) that I wouldn't want to be tagged as ham. Should that be a concern? Even mail that has been whitelisted could also contain spam, so would a ham threshold of like -100 work, or present the same problem? Thanks, Alex
Training spam as ham and forwarding
Hi SA users, I have a few messages found in the quarantine that I need to train as ham because they were marked as spam incorrectly. To do this, I added the following to the top of the file so it becomes a normal email: From DUMMY-LINE Thu Jan 1 00:00:00 1970 Is this correct? (without the leading spaces) I can now accurately access and index it using pine, whereas before it didn't acknowledge it as a normal email. I'd also now like to forward it to the intended recipient as an attachment, but the recipient isn't able to read it as a normal email, but instead as plain text. How can I accomplish this? Are there mail tools, like procmail or formail, I believe, that were designed to automate this? Does anyone request ham from their users to be trained by bayes, or is autolearning typically the only way (or only real effective way) to do this? Also, on another note, how can I have all email destined for a particular user sent to them, including spam? This is what all_spam_to is for, correct? Thanks, Alex
Google/Yahoo Spam
Hi all, I'm seeing an increase in Google Reader and yahoo groups/personals/profile spam. Here's an example of the Google Reader spam: http://pastebin.com/m1021fc5f Any ideas on how to catch this one? For the Yahoo spam (with links to yahoo sites ending in '/1', I've created these: uriLOC_YAHOO1 m{http://groups\.yahoo\.com\/}i score LOC_YAHOO1 0 1.5 0 1.5 describe LOC_YAHOO1 Contains groups.yahoo.com uri uriLOC_YAHOO2 m{http://profile\.yahoo\.com\/}i score LOC_YAHOO2 0 1.5 0 1.5 describe LOC_YAHOO2 Raw body contains profile.yahoo uriLOC_YAHOO3 m{http://personals\.yahoo\.com\/}i score LOC_YAHOO3 0 1.5 0 1.5 describe LOC_YAHOO3 Raw body contains personals.yahoo They're somewhat paired down because I'm not very good at pattern matching, so thought someone could improve on this? Thanks, Alex
Converting spam to email message
Hi all, I thought I understood, but I'm still having trouble converting a message in the quarantine back into a normal email message that I can forward on to a recipient. Does anyone know how to do this? Thanks so much. Best regards, Alex
Re: Converting spam to email message
Hi, >> I thought I understood, but I'm still having trouble converting a >> message in the quarantine back into a normal email message that I can >> forward on to a recipient. Does anyone know how to do this? > > Maybe I missed something, but SpamAssassin doesn't have a quarantine. > > http://wiki.apache.org/spamassassin/SpamQuarantine Yes, my apologies. I guess it would then be amavisd-new that's managing the quarantine. I didn't realize that amavisd manipulated the mail in that way. Hopefully someone can still help. Thanks, Alex
Re: Porn-portal spammers
Hi, > I am getting rather tired from messages spamming porn-portals. They typically > originate from hotmail.com, and advertise a porn-portal based on > google.com/groups, google.com/reader, groups.yahoo.com, pipes.yahoo.com, > spaces.live.com, docs.google.com, sites.google.com and livejournal.com. This was posted by Martin a week or so ago in response to a similar question by me: This should catch your set and more: uri LOC_YAHOO /^http:.{1,40}\.yahoo[.,]com/i scoreLOC_YAHOO 0 1.5 0 1.5 describe LOC_YAHOO Contains *.yahoo.com uri Or, if you want to be more specific, try this: uri LOC_YAHOO /^http:\/\/(groups|profile|personals)\.yahoo[.,]com/i scoreLOC_YAHOO 0 1.5 0 1.5 describe LOC_YAHOO Contains yahoo.com groups/profile/personals uri Does this help? Best regards, Alex
Re: 3.3.0 alpha 2 on production mail servers / clusers ???
Hi, > On Saturday August 29 2009 19:47:32 R-Elists wrote: >> have many, or any of you folks on the list migrated your production servers >> to the 3.3.0 alpha 2 or later release? > > We are certainly one of them (actually running CVS head, > which is pretty close to alpha2). About 1000 users here. Do we have an idea of a timeline for the next release and/or production release currently? How about dependencies? Will perl-5.8 work okay? What modules will need to be updated? How about for use with amavis? Will I need to upgrade that? A list of the top five best new features would also be great! *salivates* :-) I'm trying to anticipate what I can do ahead of time to get it into place as soon as possible. Thanks, Alex
Shortcircuit info
Hi all, I'm trying to understand how shortcircuit works to ease some of the load on the severs. First, does anyone have any recommended metas that they use in their environment that might help? Can I add shortcircuit to an existing rule, or does the rule have to be designed to be used with shortcircuit? In other words, I have a meta that combines spamcop with spamhaus: metaMETA_HAUS_COP (RCVD_IN_BL_SPAMCOP_NET && RCVD_IN_XBL) describe META_HAUS_COP Contains SPAMHAUS XBL and SPAMCOP score META_HAUS_COP 0 4.0 0 4.0 shortcircuit META_HAUS_COP spam In order for it to be actually shortcircuited, however, I have to make the score 100, correct? Thanks, Alex
URL rule creation question
Hi all, I've seen this pattern in spam quite a bit lately: href="http://doubleheaderover.com/jazert/html/?39.6d.3d.31.66.67.6b.79.77.63.77.63.65.6e.74.69.6e.6e.69 .61.6c.5f.68.31.33.33.2e.6f.39.39.41.4d.2e.30.30.45.33.39.2e.30.32.30.61.64.6b.37.61.76.61.67.63.31.66. 62.2e.6a.61.7a.65.72.74.2e.68.74.6d.6c3az8fO" Would it be reasonable to create a rule that looks for this two-char then dot pattern, or is it reasonable that it might appear in a legitimate email too frequently? If possible, how would you create a rule to capture this? Thanks, Alex
JMF whitelist and RAZOR conflict
Hi, I have several emails that are tagged with RCVD_IN_JMF_W, SPF_SOFTFAIL, and RAZOR2_CHECK such as this one: http://pastebin.com/m4a4d990e Is the criteria for being listed on the JMF_W simply that it contains a domain that is whitelisted, despite whether it contains another URL that is blacklisted? Would I be advised to make the JMF_W score very low, or create a meta that doesn't really whitelist it unless it isn't also blacklisted? meta META_NOT_JMF_RAZOR(RCVD_IN_JMF_W && !RAZOR2_CHECK) It also appears to spoof the kraftfoods.com mail server, correct? Is there a possible rule to be created here? Thanks, Alex
Re: JMF whitelist and RAZOR conflict
Hi, >> http://pastebin.com/m4a4d990e >> >> Is the criteria for being listed on the JMF_W simply that it contains >> a domain that is whitelisted, despite whether it contains another URL >> that is blacklisted? > > I'm not sure what you are saying here, it's not as if the people > running the whitelist could lookup the IP address on razor. I'm saying that it appears odd that it would be listed on both RAZOR and JMF_W, unless the JMF_W found the kraftfoods.com URL and the RAZOR rules found the bogus http://ADSENSETREASUREONLINE.yolasite.com URL. Unless the yolasite.com is a legitimate kraftfoods site? >> meta META_NOT_JMF_RAZOR (RCVD_IN_JMF_W && !RAZOR2_CHECK) > > Why RAZOR2_CHECK? Why not other positive scoring rules? The trouble is > that the whitelist rule is then pointless. Set it's score at a value > that's commensurate with it's effectiveness on your email. Does my question now make sense? I was looking at it from more of a validation point of view for JMF_W, because of the apparent conflict with RAZOR. >> It also appears to spoof the kraftfoods.com mail server, correct? Is >> there a possible rule to be created here? > > No, it was almost certainly sent through kraftfoods.com. It's based on > an IP address recorded by your trusted network. Maybe I should have used a better example. Can I ask you to look at this one? http://pastebin.com/m7d61b26f This uses IP 66.132.135.108 as its URL (xybersleuth.com), and unless that's not a spammer's site, then there's something wrong. This email includes JMF_W and RAZOR2_CF_RANGE_51_100 and URIBL_BLACK in the same message, although it has a very low bayes score. Which is correct? Thanks, Alex
Re: URL rule creation question
Hi, > The 'doubleheadedrover' domain currently shows up in Razor(E8), > uribl_black, surbl_jp, and invaluement. > > But it wasn't in all of those when he first started posting about it. Yes, that's correct. Thanks for your help. That's already caught a few. I have another that I thought you could help with. I'd like to create a rule that matches a specific letter and up to 5 spaces after it, repeated ten times. I'm thinking something like this: /s\ {5}o\ {5}n\ {5}i\ {5}c\ {5}\ m\ {5}e\ {5}d\ {5}i\ {5}a/i I'm still learning regex's, so hopefully this isn't too far off. The opportunities for rules are coming faster than my ability to learn. Thanks, Alex
Re: JMF whitelist and RAZOR conflict
Hi, >> I have several emails that are tagged with RCVD_IN_JMF_W, >> SPF_SOFTFAIL, and RAZOR2_CHECK such as this one: >> http://pastebin.com/m4a4d990e > > why accept SPF_SOFTFAIL ? > > cant this be solved ? I don't understand. I'm still learning how the SPF rules work. Shouldn't I be adding points for an SPF_FAIL? This indicates a spoof attempt, no? > are you recieving forwarded emails from spf domains ? If I understand correctly, no. I have no relationship with any external source and their SPF records. > if so add the forward ip to trusted_networks (so spf will be disabled from > this hosts) Do you mean to avoid the processing overhead? IOW, don't bother checking SPF records for trusted domains? >> Is the criteria for being listed on the JMF_W simply that it >> contains a domain that is whitelisted, despite whether it >> contains another URL that is blacklisted? > > this is spamassassin working, if there is a blacklisted domain add it to > your uribl_skip_domain list Ah, you mean if the domain is erroneously on the blacklist, right? >> Would I be advised to make the JMF_W score very low, or create a >> meta that doesn't really whitelist it unless it isn't also blacklisted? > > this is ip and not domains On a somewhat related note, how does BOTNET differ from RDNS_NONE? What is the logic behind the BOTNET rule? Is there some known list that it's checking, or is it just likely to be a dynamic IP or compromised host if it doesn't have a reverse DNS entry? Thanks so much for the clarification, and confirmation about Gevalia/Kraft. Thanks, Alex
Re: URL rule creation question
>>> \s is the proper way to represent whitespace. >> >> lol, yes, I know that; I was actually trying to match 's' and the >> slash is the start of the pattern match. > > I wasn't referring to the beginning of the RE. Yeah, I realized that just after I sent this, if anyone cares :-) Thanks again, Alex
URIBL_BLACK vs RCVD_IN_JMF_W
Hi, I have been going through about 15MB of email generated from a procmail recipe searching for RCVD_IN_JMF_W, and you would not believe how many also match URIBL_BLACK or URIBL_GREY. Call me naive, but are there really that many providers that are unaware their clients are sending spam? (okay, rhetorical question :-) IOW, I guess this email is more of an informational note to those who may not be aware, and perhaps for others to comment on whether they even use it? The winner for me was a Bank of America scam with the following two relays: Received: from User (channelf.5460.net [61.137.93.80]) Received: from ortiz.unizar.es (ortiz.unizar.es [155.210.1.52]) No b-of-a relays, of course. This message also hit RAZOR2_CHECK and SPF_FAIL. There's also a money scam that passed through nasa.gov, hit RCVD_IN_JMF_W, and a few fraud rules: Received: from ALTPHYEMBEVSP30.RES.AD.JPL ([128.149.137.84]) by Received: from mail.jpl.nasa.gov (altvirehtstap02.jpl.nasa.gov [128.149.137.73]) Received: from mail.jpl.nasa.gov (sentrion2.jpl.nasa.gov [128.149.139.106]) X-Spam-Status: No, hits=1.1 tagged_above=-300.0 required=5.0 use_bayes=1 tests=AE_ADVICE_WITH_MONEY, AE_FRAUD_ADVICE, BAYES_50, LOTS_OF_MONEY, MILLION_USD, MONEY_TO_NO_R, RCVD_IN_DNSWL_MED, RCVD_IN_JMF_W, RELAYCOUNTRY_US I have RCVD_IN_JMF_W set to 0.5 points. It was also listed in RCVD_IN_DNSWL_MED? Running it a bit later, it scored as spam with the RAZOR rules: X-Spam-Report: * 0.9 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/) * -0.5 RCVD_IN_JMF_W RBL: Sender listed in JMF-WHITE * [128.149.139.106 listed in hostkarma.junkemailfilter.com] * -4.0 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, * medium trust * [128.149.139.106 listed in list.dnswl.org] * 0.0 RELAYCOUNTRY_US Relayed through United States * 1.0 AE_FRAUD_ADVICE BODY: Someone offering free advice * 1.8 MILLION_USD BODY: Talks about millions of dollars * 2.1 RAZOR2_CF_RANGE_E4_51_100 Razor2 gives engine 4 confidence level * above 50% * [cf: 56] * 0.9 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50% * [cf: 56] * 0.0 LOTS_OF_MONEY Huge... sums of money * 2.0 AE_ADVICE_WITH_MONEY Has advice and mentions much money * 1.0 MONEY_TO_NO_R Lots of money and bare, missing or undisclosed To * 0.2 MONEY_INHERIT Lots of money from a dead guy X-Spam-Relay-Country: US US US X-Spam-Status: Yes, score=5.4 required=5.0 tests=AE_ADVICE_WITH_MONEY, AE_FRAUD_ADVICE,LOTS_OF_MONEY,MILLION_USD,MONEY_INHERIT,MONEY_TO_NO_R, RAZOR2_CF_RANGE_51_100,RAZOR2_CF_RANGE_E4_51_100,RAZOR2_CHECK, RCVD_IN_DNSWL_MED,RCVD_IN_JMF_W,RELAYCOUNTRY_US shortcircuit=no autolearn=disabled version=3.2.5 Thanks, Alex
Re: Problems with high spam
Hi, > also if using amavisd make its temp dir on ram speed up scanning and it > considered safe, mta have it on disk for the backup :) How about mounting /var with noatime? Does anyone do that? Do you think it helps? What Linux filesystem is best suited for this? ext4? Thanks, Alex
Re-running SA on an mbox
Hi, I have an mbox with about a 100 messages in it from a few days ago. The mbox is a combination of spam and ham. What is the best way to run SA through these messages again, so I can catch the ones that have URLs in them that weren't on the blacklist at the time they were received? Must I break them all apart to do this, or can SA somehow parse the whole mbox? If not, what program do you suggest I use to accomplish this? Thanks, Alex
Re: Re-running SA on an mbox
Hi, > Do you just want to re-scan the whole mbox and see what rules hit now > for research reasons? That's a good start, but I'd like to see if I can break out the ham to train bayes. > There's no way to (directly) get SA to modify email that's already in an > mbox file. The mass-check and sa-learn tools can read them, but nothing > in SA can write to that. However, there might be a utility out there to > do this (although I'm not aware of any).. Yeah, that's kind of what I thought. Maybe a program that can split each message back into an individual file? Would procmail even help here? Or even a simple shell script that looks for '^From ', redirects it to a file, runs spamassassin -d on it, then re-runs SA on each file? I could then concatenate each of them back together and pass it through sa-learn. Thanks, Alex
Re: Re-running SA on an mbox
Hi, > You probably want "spamassassin --mbox". :) > It won't modify the messages in-place, but you can do something like > "spamassassin --mbox infile > outfile". My apologies if it wasn't clear, but these messages have already been marked by SA. Some are ham, and the rest are FPs that I'd like to re-run through SA, in hopes of it now properly detecting them as spam. Thank you all for your help. The "mbox split" suggestion is a good one. I'll follow that route and post my experience later. Thanks again, Alex
Re: Re-running SA on an mbox
Hi, >> You probably want "spamassassin --mbox". :) >> It won't modify the messages in-place, but you can do something like >> "spamassassin --mbox infile > outfile". > > My apologies if it wasn't clear, but these messages have already been Wait, my mistake. I read that too fast. Does that work, and rewrite the X-Spam-Status header? Guess I could find out for myself, but it just contradicts my experience and info I've learned previously. Thanks again, Alex
Re: Re-running SA on an mbox
Hi, >> Thank you all for your help. The "mbox split" suggestion is a good >> one. I'll follow that route and post my experience later. > > formail -s is the way to go. I thought about that as a component of procmail. Sounds great. Thanks, Alex
Re: Re-running SA on an mbox
> but this will invalidtate dkim headers if this headers is signed, are > spamassassin aware of this problem ? (in general) Are you saying there is a bug? > mutt -f mbox > > in mutt save to another folder if missclassified Yes, I use pine for that, but would like to eliminate as many of the FNs as possible, particularly ones that I can't determine visually. Thanks, Dave
Re: Re-running SA on an mbox
Hi, > IIRC you previously mentioned using Pine. Just in case you're not aware > the default format for Pine/Alpine is MBX, an extended version of > MBOX. You can tell the difference because MBX mailboxes start with a > dummy email that's hidden by the software. It seems that if you save messages into a separate folder it does not add the DUMMY information at the top. I believe this is why the system was set up to use "mbox" and not "mbx". Does this sound correct? > I'd be very wary about allowing any tool to modify an MBX file unless > you know it's safe. Where locking is an issue, Mark Crispin recommends > that they only be accessed via the c-client library. This isn't the actual spool file, but a copy in the home directory. Thanks, Alex
Re: Re-running SA on an mbox
Hi, It's certainly not a fast operation, but using the following will split an mbox into individual messages: export FILENO=0 mkdir msgs formail -s sh -c 'cat - >msgs/$FILENO' < mbox-name.mbox I also created a loop that would strip all the SA headers from the messages: for file in *; do echo Processing: $file; spamassassin -d < $file > $file.txt; done This worked for a few hundred of the messages, but then started to fail on my production system with: [22135] warn: bayes: cannot open bayes databases /home/user/.spamassassin/bayes_* R/W: lock failed: File exists How can I tell when another process is using the database and when it is free for my script to use? Is there a faster way to run spamassassin just to strip the SA headers? Maybe there is a faster way, like passing the messages through the running amavisd instead of having to restart spamassassin each time to re-process each message? Thanks, Alex
Re: Re-running SA on an mbox
Hi, > Try using a local SA setup for stripping the headers. By local, I mean > don't use your main production SA - run a separate copy with its own > (cut down) configuration and all data base accesses and UBL calls etc > turned off. Much better idea, thanks. Thanks for the script, too. Best, Alex
New money/fraud spam
Hi John, Another batch of money spam attached. Everything is the same as the last time. Thanks, Alex money-spam-092709.gz Description: GNU Zip compressed data
Re: New money/fraud spam
Okay, my bad, please ignore. Damn google auto-complete. Alex On Sun, Sep 27, 2009 at 6:46 PM, MySQL Student wrote: > Hi John, > > Another batch of money spam attached. Everything is the same as the last time. > > Thanks, > Alex >
Sought regex problem
Hi, I posted bug 6198 a few weeks ago, and there have been no comments or fixes on it in two weeks, and I'm unsure what to do next. It's either not a bug and I'm doing something wrong or it's not significant enough to bother with the focus on v3.3. Thought someone might have some ideas here? I'm using perl-5.6. Anyone else using perl-5.6 with the sought rules? [13204] dbg: config: read file /var/lib/spamassassin/3.002005/sought_rules_yerp_ org/20_sought.cf [13204] warn: config: invalid regexp for rule __SEEK_D52BRW: / Don\'t want to lose your potential of a lover\? Lucky you are, in 21th century all bed-related male problems can be solved by the powerful remedy, the all-mighty blue caplet\! This solution will give you the right support for 50\(\!\) hours\. Rock-like and ready to go\. more\x{bb}/: / Don\'t want to lose your potential of a lover\? Lucky you are, in 21th century all bed-related male problems can be solved by /: Can't use \x{} without 'use utf8' declaration Maybe it's a perl module that's incompatible? Ideas greatly appreciated. Thanks, Alex
Re: Sought regex problem
Hi, >> [13204] dbg: config: read >> file /var/lib/spamassassin/3.002005/sought_rules_yerp_ >> org/20_sought.cf [13204] warn: config: invalid regexp for rule >> __SEEK_D52BRW: > > grep doesn't find __SEEK_D52BRW in my copy of the rules. This was from the sa-update when I submitted the bug report. Thanks to all for the feedback and the update to the bugzilla. I'm in the process of upgrading perl, but there are still a few applications that depend on it. Mark suggested in the bugzilla update that I "change SpamAssassin to add 'use utf8' into code generated from rules when it sees it is being run with a pre-5.8 version of perl." How do I do this for the time being? Thanks, Alex
Re: Hostkarma Blacklist Climbing the Charts
Hi, > header RCVD_IN_JMF_W eval:check_rbl_sub('JMF-lastexternal', '127.0.0.1') > describe RCVD_IN_JMF_W Sender listed in JMF-WHITE > tflags RCVD_IN_JMF_W net nice > score RCVD_IN_JMF_W -5 Hopefully my comment isn't out of place with the current discussion of JMF/Hostkarma. I think this is not only a really bad default score, but it should be reduced to -0.5 or perhaps not used at all. I have a money/fraud email that hit RCVD_IN_JMF_W that passed through these servers: Received: from 41.220.75.3 Received: from webmail.stu.qmul.ac.uk (138.37.100.37) by mercury.stu.qmul.ac.uk Received: from qmwmail2.stu.qmul.ac.uk ([138.37.100.210] Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6]) It also hit these other rules: X-Spam-Status: No, hits=1.3 tagged_above=-300.0 required=5.0 use_bayes=1 tests=AE_GBP, BAYES_50, LOTS_OF_MONEY, LOTTERY_PH_004470, LOTTO_RELATED, MONEY_TO_NO_R, RCVD_IN_DNSWL_MED, RCVD_IN_JMF_W, RELAYCOUNTRY_UK, SPF_FAIL, SPF_HELO_FAIL Unless I'm really missing something, which server has JMF/Hostkarma whitelisted that shouldn't be? This happens time after time. Thanks, Alex > > header RCVD_IN_JMF_BL eval:check_rbl_sub('JMF-lastexternal', '127.0.0.2') > describe RCVD_IN_JMF_BL Sender listed in JMF-BLACK > tflags RCVD_IN_JMF_BL net > score RCVD_IN_JMF_BL 3.0 > > header RCVD_IN_JMF_BR eval:check_rbl_sub('JMF-lastexternal', '127.0.0.4') > describe RCVD_IN_JMF_BR Sender listed in JMF-BROWN > tflags RCVD_IN_JMF_BR net > score RCVD_IN_JMF_BR 1.0 > ===8<--- > > You pick the names and then the world can use them. The JMF names are out > there today. > > {^_^} Joanne >
Re: Hostkarma white list
Hi, > For those of you getting spam from IPs/Hostnames on my hostkarma > white list, if you could email me a list of false hits (IP or host name) I > could probable clean out the bad entries in the white list pretty quick. I'm not sure this is the best approach. I have a procmail recipe that filters specifically the JMF_W and I go through it every day before training the folder as ham. I'd say around a quarter of the messages are spam. How many entries on the whitelist? How were they added? I'd almost rather start from scratch (or from a more proven list) with a percentage known to be valid and build from there. At the least, wouldn't it be best to move the default score closer to zero on your wiki page for the time being? Maybe another method for submitting FPs rather than emailing them to you could be created? Wouldn't the veracity of the list be better assured if you built the list from a pile of known ham? Mail originating from priorityoneemail.com [69.10.237.52] would be one prime suspect for removal consideration. On a somewhat related topic, how do people classify topica.com? That is one for sure sends junk, but looks like people may actually request it, heh. Thanks, Alex > >
Re: .cn Oddity
Hi All, Regarding the .cn oddity, I added these to my rules, and of about 79k messages today so far, I have the following: uri LOC_URI_CN m;^https?://[^/?]+\.cn\b; uri T_CN_8_URL /[\/.]+\w{8}\.cn(?:$|\/|\?)/i LOC_URI_CN: 2926 T_CN_8_URL: 1634 HTH, Alex
Re: OT bad news
Hi, > It's a shame that, living in Denver, I will be *just* out of range of > hearing the screams as the mailspools fill with viruses, malware, and > massive payloads of Spanish Prinsoner spams. Awe, c'mon now. Yes, I agree SA is a better solution, but Microsoft didn't get to be a multi-billion-dollar company solely because of its marketing. Certainly a competent admin following some SANS guides can secure an Exchange box to sufficiently avoid it getting hacked, and a properly-installed version of Symantec will keep most spam away. It /is/ possible, I suppose :-) I'd bet that if he kept the FreeBSD box in place and just told his boss he "upgraded" to Exchange, they'd never even know :-) Regards, Alex
Re: Uppercase E-mail in Latin America
Hi, > doesnt it appear to everyone else that this has the (slim to none) makings > of a new urban legend? I have to admit that when Warren posted this, I went to snopes to check, and there was nothing there :-) Regards, Alex
Re: SpamAssassin Ruleset Generation
Hi, > Other than the sought rules, all the rules are manually generated? Is there > any statistics on how frequently are new rules/regex adopted by > spamassasssin? Who are the people who write them? Any details related to Information on Justin Mason's SOUGHT rules is here: http://taint.org/2007/08/15/004348a.html Use sa-update to update your SA rules once or twice per day with the new stuff. His ongoing development work is here: http://svn.apache.org/viewvc/spamassassin/trunk/rulesrc/sandbox/jm/?sortby=date HTH, Alex
Re: Subject Rewrite Based on Score
Hi, > I actually would be doing that but the filter does not know how to > handle int(), so I would have to build a filter for all possible number > combinations, but if I could just get SA to do the basic math for me and > write a header or subject I can filter off of that. We do something similar here using a procmail/formail script which calls a perl script to match on X-Spam-Status then rewrite the subject to include the bayes score prepended to the subject. We then use a few procmail rules to filter the mail based on the bayes score for analysis. Regards, Alex
Re: Subject Rewrite Based on Score
Hi, > That sounds overly complicated and like a lot of wasted cycles. Calling > a Perl script for each message? What you just described sounds a hell of > lot like this light-weight SA configuration: Yes, I should have mentioned that it is a copy of the mail that users receive and only visible by a single account. It also only occurs once every four hours as the mail is pulled from the spool. Regards, Alex