On Jun 9, 2014, at 3:10 PM, Axb <axb.li...@gmail.com> wrote: > On 06/09/2014 11:03 PM, Philip Prindeville wrote: >> >> On Jun 6, 2014, at 3:50 PM, Axb <axb.li...@gmail.com> wrote: >> >>> If you have to post a spam sample, pls use pastebin and post the full msg >>> >>> On 06/06/2014 11:32 PM, Philip Prindeville wrote: >>>> We’re getting a lot of spam that contains URL’s which look like (remove >>>> the ####): >>>> >>>> http://mab####sut.com/20220362/vuxtxumsrnsst6unlornt3umtfuwznvv~5v0nmro0ysnx_u_usqzxsrwlln_t_t_tomtdyumplnl_ts_tn_ttce/unnt7uqs_mrn_ttdfw3yuw_h_03xo_gl_67_8gw_buutxveumpomte3yuo_tlltcx3yumsrnsstziaumte3umm/lst0x0ut0xut7eunty1um_ttf1umnrt2utezdeuteutyutw2utv3utvaut0u_0czz_xz66_a298zty8ux97xvd/e_o8zetdy97utd3aut09ultcdaumtd3un_unsrrtw3utwv8utweut80utecegutfnutaeut263yutdzeumt9cul_ol >>> >>>> Some observations… The URL’s should be fairly easy to filter against via a >>>> regex. Anyone have some working rules they could share? >>> >>> Pls note than any rule shared via lists usually looses its teeth within a >>> few hours .-) >> >> Well, it depends on the nature of the rule… Some characteristics are less >> fungible than others.
BTW, I found that the last N characters of the above URL’s were always the same, and tried to do a “body” rule based on those last N characters, but I couldn’t get the rule to match. Still not sure why. The entire <a ...> sequence is only 382 characters long. Any ideas? >> >> >>> >>>> >>>> The other thing is, the URL is almost always hosted by solarvps.com, in >>>> the CIDR block 65.181.64.0/18. >>>> >>>> Is there an easy way to do a domain lookup on the host portion of the URL >>>> and then filter it if it’s in this subnet? >>> >>> Yes, there is: >>> >>> run a local A record blacklist with rbldnsd >>> >>> 65.181.64.0/18 >>> >>> and a rule like, for example: >>> >>> uridnssub YOUR_A_URIBL yourabl.example.net. A 127.0.0.2 >>> body YOUR_A_URIBL eval:check_uridnsbl('YOUR_A_URIBL') >>> describe YOUR_A_URIBL URL domain A rec listed by YOUR_A_URIBL >>> score YOUR_A_URIBL 5.0 >>> tflags YOUR_A_URIBL net a >>> >>> >> >> >> If I used local A records, for a /18 network, I’d need all 2^14 records, >> right? >> >> Because a lookup is always on a full dotted-quad (in reverse order)… > > > nope... wiht robldnsd you set your BL zone to use the ip4trie dataset > > which as per http://www.corpit.ru/mjt/rbldnsd/rbldnsd.8.html > > ip4trie Dataset > Set of IP4 CIDR ranges with corresponding (A, TXT) values. This dataset is > similar to ip4set, but uses a different internal representation. It accepts > CIDR ranges only (not a.b.c.d−e.f.g.h), and allows for the specification of > A/TXT values on a per CIDR range basis. (If multiple CIDR ranges match a > query, the value for longest matching prefix is returned.) Exclusions are > supported too. Okay, and what would 65.181.64.0/18 look like as a BIND RR? I wasn’t able to infer this from the documentation you pointed at. > >> >> I tried using multi.uribl.com and couldn’t get this to work. >> >> I had: >> >> urirhssub L_URIBL_BLACK multi.uribl.com. A 2 >> body L_URIBL_BLACK eval:check_uridnsbl('L_URIBL_BLACK') >> describe L_URIBL_BLACK Contains a URL listed in the URIBL blacklist >> tflags L_URIBL_BLACK net >> score L_URIBL_BLACK 20.0 > > URIBL is enabled by default in SA - no need to add extra rules. > >> >> set, and also: >> >> skip_rbl_checks 0 >> >> at the end of /etc/mail/spamassassin/sa-mimedefang.cf set. >> >> Running this over the message in a file: >> >> spamassassin -t --lint -D < /tmp/cable.eml >> >> I get: >> >> … >> Jun 9 14:57:13.029 [32297] dbg: rules: compiled meta tests >> Jun 9 14:57:13.032 [32297] dbg: check: is spam? score=-2.348 required=5 >> Jun 9 14:57:13.032 [32297] dbg: check: >> tests=L_EMPTY_SENDER,MISSING_DATE,MISSING_HEADERS,NO_RECEIVED,NO_RELAYS >> Jun 9 14:57:13.032 [32297] dbg: check: >> subtests=__BODY_TEXT_LINE,__EMPTY_BODY,__EMPTY_SENDER,__GATED_THROUGH_RCVD_REMOVER,__HAS_FROM,__HAS_MESSAGE_ID,__HAS_MSGID,__HAS_SUBJECT,__L_UNDISCLOSED2,__MISSING_REF,__MISSING_REPLY,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__MSOE_MID_WRONG_CASE,__NONEMPTY_BODY,__NOT_SPOOFED,__SANE_MSGID,__TO_NO_ARROWS_R,__UNUSABLE_MSGID >> Jun 9 14:57:13.033 [32297] dbg: timing: total 1908 ms - init: 1384 (72.5%), >> parse: 1.17 (0.1%), extract_message_metadata: 11 (0.6%), >> get_uri_detail_list: 1.06 (0.1%), tests_pri_-1000: 9 (0.5%), compile_gen: >> 202 (10.6%), compile_eval: 37 (1.9%), tests_pri_-950: 6 (0.3%), >> tests_pri_-900: 7 (0.4%), tests_pri_-400: 6 (0.3%), tests_pri_0: 404 >> (21.2%), tests_pri_500: 75 (3.9%) >> >> >> so I’m not sure why it’s failing to find nqtel.com in the uribl.com database. >> What am I missing? > > --lint doesn't do network tests > Okay, taking out --lint changed the results. Thanks, -Philip