Re: Darxus's LOCAL_8X_TAGS

2011-04-21 Thread darxus
On 04/21, dar...@chaosreigns.com wrote: > rawbody MUCH_HTML_SPACE /(?:<\s*(?:p|br)[\s\/]*>\W*){8}/is A little better: rawbody MUCH_HTML_SPACE /(?:<\s*(?:p|br)[\s\/]*>[^[:alnum:]]*){8}/is Same results on current copora. Hits 15 out of 57 most recently missed spams, and none of 5,841 hams. --

Re: Regex help

2011-04-21 Thread Karsten Bräckelmann
On Thu, 2011-04-21 at 16:08 -0800, Kevin Miller wrote: > Karsten Bräckelmann wrote: > > That should do the trick indeed. > > > > After this, I strongly suggest to carefully re-read the entire > > thread, and read some docs specifically about the points raised. That > > includes RE peculiarities [1

Re: Darxus's LOCAL_8X_TAGS

2011-04-21 Thread darxus
On 04/21, Adam Katz wrote: > > rawbody LOCAL_8X_TAGS /(?:<[^>]*>[\s\r\n]{0,4}){8}/mi > I'm not sure about email clients specifically, but it is (or rather, > used to be -- I'm way out of date here) a common WYSIWYG foible to > create empty tags when the user plays with various formatting buttons

Re: Regex help

2011-04-21 Thread John Hardin
On Thu, 21 Apr 2011, Adam Katz wrote: rawbody LOCAL_5X_BR_TAGS /(?:[\s\r\n]{0,4}){5}/mi ...when does \s{0,4} not match the same text as [\s\r\n]{0,4} ? (i.e. \r and \n are whitespace, no?) -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALa

Re: DOB list dead?

2011-04-21 Thread John Hardin
On Thu, 21 Apr 2011, Michael Monnerie wrote: Does anyone know about the state of the "day old bread" list? dob.sibl.support-intelligence.net doesn't seem to get any hits on new domains anymore, and on their contact e-mail address nobody responded to my requests. It appears to be working for

RE: Regex help

2011-04-21 Thread Kevin Miller
Karsten Bräckelmann wrote: > On Thu, 2011-04-21 at 15:47 -0800, Kevin Miller wrote: >> Karsten Bräckelmann wrote: >>> What you want. The string '', repeated five times (or more). For >>> the quantifier, you need to group the string. >>> >>> /(?:){5}/ > >> Great. I've changed my rule to that, a

RE: Regex help

2011-04-21 Thread Karsten Bräckelmann
On Thu, 2011-04-21 at 15:47 -0800, Kevin Miller wrote: > Karsten Bräckelmann wrote: > > What you want. The string '', repeated five times (or more). For > > the quantifier, you need to group the string. > > > > /(?:){5}/ > Great. I've changed my rule to that, and am going to look at Adam's >

RE: Regex help

2011-04-21 Thread Kevin Miller
Stupid Outlook. Meant to reply to the list again. Sigh. Karsten Bräckelmann wrote: > > What you want. The string '', repeated five times (or more). For > the quantifier, you need to group the string. > > /(?:){5}/ > > Besides the above, do not use {5,} as a quantifier, UNLESS there is > s

Re: Darxus's LOCAL_8X_TAGS

2011-04-21 Thread Karsten Bräckelmann
On Thu, 2011-04-21 at 16:35 -0700, Adam Katz wrote: > Broken apart from previous thread to prevent confusion. > > On 04/21/2011 04:18 PM, dar...@chaosreigns.com wrote: > > I wonder if it would be useful to generalize this as: > > > > rawbody LOCAL_8X_TAGS /(?:<[^>]*>[\s\r\n]{0,4}){8}/mi Rawbod

RE: Regex help

2011-04-21 Thread Martin Gregorie
On Thu, 2011-04-21 at 14:55 -0800, Kevin Miller wrote: > I know it may trigger on some ham which is why I set the initial score > to 0.01. Better ideas are most welcome though! > It may be a good idea to look at the headers, especially From, From: and Message-ID: and at body URIs to see if there

RE: Regex help

2011-04-21 Thread Kevin Miller
Adam Katz wrote: > On 04/21/2011 03:55 PM, Kevin Miller wrote: >> Thanks (also to Martin who replied). I posted one of the spams >> here: http://pastebin.com/9aBAxR7m >> >> You can see the long series of break codes in it. > > Yes I can. I can also see several other diagnostic bits in it, such

Darxus's LOCAL_8X_TAGS

2011-04-21 Thread Adam Katz
Broken apart from previous thread to prevent confusion. On 04/21/2011 04:18 PM, dar...@chaosreigns.com wrote: > On 04/21, Adam Katz wrote: >> rawbody LOCAL_5X_BR_TAGS /(?:[\s\r\n]{0,4}){5}/mi > > I wonder if it would be useful to generalize this as: > > rawbody LOCAL_8X_TAGS /(?:<[^>]*>[\s\r\n

RE: Regex help

2011-04-21 Thread Karsten Bräckelmann
On Thu, 2011-04-21 at 14:55 -0800, Kevin Miller wrote: > I did get it to work from the CLI, and wrote the following rule: > > body CBJ_GiveMeABreak /\[""]{5,}/ This still is wrong. Something that has been mentioned, but not properly explained to you is the char class, denoted by square brac

RE: Regex help

2011-04-21 Thread Kevin Miller
dar...@chaosreigns.com wrote: > On 04/21, Adam Katz wrote: >> rawbody LOCAL_5X_BR_TAGS /(?:[\s\r\n]{0,4}){5}/mi > > I wonder if it would be useful to generalize this as: > > rawbody LOCAL_8X_TAGS /(?:<[^>]*>[\s\r\n]{0,4}){8}/mi > > Just a mess of tags in a row without any content. I'll leav

Re: Regex help

2011-04-21 Thread Adam Katz
On 04/21/2011 03:55 PM, Kevin Miller wrote: > Thanks (also to Martin who replied). I posted one of the spams here: > http://pastebin.com/9aBAxR7m > > You can see the long series of break codes in it. Yes I can. I can also see several other diagnostic bits in it, such as the domain: http://www.

Re: Regex help

2011-04-21 Thread darxus
On 04/21, Adam Katz wrote: > rawbody LOCAL_5X_BR_TAGS /(?:[\s\r\n]{0,4}){5}/mi I wonder if it would be useful to generalize this as: rawbody LOCAL_8X_TAGS /(?:<[^>]*>[\s\r\n]{0,4}){8}/mi Just a mess of tags in a row without any content. On 04/21, Kevin Miller wrote: > body CBJ_GiveMeA

RE: Regex help

2011-04-21 Thread Kevin Miller
Opps - this should have gone to the list. Sorry. Adam Katz wrote: > Before I help you with your shell and regex issues, I should point out > that this is not a very strong rule. It will hit ham. SNIP > > Better solution: put some examples up on a pastebin and link them to > us so we can help

Re: Regex help

2011-04-21 Thread Adam Katz
> "egrep '[]{5,}' p3L..." prevents the shell from trying to interpret > your query but still has a bad query, as it looks for five or more > consecutive occurrences of any character listed between the angle > brackets, so "brr" will match up to the slash. Between the square brackets ("[" and "]"),

Re: Regex help

2011-04-21 Thread Adam Katz
atching one, zero, or dot. The grouping symbol you are looking for is a curly-bracket, and the dot (when outside a square bracket) must be escaped as it otherwise means "any single character." > However, doing this fails: > mxg:/var/spool/MailScanner/quarantine/20110421/nonspam # e

Re: Regex help

2011-04-21 Thread Martin Gregorie
On Thu, 2011-04-21 at 13:54 -0800, Kevin Miller wrote: > mxg:/var/spool/MailScanner/quarantine/20110421/nonspam # egrep \[]{5,} > p3LJZSnX024470 > That won't do what you want anyway, since its asking for "a sequence of 5 characters, each of which must be one of <,>,b or

Regex help

2011-04-21 Thread Kevin Miller
eating characters and it returns expected results: mkm@mis-mkm-lnx:~$ egrep \[10.]{3} DomainLiterals.txt you can add a line containing only [10.10.10.10] to /etc/mail/local-host-names where 10.10.10.10 is the IP address you However, doing this fails: mxg:/var/spool/MailScanner/quarantine/20110421

DOB list dead?

2011-04-21 Thread Michael Monnerie
Does anyone know about the state of the "day old bread" list? dob.sibl.support-intelligence.net doesn't seem to get any hits on new domains anymore, and on their contact e-mail address nobody responded to my requests. Any replacement known? -- mit freundlichen Grüssen, Michael Monnerie, Ing. B

Fwd: [#IHH-446659]: spam

2011-04-21 Thread Michael Scheidell
so, a while back, I got rid of the linked in spam. (added 4 points to any emails from them). didn't want to blacklist them outright, in case users wanted to whitelist them. now it is back. seems linked spam is not covered under can spam laws, because its 'transactional' (is an 'invitation' f