On Tue, 16 May 2006, Craig McLean wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > [snipped] > > I use this style to catch a couple of common text formatting oddities > caused by machine-generated input, see: > http://fukka.co.uk/sa-rules/local/textstyles.cf > > Thinking about it, this stuff will nest fairly well, so this should work: > > rawbody T_30_DODGY_DIVS m'(?:<DIV>\s{0,}?[\$%\w]\s{0,}?</DIV>.{1,40}?){30}'i > > Stick with rawbody, you don't need full. Also, you'll probably want > case-insensitive, and \s{0,}? to match zero or more whitespace.
Only problem with that is "rawbody" processes the original message one line at a time, unlike "full" or "body" which concatinate the whole message into one large string. So if you're looking for some characteristic of a message which is spread accross multiple lines of input you cannot use "rawbody". Thus you are -very- unlikely to find that 30 repetitions of your pattern in one of the lines of the input message. This 'feature' of rawbody has already been the subject of various threads on this list. -- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{