Mark Martinec wrote: > Theo Van Dinter writes: > >> body rules aren't run on lines, they're run on paragraphs, >> so that text is in the middle of a string. >> > > Matt Kettler writes: > >> Use rawbody for this. Body rules have CR/LF stripped out. >> > > Giving whole paragraphs to regexp is fine, but why are newlines > stripped out in 'body' rules? In order to normalize whitespace. This way rules don't have to care about whitespace, they can just be written normally.
Otherwise /Hello I'm a spammer/i Would fail to match: Hello I'm a spammer. SA also reduces excess spaces in normal body rules, that way spammers can't obfuscate text by simply inserting piles of spaces. It would be really a pain to have to rewrite the above rule as: /Hello\s*I'm\s*a\s*spammer/m And also much slower if you have to do that for a few hundred rules. > Perl regexp modifiers m (and s) > would be handy: > > body L_TEST /^[A-Z]\s[A-Z]\s[A-Z]\s[A-Z]$/m > > but as it stands now the m modifier is of no use in 'body' rules > (unlike in 'rawbody'). True. If you care about whitespace formatting and EOLs, use rawbody. If you want to match text in a straightforward way, use body and let SA's pre-processing of the text deal with simplifying whitespace.