Hi Eric, Actually the "full" rules don't ignore HTML at all - they are able to search within HTML tags quite fine, and also take into account line breaks, because they are run before SA does any decoding of the email. I use a bunch of custom full rules for this exact purpose.
>From http://spamassassin.apache.org/dist/doc/Mail_SpamAssassin_Conf.html#rule_definitions_and_privileged_settings: "The full message is the pristine message headers plus the pristine message body, including all MIME data such as images, other attachments, MIME boundaries, etc." In order to take into account line breaks you probably need to use the /s at the end of the rule, which enables "single-line mode". Eg: full IMG_SRC /<img src cid:[0-9]+>/is ...Although I don't think this exact rule will actually hit on anything, as the HTML will actually take the form of something like this: <img src="cid:223505420@08042006-0FEA"> ...with the equal sign and quote mark after "src", and with not only digits but also other characters within the cid part, such as @ or hyphens etc. And you also have to take into account other tag attributes such as height, width which could exist between "img" and "src". Furthermore, if the email was encoded in Quoted-Printable, there will probably look more like this (actual example from one of my emails): <IMG height=3D72 = src=3D"cid:223505420@08042006-0FEA" width=3D494=20 border=3D0> Note the extra end-of-line equal-sign character on the first row and "3D" or "=20" bits which are put there by the Quoted-Printable encoding and which will not be removed by SA before the full rule is run. So what I'd do is write a rule like this: full IMG_SRC /<img.{1,100}cid:/is Or perhaps more efficiently, this one which doesn't use any backtracking: full IMG_SRC /<img ([^>](?!cid))+.cid:/is I wouldn't bother trying to detect the string after the "cid:" bit, ie. the digits etc, unless you had a particular need to. Simply detecting the existance of "cid:" within the IMG tag is enough to determine the email has an embedded/inline image within the HTML. Hope that helps! Cheers, Jeremy --------------------------------------------------------------- "Eric Hart" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] Hi folks, Let's say that I want to recognize this HTML tag in a rawbody rule: <img src cid:[random number]> It's easy to write a rule that recognizes this. I use "rawbody" because "full" and "body" ignore html. Now suppose that there's a line break in the html tag. This is legal, and is still recognized by mail client: <img src cid:[random number]> It's not possible to write a rawbody rule that recognizes this! The problem seems to be that rawbody looks at the message "one line at a time". I won't bore you with every way I've tried to create a rule that spans this line break, but none of them have worked. Has anyone enountered/resolved this issue? Cordially, Eric Hart ehart at npi dot net