On Sat, December 28, 2013 7:57 pm, John Hardin wrote:
> And in case you actually did mean From: rather than recipient address...

Sorry, no, I meant To: as you surmised.

> Unfortunately __SUBJ_HAS_TO_1 isn't performing well enough against the
> current masscheck corpora to be published. It's possible that the majority
> of mail has the Subject: header appear before the To: header (which would
> be the __SUBJ_HAS_TO_2 rule I haven't gotten around to writing yet).
>
> It also doesn't work on a "bare" email address in the To: header, the RE
> requires proper angle-bracket form.

Yes, I definitely noticed that.  As you can see from the spample (link
below), none of the above rules are hitting properly; the To: line is a
bare email, not properly angle-bracketed.  Or, if any of the rules are
hitting, the meta rule that they are supposed to trigger still does not
trigger.

> If you can post a spample or two somewhere I'll see what I can do.

Spample:
http://pastebin.com/0jEMBA1X

The other unfortunate thing is that this SHOULD have popped
HTML_COMMENT_GIBBERISH (my own home-baked version since there's not a
public one), but it didn't pop that one either.  Of course, as I have
posted previously, I've had problems getting SA to hit
HTML_COMMENT_GIBBERISH even when it should, i.e. when feeding the mail
into regexpal.com, it says there are hits, but SA, for some unknown
reason, does not.  So, I guess it's not surprising that
HTML_COMMENT_GIBBERISH didn't hit, but I still don't know why not.

(For posterity, my HTML_COMMENT_GIBBERISH rule is the following:
rawbody HTML_COMMENT_GIBBERISH      
/<!--\s*(?:[\w'"?.:;-]+\s+){100,}\s*-->/im

I'm sure this isn't quite the best one, and clearly not since it doesn't
hit when it should... but I haven't found anything better yet.  John, you
did say you had your own homebrew version of this that you were thinking
of making public, what's the status of that?)

Thanks in advance for the help.

                                                --- Amir

Reply via email to