From: "Ralph Seichter" <[EMAIL PROTECTED]> > Kris Deugau wrote: > > > If I recall the appropriate RFC correctly, you're looking for > > something that - by definition - doesn't exist. Whitespace is > > whitespace, so the content of a header begins with the first > > non-whitespace character after the colon. > > I checked <http://www.faqs.org/rfcs/rfc2822.html> for this: > > 2.2. Header Fields > Header fields are lines composed of a field name, followed by a > colon (":"), followed by a field body, and terminated by CRLF. > A field name MUST be composed of printable US-ASCII characters > (i.e., characters that have values between 33 and 126, ^^ NOTE > inclusive), except colon. A field body may be composed of any > US-ASCII characters, except for CR and LF. [...] > > 2.2.1. Unstructured Header Field Bodies > Some field bodies in this standard are defined simply as > "unstructured" (which is specified below as any US-ASCII > characters, except for CR and LF) with no further restrictions. > These are referred to as unstructured field bodies. Semantically, > unstructured field bodies are simply to be treated as a single > line of characters with no further processing (except for header > "folding" and "unfolding" as described in section 2.2.3). > > 3.6.5. Informational fields > The informational fields are all optional. The "Keywords:" > field contains a comma-separated list of one or more words or > quoted-strings. The "Subject:" and "Comments:" fields are > unstructured fields as defined in section 2.2.1, and therefore > may contain text or folding white space. > > subject = "Subject:" unstructured CRLF > > If I understand this correctly, the field body always starts with > the character after the colon, whitespace or not. I'm quite certain > that many SW implementations share your point of view, though.
NOTE: Character 32 is space. Character 33 is !. The subject does NOT begin with the space character. It begins with the first character past the space. The field body may contain space characters, and indeed virtually any other character. Now, as to how SpamAssassin parses the Subject field is open for question. It appears a lot of rules seem to start presuming zero or more blank characters followed by the real search string. {^_^}