From: "Ralph Seichter" <[EMAIL PROTECTED]> > jdow wrote: > > > > 2.2. Header Fields > > > Header fields are lines composed of a field name, followed by a > > > colon (":"), followed by a field body, and terminated by CRLF. > > > A field name MUST be composed of printable US-ASCII characters > > > (i.e., characters that have values between 33 and 126, > > ^^ NOTE > > > > > inclusive), except colon. A field body may be composed of any > > > US-ASCII characters, except for CR and LF. [...] > > > > NOTE: Character 32 is space. Character 33 is !. The subject does NOT > > begin with the space character. It begins with the first character > > past the space. > > Perhaps you misread the RFC excerpt a bit? only the field name (!) > must be composed of characters between 33 and 126. The definition
No - zero or more spaces are ignored with the first real character being "!" through "~". For the rest of the message the space character is allowed. > subject = "Subject:" unstructured CRLF > > implies that, as far as I understand, the field body starts with the > character immediately after the colon. As long as that first character is not a space. (Arguably Outlook Express gets it wrong presuming any character past the first space is part of the subject. However, for OE I believe the subject header can be either of "Subject:" or "Subject: ". The latter one is used if matched otherwise the former one is used. > > Now, as to how SpamAssassin parses the Subject field is open for > > question. It appears a lot of rules seem to start presuming zero > > or more blank characters followed by the real search string. > > As I wrote before: I believe that many software products dealing > with email assume that the field body starts with the first non- > whitespace character after zero or more whitespaces, or that they > make use of functions like trim() to remove any leading/trailing > whitespaces as they see fit, i.e. when storing or displaying > messages. I don't know if checking for "surplus" whitespaces in > field bodies has a realistic chance of success. Darned few presume ANY first character is part of the body of the subject. Most, in my experience, skip at least the first one. Often (usually?) they will skip all space characters following the colon until the first non-space character. I've never run across an email program that treats "Subject: Spoo" as having a subject body, for presentation to the user, of " Spoo". I've run across many that will treat "Subject: Spoo" as " Spoo" for presentation to the user. (That many may be most.) Those which do not treat it as having a subject of "Spoo", instead. I have also noticed that all the email programs I've played with accept the line "Subject:Spoo" as having a subject of "Spoo". This seems to be the reading they have taken on the quoted three paragraphs. I take 2.2 as defining the rule and 2.2.2 as a subset of that definition that is a trifle ambiguous. Certainly "Subject:Spoo" is legal. It is unconventional. "Subject: Spoo" is the general convention. And "Subject: Spoo" is open to interpretation regarding whether or not that second space is part of the subject. The first space is not by paragraph 2.2. {^_^} (My that's a lot of slow moving tasty creatures. JMS would be proud.)