>>>... >>>Date: Thu, 17 Mar 2005 00:29:43 +0100 >>>From: mouss <[EMAIL PROTECTED]> >>>... >>>To: List Mail User <[EMAIL PROTECTED]> >>>Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], >>> [EMAIL PROTECTED] >>>Subject: Re: Is this Received header correctly formatted? >>>... >>> >>>List Mail User wrote: >>> >>> >>>> In other words, lowercase is conformant. and your first point is >>>>not correct (though all the examples do show uppercase). However, you are >>>>completely correct that the "helo=" is flat out wrong, >>> >>>why? it's inside a comment, no? >>> >>> but with a slight >>> >>>>variation, and it becomes something like "(watson1 [4.16.241.28])" which >>>>is not only conformant, but is the the typical behavior or both sendmail >>>>and postfix. >>> >>>except that here the situation is reversed. >>>while postfix and sendmail use "from heloname (client_namer >>>[client_ip])", others such as qmail prefer "from client_name >>>([client_ip]) (helo heloname)" or other variants. >>> >>> >> >> Mous, >> >> You're correct about the reversal, I realized that *after* I sent >> the message. Also technically the area after the [client_ip] is not white >> space. Eric properly pointed out that the entire header field already has >> an assigned use already, and the comment in the definition states >> specifically not to use information from the HELO. >> >> To requote: >> >> "TCP-info = Address-literal / ( Domain FWS Address-literal ) >> ; Information derived by server from TCP connection >> ; not client EHLO." >> >> Notice the definition does not use any specification for white space after >> the address literal. The single "space" character does not count - The >> notation uses that to delineate between atoms and/or tokens; There would have >> to be a reference to either "FWS", "WSP" or maybe even "LWSP" might qualify; >> But since none of those atoms are part of the definition, the area after the >> literal and before the ')' does not qualify as white space. So the clause >> "([4.16.241.28] helo=watson1)" seems to be clearly non-conformant. > >ahem. the specs provide for comments, and don't restrict comments. so >whatever is in between pars is ok. the specs even allow silly things >linke Fr(foo)om. btw, unlike what a lot of people seem to think, rfc2821 >is only a "standard track'.
I've made this argument myself, but it has been upgraded to "Best Practices". Also, your "Fr(foo)om" case is not allowed, because as you can read below a comment is to be parsed as if it were a single space character, so your example would parse to "Fr om" which is meaningless. Anyway, let's go back to RFC822 which is a "Standard" and still stands depite the intentions for 2822 to replace it. To quote the `old' restriction on comments: RFC822 Section 3.4.3 "3.4.3. COMMENTS A comment is a set of ASCII characters, which is enclosed in matching parentheses and which is not within a quoted-string The comment construct permits message originators to add text which will be useful for human readers, but which will be ignored by the formal semantics. Comments should be retained while the message is subject to interpretation according to this standard. However, comments must NOT be included in other cases, such as during protocol exchanges with mail servers. Comments nest, so that if an unquoted left parenthesis occurs in a comment string, there must also be a matching right parenthesis. When a comment acts as the delimiter between a sequence of two lexical symbols, such as two atoms, it is lex- ically equivalent with a single SPACE, for the purposes of regenerating the sequence, such as when passing the sequence onto a mail protocol server. Comments are detected as such only within field-bodies of structured fields. If a comment is to be "folded" onto multiple lines, then the syntax for folding must be adhered to. (See the "Lexical Analysis of Messages" section on "Folding Long Header Fields" above, and the section on "Case Independence" below.) Note that the official semantics therefore do not "see" any unquoted CRLFs that are in comments, although particular pars- ing programs may wish to note their presence. For these pro- grams, it would be reasonable to interpret a "CRLF LWSP-char" as being a CRLF that is part of the comment; i.e., the CRLF is kept and the LWSP-char is discarded. Quoted CRLFs (i.e., a backslash followed by a CR followed by a LF) still must be followed by at least one LWSP-char." and RFC822 Section 3.4.6 "3.4.6. BRACKETING CHARACTERS There is one type of bracket which must occur in matched pairs and may have pairs nested within each other: o Parentheses ("(" and ")") are used to indicate com- ments. ..." So even in RFC822, comments require parenthesis. > > Also, the >> inclusion of the parenthesis seems to be incorrect for a bare literal; > >as far as this is in comments, there is no issue. so >Receieved: from foo (whatever is here) >is ok. As has already been agree by both Eric and I, noncomformance in the headers is no good reason to refuse accepting mail. Now violating the rules for commands is a separate issue - There I would and do refuse mail. > >They >> are only specified for the second alternative with both the "Domain" and >> "Address-literals". I do agree that is it not enough of an error that mail >> should be refused on that basis alone, but if a server were to do so, it >> would be within its prerogative (and seemingly legal to do so). > >as far as I can see, the std allows for a lot of received stuff. the std >even manages to create a notion of domain that is not compatible with a >dns domain. after all, smtp has apparently been defined by sendmail.... > > There are allowable domains that aren't DNS domains, but they are all required to be defined in some other RFC (I used .UUCP for many years, and the old Bitnets domains were never in DNS, but were defined in an RFC). Again there are restrictions in 822 (the ones I just mentioned require you to follow the references - 822 has many problems with being incomplete). RFC822 Section 6.2.1 "6.2.1. DOMAINS A name-domain is a set of registered (mail) names. A name- domain specification resolves to a subordinate name-domain specification or to a terminal domain-dependent string. Hence, domain specification is extensible, permitting any number of registration levels. Name-domains model a global, logical, hierarchical addressing scheme. The model is logical, in that an address specifica- tion is related to name registration and is not necessarily tied to transmission path. The model's hierarchy is a directed graph, called an in-tree, such that there is a single path from the root of the tree to any node in the hierarchy. If more than one path actually exists, they are considered to be different addresses. The root node is common to all addresses; consequently, it is not referenced. Its children constitute "top-level" name- domains. Usually, a service has access to its own full domain specification and to the names of all top-level name-domains. The "top" of the domain addressing hierarchy -- a child of the root -- is indicated by the right-most field, in a domain specification. Its child is specified to the left, its child to the left, and so on. Some groups provide formal registration services; these con- stitute name-domains that are independent logically of specific machines. In addition, networks and machines impli- citly compose name-domains, since their membership usually is registered in name tables. In the case of formal registration, an organization implements a (distributed) data base which provides an address-to-route mapping service for addresses of the form: [EMAIL PROTECTED] Note that "organization" is a logical entity, separate from any particular communication network. A mechanism for accessing "organization" is universally avail- able. That mechanism, in turn, seeks an instantiation of the registry; its location is not indicated in the address specif- ication. It is assumed that the system which operates under the name "organization" knows how to find a subordinate regis- try. The registry will then use the "person" string to deter- mine where to send the mail specification. The latter, network-oriented case permits simple, direct, attachment-related address specification, such as: [EMAIL PROTECTED] Once the network is accessed, it is expected that a message will go directly to the host and that the host will resolve the user name, placing the message in the user's mailbox." Here the flaw is the lack of a definition for "registered" in this particular document but it is (I believe) in RFC733. Notice the (undefined) reference to host.network - where network is meant to be one of the "registered" networks defined in other RFCs. Paul Shupak [EMAIL PROTECTED]