Bugs item #504152, was opened at 2002-01-16 01:31 Message generated for change (Settings changed) made by gbrandl You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=504152&group_id=5470
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: Richard Jones (richard) >Assigned to: Barry A. Warsaw (bwarsaw) Summary: rfc822 long header continuation broken Initial Comment: I don't believe this is fixed in 2.1.2 or 2.2, but haven't checked. The code in rfc822.Message.readheaders incorrectly unfolds long message headers. The relevant information from rfc2822 is in section 2.2.3. In short: """ The process of moving from this folded multiple-line representation of a header field to its single line representation is called "unfolding". Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP. Each header field should be treated in its unfolded form for further syntactic and semantic evaluation. """ This means that the code in readheaders: if headerseen and line[0] in ' \t': # It's a continuation line. list.append(line) x = (self.dict[headerseen] + "\n " + line.strip()) self.dict[headerseen] = x.strip() continue should be: if headerseen and line[0] in ' \t': # It's a continuation line. list.append(line) x = self.dict[headerseen] + line self.dict[headerseen] = x.strip() continue ie. no stripping of the leading whitespace and no adding the newline. ---------------------------------------------------------------------- Comment By: Richard Jones (richard) Date: 2003-11-10 22:28 Message: Logged In: YES user_id=6405 OK, I've sent a message, but I don't have the time to sign up to the list. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2003-11-10 22:04 Message: Logged In: YES user_id=12800 Since this was never addressed in the email package either, perhaps you'd like to bring it up in the email-sig? ---------------------------------------------------------------------- Comment By: Richard Jones (richard) Date: 2003-11-10 21:35 Message: Logged In: YES user_id=6405 Hurm. This issue has been lost to the void, but it's as valid today as it ever was. I've just had another user of Roundup run into the same thing: RE: [issue51] Mails being delayed [assignedto=stuartm;priority=medium] (that should be one long line) became RE: [issue51] Mails being delayed [assignedto=stuartm;priority=me dium] when sent by Outlook. Note that the current code reconstructs that line as "me\ndium" which is about as wrong as it can get, as there's no way for my code to determine whether that *should* be "me dium" or "medium" since the other whitespace has been stripped (so just stripping out the newline, as my code currently does, doesn't help). I stand by my original post, requesting that the code be fixed as stated. ---------------------------------------------------------------------- Comment By: Barry A. Warsaw (bwarsaw) Date: 2002-04-15 17:28 Message: Logged In: YES user_id=12800 There is some value in not unfolding long lines by default. FWIW, the email package also retains the line breaks for such multi-line headers. The advantage to retaining this is that message input/output can be idempotent (i.e. you get the same thing in as you get out). This can be useful when using the message to generate a hash value, and for other user-friendly reasons. That being said, there is also some use in providing a way to return the unfolded line. I don't see a lot of benefit in adding such a feature to the rfc822 module, but I could see adding it to the email package. Specifically, I would propose to add it to the email.Header.Header class, either as a separate method (e.g. Header.unfold()) or as a default argument to the Header.encode() method (e.g. Header.encode(self, unfold=0)). If we did the latter, then I'd change Header.__str__() to call .encode(unfold=1). Assigning to Ben to get his feedback. Ben, feel free to comment and re-assign this bug to me. ---------------------------------------------------------------------- Comment By: Martin v. Löwis (loewis) Date: 2002-01-16 12:14 Message: Logged In: YES user_id=21627 Even though it might not matter, I don't think changing it would hurt, either, and the change brings it definitely closer to following the word of RFC 2822. If no case is brought forward where it matters, fixing it for 2.3 alone should be sufficient. ---------------------------------------------------------------------- Comment By: Richard Jones (richard) Date: 2002-01-16 12:12 Message: Logged In: YES user_id=6405 Yes, we had someone submit a bug report on the roundup users mailing list because someone had sent a message to the roundup mail gateway which was split. The client was extra-specially broken, since it split in the middle of a word (which is not to spec), but the more general case of folding on whitespace will cause roundup problems since I hadn't expected there to be any newlines in the header. I can modify roundup to strip out the newline, but it'd be nice to have rfc822.Message not put it in there... ---------------------------------------------------------------------- Comment By: Guido van Rossum (gvanrossum) Date: 2002-01-16 02:47 Message: Logged In: YES user_id=6380 Richard, have you found a situation where it matters? I thought that usually the next phase calls for normalizing whitespace by squashing repeated spaces/tabs and removing them from front and back. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=504152&group_id=5470 _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com