> On 30 Jul 2022, at 00:30, Peter Pearson <pkpearson@nowhere.invalid> wrote: > > The following code produces a nonsense result with the input > described below: > > import mailbox > box = mailbox.Maildir("/home/peter/Temp/temp",create=False) > x = box.values()[0] > h = x.get("X-DSPAM-Factors") > print(type(h)) > # <class 'email.header.Header'> > > The output is the desired "str" when the message file contains this: > > To: recipi...@example.com > Message-ID: <123> > Date: Sun, 24 Jul 2022 15:31:19 +0000 > Subject: Blah blah > From: f...@from.com > X-DSPAM-Factors: a'b > > xxx > > ... but if the apostrophe in "a'b" is replaced with a > RIGHT SINGLE QUOTATION MARK, the returned h is of type > "email.header.Header", and seems to contain inscrutable garbage.
Include in any bug report the exact bytes that are in the header. In may not be utf-8 encoded it maybe windows cp1252, etc. Repr of the bytes header will show this. Barry > > I realize that one should not put non-ASCII characters in > message headers, but of course I didn't put it there, it > just showed up, pretty much beyond my control. And I realize > that when software is given input that breaks the rules, one > cannot expect optimal results, but I'd think an exception > would be the right answer. > > Is this worth a bug report? > > -- > To email me, substitute nowhere->runbox, invalid->com. > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list