Cameron Simpson <c...@cskk.id.au> wrote: > On 27Aug2020 23:54, Marco Sulla <marco.sulla.pyt...@gmail.com> wrote: > >Are you sure you want `str()`? > > > >>>> str(b'aaa') > >"b'aaa'" > > > >Probably you want: > > > >map(lambda x: x.decode(), bbb) > > _And_ you need to know the encoding of the text in the bytes. The above > _assumes_ UTF-8 because that is the default for bytes.decode, and if > that is _not_ what is in the bytes objects you will get mojibake. > > Because a lot of stuff is "mostly ASCII", this is the kind of bug which > can lurk until much later when you have less usual data. > If there's an encoding given in the header of the incoming E-Mail then one (hopefully) knows what the encoding is. However you have to be able to handle the more general case where either the encoding isn't given or it's wrong. In the real world E-Mail survives having an incorrect encoding in the header, what you see is either missing or garbled characters with the remainder being OK. Garbling the whole lot isn't a good approach.
-- Chris Green ยท -- https://mail.python.org/mailman/listinfo/python-list