Cameron Simpson <c...@cskk.id.au> wrote: > On 27Aug2020 09:16, Chris Green <c...@isbd.net> wrote: > >Cameron Simpson <c...@cskk.id.au> wrote: > >> But note: joining bytes like strings is uncommon, and may indicate > >> that > >> you should be working in strings to start with. Eg you may want to > >> convert popmsg from bytes to str and do a str.join anyway. It depends on > >> exactly what you're dealing with: are you doing text work, or are you > >> doing "binary data" work? > >> > >> I know many network protocols are "bytes-as-text, but that is > >> accomplished by implying an encoding of the text, eg as ASCII, where > >> characters all fit in single bytes/octets. > >> > >Yes, I realise that making everything a string before I start might be > >the 'right' way to do things but one is a bit limited by what the mail > >handling modules in Python provide. > > I do ok, though most of my message processing happens to messages > already landed in my "spool" Maildir by getmail. My setup uses getmail > to get messages with POP into a single Maildir, and then I process the > message files from there. > Most of my mail is delivered by SMTP, I run a Postfix SMTP *serever* on my desktop machine which stays on permanently.
The POP3 processing is solely to collect E-Mail that ends up in the 'catchall' mailbox on my hosting provider. It empties the POP3 catchall mailbox, checks for anything that *might* be for me or other family members then just deletes the rest. > >E.g. in this case the only (well the only ready made) way to get a > >POP3 message is using poplib and this just gives you a list of lines > >made up of "bytes as text" :- > > > > popmsg = pop3.retr(i+1) > > Ok, so you have bytes? You need to know. > The documentation says (and it's exactly the same for Python 2 and Python 3):- POP3.retr(which) Retrieve whole message number which, and set its seen flag. Result is in form (response, ['line', ...], octets). Which isn't amazingly explicit unless 'line' implies a string. > >I join the lines to feed them into mailbox.mbox() to create a mbox I > >can analyse and also a message which can be sent using SMTP. > > > >Should I be converting to string somewhere? > > I have not used poplib, but the Python email modules have a BytesParser, > which gets you a Message object; I would feed the poplib bytes to that > to parse the received message. A Message object can then be transcribed > as text via its .as_string method. Or you can do other things with it. > > I think my main points are: > > - know whether you're using bytes (uninterpreted data) or text (strings > of _characters_); treating bytes _as_ text implies an encoding, and > when that assumption is incorrect you get mojibake[1] > > - look at the email modules' parsers, which return Messages, a > representation of the message in a structure (so that MIME subparts > etc are correctly broken out, and the character sets are _known_, post > parse) OK, thanks Cameron. -- Chris Green ยท -- https://mail.python.org/mailman/listinfo/python-list