On Tue, Apr 23, 2019 at 10:58 AM Eli the Bearded <*@eli.users.panix.com> wrote: > > Here's some code I wrote today: > > ------ cut here 8< ------ > HEXCHARS = (b'0', b'1', b'2', b'3', b'4', b'5', b'6', b'7', b'8', b'9', > b'A', b'B', b'C', b'D', b'E', b'F', > b'a', b'b', b'c', b'd', b'e', b'f') > > > # decode a single hex digit > def hord(c): > c = ord(c) > if c >= ord(b'a'): > return c - ord(b'a') + 10 > elif c >= ord(b'A'): > return c - ord(b'a') + 10 > else: > return c - ord(b'0') > > > # decode quoted printable, specifically the MIME-encoded words > # variant which is slightly different than the body text variant > def decodeqp(v):
Have you checked to see if Python can already do this? You mention quopri from the stdlib (that's https://docs.python.org/3/library/quopri.html for those following along at home), so I'm curious which ways your code differs from that; it might be that the easiest way is to use that module, and then add some extra framing around the outside of it. > But the bytes() thing is really confusing me. Most of this is translated > from C code I wrote some time ago. I'm new to python and did spend some > time reading: > > https://docs.python.org/3/library/stdtypes.html#bytes-objects > > Why does "bytes((integertype,))" work? I'll freely admit to stealing > that trick from /usr/lib/python3.5/quopri.py on my system. (Why am I not > using quopri? Well, (a) I want to learn, (b) it decodes to a file > not a variable, (c) I want different error handling.) The bytes constructor will take a sequence of integers and return a byte string with those values. For instance, bytes([1, 2, 3, 4, 5]) is the same as bytes(range(1, 6)) and is the same as b"\1\2\3\4\5". In this case, the iterable is a tuple of one byte value. > Is there a more python-esque way to convert what should be plain ascii What does "plain ASCII" actually mean, though? > into a binary "bytes" object? In the use case I'm working towards the > charset will not be ascii or UTF-8 all of the time, and the charset > isn't the responsibility of the python code. Think "decode this if > charset matches user-specified value, then output in that same charset; > otherwise do nothing." I'm not sure what this means, but I would strongly recommend just encoding and decoding regardless. Use text internally and bytes at the outside. ChrisA -- https://mail.python.org/mailman/listinfo/python-list