Hello list! I'm trying to figure out how to flatten a MIMEText message to bytes using an 8bit Content-Transfer-Encoding in Python 3.3. Here's what I've tried so far:
# -*- encoding: utf-8 -*- import email.encoders from email.charset import Charset from email.generator import BytesGenerator from email.mime.text import MIMEText import sys body = 'Ζεύς' encoding = 'utf-8' charset = Charset(encoding) charset.body_encoding = email.encoders.encode_7or8bit message = MIMEText(body, 'plain', encoding) del message['Content-Transfer-Encoding'] message.set_payload(body, charset) try: BytesGenerator(sys.stdout.buffer).flatten(message) except UnicodeEncodeError as e: print('error with string input:') print(e) message = MIMEText(body, 'plain', encoding) del message['Content-Transfer-Encoding'] message.set_payload(body.encode(encoding), charset) try: BytesGenerator(sys.stdout.buffer).flatten(message) except TypeError as e: print('error with byte input:') print(e) The `del m[…]; m.set_payload()` bits work around #16324 [1] and should be orthogonal to the encoding issues. It's possible that #12553 is trying to address this issue [2,3], but that issue's comments are a bit vague, so I'm not sure. The problem with the string payload is that email.generator.BytesGenerator.write is getting the Unicode string payload unencoded and trying to encode it as ASCII. It may be possible to work around this by encoding the payload so that anything that doesn't encode (using the body charset) to a 7bit value is replaced with a surrogate escape, but I'm not sure how to do that. The problem with the byte payload is that _has_surrogates (used in email.generator.Generator._handle_text and BytesGenerator._handle_text) chokes on byte input: TypeError: can't use a string pattern on a bytes-like object For UTF-8, you can get away with: message.as_string().encode(message.get_charset().get_output_charset()) because the headers are encoded into 7 bits, so re-encoding them with UTF-8 is a no-op. However, if the body charset is UTF-16-LE or any other encoding that remaps 7bit characters, this hack breaks down. Thoughts? Trevor [1]: http://bugs.python.org/issue16324 [2]: http://bugs.python.org/issue12553 [3]: http://bugs.python.org/issue12552#msg140294 -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
signature.asc
Description: OpenPGP digital signature
-- http://mail.python.org/mailman/listinfo/python-list