New submission from Ross Patterson <m...@rpatterson.net>: Due to repeated use of StringIO as a way to "look ahead" into subparts while checking that multipart boundaries are unique, memory consumption during email.generator.Generator.flatten() can be up to 3 times the original message size.
I implemented a subclass of email.generator.Generator that works around this using email.message.Message.walk() to check message headers and string (final) payloads for the boundary without duplicating their contents into a StringIO. It assumes that the boundary only ever might be duplicated in a single part's headers or in a single part's payload when that part's payload is a string. IOW, it assumes that the boundary will not be duplicated by some combination of all the parts' and recursive subparts' headers and string payloads. If this assumption is safe, then this implementation should work. If this assumption is not safe, then perhaps a different boundary format can be used which will make this assumption safe? You can find my implementation at http://gitorious.org/rpatterson- imappipe/rpatterson- imappipe/blobs/master/rpatterson/imappipe/generator.py ---------- components: Library (Lib) messages: 92853 nosy: rpatterson severity: normal status: open title: email.generator.Generator memory consumption type: resource usage versions: Python 2.6 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue6942> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com