Hi, A bytearray is pickled (using max protocol) as follows:
>>> pickletools.dis(pickle.dumps(bytearray([255]*10),2)) 0: \x80 PROTO 2 2: c GLOBAL '__builtin__ bytearray' 25: q BINPUT 0 27: X BINUNICODE u'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff' 52: q BINPUT 1 54: U SHORT_BINSTRING 'latin-1' 63: q BINPUT 2 65: \x86 TUPLE2 66: q BINPUT 3 68: R REDUCE 69: q BINPUT 4 71: . STOP >>> bytearray("\xff"*10).__reduce__() (<type 'bytearray'>, (u'\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff', 'latin-1'), None) Is there a particular reason it is encoded so inefficiently? Most notably, the actual *bytes* in the bytearray are represented by an UTF-8 string. This needs to be transformed into a unicode string and then encoded back into bytes, when unpickled. The thing being a bytearray, I would expect it to be pickled as such: a sequence of bytes. And then possibly converted back to bytearray using the constructor that takes the bytes directly (BINSTRING/BINBYTES pickle opcodes). The above occurs both on Python 2.x and 3.x. Any ideas? Candidate for a patch? Irmen. -- http://mail.python.org/mailman/listinfo/python-list