fedor <[EMAIL PROTECTED]> wrote: > Hi all, happy new year, > > I was trying to pickle a instance of a subclass of a tuple when I ran > into a problem. Pickling doesn't work with HIGHEST_PROTOCOL. How should > I rewrite my class so I can pickle it?
You're falling afoul of an optimization in pickle's protocol 2, which is documented in pickle.py as follows: # A __reduce__ implementation can direct protocol 2 to # use the more efficient NEWOBJ opcode, while still # allowing protocol 0 and 1 to work normally. For this to # work, the function returned by __reduce__ should be # called __newobj__, and its first argument should be a # new-style class. The implementation for __newobj__ # should be as follows, although pickle has no way to # verify this: # # def __newobj__(cls, *args): # return cls.__new__(cls, *args) # # Protocols 0 and 1 will pickle a reference to __newobj__, # while protocol 2 (and above) will pickle a reference to # cls, the remaining args tuple, and the NEWOBJ code, # which calls cls.__new__(cls, *args) at unpickling time # (see load_newobj below). If __reduce__ returns a # three-tuple, the state from the third tuple item will be # pickled regardless of the protocol, calling __setstate__ # at unpickling time (see load_build below). Essentially, and simplifying just a little...: you're inheriting __reduce_ex__ (because you're not overriding it), but you ARE overriding __new__ *and changing its signature* -- so, the inherited __reduce__ex__ is used, and, with this protocol 2 optimization, it essentially assumes that __new__ is similarly used -- or, at least, that a __new__ is used which does not arbitrarily change the signature! So, if you want to change __new__'s signature, and yet be picklable by protocol 2, you have to override __reduce_ex__ to return the right "args"... those your class's __new__ expects! For example, you could consider something like...: def __newobj__(cls, *args): return cls.__new__(cls, *args) class A(tuple): def __new__(klass, arg1, arg2): return super(A, klass).__new__(klass, (arg1, arg2)) def __reduce_ex__(self, proto=0): if proto >= 2: return __newobj__, (A, self[0], self[1]) else: return super(A, self).__reduce_ex__(proto) Note the key difference in A's __reduce_ex__ (for proto=2) wrt tuple's (which is the same as object's) -- that's after an "import a" where a.py has this code as well as an 'a = A(1, 2)'...: >>> a.a.__reduce_ex__(2) (<function __newobj__ at 0x3827f0>, (<class 'a.A'>, 1, 2)) >>> tuple.__reduce_ex__(a.a, 2) (<function __newobj__ at 0x376770>, (<class 'a.A'>, (1, 2)), {}, None, None) >>> Apart from the additional tuple items (not relevant here), tuple's reduce returns args as (<class 'a.A'>, (1, 2)) -- two items: the class and the tuplevalue; so with protocol 2 this ends up calling A.__new__(A, (1,2))... BOOM, because, differently from tuple.__new__, YOUR override doesn't accept this signature! So, I suggest tweaking A's reduce so it returns args as (<class 'a.A'>, 1, 2)... apparently the only signature you're willing to accept in your A.__new__ method. Of course, if A.__new__ can have some flexibility, you COULD have it accept the same signature as tuple.__new__ and then you wouldn't have to override __reduce_ex__. Or, you could override __reduce_ex__ in other ways, say: def __reduce_ex__(self, proto=0): if proto >= 2: proto = 1 return super(A, self).__reduce_ex__(proto) this would avoid the specific optimization that's tripping you up due to your signature-change in __new__. The best solution may be to forget __reduce_ex__ and take advantage of the underdocumented special method __getnewargs__ ...: class A(tuple): def __new__(klass, arg1, arg2): return super(A, klass).__new__(klass, (arg1, arg2)) def __getnewargs__(self): return self[0], self[1] This way, you're essentially choosing to explicitly tell the "normal" __reduce_ex__ about the particular arguments you want to be used for the __new__ call needed to reconstruct your object on unpickling! This highlights even better the crucial difference, due strictly to the change in __new__'s signature...: >>> a.a.__getnewargs__() (1, 2) >>> tuple.__getnewargs__(a.a) ((1, 2),) It IS, I guess, somewhat unfortunate that you have to understand pickling in some depth to let you change __new__'s signature and yet fully support pickling... on the other hand, when you're overriding __new__ you ARE messing with some rather deep infrastructure, particularly if you alter its signature so that it doesn't accept "normal" calls any more, so it's not _absurd_ that compensatory depth of understanding is required;-). Alex -- http://mail.python.org/mailman/listinfo/python-list