[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-24 Thread Antoine Pitrou
Antoine Pitrou added the comment: I've committed the latest patch (pickle_nonportable_size_2.patch). Thank you for working on this! -- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed ___ Python tracker

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-24 Thread Roundup Robot
Roundup Robot added the comment: New changeset 55fe4b57dd9c by Antoine Pitrou in branch '3.2': Issue #12848: The pure Python pickle implementation now treats object lengths as unsigned 32-bit integers, like the C implementation does. http://hg.python.org/cpython/rev/55fe4b57dd9c New changeset c

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-24 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Patch updated (comment for load_binstring added). -- Added file: http://bugs.python.org/file28097/pickle_nonportable_size_2.patch ___ Python tracker _

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Antoine Pitrou
Antoine Pitrou added the comment: > OTOH, I also think that it won't matter much in practive: if you try to > unpickle a string with more than 2GiB on a 32-bit system, chances are > really high that you run out of memory. Agreed. I think this issue is mostly about 64-bit systems, even though we

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch for 3.x which extends supported size to 4G on 64-bit. -- Added file: http://bugs.python.org/file28010/pickle_nonportable_size.patch ___ Python tracker ___

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Martin v . Löwis
Martin v. Löwis added the comment: IMO, the right solution is to finish PEP 3154, and support large strings in the format. For the time being, I'd claim that signed length in the existing implementations are just a bug, and that unsigned lengths are the intended semantics of these opcodes. I

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The issue is not only in difference between Python and C implementations, but also between 32-bit and 64-bit. pickle.py on 32-bit accepts data up to 2G. pickle.py on 64-bit accepts data up to 2G. _pickle.c on 32-bit accepts data up to 2G. _pickle.c on 64-bit

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Antoine Pitrou
Antoine Pitrou added the comment: I'd like to add that anyone wanting to serialize large data will certainly be using _pickle (or its ancestor cPickle), since using pickle.py is probably excruciatingly slow. Meaning we should favour preserving _pickle/cPickle's behaviour over preserving pickle

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-17 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Here is a patch for 3.x. It unify behavior of Python and C > implementations and unify behavior on 32- and 64-bit platforms. For > backward compatibility Pickler can pickle up to 2G data, but Unpickler > can unpickle up to 4G on 64-bit. I agree the right tr

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-16 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Here is a patch for 3.x. It unify behavior of Python and C implementations and unify behavior on 32- and 64-bit platforms. For backward compatibility Pickler can pickle up to 2G data, but Unpickler can unpickle up to 4G on 64-bit. -- keywords: +patc

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-15 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: The C implementation writes and reads BINBYTES and BINUNICODE up to 4G (on 64-bit platform). The Python implementation writes and reads BINBYTES and BINUNICODE up to 2G. What should be compatible fix? Allow the Python implementation to write and read up to 4

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-06 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: pickle.py is the buggy one here. Its use of the marshal module is really a hack. Plus, it is slower than both struct and int.from_bytes. 14:40:57 [~/cpython]$ ./python -m timeit "int.from_bytes(b'\xff\xff\xff\xff', 'big')" 100 loops, best of 3: 0.209

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Or you can use len = struct.unpack('>I', self.read(4)). -- ___ Python tracker ___ ___ Python-bugs-

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: Ah, for unpacking 32-bit unsigned big-endian bytes you can use len = int.from_bytes(self.read(4), 'big'). -- ___ Python tracker ___ _

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2012-11-06 Thread Serhiy Storchaka
Serhiy Storchaka added the comment: What if just add " & 0x"? -- nosy: +serhiy.storchaka versions: +Python 3.4 ___ Python tracker ___

[issue12848] pickle.py treats 32bit lengths as signed, but _pickle.c as unsigned

2011-08-27 Thread Antoine Pitrou
New submission from Antoine Pitrou : In several opcodes (BINBYTES, BINUNICODE... what else?), _pickle.c happily accepts 32-bit lengths of more than 2**31, while pickle.py uses marshal's "i" typecode which means "signed"... and therefore fails reading the data. Apparently, pickle.py uses marshal