New submission from Andreas Jung <aj...@users.sourceforge.net>: We encountered a pretty bizarre behavior of Python 2.4.6 while decoding a 600MB long unicode string 'data':
Python 2.4.6 (8GB RAM, 64 bit) (Pdb) type(data) <type 'unicode'> (Pdb) len(data) 601794657 (Pdb) data2=data.encode('utf-8') *** SystemError: Negative size passed to PyString_FromStringAndSize Assuming that this has something to do with a 512MB limit: (Pdb) data2=data[:512*1024*1024].encode('utf-8') *** SystemError: Negative size passed to PyString_FromStringAndSize Same bug...now with 512MB - 1 byte: (Pdb) data2=data[:(256*1024*1024)-1].encode('utf-8') OverflowError Cross-check on a different Linux box (4GB RAM, 4 GB Swap, 64 bit) aj...@blackmoon:~> python2.4 Python 2.4.5 (#1, Jun 9 2008, 10:35:12) [GCC 4.2.1 (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> data = u'x'*601794657 >>> data2= data.encode('utf-8') Traceback (most recent call last): File "<stdin>", line 1, in ? MemoryError Where is this different behavior coming from? ---------- messages: 96695 nosy: ajung severity: normal status: open title: SystemError/MemoryError/OverflowErrors on encode() a unicode string versions: Python 2.4 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue7551> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com