New submission from Antoine Pitrou <pit...@free.fr>:

On a 8GB RAM box (more than 6GB free), serializing many small objects can eat 
all memory, while the end result would take around 600MB on an UCS2 build:

$ LANG=C time opt/python -c "import json; l = [1] * (100*1024*1024); encoded = 
json.dumps(l)"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/antoine/cpython/opt/Lib/json/__init__.py", line 224, in dumps
    return _default_encoder.encode(obj)
  File "/home/antoine/cpython/opt/Lib/json/encoder.py", line 188, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/home/antoine/cpython/opt/Lib/json/encoder.py", line 246, in iterencode
    return _iterencode(o, 0)
MemoryError
Command exited with non-zero status 1
11.25user 2.43system 0:13.72elapsed 99%CPU (0avgtext+0avgdata 
27820320maxresident)k
2920inputs+0outputs (12major+1261388minor)pagefaults 0swaps


I suppose the encoder internally builds a large list of very small unicode 
objects, and only joins them at the end. Probably we could join it by chunks so 
as to avoid this behaviour.

----------
messages: 142338
nosy: ezio.melotti, pitrou, rhettinger
priority: normal
severity: normal
status: open
title: JSON-serializing a large container takes too much memory
type: resource usage
versions: Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue12778>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to