[issue12778] JSON-serializing a large container takes too much memory

2011-08-19 Thread poq
poq added the comment: > Is iterencode() used much? I would think dump() and dumps() see the most use. Of course. I'd just prefer an elegant & complete solution. But I agree accelerating just dump() would already be much better than the current situation. --

[issue12778] JSON-serializing a large container takes too much memory

2011-08-19 Thread Antoine Pitrou
Antoine Pitrou added the comment: > > It would just need to call a given callable (fp.write) at regular > intervals and that would be enough to C-accelerate dump(). > > True, but that would just special case dump(), just like dumps() is > special-cased now. Ideally JSONEncoder.iterencode() woul

[issue12778] JSON-serializing a large container takes too much memory

2011-08-19 Thread poq
poq added the comment: > It would just need to call a given callable (fp.write) at regular intervals > and that would be enough to C-accelerate dump(). True, but that would just special case dump(), just like dumps() is special-cased now. Ideally JSONEncoder.iterencode() would be accelerated,

[issue12778] JSON-serializing a large container takes too much memory

2011-08-19 Thread Roundup Robot
Roundup Robot added the comment: New changeset 47176e8d7060 by Antoine Pitrou in branch 'default': Issue #12778: Reduce memory consumption when JSON-encoding a large container of many small objects. http://hg.python.org/cpython/rev/47176e8d7060 -- nosy: +python-dev ___

[issue12778] JSON-serializing a large container takes too much memory

2011-08-19 Thread Antoine Pitrou
Changes by Antoine Pitrou : -- resolution: -> fixed stage: -> committed/rejected status: open -> closed ___ Python tracker ___ ___ P

[issue12778] JSON-serializing a large container takes too much memory

2011-08-19 Thread Antoine Pitrou
Antoine Pitrou added the comment: > I actually looked into doing this for issue #12134, but it didn't seem > so simple; Since C has no yield, I think the iterator would need to > maintain its own stack to keep track of where it is in the object tree > it's encoding... The encoder doesn't have

[issue12778] JSON-serializing a large container takes too much memory

2011-08-18 Thread Antoine Pitrou
Antoine Pitrou added the comment: This patch does the trick. -- keywords: +patch Added file: http://bugs.python.org/file22942/jsonacc.patch ___ Python tracker ___ __

[issue12778] JSON-serializing a large container takes too much memory

2011-08-18 Thread poq
poq added the comment: I think this is because dumps() uses the C encoder. Making the C encoder incremental (i.e. iterator-based) like the Python encoder would solve this. I actually looked into doing this for issue #12134, but it didn't seem so simple; Since C has no yield, I think the itera

[issue12778] JSON-serializing a large container takes too much memory

2011-08-18 Thread Antoine Pitrou
New submission from Antoine Pitrou : On a 8GB RAM box (more than 6GB free), serializing many small objects can eat all memory, while the end result would take around 600MB on an UCS2 build: $ LANG=C time opt/python -c "import json; l = [1] * (100*1024*1024); encoded = json.dumps(l)" Traceback