New submission from Valentin Kuznetsov :
Hi, I found that parsing XML file with identical structure leads to missing
children item at some point. In my test case which I attach it happens at
id=183. Basically I have XML with bunch of elements of the following structure
Valentin Kuznetsov added the comment:
Hi,
I just found this bug and would like to add my experience with
performance of large JSON docs. I have a few JSON docs about 180MB in
size which I read from data-services. I use python2.6, run on Linux, 64-
bit node w/ 16GB of RAM and 8 core CPU, Intel
Valentin Kuznetsov added the comment:
Hi,
I'm sorry for delay, I was busy. Here is a test data file:
http://www.lns.cornell.edu/~vk/files/mangled.json
Its size is 150 MB, 50MB less of original, due to scrambled values I was
forced to do.
The tests with stock json module in python 2.6
Valentin Kuznetsov added the comment:
Oops, that's explain why I saw such small memory usage with cjson. I
constructed tests on a fly.
Regarding the data structure. Unfortunately it's out of my hands. The
data comes from data-service. So, I can't do much and can only report
Valentin Kuznetsov added the comment:
Antoine,
indeed, both patches improved time and memory foot print. The latest
patch shows only 1.1GB RAM usage and is very fast. What's worry me
though, that memory is not released back to the system. Is this is the
case? I just added time.sleep
Valentin Kuznetsov added the comment:
Nope, all three json's implementation do not release the memory. I used
your patched one, the one shipped with 2.6 and cjson. The one which comes
with 2.6, reach 2GB, then release 200MB and stays with 1.8GB during
sleep. The cjson reaches 1.5GB mar
Valentin Kuznetsov added the comment:
I made data local, but adding del shows the same behavior.
This is the test
def test():
source = open('mangled.json', 'r')
data = json.load(source)
source.close()
del data
Valentin Kuznetsov added the comment:
I wonder if you can make a patch for 2.6 python branch.
--
___
Python tracker
<http://bugs.python.org/issue7451>
___
___