[issue6594] json C serializer performance tied to structure depth on some systems

2010-11-30 Thread Antoine Pitrou
Antoine Pitrou added the comment: Raymond, I'll follow up in private with Shawn. All the recent performance improvements done on JSON (in 3.2) mean the issue can be closed IMO. -- resolution: -> out of date status: open -> closed ___ Python tracker

[issue6594] json C serializer performance tied to structure depth on some systems

2010-11-30 Thread Shawn
Shawn added the comment: I specifically mentioned *SPARC* as the performance problem area, but the reply about "0.5s to dump" fails to mention on what platform they tested My problem is not "undiagnosable". I'll be happy to provide you with even more data files. But I believe that there is

[issue6594] json C serializer performance tied to structure depth on some systems

2010-11-30 Thread Raymond Hettinger
Raymond Hettinger added the comment: Antoine, what do you want to do with the one? Without a good test case the OP's original issue is undiagnosable. -- assignee: rhettinger -> pitrou versions: +Python 3.1 ___ Python tracker

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: Your example takes 0.5s to dump here. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Un

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Antoine Pitrou
Changes by Antoine Pitrou : Removed file: http://bugs.python.org/file15450/json-opts2.patch ___ Python tracker ___ ___ Python-bugs-list mailing

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: > However, this bug is about the serializer (encoder). So perhaps the > decode performance patch should be a separate bug? You're right, I've filed a separate bug for it: issue7451. -- stage: patch review -> needs patch __

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Shawn
Shawn added the comment: I've attached a sample JSON file that is much slower to write out on some systems as described in the initial comment. If you were to restructure the contents of this file into more of a tree structure instead of the flat array structure it uses now, you will notice tha

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Shawn
Shawn added the comment: You are right, an environment anomaly let me to falsely believe that this had somehow affected encoding performance. I had repeated the test many times with and without the patch using simplejson trunk and wrongly concluded that the patch was to blame. After correcting

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: > The attached patch doubles write times for my particular case when > applied to simplejson trunk using python 2.6.2. Not good. What do you mean by "write times"? The patch only affects decoding. -- ___ Python tra

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Shawn
Shawn added the comment: The attached patch doubles write times for my particular case when applied to simplejson trunk using python 2.6.2. Not good. -- ___ Python tracker ___ _

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Valentin Kuznetsov
Valentin Kuznetsov added the comment: I made data local, but adding del shows the same behavior. This is the test def test(): source = open('mangled.json', 'r') data = json.load(source) source.close() del data test() time.sleep(20) -- __

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Nope, all three json's implementation do not release the memory. I used > your patched one, the one shipped with 2.6 and cjson. The one which comes > with 2.6, reach 2GB, then release 200MB and stays with 1.8GB during > sleep. The cjson reaches 1.5GB mark a

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Valentin Kuznetsov
Valentin Kuznetsov added the comment: Nope, all three json's implementation do not release the memory. I used your patched one, the one shipped with 2.6 and cjson. The one which comes with 2.6, reach 2GB, then release 200MB and stays with 1.8GB during sleep. The cjson reaches 1.5GB mark and s

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Antoine, > indeed, both patches improved time and memory foot print. The latest > patch shows only 1.1GB RAM usage and is very fast. What's worry me > though, that memory is not released back to the system. Is this is the > case? I just added time.sleep aft

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-07 Thread Valentin Kuznetsov
Valentin Kuznetsov added the comment: Antoine, indeed, both patches improved time and memory foot print. The latest patch shows only 1.1GB RAM usage and is very fast. What's worry me though, that memory is not released back to the system. Is this is the case? I just added time.sleep after jso

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-04 Thread Antoine Pitrou
Changes by Antoine Pitrou : Removed file: http://bugs.python.org/file15444/json-opts.patch ___ Python tracker ___ ___ Python-bugs-list mailing

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-04 Thread Antoine Pitrou
Antoine Pitrou added the comment: Here is a new patch with an internal memo dict to reuse equal keys, and some tests. -- stage: -> patch review versions: +Python 3.2 Added file: http://bugs.python.org/file15450/json-opts2.patch ___ Python tracker <

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-02 Thread Valentin Kuznetsov
Valentin Kuznetsov added the comment: Oops, that's explain why I saw such small memory usage with cjson. I constructed tests on a fly. Regarding the data structure. Unfortunately it's out of my hands. The data comes from data-service. So, I can't do much and can only report to developers.

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-02 Thread Antoine Pitrou
Antoine Pitrou added the comment: That said, it is possible to further improve json by reducing the number of memory allocations and temporary copies. Here is an experimental (meaning: not polished) patch which gains 40% in decoding speed in your example (9 seconds versus 15). We could also add

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-02 Thread Antoine Pitrou
Antoine Pitrou added the comment: > Using cjson module, I observed 180MB of RAM utilization > source = open('mangled.json', 'r') > data = cjson.encode(source.read()) > > cjson is about 10 times faster! This is simply wrong. You should be using cjson.decode(), not cjson.encode(). If you do so,

[issue6594] json C serializer performance tied to structure depth on some systems

2009-12-02 Thread Valentin Kuznetsov
Valentin Kuznetsov added the comment: Hi, I'm sorry for delay, I was busy. Here is a test data file: http://www.lns.cornell.edu/~vk/files/mangled.json Its size is 150 MB, 50MB less of original, due to scrambled values I was forced to do. The tests with stock json module in python 2.6.2 is 2GB

[issue6594] json C serializer performance tied to structure depth on some systems

2009-11-19 Thread Bob Ippolito
Bob Ippolito added the comment: Did you try the trunk of simplejson? It doesn't work quite the same way as the current json module in Python 2.6+. Without the data or a tool to produce data that causes the problem, there isn't much I can do to help. -- __

[issue6594] json C serializer performance tied to structure depth on some systems

2009-11-19 Thread Valentin Kuznetsov
Valentin Kuznetsov added the comment: Hi, I just found this bug and would like to add my experience with performance of large JSON docs. I have a few JSON docs about 180MB in size which I read from data-services. I use python2.6, run on Linux, 64- bit node w/ 16GB of RAM and 8 core CPU, Intel

[issue6594] json C serializer performance tied to structure depth on some systems

2009-08-06 Thread Shawn
Shawn added the comment: First, I want to apologise for not providing more detail initially. Notably, one thing you may want to be aware of is that I'm using python 2.4.4 with the latest version of simplejson. So my timings and assumptions here are based on the fact that simplejson was adopted

[issue6594] json C serializer performance tied to structure depth on some systems

2009-08-06 Thread Antoine Pitrou
Antoine Pitrou added the comment: As Raymond said, and besides, when you talk about "penalty", please explain what the baseline is. Otherwise it's a bit hard to follow. (and I stress again that SPARC is a nich platform, even Niagara :-); moreover, Niagara is throughput-oriented rather than late

[issue6594] json C serializer performance tied to structure depth on some systems

2009-08-05 Thread Raymond Hettinger
Raymond Hettinger added the comment: Are you sure that recursion depth is the issue? Have you tried the same number and kind of objects listed serially (unnested)? This would help rule-out memory allocation issues and would instead confirm that it has something to do with the C stack. It woul

[issue6594] json C serializer performance tied to structure depth on some systems

2009-08-05 Thread Shawn
Shawn added the comment: As I mentioned, there's also noticeable performance penalties on recent SPARC systems, such as Niagra T1000, T2000, etc. The degradation is just less obvious (a 10-15 second penalty instead of a 20 or 30 second penalty). While x86 enjoys no penalty at all (in my testin

[issue6594] json C serializer performance tied to structure depth on some systems

2009-08-05 Thread Antoine Pitrou
Antoine Pitrou added the comment: I'm not sure there's anything we should do about this. Some architectures are unreasonably slow at some things, and the old SPARC implementations are a niche nowadays. I suppose you may witness the same kinds of slowdowns if you use cPickle rather than json. (I

[issue6594] json C serializer performance tied to structure depth on some systems

2009-07-28 Thread Brett Cannon
Changes by Brett Cannon : -- priority: -> low ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.pytho

[issue6594] json C serializer performance tied to structure depth on some systems

2009-07-28 Thread Shawn
New submission from Shawn : The json serializer's performance (when using the C speedups) appears to be tied to the depth of the structure being serialized on some systems. In particular, dict structure that are more than a few levels deep, especially when they content mixed values (lists, strin