Re: Tremendous slowdown due to garbage collection

2008-05-01 Thread Dieter Maurer
John Nagle <[EMAIL PROTECTED]> writes on Mon, 28 Apr 2008 11:41:41 -0700: > Dieter Maurer wrote: > > Christian Heimes <[EMAIL PROTECTED]> writes on Sat, 12 Apr 2008 18:47:32 > > +0200: > >> [EMAIL PROTECTED] schrieb: > >>> which made me suggest to use these as defaults, but then > > > We observed

Re: Tremendous slowdown due to garbage collection

2008-04-30 Thread Aaron Watters
> I do not argue that Python's default GC parameters must change -- only > that applications with lots of objects may want to consider a > reconfiguration. I would argue that changing the GC to some sort of adaptive strategy should at least be investigated. Having an app which doesn't need gc spe

Re: Tremendous slowdown due to garbage collection

2008-04-30 Thread s0suk3
On Apr 12, 11:11 am, [EMAIL PROTECTED] wrote: > I should have been more specific about possible fixes. > > > > python2.5 -m timeit 'gc.disable();l=[(i,) for i in range(200)]' > > > 10 loops, best of 3: 662 msec per loop > > > > python2.5 -m timeit 'gc.enable();l=[(i,) for i in range(200)]'

Re: Tremendous slowdown due to garbage collection

2008-04-28 Thread Martin v. Löwis
> I do not argue that Python's default GC parameters must change -- only > that applications with lots of objects may want to consider a > reconfiguration. That's exactly what I was trying to say: it's not that the parameters are useful for *all* applications (that's why they are tunable parameter

Re: Tremendous slowdown due to garbage collection

2008-04-28 Thread John Nagle
Dieter Maurer wrote: Christian Heimes <[EMAIL PROTECTED]> writes on Sat, 12 Apr 2008 18:47:32 +0200: [EMAIL PROTECTED] schrieb: which made me suggest to use these as defaults, but then We observed similar very bad behaviour -- in a Web application server. Apparently, the standard behaviour i

Re: Tremendous slowdown due to garbage collection

2008-04-28 Thread Dieter Maurer
"Martin v. Löwis" wrote at 2008-4-27 19:33 +0200: >> Martin said it but nevertheless it might not be true. >> >> We observed similar very bad behaviour -- in a Web application server. >> Apparently, the standard behaviour is far from optimal when the >> system contains a large number of objects an

Re: Tremendous slowdown due to garbage collection

2008-04-27 Thread Paul Rubin
"Terry Reedy" <[EMAIL PROTECTED]> writes: > Can this alternative be made easier by adding a context manager to gc > module to use with 'with' statements? Something like > > with gc.delay() as dummy: > That sonuds worth adding as a hack, but really I hope there can be an improved gc someday.

Re: Tremendous slowdown due to garbage collection

2008-04-27 Thread Terry Reedy
"Dieter Maurer" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | We observed similar very bad behaviour -- in a Web application server. | Apparently, the standard behaviour is far from optimal when the | system contains a large number of objects and occationally, large | numbers of o

Re: Tremendous slowdown due to garbage collection

2008-04-27 Thread Martin v. Löwis
> Martin said it but nevertheless it might not be true. > > We observed similar very bad behaviour -- in a Web application server. > Apparently, the standard behaviour is far from optimal when the > system contains a large number of objects and occationally, large > numbers of objects are created

Re: Tremendous slowdown due to garbage collection

2008-04-27 Thread Dieter Maurer
Christian Heimes <[EMAIL PROTECTED]> writes on Sat, 12 Apr 2008 18:47:32 +0200: > [EMAIL PROTECTED] schrieb: > > which made me suggest to use these as defaults, but then > > Martin v. Löwis wrote that > > > >> No, the defaults are correct for typical applications. > > > > At that point I felt los

Re: Tremendous slowdown due to garbage collection

2008-04-15 Thread Paul Rubin
Aaron Watters <[EMAIL PROTECTED]> writes: > Even with Btree's if you jump around in the tree the performance can > be awful. The Linux file cache really helps. The simplest approach is to just "cat" the index files to /dev/null a few times an hour. Slightly faster (what I do with Solr) is mmap

Re: Tremendous slowdown due to garbage collection

2008-04-15 Thread Aaron Watters
On Apr 14, 11:18 pm, Carl Banks <[EMAIL PROTECTED]> wrote: > However, that is for the OP to decide. The reason I don't like the > sort of question I posed is it's presumptuous--maybe the OP already > considered and rejected this, and has taken steps to ensure the in > memory data structure won't

Re: Tremendous slowdown due to garbage collection

2008-04-14 Thread Carl Banks
On Apr 14, 4:27 pm, Aaron Watters <[EMAIL PROTECTED]> wrote: > > A question often asked--and I am not a big a fan of these sorts of > > questions, but it is worth thinking about--of people who are creating > > very large data structures in Python is "Why are you doing that?" > > That is, you should

Re: Tremendous slowdown due to garbage collection

2008-04-14 Thread Aaron Watters
> A question often asked--and I am not a big a fan of these sorts of > questions, but it is worth thinking about--of people who are creating > very large data structures in Python is "Why are you doing that?" > That is, you should consider whether some kind of database solution > would be better.

Re: Tremendous slowdown due to garbage collection

2008-04-13 Thread Rhamphoryncus
On Apr 12, 6:58 pm, Steve Holden <[EMAIL PROTECTED]> wrote: > Paul Rubin wrote: > > Steve Holden <[EMAIL PROTECTED]> writes: > >> I believe you are making surmises outside your range of competence > >> there. While your faith in the developers is touching, the garbage > >> collection scheme is some

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread Martin v. Löwis
> I still don't see what is so good about defaults that lead to O(N*N) > computation for a O(N) problem, and I like Amaury's suggestion a lot, > so I would like to see comments on its disadvantages. Please don't > tell me that O(N*N) is good enough. For N>1E7 it isn't. Please understand that chang

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread Steve Holden
Paul Rubin wrote: > Steve Holden <[EMAIL PROTECTED]> writes: >> I believe you are making surmises outside your range of competence >> there. While your faith in the developers is touching, the garbage >> collection scheme is something that has received a lot of attention >> with respect to performa

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread Paul Rubin
Steve Holden <[EMAIL PROTECTED]> writes: > I believe you are making surmises outside your range of competence > there. While your faith in the developers is touching, the garbage > collection scheme is something that has received a lot of attention > with respect to performance under typical worklo

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread Steve Holden
[EMAIL PROTECTED] wrote: >> Martin said that the default settings for the cyclic gc works for most >> people. > > I agree. > >> Your test case has found a pathologic corner case which is *not* >> typical for common application but typical for an artificial benchmark. > > I agree that my "corner"

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread andreas . eisele
> Martin said that the default settings for the cyclic gc works for most > people. I agree. > Your test case has found a pathologic corner case which is *not* > typical for common application but typical for an artificial benchmark. I agree that my "corner" is not typical, but I strongly disagr

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread andreas . eisele
Sorry, I have to correct my last posting again: > > Disabling the gc may not be a good idea in a real application; I suggest > > you to play with the gc.set_threshold function and set larger values, at > > least while building the dictionary. (700, 1000, 10) seems to yield good > > results. > > py

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread Christian Heimes
[EMAIL PROTECTED] schrieb: > which made me suggest to use these as defaults, but then > Martin v. Löwis wrote that > >> No, the defaults are correct for typical applications. > > At that point I felt lost and as the general wish in that thread was > to move > discussion to comp.lang.python, I bro

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread John Nagle
[EMAIL PROTECTED] wrote: > In an application dealing with very large text files, I need to create > dictionaries indexed by tuples of words (bi-, tri-, n-grams) or nested > dictionaries. The number of different data structures in memory grows > into orders beyond 1E7. > > It turns out that the def

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread andreas . eisele
I should have been more specific about possible fixes. > > python2.5 -m timeit 'gc.disable();l=[(i,) for i in range(200)]' > > 10 loops, best of 3: 662 msec per loop > > > python2.5 -m timeit 'gc.enable();l=[(i,) for i in range(200)]' > > 10 loops, best of 3: 15.2 sec per loop > > In the l

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread Carl Banks
On Apr 12, 7:02 am, [EMAIL PROTECTED] wrote: > I would suggest to configure the default behaviour of the garbage > collector in such a way that this squared complexity is avoided > without requiring specific knowledge and intervention by the user. Not > being an expert in these details I would like

Re: Tremendous slowdown due to garbage collection

2008-04-12 Thread Steve Holden
[...] > I would suggest to configure the default behaviour of the garbage > collector in such a way that this squared complexity is avoided > without requiring specific knowledge and intervention by the user. Not > being an expert in these details I would like to ask the gurus how > this could be d

Tremendous slowdown due to garbage collection

2008-04-12 Thread andreas . eisele
In an application dealing with very large text files, I need to create dictionaries indexed by tuples of words (bi-, tri-, n-grams) or nested dictionaries. The number of different data structures in memory grows into orders beyond 1E7. It turns out that the default behaviour of Python is not very