Re: [Python-Dev] Issue 2722

2008-06-22 Thread Facundo Batista
2008/6/21 Neil Muller <[EMAIL PROTECTED]>:

> Could some one have a look at the suggested fix for issue 2722? While
> not a particularly common problem, it can be an annoyance, and the
> patch looks reasonable.

I'm on it... Python Bug Day!

-- 
. Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: Run GC less often

2008-06-22 Thread Antoine Pitrou
Greg Ewing  canterbury.ac.nz> writes:
> 
> What happens if the program enters a phase where it's not
> producing any new cyclic garbage, but is breaking references
> among the old objects in such a way that cycles of them
> are being left behind? Under this rule, the oldest
> generation would never be scanned, so those cycles would
> never be collected.

We could introduce a kind of "timing rule" such that there is at least
one full collection, say, every minute. While timing is not relevant
to memory management, it is relevant to the user behind the keyboard.

In any case, I think MvL's suggestion is worth trying.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: Run GC less often

2008-06-22 Thread Neil Schemenauer
Greg Ewing <[EMAIL PROTECTED]> wrote:
> Martin v. Löwis wrote:
>
>> Under my proposal, 10 middle collections must have passed,
>> PLUS the number of survivor objects from the middle generation
>> must exceed 10% of the number of objects in the oldest
>> generation.
>
> What happens if the program enters a phase where it's not
> producing any new cyclic garbage, but is breaking references
> among the old objects in such a way that cycles of them
> are being left behind? Under this rule, the oldest
> generation would never be scanned, so those cycles would
> never be collected.

Another problem is that the program could be slowly leaking and a
full collection will never happen.

>> As a consequence, garbage collection becomes less frequent
>> as the number of objects on the heap grows
>
> Wouldn't it be simpler just to base the collection frequency
> directly on the total number of objects in the heap?
>  From what another poster said, this seems to be what
> emacs does.

I like simple.  The whole generational collection scheme was dreamed
up by me early in the GC's life.  There was not a lot of thought or
benchmarking put into it since at that time I was more focused on
getting the basic GC working.  At some later point some tuning was
done on the collection frequencies but that 10 middle collections
scheme was never deeply investigated, AFAIK.

BTW, I suspect that documentation needs updating since I understand
that the GC is no longer optional (the stdlib and/or the Python
internals create reference cycles themselves).

  Neil

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] forceful exit

2008-06-22 Thread tomer filiba
hi

i'm having trouble when forking child processes to serve sockets. the
skeleton is something like the following:

def work():
try:
while True:
s = listener.accept()[0]
log("hello %s", s)
if os.fork() == 0:
try:
serve_client(sock)
finally:
sys.exit()     (1)
else:
log("forked child")
sock.close()
except KeyboardInterrupt:
log("got ctrl+c")
finally:
log("server terminated")
listener.close()

the problem is that sys.exit() raises an exception, which propagates
all the way up and is handled by the finallies... the real code does
more than just printing logs, so i can't allow that.

i'm forced to resort to os._exit, which does the trick but doesn't
perform cleanup, which made me realize there's no clean way in python
to *force* exit.

i think there ought to be something like sys.forcedexit(), which
collects all objects nicely and then exits immediately without letting
anything propagate up. this will also solve another problem i came
across, with threads. turns out only the main thread can kill the
process -- if another thread issues sys.exit, it only kills itself.
there's no clean way for a thread to terminate the process... i think
there must be.

i can contribute a patch in a matter of days... i think it's pretty
straight forward (calling Py_Finalize and then exit). aye or nay?


-tomer
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: Run GC less often

2008-06-22 Thread Martin v. Löwis
> Another problem is that the program could be slowly leaking and a
> full collection will never happen.

I don't think that will be possible. If the program slowly leaks,
survivor objects leave the middle generation, and account towards
the 10%. As the count of objects in the oldest generation doesn't
change, collection will eventually occur.

However, it may occur much later than it currently does, if you have
many objects on the heap, and each middle collection only has few
survivors. One may argue that if the machine had the space to keep
N objects in memory, it probably can also keep 1.1N objects in memory.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-22 Thread Tim Peters
[Antoine Pitrou]
>> Would it be helpful if the GC was informed of memory growth by the
>> Python memory allocator (that is, each time it either asks or gives back
>> a block of memory to the system allocator) ?

[Martin v. Löwis]
> I don't see how. The garbage collector is already informed about memory
> growth; it learns exactly when a container object is allocated or
> deallocated. That the allocator then requests memory from the system
> only confirms what the garbage collector already knew: that there are
> lots of allocated objects. From that, one could infer that it might
> be time to perform garbage collection - or one could infer that all
> the objects are really useful, and no garbage can be collected.

Really the same conundrum we currently face:  cyclic gc is currently
triggered by reaching a certain /excess/ of allocations over
deallocations.  From that we /do/ infer it's time to perform garbage
collection -- but, as some examples here showed, it's sometimes really
the case that the true meaning of the excess is that "all the objects
are really useful, and no garbage can be collected -- and I'm creating
a lot of them".

pymalloc needing to allocate a new arena would be a different way to
track an excess of allocations over deallocations, and in some ways
more sensible (since it would reflect an excess of /bytes/ allocated
over bytes freed, rather than an excess in the counts of objects
allocated-over-freed regardless of their sizes -- an implication is,
e.g., that cyclic gc would be triggered much less frequently by mass
creation of small tuples than of small dicts, since a small tuple
consumes much less memory than a small dict).

Etc. ;-)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: Run GC less often

2008-06-22 Thread Greg Ewing

Martin v. Löwis wrote:


Wouldn't it be simpler just to base the collection frequency
directly on the total number of objects in the heap?


Using what precise formula?


The simplest thing to try would be

 middle_collections >= num_objects_in_heap * some_constant

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: Run GC less often

2008-06-22 Thread Martin v. Löwis
>>> Wouldn't it be simpler just to base the collection frequency
>>> directly on the total number of objects in the heap?
>>
>> Using what precise formula?
> 
> The simplest thing to try would be
> 
>  middle_collections >= num_objects_in_heap * some_constant

So what value is some_constant?

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-22 Thread Martin v. Löwis
> pymalloc needing to allocate a new arena would be a different way to
> track an excess of allocations over deallocations, and in some ways
> more sensible (since it would reflect an excess of /bytes/ allocated
> over bytes freed, rather than an excess in the counts of objects
> allocated-over-freed regardless of their sizes -- an implication is,
> e.g., that cyclic gc would be triggered much less frequently by mass
> creation of small tuples than of small dicts, since a small tuple
> consumes much less memory than a small dict).
> 
> Etc. ;-)

:-) So my question still is: how exactly?

Currently, only youngest collections are triggered by allocation
rate; middle and old are triggered by frequency of youngest collection.
So would you now specify that the youngest collection should occur
if-and-only-if a new arena is allocated? Or discount arenas returned
from arenas allocated? Or apply this to triggering other generation
collections but youngest? How would that help the quadratic behavior
(which really needs to apply a factor somewhere)?

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: Run GC less often

2008-06-22 Thread Terry Reedy



Neil Schemenauer wrote:


BTW, I suspect that documentation needs updating since I understand
that the GC is no longer optional (the stdlib and/or the Python
internals create reference cycles themselves).


Is it possible and might it be useful for those internal cycle-creating 
operations to increment a counter that was part of the gc trigger? 
Doing millions of 'safe' operations would then leave the counter alone 
and could have less effect in triggering gc.


tjr

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com