Lawrence D'Oliveiro <l...@geek-central.gen.new_zealand> writes: >> GC's for large systems generally don't free (or even examine) individual >> garbage objects. They copy the live objects to a new contiguous heap >> without ever touching the garbage, and then they release the old heap. > > And suddenly you’ve doubled the memory requirements. And on top of that, > since you’re moving the valid objects into different memory, you’re forcing > cache misses on all of them as well.
A minimal naive implementation indeed doubles the memory requirements, but from a Python perspective where every integer takes something like 24 bytes already, even that doesn't seem so terrible. More sophisticated implementations use multiple small heaps or other tricks. It still costs something in memory footprint, but less than the minimal implementation's 2x cost. The new heap is filled sequentially so accesses to it will have good locality. You do have to jump around within the old heap, but again, with generational schemes, in the more frequent collections, the old heap fits entirely in cache. For example, GHC's minor heap size is 256kB. For major collections, GHC switches (or used to) from copying to a mark/compact scheme once the amount of live data in the heap goes over some amount, giving the best of both worlds. It's also the case that programs with very large memory consumption tend to use most of the memory for large arrays that don't contain pointers (think of a database server with a huge cache). That means the gc doesn't really have to think about all that much of the memory. > This is the continuing problem with garbage collection: all the attempts to > make it cheaper just end up moving the costs somewhere else. Same thing with manual allocation. That moves the costs off the computer and onto the programmer. Not good, most of the time. Really, I'm no gc expert, but the stuff you're saying about gc is quite ill-informed. You might want to check out some current literature. -- http://mail.python.org/mailman/listinfo/python-list