On Sun, May 07, 2006 at 10:50:54PM +0200, Laurynas Biveinis wrote: > >I suspect Cygwin is blameless here. The runtime page size detection > >would probably work better (but it's slower). > > How much would be slower that, if it needs to be executed once per > invocation? Looks like it's the way to go.
Calculating the page size is constant overhead; that's ignorable. However, having to load the page size from memory and do e.g. variable shifts on it when processing page-sized chunks of data is a bigger slowdown - IIRC it's the marking and sweeping loops that are the issue. I made the current hack after much profiling. The comments say that GGC_PAGE_SIZE must be no larger than the system's page size. In fact, the code already looks like it tries to handle G.pagesize > GGC_PAGE_SIZE. It looks to me as if the munmap thing you've found is simply something I didn't think about. It would probably need some changes to the free page list handling, but refusing to unmap things in chunks less than G.pagesize would probably not be hard. I think you could just leave them on the free_pages list. The thing to beware of there would be that the code in release_pages reads like it expects the free list to be in some sort of order. But really it's just an optimization to reduce the number of syscalls made. The free pages list is in no particular order. So if we _need_ to coalesce when releasing pages, then we'll have to do something like sort the free list. This only happens once per zone per collection, so sorting the free list before releasing pages probably won't add measurable time. -- Daniel Jacobowitz CodeSourcery