On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote: >> ... yeah, something like that would bypass
On Mon, May 21, 2007 at 05:43:16PM -0500, Matt Mackall wrote: > As long as we're throwing out crazy unpopular ideas, try this one: > Divide struct page in two such that all the most commonly used > elements are in one piece that's nicely sized and the rest are in > another. Have two parallel arrays containing these pieces and accessor > functions around the unpopular bits. > Whether a sensible divide between popular and unpopular bits isn't > clear to me. But hey, I said it was crazy. I have a crazier and even less popular idea. Eliminate struct page entirely as an accounting structure (and, of course, mem_map with it). Filesystems can keep the per-page metadata they need in their own accounting structures, slab mutatis mutandis, etc. The brilliant bit here is that devolving the accounting structures this way allows the fs and/or subsystem to arrange for strong cache locality, file offset adjacency to imply memory adjacency of the page accounting fields, etc., where grabbing random structures out of some array is a real cache thrasher. The page allocation and page replacement algorithms would have to be adjusted, and things would have to allocate their own refcounts, supposing they want/need refcounts, but it's not so far out. Refer to filesystem pages by <mapping, index> pairs, refer to slab pages by address (virtual and physical are trivially inter-convertible), mock up something akin to what filesystems do for anonymous pages, etc. The real objection everyone's going to have is that driver writers will stain their shorts when faced with the rules for handling such things. The thing is, I'm not entirely sure who these driver writers that would have such trouble are, since the driver writers I know personally are sophisticates rather than walking disaster areas as such would imply. I suppose they may not be representative of the whole. -- wli P.S. This idea is not plucked out of the air; it has precedents. A number of microkernels do this, and IIRC k42 does so also. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/