On Mon, May 21, 2007 at 04:26:03AM -0700, William Lee Irwin III wrote: > On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote: > >> Choosing k distinct integers (mem_map array indices) from the interval > >> [0,n-1] results in k(n-k+1)/n non-adjacent intervals of contiguous > >> array indices on average. The average interval length is > >> (n+1)/(n-k+1) - 1/C(n,k). Alignment considerations make going much > >> further somewhat hairy, but it should be clear that contiguity arising > >> from random choice is non-negligible. > > On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote: > > That doesn't say anything about temporal locality, though. > > It doesn't need to. If what's in the cache is uniformly distributed, > you get that result for spatial locality. From there, it's counting > cachelines.
OK, so your 'k' is the number of struct pages that are in cache? Then that's fine. I'm not sure how many that is going to be, but I would be surprised if it were a significant proportion of mem_map, even on not-so-large memory systems. > On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote: > >> In any event, I don't have all that much of an objection to what's > >> actually proposed, just this particular cache footprint argument. > >> One can motivate increases in sizeof(struct page), but not this way. > > On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote: > > Realise that you have to have a run of I think at least 7 or 8 contiguous > > pages and temporally close references in order to save a single cacheline. > > Then also that if the page being touched is not partially in cache from > > an earlier access, then it is statistically going to cost more lines to > > touch it (up to 75% if you touch the first and the last field, obviously 0% > > if you only touch a single field, but that's unlikely given that you > > usually take a reference then do at least something else like check flags). > > I think the problem with the cache footprint argument is just whether > > it makes any significant difference to performance. But.. > > The average interval ("run") length is (n+1)/(n-k+1) - 1/C(n,k), so for > that to be >= 8 you need (n+1)/(n-k+1) - 1/C(n,k) >= 8 which also happens > when (n+1)/(n-k+1) >= 9 or when n >= (9/8)*k - 1 or k <= (8/9)*(n+1). > Clearly a lower bound on k is required, but not obviously derivable. > k >= 8 is obvious, but the least k where (n+1)/(n-k+1) - 1/C(n,k) >= 8 > is not entirely obvious. Numerically solving for the least such k finds > that k actually needs to be relatively close to (8/9)*n. A lower bound > of something like 0.87*n + O(1) probably holds. Ah, you worked it out... yeah I'd guess this is going to be pretty difficult a condition to satisfy (given that it isn't possible for a 4GB system, even if you had 32MB of cache to fill entirely with struct pages). > On Mon, May 21, 2007 at 01:08:13AM -0700, William Lee Irwin III wrote: > >> Now that I've been informed of the ->_count and ->_mapcount issues, > >> I'd say that they're grave and should be corrected even at the cost > >> of sizeof(struct page). > > On Mon, May 21, 2007 at 11:27:42AM +0200, Nick Piggin wrote: > > ... yeah, something like that would bypass > > Did you get cut off here? Must have. I was going to say it would bypass the whole speed/size discussion anyway :P - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/