On Mon, 2007-10-01 at 10:03 -0700, Dave Hansen wrote: > On Mon, 2007-10-01 at 19:43 +1000, Rusty Russell wrote: > > On Fri, 2007-09-28 at 16:00 -0700, Dave Hansen wrote: > > > Module refcounts currently use a percpu counter stored > > > in the 'struct module'. However, we also have a more > > > generic implementation that does stuff like handle > > > hotplug cpus. > > > > > > I'm not actually all that convinced that this refcount > > > actually does a lot of good, with cpus racing bumping > > > the counters at the same time that they're being > > > summed up. But, it certainly isn't any worse than > > > what was there before. > > > > That's why we look at the counters inside stop_machine_run(). > > Ahhh. That makes sense. Although it wasn't apparent during my 3-second > perusal of the code. > > > Note that (1) the module implementation handles hotplug CPUs > > > You're saying it handles hotplug because of stop_machine_run()?
No, it handles hotplug because it does every possible CPU, not every online CPU. percpu_counters empties cpu's counter presumably to avoid systematic error. Renaming percpu_counters to approximate_counters here would be nice. > > But it might be a useful cleanup (although a slight de-optimization). > > If you want I'll queue for 2.6.24 (there are several other module > > patches pending too). > > Might as well. It removed a very small amount of code, and opens the > door a bit for future optimizations in a single place. You missed removing struct module_ref, too. That's a little more code. > > In an ideal world, (1) we would have percpu pointers using the same > > percpu mechanism as percpu variables, (2) we would have a modal variant > > of percpu counters which would collapse to a single counter when we > > cared about the precise value (probably using stop_machine for the > > transition). This would be useful for many other cases. > > Yeah, but before we do that, we need some kind of flag to get the > percpu_counter_mod() fast path shoved into the slow path that takes the > lock. Well, there is already a branch in the fast path. BTW, comparing before and after applying your patch for a try_module_get/module_put pair, I get 5.9 ns vs 20.6 ns. We perhaps win something back on NUMA-like machines, but it's not clear. My initial implementation of such a counter used atomic ops via a pointer. The pointer was aimed at a shared counter for slow-mode. The problem is that you need to disable preemption around the counter update (so you can use rcu to ensure everyone has seen the changeover). > I'm not sure the stop_machine() mechanism will work very well if we try > to expand this much further for other users. What would the SGI guys > think if these happened more than once in a blue moon? Indeed, that's why I called it "stop_machine". The real-time coders hate it too. Cheers, Rusty. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/