Thanks, Paul, that's a great idea! The approach I'm testing right now just does a module_get(mod), which is released when you manually disable the module by echoing it's name into /sys/kernel/security/stacker/unload, so that must be done before you can rmmod. This way no refcounting is actually needed at all in the time-critical loops, but it involves a change in the LSM API which may well be rejected, so your approach may be the way to go.
thanks, -serge Quoting Paul E. McKenney ([EMAIL PROTECTED]): > On Thu, Jul 14, 2005 at 12:13:57PM -0500, [EMAIL PROTECTED] wrote: > > Quoting Paul E. McKenney ([EMAIL PROTECTED]): > > > On Thu, Jul 14, 2005 at 08:44:50AM -0500, [EMAIL PROTECTED] wrote: > > > > Quoting Paul E. McKenney ([EMAIL PROTECTED]): > > > > > My guess is that the reference count is indeed costing you quite a > > > > > bit. I glance quickly at the patch, and most of the uses seem to > > > > > be of the form: > > > > > > > > > > increment ref count > > > > > rcu_read_lock() > > > > > do something > > > > > rcu_read_unlock() > > > > > decrement ref count > > > > > > > > > > Can't these cases rely solely on rcu_read_lock()? Why do you also > > > > > need to increment the reference count in these cases? > > > > > > > > The problem is on module unload: is it possible for CPU1 to be > > > > on "do something", and sleep, and, while it sleeps, CPU2 does > > > > rmmod(lsm), so that by the time CPU1 stops sleeping, the code it > > > > is executing has been freed? > > > > > > OK, but in the above case, "do something" cannot be sleeping, since > > > it is under rcu_read_lock(). > > > > Oh, but that's not quite what the code is doing, rather it is doing: > > > > rcu_read_lock > > while get next element from list > > inc element.refcount > > rcu_read_unlock > > do something > > rcu_read_lock > > dec refcount > > rcu_read_unlock > > Color me blind this morning... :-/ Yes, "do something" can legitimately > sleep. Sorry for my confusion! > > > What I plan to try next is: > > > > rcu_read_lock > > while get next element from list > > if (element->owning_module->state != LIVE) > > continue > > rcu_read_unlock > > do something > > rcu_read_lock > > rcu_read_unlock > > > > > > Because stacker won't remove the lsm from the list of modules > > > > until mod->exit() is executed, and module_free(mod) happens > > > > immediately after that, the above scenario seems possible. > > > > > > Right, if you have some other code path that sleeps (outside of > > > rcu_read_lock(), right?), then you need the reference count for that > > > code path. But the code paths that do not sleep should be able to > > > dispense with the reference count, reducing the cache-line traffic. > > > > Most if not all of the codepaths can sleep, however. So unfortunately > > that doesn't seem a feasible solution. That's why I'm hoping there is > > something inherent in the module unload code that I can take advantage > > of to forego my own refcounting. > > OK, so the only way that elements are removed is when a module is > unloaded, right? > > If your module trick does not pan out, how about the following: > > o Add a "need per-element reference count" global variable > > o Have a per-CPU reference-count variable. > > o Make your code snippet do something like the following: > > rcu_read_lock() > while get next element from list > if (need per-element reference count) > ref = &element.refcount > else > ref = &__get_cpu_var(stacker_refcounts) > atomic_inc(ref) > rcu_read_unlock() > do something > rcu_read_lock() > atomic_dec(ref) > rcu_read_unlock() > > o The point is to (hopefully) reduce the cache thrashing associated > with the reference counts. > > At module unload time, do something like the following: > > need per-element reference count = 1 > synchronize_rcu() > for_each_cpu(cpu) > while (per_cpu(stacker_refcounts,cpu) != 0) > sleep for a bit > > /* At this point, all CPUs are using per-element reference counts */ > > If this approach does not reduce cache thrashing enough, one could use > a per-task reference count instead of a per-CPU reference count. The > downside of doing this per-task approach is that you have to traverse > the entire task list at unload time. But module unloading should be > quite rare. If doing the per-task approach, you don't need atomic > increments and decrements for the reference count, and you have excellent > cache locality. > > Thanx, Paul > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/