On Fri, Jan 28, 2005 at 08:45:46PM +0100, Ingo Molnar wrote: > * Trond Myklebust <[EMAIL PROTECTED]> wrote: > > If you do have a highest interrupt case that causes all activity to > > block, then rwsems may indeed fit the bill. > > > > In the NFS client code we may use rwsems in order to protect stateful > > operations against the (very infrequently used) server reboot recovery > > code. The point is that when the server reboots, the server forces us > > to block *all* requests that involve adding new state (e.g. opening an > > NFSv4 file, or setting up a lock) while our client and others are > > re-establishing their existing state on the server. > > it seems the most scalable solution for this would be a global flag plus > per-CPU spinlocks (or per-CPU mutexes) to make this totally scalable and > still support the requirements of this rare event. An rwsem really > bounces around on SMP, and it seems very unnecessary in the case you > described. > > possibly this could be formalised as an rwlock/rwlock implementation > that scales better. brlocks were such an attempt.
>From how I understand it, you'll have to have a global structure to denote an exclusive operation and then take some additional cpumask_t representing the spinlocks set and use it to iterate over when doing a PI chain operation. Locking of each individual parametric typed spinlock might require a raw_spinlock manipulate lists structures, which, added up, is rather heavy weight. No only that, you'd have to introduce a notion of it being counted since it could also be aquired/preempted by another higher priority thread on that same procesor. Not having this semantic would make the thread in that specific circumstance effectively non-preemptable (PI scheduler indeterminancy), where the mulipule readers portion of a real read/write (shared-exclusve) lock would have permitted this. http://people.lynuxworks.com/~bhuey/rt-share-exclusive-lock/rtsem.tgz.1208 Is our attempt at getting real shared-exclusive lock semantics in a blocking lock and may still be incomplete and buggy. Igor is still working on this and this is the latest that I have of his work. Getting comments on this approach would be a good thing as I/we (me/Igor) believed from the start that this approach is correct. Assuming that this is possible with the current approach, optimizing it to avoid CPU ping-ponging is an important next step bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/