On Wed, Oct 31, 2012 at 12:41 PM, Oleg Nesterov <o...@redhat.com> wrote: > Currently the writer does msleep() plus synchronize_sched() 3 times > to acquire/release the semaphore, and during this time the readers > are blocked completely. Even if the "write" section was not actually > started or if it was already finished. > > With this patch down_read/up_read does synchronize_sched() twice and > down_read/up_read are still possible during this time, just they use > the slow path.
The changelog is wrong (it's the write path, not read path, that does the synchronize_sched). > struct percpu_rw_semaphore { > - unsigned __percpu *counters; > - bool locked; > - struct mutex mtx; > + int __percpu *fast_read_ctr; This change is wrong. You must not make the 'fast_read_ctr' thing be an int. Or at least you need to be a hell of a lot more careful about it. Why? Because the readers update the counters while possibly moving around cpu's, the increment and decrement of the counters may be on different CPU's. But that means that when you add all the counters together, things can overflow (only the final sum is meaningful). And THAT in turn means that you should not use a signed count, for the simple reason that signed integers don't have well-behaved overflow behavior in C. Now, I doubt you'll find an architecture or C compiler where this will actually ever make a difference, but the fact remains that you shouldn't use signed integers for counters like this. You should use unsigned, and you should rely on the well-defined modulo-2**n semantics. I'd also like to see a comment somewhere in the source code about the whole algorithm and the rules. Other than that, I guess it looks ok. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/