On 11/07, Mikulas Patocka wrote: > > On Wed, 7 Nov 2012, Oleg Nesterov wrote: > > > On 11/07, Mikulas Patocka wrote: > > > > > > It looks sensible. > > > > > > Here I'm sending an improvement of the patch - I changed it so that there > > > are not two-level nested functions for the fast path and so that both > > > percpu_down_read and percpu_up_read use the same piece of code (to reduce > > > cache footprint). > > > > IOW, the only change is that you eliminate "static update_fast_ctr()" > > and fold it into down/up_read which takes the additional argument. > > > > Honestly, personally I do not think this is better, but I won't argue. > > I agree with everything but I guess we need the ack from Paul. > > If you look at generated assembly (for x86-64), the footprint of my patch > is 78 bytes shared for both percpu_down_read and percpu_up_read. > > The footprint of your patch is 62 bytes for update_fast_ctr, 46 bytes for > percpu_down_read and 20 bytes for percpu_up_read.
Still I think the code looks more clean this way, and personally I think this is more important. Plus, this lessens the footprint for the caller although I agree this is minor. Please send the increnental patch if you wish, I won't argue. But note that with the lockdep annotations (and I'll send the patch soon) the code will look even worse. Either you need another "if (val > 0)" check or you need to add rwsem_acquire_read/rwsem_release into .h And if you do this change please also update the comments, they still refer to update_fast_ctr() you folded into down_up ;) Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/