* Paul E. McKenney <paul...@linux.vnet.ibm.com> wrote: > From: Patrick Marlier <patrick.marl...@gmail.com> > > The current list_entry_rcu() implementation copies the pointer to a stack > variable, then invokes rcu_dereference_raw() on it. This results in an > additional store-load pair. Now, most compilers will emit normal store > and load instructions, which might seem to be of negligible overhead, > but this results in a load-hit-store situation that can cause surprisingly > long pipeline stalls, even on modern microprocessors. The problem is > that it takes time for the store to get the store buffer updated, which > can delay the subsequent load, which immediately follows. > > This commit therefore switches to the lockless_dereference() primitive, > which does not expect the __rcu annotations (that are anyway not present > in the list_head structure) and which, like rcu_dereference_raw(), > does not check for an enclosing RCU read-side critical section. > Most importantly, it does not copy the pointer, thus avoiding the > load-hit-store overhead. > > Signed-off-by: Patrick Marlier <patrick.marl...@gmail.com> > [ paulmck: Switched to lockless_dereference() to suppress sparse warnings. ] > Signed-off-by: Paul E. McKenney <paul...@linux.vnet.ibm.com> > --- > include/linux/rculist.h | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/include/linux/rculist.h b/include/linux/rculist.h > index 17c6b1f84a77..5ed540986019 100644 > --- a/include/linux/rculist.h > +++ b/include/linux/rculist.h > @@ -247,10 +247,7 @@ static inline void list_splice_init_rcu(struct list_head > *list, > * primitives such as list_add_rcu() as long as it's guarded by > rcu_read_lock(). > */ > #define list_entry_rcu(ptr, type, member) \ > -({ \ > - typeof(*ptr) __rcu *__ptr = (typeof(*ptr) __rcu __force *)ptr; \ > - container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \ > -}) > + container_of(lockless_dereference(ptr), type, member)
So this commit: 8db70b132dd5 ("rculist: Make list_entry_rcu() use lockless_dereference()") when merged with Linus's latest tree, triggers the following build failure on allyesconfig/allmodconfig x86: triton:~/tip> make fs/fs-writeback.o CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h CHK include/generated/bounds.h CHK include/generated/timeconst.h CHK include/generated/asm-offsets.h CALL scripts/checksyscalls.sh CC fs/fs-writeback.o In file included from fs/fs-writeback.c:16:0: fs/fs-writeback.c: In function ‘bdi_split_work_to_wbs’: include/linux/compiler.h:281:20: error: lvalue required as unary ‘&’ operand __read_once_size(&(x), __u.__c, sizeof(x)); \ ^ include/linux/kernel.h:811:49: note: in definition of macro ‘container_of’ const typeof( ((type *)0)->member ) *__mptr = (ptr); \ ^ include/linux/compiler.h:286:22: note: in expansion of macro ‘__READ_ONCE’ #define READ_ONCE(x) __READ_ONCE(x, 1) ^ include/linux/compiler.h:533:26: note: in expansion of macro ‘READ_ONCE’ typeof(p) _________p1 = READ_ONCE(p); \ ^ include/linux/rculist.h:250:15: note: in expansion of macro ‘lockless_dereference’ container_of(lockless_dereference(ptr), type, member) ^ fs/fs-writeback.c:782:29: note: in expansion of macro ‘list_entry_rcu’ struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list, ^ include/linux/compiler.h:283:28: error: lvalue required as unary ‘&’ operand __read_once_size_nocheck(&(x), __u.__c, sizeof(x)); \ ^ include/linux/kernel.h:811:49: note: in definition of macro ‘container_of’ const typeof( ((type *)0)->member ) *__mptr = (ptr); \ ^ include/linux/compiler.h:286:22: note: in expansion of macro ‘__READ_ONCE’ #define READ_ONCE(x) __READ_ONCE(x, 1) ^ include/linux/compiler.h:533:26: note: in expansion of macro ‘READ_ONCE’ typeof(p) _________p1 = READ_ONCE(p); \ ^ include/linux/rculist.h:250:15: note: in expansion of macro ‘lockless_dereference’ container_of(lockless_dereference(ptr), type, member) ^ fs/fs-writeback.c:782:29: note: in expansion of macro ‘list_entry_rcu’ struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list, ^ scripts/Makefile.build:258: recipe for target 'fs/fs-writeback.o' failed make[1]: *** [fs/fs-writeback.o] Error 1 Makefile:1526: recipe for target 'fs/fs-writeback.o' failed make: *** [fs/fs-writeback.o] Error 2 It's this new usage in fs/fs-writeback.c: static void bdi_split_work_to_wbs(struct backing_dev_info *bdi, struct wb_writeback_work *base_work, bool skip_if_busy) { struct bdi_writeback *last_wb = NULL; struct bdi_writeback *wb = list_entry_rcu(&bdi->wb_list, struct bdi_writeback, bdi_node); so AFAICS the problem is lockless_dereference() not being able to take this valid but non-lvalue list head, because READ_ONCE() does not take non-lvalues. All other existing uses of list_entry_rcu() happen to use lvalues, so this bug didn't get triggered before. For now I've reverted this commit locally, to make the kernel build and so, but we need a cleaner solution I suspect. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/