* Peter Zijlstra <pet...@infradead.org> wrote: > Implement a latched RB-tree in order to get unconditional RCU/lockless > lookups. > > Cc: Oleg Nesterov <o...@redhat.com> > Cc: Michel Lespinasse <wal...@google.com> > Cc: Andrea Arcangeli <aarca...@redhat.com> > Cc: David Woodhouse <david.woodho...@intel.com> > Cc: Rik van Riel <r...@redhat.com> > Cc: Mathieu Desnoyers <mathieu.desnoy...@efficios.com> > Cc: "Paul E. McKenney" <paul...@linux.vnet.ibm.com> > Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> > --- > include/linux/rbtree_latch.h | 212 > +++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 212 insertions(+) > > --- /dev/null > +++ b/include/linux/rbtree_latch.h > @@ -0,0 +1,212 @@ > +/* > + * Latched RB-trees > + * > + * Copyright (C) 2015 Intel Corp., Peter Zijlstra <pet...@infradead.org> > + * > + * Since RB-trees have non atomic modifications they're not immediately > suited > + * for RCU/lockless queries. Even though we made RB tree lookups non-fatal > for > + * lockless lookups; we cannot guarantee they return a correct result. > + * > + * The simplest solution is a seqlock + rb-tree, this will allow lockless > + * lookups; but has the constraint (inherent to the seqlock) that read sides > + * cannot nest in write sides. > + * > + * If we need to allow unconditional lookups (say as required for NMI context > + * usage) we need a more complex setup; this data structure provides this by > + * employing the latch technique -- see @raw_write_seqcount_latch -- to > + * implement a latched RB-tree which does allow for unconditional lookups by > + * virtue of always having (at least) one stable copy of the tree. > + * > + * However, while we have the guarantee that there is at all times one stable > + * copy, this does not guarantee an iteration will not observe modifications. > + * What might have been a stable copy at the start of the iteration, need not > + * remain so for the duration of the iteration. > + * > + * Therefore, this does require a lockless RB-tree iteration to be non-fatal; > + * see the comment in lib/rbtree.c. Note however that we only require the > first > + * condition -- not seeing partial stores -- because the latch thing isolates > + * us from loops. If we were to interrupt a modification the lookup would be > + * pointed at the stable tree and complete while the modification was halted.
Minor nit: so this text has 3 variants to spell RB-trees: RB-tree RB tree rb-tree I suggest we pick one! :-) Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/