Thanks for your feedback, I will update the patch in the next few days,
addressing the comments and reorganizing classify_object_over_fdes.
Concerning your question:
+
+restart:
+ struct btree_node *iter;
+ uintptr_t lock;
+ {
+ // Accessing the root node requires defending against concurrent
pointer
+ // changes Thus we couple rootLock -> lock on root node ->
validate rootLock
Can you elaborate a bit more in the comment on why it's necessary and
sufficient to validate the containing node after "locking" the child node?
+ if (!version_lock_lock_optimistic (&(t->root_lock), &lock))
+ goto restart;
+ iter = RLOAD (t->root);
+ if (!version_lock_validate (&(t->root_lock), lock))
+ goto restart;
+ if (!iter)
+ return NULL;
+ uintptr_t child_lock;
+ if ((!btree_node_lock_optimistic (iter, &child_lock))
+ || (!version_lock_validate (&(t->root_lock), lock)))
+ goto restart;
+ lock = child_lock;
+ }
+
+ // Now we can walk down towards the right leaf node
We are reading here without explicit locks, i.e., memory can change
behind our back at any time. Thus, before we act on a value that we have
read, we must check if that value is still valid.
The invariants that we rely upon are:
I1) a node pointer will always be either a nullptr or point to a b-tree
node. The node might have been logically deleted, but the memory will
always remain valid due to our free list.
I2) The root pointer is only modified by a thread that holds an
exclusive lock on root_lock.
I3) valide calls fail if a) somebody else currently holds the lock, or
b) if the version value has changed since locking, which happens when
somebody releases an exlusive lock.
Using that, the code does the following steps
1. We lock root_lock optimistic. In case of an exclusive lock we restart
immediately, otherwise we remember the current version
2. We load the root pointer and store it in iter. iter might contain
arbitrary garbage due to races
3. We validate the root_lock, and restart in case of errors. Now we know
that a) nobody has locked the root exclusively or modified it since step
1 (I3), and b) iter indeed points to the node that corresponds the to
the version that we have read in step 1 (due to I2)
At this point we now that iter is indeed the right root pointer. If it
is a nullptr we stop here. Otherwise we continue with
4. We lock the root node itself (iter), storing the validated version in
child_lock. We need that lock to detect changes to the content of the
root, the root_lock only protects the pointer to the root. Loading the
lock is safe due to I1. As our tree can change at any point in time, we
need another valide call on root_lock to make sure that we still have
the right root pointer.
At this point we have both a pointer to the root node and an optimistic
lock that allows us to detect changes to the content of the root. Thus,
we can safely navigate down, any changes to the b-tree that could affect
us, like, e.g., node splits in the root, will be detected when
validating the lock.
I hope that answered both the necessary and sufficient part. As a
general rule, you must always use the pair relax-atomic-read + validate
to avoid data races. Acting on any value that you have not validated is
unsafe, as you could get races.
Best
Thomas