Re: [lttng-dev] RCU consistency guarantees
On Sat, Dec 14, 2019 at 01:31:31AM -0500, Yuxin Ren wrote: > Hi Paul > > On Sat, Dec 7, 2019 at 5:42 PM Paul E. McKenney wrote: > > > On Sat, Dec 07, 2019 at 03:04:42PM -0500, Yuxin Ren wrote: > > > Thanks a lot for your help. I have some questions below. > > > > > > On Sat, Dec 7, 2019 at 1:37 AM Paul E. McKenney > > wrote: > > > > > > > On Fri, Dec 06, 2019 at 07:00:13PM -0500, Yuxin Ren wrote: > > > > > Thanks so much for your great help. > > > > > I definitely will look at those resources and papers! > > > > > > > > > > One more thing that I am confused > > > > > As I mentioned earlier, someone said One key distinction is that both > > > > MVCC > > > > > and RLU provide much stronger consistency guarantees to readers than > > does > > > > > RCU ...) (https://lwn.net/Articles/777036/). > > > > > > > > That someone was in fact me. ;-) > > > > > > > > > I am not sure if the above statement is correct or not. But in > > general, > > > > > How can we compare RCU consistency guarantees to other techniques > > (such > > > > as > > > > > RLU)? > > > > > How to reason about which one has stronger or weaker guarantees? > > > > > > > > I suggest starting from the use case. For concreteness, let's assume > > > > that we are using a hash table. At one extreme, imagine a use case in > > > > which each event makes exactly one hash-table operation. No > > information > > > > is carried from one event to the next. (This might well be the case > > > > for simple web server.) Such a use case cannot tell the difference > > > > between RCU on the one hand and MVCC/RLU on the other. > > > > > > > > At the other extreme, suppose that each event either accesses or > > updates > > > > multiple entries in the hash table. In this case, MVCC/RLU will rule > > > > out outcomes that RCU would permit. For example, suppose we had four > > > > events accessing two different elements in different buckets of the > > > > hash table: > > > > > > > > E1: Adds 32 to the hash table. > > > > E2: Adds 1729 to the hash table. > > > > E3: Within a read-side critical section, looks up 32 then 1729. > > > > E4: Within a read-side critical section, looks up 1729 then 32. > > > > > > > > Given either MVCC or RLU, it will not be possible for E3 to find 32 but > > > > not 1729 and at the same time for E4 to find 1729 but not 32. Given > > RCU, > > > > this outcome is possible. > > > > > > > When you say "Within a read-side section", do you mean within a single > > same > > > read section? such as > > > > > > > read_lock() > > > > lookup(32) > > > > lookup(1729) > > > > read_unlock() > > > > > > > > > How about putting two lookups into two read-side sections? Do we still > > have > > > the problem, then? > > > > > > > read_lock() > > > > lookup(32) > > > > read_unlock() > > > > read_lock() > > > > lookup(1729) > > > > read_unlock() > > > > Without in any way agreeing with your characterization of this as a > > problem, because rcu_read_lock() and rcu_read_unlock() provide > > absolutely no ordering guarantees in the absence of a grace period, > > any non-grace-period-related reordering that can happen with a single > > RCU read-side critical section can also happen when that critical > > section is split in two as you have done above. > > > > > Could you kindly give me more clues why RCU can see such reorder, while > > RLU > > > can prevent it? > > > > Here are minimal C-language implementations for RCU that can (and are) > > actually used: > > > Great. We use the same thing in our real-time work [1] It has been a popular choice for 40 years now. ;-) > > #define rcu_read_lock() > > #define rcu_read_unlock() > > > > Please compare these to the read-side markers presented in the RLU paper, > > and then tell me your thoughts on the answer to your question. ;-) > > > I submit my homework here, but I do not think I did it well. > 1. I believe in the default URCU implementation, it has memory barrier > inside the read_lock / read_unlock. It certainly was at one time. But you would of course need to check to see what any given worker actually used. > 2. From the RLU paper, it shows the code for read-side operation. > RLU_READER_LOCK(ctx) > ctx.is-writer←false > ctx.run-cnt←ctx.run-cnt+1. > memory fence >ctx.local-clock←global-clock > RLU_READER_UNLOCK(ctx) > ctx.run-cnt←ctx.run-cnt+1 > if (ctx.is-writer) > RLU_COMMIT_WRITE_LOG(ctx) > > We can ignore the writer check, as in our case, the reader never do update. > My understanding is that read-side operations are only used to facilitate > the quiescence detection. > the run cnt is used to decide if a thread is active (if it is currently > inside a read-side section). > the local clock is used to check if the active thread goes into the > read-side section very late, so it does not prevent reclaiming memory > unlinked before it enters the read section. > However i see no difference between RLU and RCU read-side operation > regar
Re: [lttng-dev] RCU consistency guarantees
On Sun, Dec 15, 2019 at 3:30 PM Paul E. McKenney wrote: > On Sat, Dec 14, 2019 at 01:31:31AM -0500, Yuxin Ren wrote: > > Hi Paul > > > > On Sat, Dec 7, 2019 at 5:42 PM Paul E. McKenney > wrote: > > > > > On Sat, Dec 07, 2019 at 03:04:42PM -0500, Yuxin Ren wrote: > > > > Thanks a lot for your help. I have some questions below. > > > > > > > > On Sat, Dec 7, 2019 at 1:37 AM Paul E. McKenney > > > wrote: > > > > > > > > > On Fri, Dec 06, 2019 at 07:00:13PM -0500, Yuxin Ren wrote: > > > > > > Thanks so much for your great help. > > > > > > I definitely will look at those resources and papers! > > > > > > > > > > > > One more thing that I am confused > > > > > > As I mentioned earlier, someone said One key distinction is that > both > > > > > MVCC > > > > > > and RLU provide much stronger consistency guarantees to readers > than > > > does > > > > > > RCU ...) (https://lwn.net/Articles/777036/). > > > > > > > > > > That someone was in fact me. ;-) > > > > > > > > > > > I am not sure if the above statement is correct or not. But in > > > general, > > > > > > How can we compare RCU consistency guarantees to other techniques > > > (such > > > > > as > > > > > > RLU)? > > > > > > How to reason about which one has stronger or weaker guarantees? > > > > > > > > > > I suggest starting from the use case. For concreteness, let's > assume > > > > > that we are using a hash table. At one extreme, imagine a use > case in > > > > > which each event makes exactly one hash-table operation. No > > > information > > > > > is carried from one event to the next. (This might well be the > case > > > > > for simple web server.) Such a use case cannot tell the difference > > > > > between RCU on the one hand and MVCC/RLU on the other. > > > > > > > > > > At the other extreme, suppose that each event either accesses or > > > updates > > > > > multiple entries in the hash table. In this case, MVCC/RLU will > rule > > > > > out outcomes that RCU would permit. For example, suppose we had > four > > > > > events accessing two different elements in different buckets of the > > > > > hash table: > > > > > > > > > > E1: Adds 32 to the hash table. > > > > > E2: Adds 1729 to the hash table. > > > > > E3: Within a read-side critical section, looks up 32 then > 1729. > > > > > E4: Within a read-side critical section, looks up 1729 > then 32. > > > > > > > > > > Given either MVCC or RLU, it will not be possible for E3 to find > 32 but > > > > > not 1729 and at the same time for E4 to find 1729 but not 32. > Given > > > RCU, > > > > > this outcome is possible. > > > > > > > > > When you say "Within a read-side section", do you mean within a > single > > > same > > > > read section? such as > > > > > > > > > read_lock() > > > > > lookup(32) > > > > > lookup(1729) > > > > > read_unlock() > > > > > > > > > > > > How about putting two lookups into two read-side sections? Do we > still > > > have > > > > the problem, then? > > > > > > > > > read_lock() > > > > > lookup(32) > > > > > read_unlock() > > > > > read_lock() > > > > > lookup(1729) > > > > > read_unlock() > > > > > > Without in any way agreeing with your characterization of this as a > > > problem, because rcu_read_lock() and rcu_read_unlock() provide > > > absolutely no ordering guarantees in the absence of a grace period, > > > any non-grace-period-related reordering that can happen with a single > > > RCU read-side critical section can also happen when that critical > > > section is split in two as you have done above. > > > > > > > Could you kindly give me more clues why RCU can see such reorder, > while > > > RLU > > > > can prevent it? > > > > > > Here are minimal C-language implementations for RCU that can (and are) > > > actually used: > > > > > Great. We use the same thing in our real-time work [1] > > It has been a popular choice for 40 years now. ;-) > > > > #define rcu_read_lock() > > > #define rcu_read_unlock() > > > > > > Please compare these to the read-side markers presented in the RLU > paper, > > > and then tell me your thoughts on the answer to your question. ;-) > > > > > I submit my homework here, but I do not think I did it well. > > 1. I believe in the default URCU implementation, it has memory barrier > > inside the read_lock / read_unlock. > > It certainly was at one time. But you would of course need to check > to see what any given worker actually used. > > > 2. From the RLU paper, it shows the code for read-side operation. > > RLU_READER_LOCK(ctx) > > ctx.is-writer←false > > ctx.run-cnt←ctx.run-cnt+1. > > memory fence > >ctx.local-clock←global-clock > > RLU_READER_UNLOCK(ctx) > > ctx.run-cnt←ctx.run-cnt+1 > > if (ctx.is-writer) > > RLU_COMMIT_WRITE_LOG(ctx) > > > > We can ignore the writer check, as in our case, the reader never do > update. > > My understanding is that read-side operations are only used to facilitate > > the quiescence detection. > > the run cnt is used to decide if a t
Re: [lttng-dev] RCU consistency guarantees
On Sun, Dec 15, 2019 at 05:10:11PM -0500, Yuxin Ren wrote: > On Sun, Dec 15, 2019 at 3:30 PM Paul E. McKenney wrote: > > > On Sat, Dec 14, 2019 at 01:31:31AM -0500, Yuxin Ren wrote: > > > Hi Paul > > > > > > On Sat, Dec 7, 2019 at 5:42 PM Paul E. McKenney > > wrote: > > > > > > > On Sat, Dec 07, 2019 at 03:04:42PM -0500, Yuxin Ren wrote: > > > > > Thanks a lot for your help. I have some questions below. > > > > > > > > > > On Sat, Dec 7, 2019 at 1:37 AM Paul E. McKenney > > > > wrote: > > > > > > > > > > > On Fri, Dec 06, 2019 at 07:00:13PM -0500, Yuxin Ren wrote: > > > > > > > Thanks so much for your great help. > > > > > > > I definitely will look at those resources and papers! > > > > > > > > > > > > > > One more thing that I am confused > > > > > > > As I mentioned earlier, someone said One key distinction is that > > both > > > > > > MVCC > > > > > > > and RLU provide much stronger consistency guarantees to readers > > than > > > > does > > > > > > > RCU ...) (https://lwn.net/Articles/777036/). > > > > > > > > > > > > That someone was in fact me. ;-) > > > > > > > > > > > > > I am not sure if the above statement is correct or not. But in > > > > general, > > > > > > > How can we compare RCU consistency guarantees to other techniques > > > > (such > > > > > > as > > > > > > > RLU)? > > > > > > > How to reason about which one has stronger or weaker guarantees? > > > > > > > > > > > > I suggest starting from the use case. For concreteness, let's > > assume > > > > > > that we are using a hash table. At one extreme, imagine a use > > case in > > > > > > which each event makes exactly one hash-table operation. No > > > > information > > > > > > is carried from one event to the next. (This might well be the > > case > > > > > > for simple web server.) Such a use case cannot tell the difference > > > > > > between RCU on the one hand and MVCC/RLU on the other. > > > > > > > > > > > > At the other extreme, suppose that each event either accesses or > > > > updates > > > > > > multiple entries in the hash table. In this case, MVCC/RLU will > > rule > > > > > > out outcomes that RCU would permit. For example, suppose we had > > four > > > > > > events accessing two different elements in different buckets of the > > > > > > hash table: > > > > > > > > > > > > E1: Adds 32 to the hash table. > > > > > > E2: Adds 1729 to the hash table. > > > > > > E3: Within a read-side critical section, looks up 32 then > > 1729. > > > > > > E4: Within a read-side critical section, looks up 1729 > > then 32. > > > > > > > > > > > > Given either MVCC or RLU, it will not be possible for E3 to find > > 32 but > > > > > > not 1729 and at the same time for E4 to find 1729 but not 32. > > Given > > > > RCU, > > > > > > this outcome is possible. > > > > > > > > > > > When you say "Within a read-side section", do you mean within a > > single > > > > same > > > > > read section? such as > > > > > > > > > > > read_lock() > > > > > > lookup(32) > > > > > > lookup(1729) > > > > > > read_unlock() > > > > > > > > > > > > > > > How about putting two lookups into two read-side sections? Do we > > still > > > > have > > > > > the problem, then? > > > > > > > > > > > read_lock() > > > > > > lookup(32) > > > > > > read_unlock() > > > > > > read_lock() > > > > > > lookup(1729) > > > > > > read_unlock() > > > > > > > > Without in any way agreeing with your characterization of this as a > > > > problem, because rcu_read_lock() and rcu_read_unlock() provide > > > > absolutely no ordering guarantees in the absence of a grace period, > > > > any non-grace-period-related reordering that can happen with a single > > > > RCU read-side critical section can also happen when that critical > > > > section is split in two as you have done above. > > > > > > > > > Could you kindly give me more clues why RCU can see such reorder, > > while > > > > RLU > > > > > can prevent it? > > > > > > > > Here are minimal C-language implementations for RCU that can (and are) > > > > actually used: > > > > > > > Great. We use the same thing in our real-time work [1] > > > > It has been a popular choice for 40 years now. ;-) > > > > > > #define rcu_read_lock() > > > > #define rcu_read_unlock() > > > > > > > > Please compare these to the read-side markers presented in the RLU > > paper, > > > > and then tell me your thoughts on the answer to your question. ;-) > > > > > > > I submit my homework here, but I do not think I did it well. > > > 1. I believe in the default URCU implementation, it has memory barrier > > > inside the read_lock / read_unlock. > > > > It certainly was at one time. But you would of course need to check > > to see what any given worker actually used. > > > > > 2. From the RLU paper, it shows the code for read-side operation. > > > RLU_READER_LOCK(ctx) > > > ctx.is-writer←false > > > ctx.run-cnt←ctx.run-cnt+1. > > > memory fence > > >ctx.local-clock←global-clock > > > RLU_READER_UNLOCK(ctx) > > >