> From: Honnappa Nagarahalli [mailto:honnappa.nagaraha...@arm.com]
> Sent: Saturday, 2 April 2022 21.31
> 
> <snip>
> 
> > > +__rte_experimental
> > > +static inline bool
> > > +rte_seqlock_read_tryunlock(const rte_seqlock_t *seqlock, uint32_t
> > > +*begin_sn) {
> > > + uint32_t end_sn;
> > > +
> > > + /* make sure the data loads happens before the sn load */
> > > + rte_atomic_thread_fence(__ATOMIC_ACQUIRE);
> > > +
> > > + end_sn = __atomic_load_n(&seqlock->sn, __ATOMIC_RELAXED);
> >
> > Since we are reading and potentially returning the sequence number
> here
> > (repeating the read of the protected data), we need to use load-
> acquire.
> > I assume it is not expected that the user will call
> > rte_seqlock_read_lock() again.
> Good point, we need a load-acquire (due to changes done in v3).
> 
> >
> > Seeing this implementation, I might actually prefer the original
> > implementation, I think it is cleaner. But I would like for the begin
> function
> > also to wait for an even sequence number, the end function would only
> have
> > to check for same sequence number, this might improve performance a
> little
> > bit as readers won't perform one or several broken reads while a
> write is in
> > progress. The function names are a different thing though.
> I think we need to be optimizing for the case where there is no
> contention between readers and writers (as that happens most of the
> time). From this perspective, not checking for an even seq number in
> the begin function would reduce one 'if' statement.

I might be siding with Ola on this, but with a twist: The read_lock() should 
not wait, but test. (Or both variants could be available. Or all three, 
including the variant without checking for an even sequence number.)

My argument for this is: The write operation could take a long time to 
complete, and while this goes on, it is good for the reading threads to know at 
entry of their critical read section that the read operation will fail, so they 
can take the alternative code path instead of proceeding into the critical read 
section. Otherwise, the reading threads have to waste time reading the 
protected data, only to discard them at the end. It's an optimization based on 
the assumption that reading the protected data has some small cost, because 
this small cost adds up if done many times during a longwinded write operation.

And, although checking for the sequence number in read_trylock() adds an 'if' 
statement to it, that 'if' statement should be surrounded by likely() to reduce 
its cost in the case we are optimizing for, i.e. when no write operation is 
ongoing.

This means that read_trylock() returns a boolean, and the sequence number is 
returned in an output parameter.

Please note that it doesn't change the fact that read_tryunlock() can still 
fail, even though read_trylock() gave the go-ahead.

I'm trying to highlight that while we all agree to optimize for the case of 
reading while no writing is ongoing, there might be opportunity for optimizing 
for the opposite case (i.e. trying to read while writing is ongoing) at the 
same time.

I only hope it can be done with negligent performance cost for the primary case.

I'll respectfully leave the hardcore implementation details and performance 
considerations to you experts in this area. :-)

Reply via email to