> 
> > > > Mixed feelings about this one.
> > > >
> > > > Love to see RCU used for more things since it is much better than
> > > > reader/writer locks for many applications. But hate to see DPDK
> > > > reinventing every other library and not reusing code. Userspace
> > > > RCU https://liburcu.org/ is widely supported by distro's, more
> > > > throughly tested and documented, and more flexiple.
> > > >
> > > > The issue with many of these reinventions is a tradeoff of DPDK
> > > > growing another dependency on external library versus using common
> code.
> > > >
> > Agree with the dependency issues. Sometimes flexibility also causes 
> > confusion
> and features that are not necessarily required for a targeted use case. I have
> seen that much of the functionality that can be left to the application is
> implemented as part of the library.
> > I think having it in DPDK will give us control over the amount of 
> > capability this
> library will have and freedom over changes we would like to make to such a
> library. I also view DPDK as one package where all things required for data
> plane development are available.
> >
> > > > For RCU, the big issue for me is the testing and documentation of
> > > > how to do RCU safely. Many people get it wrong!
> > Hopefully, we all will do a better job collectively :)
> >
> > >
> 
> Reinventing RCU is not helping anyone.
IMO, this depends on what the rte_tqs has to offer and what the requirements 
are. Before starting this patch, I looked at the liburcu APIs. I have to say, 
fairly quickly (no offense) I concluded that this does not address DPDK's 
needs. I took a deeper look at the APIs/code in the past day and I still 
concluded the same. My partial analysis (analysis of more APIs can be done, I 
do not have cycles at this point) is as follows:

The reader threads' information is maintained in a linked list[1]. This linked 
list is protected by a mutex lock[2]. Any additions/deletions/traversals of 
this list are blocking and cannot happen in parallel.

The API, 'synchronize_rcu' [3] (similar functionality to rte_tqs_check call) is 
a blocking call. There is no option provided to make it non-blocking. The 
writer spins cycles while waiting for the grace period to get over.

'synchronize_rcu' also has grace period lock [4]. If I have multiple writers 
running on data plane threads, I cannot call this API to reclaim the memory in 
the worker threads as it will block other worker threads. This means, there is 
an extra thread required (on the control plane?) which does garbage collection 
and a method to push the pointers from worker threads to the garbage collection 
thread. This also means the time duration from delete to free increases putting 
pressure on amount of memory held up.
Since this API cannot be called concurrently by multiple writers, each writer 
has to wait for other writer's grace period to get over (i.e. multiple writer 
threads cannot overlap their grace periods).

This API also has to traverse the linked list which is not very well suited for 
calling on data plane.

I have not gone too much into rcu_thread_offline[5] API. This again needs to be 
used in worker cores and does not look to be very optimal.

I have glanced at rcu_quiescent_state [6], it wakes up the thread calling 
'synchronize_rcu' which seems good amount of code for the data plane.

[1] 
https://github.com/urcu/userspace-rcu/blob/master/include/urcu/static/urcu-qsbr.h#L85
[2] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c#L68
[3] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c#L344
[4] https://github.com/urcu/userspace-rcu/blob/master/src/urcu-qsbr.c#L58
[5] 
https://github.com/urcu/userspace-rcu/blob/master/include/urcu/static/urcu-qsbr.h#L211
[6] 
https://github.com/urcu/userspace-rcu/blob/master/include/urcu/static/urcu-qsbr.h#L193

Coming to what is provided in rte_tqs:

The synchronize_rcu functionality is split in to 2 APIs: rte_tqs_start and 
rte_tqs_check. The reader data is maintained as an array.

Both the APIs are lock-free, allowing them to be called from multiple threads 
concurrently. This allows multiple writers to wait for their grace periods 
concurrently as well as overlap their grace periods. rte_tqs_start API returns 
a token which provides the ability to separate the quiescent state waiting of 
different writers. Hence, no writer waits for other writer's grace period to 
get over. 
Since these 2 APIs are lock-free, they can be called from writers running on 
worker cores as well without the need for a separate thread to do garbage 
collection.

The separation into 2 APIs provides the ability for writers to not spin cycles 
waiting for the grace period to get over. This enables different ways of doing 
garbage collection. For ex: a data structure delete API could remove the entry 
from the data structure, call rte_tqs_start and return back to the caller. On 
the invocation of next API call of the library, the API can call rte_tqs_check 
(which will mostly indicate that the grace period is complete) and free the 
previously deleted entry.

rte_tqs_update (mapping to rcu_quiescent_state) is pretty small and simple.

rte_tqs_register and rte_tqs_unregister APIs are lock free. Hence additional 
APIs like rcu_thread_online and rcu_thread_offline are not required. The 
rte_tqs_unregister API (when compared to rcu_thread_offline) is much simple and 
conducive to be used in worker threads.

> 
> 
> DPDK needs to fix its dependency model, and just admit that it is ok to build 
> off
> of more than glibc.
> 
> Having used liburcu, it can be done in a small manner and really isn't that
> confusing.
> 
> Is your real issue the LGPL license of liburcu for your skittish customers?
I have not had any discussions on this. Customers are mainly focused on having 
a solution on which they have meaningful control. They want to be able to 
submit a patch and change things if required. For ex: barriers for Arm [7] are 
not optimal. How easy is it to change this and get it into distros (there are 
both internal and external factors here)?

[7] 
https://github.com/urcu/userspace-rcu/blob/master/include/urcu/arch/arm.h#L44

Reply via email to