Hi. Is the any differentiation points with liburcu [1] ? Is there any profit having own implementation inside DPDK ?
[1] http://liburcu.org/ https://lwn.net/Articles/573424/ Best regards, Ilya Maximets. > Lock-less data structures provide scalability and determinism. > They enable use cases where locking may not be allowed > (for ex: real-time applications). > > In the following paras, the term 'memory' refers to memory allocated > by typical APIs like malloc or anything that is representative of > memory, for ex: an index of a free element array. > > Since these data structures are lock less, the writers and readers > are accessing the data structures simultaneously. Hence, while removing > an element from a data structure, the writers cannot return the memory > to the allocator, without knowing that the readers are not > referencing that element/memory anymore. Hence, it is required to > separate the operation of removing an element into 2 steps: > > Delete: in this step, the writer removes the element from the > data structure but does not return the associated memory to the allocator. > This will ensure that new readers will not get a reference to the removed > element. Removing the reference is an atomic operation. > > Free: in this step, the writer returns the memory to the > memory allocator, only after knowing that all the readers have stopped > referencing the removed element. > > This library helps the writer determine when it is safe to free the > memory. > > This library makes use of Thread Quiescent State (TQS). TQS can be > defined as 'any point in the thread execution where the thread does > not hold a reference to shared memory'. It is upto the application to > determine its quiescent state. Let us consider the following diagram: > > Time --------------------------------------------------> > > | | > RT1 $++++****D1****+++***D2*|**+++|+++**D3*****++++$ > | | > RT2 $++++****D1****++|+**D2|***++++++**D3*****++++$ > | | > RT3 $++++****D1****+++***|D2***|++++++**D2*****++++$ > | | > |<--->| > Del | Free > | > Cannot free memory > during this period > > RTx - Reader thread > < and > - Start and end of while(1) loop > ***Dx*** - Reader thread is accessing the shared data structure Dx. > i.e. critical section. > +++ - Reader thread is not accessing any shared data structure. > i.e. non critical section or quiescent state. > Del - Point in time when the reference to the entry is removed using > atomic operation. > Free - Point in time when the writer can free the entry. > > As shown thread RT1 acesses data structures D1, D2 and D3. When it is > accessing D2, if the writer has to remove an element from D2, the > writer cannot return the memory associated with that element to the > allocator. The writer can return the memory to the allocator only after > the reader stops referencng D2. In other words, reader thread RT1 > has to enter a quiescent state. > > Similarly, since thread RT3 is also accessing D2, writer has to wait till > RT3 enters quiescent state as well. > > However, the writer does not need to wait for RT2 to enter quiescent state. > Thread RT2 was not accessing D2 when the delete operation happened. > So, RT2 will not get a reference to the deleted entry. > > It can be noted that, the critical sections for D2 and D3 are quiescent states > for D1. i.e. for a given data structure Dx, any point in the thread execution > that does not reference Dx is a quiescent state. > > For DPDK applications, the start and end of while(1) loop (where no shared > data structures are getting accessed) act as perfect quiescent states. This > will combine all the shared data structure accesses into a single critical > section and keeps the over head introduced by this library to the minimum. > > However, the length of the critical section and the number of reader threads > is proportional to the time taken to identify the end of critical section. > So, if the application desires, it should be possible to identify the end > of critical section for each data structure. > > To provide the required flexibility, this library has a concept of TQS > variable. The application can create one or more TQS variables to help it > track the end of one or more critical sections. > > The application can create a TQS variable using the API rte_tqs_alloc. > It takes a mask of lcore IDs that will report their quiescent states > using this variable. This mask can be empty to start with. > > rte_tqs_register_lcore API will register a reader thread to report its > quiescent state. This can be called from any control plane thread or from > the reader thread. The application can create a TQS variable with no reader > threads and add the threads dynamically using this API. > > The application can trigger the reader threads to report their quiescent > state status by calling the API rte_tqs_start. It is possible for multiple > writer threads to query the quiescent state status simultaneously. Hence, > rte_tqs_start returns a token to each caller. > > The application has to call rte_tqs_check API with the token to get the > current status. Option to block till all the threads enter the quiescent > state is provided. If this API indicates that all the threads have entered > the quiescent state, the application can free the deleted entry. > > The separation of triggering the reporting from querying the status provides > the writer threads flexibility to do useful work instead of waiting for the > reader threads to enter the quiescent state. > > rte_tqs_unregister_lcore API will remove a reader thread from reporting its > quiescent state using a TQS variable. The rte_tqs_check API will not wait > for this reader thread to report the quiescent state status anymore. > > Finally, a TQS variable can be deleted by calling rte_tqs_free API. > Application must make sure that the reader threads are not referencing the > TQS variable anymore before deleting it. > > The reader threads should call rte_tqs_update API to indicate that they > entered a quiescent state. This API checks if a writer has triggered a > quiescent state query and update the state accordingly. > > Next Steps: > 1) Add more test cases > 2) Convert to patch > 3) Incorporate feedback from community > 4) Add documentation > > Dharmik Thakkar (1): > test/tqs: Add API and functional tests > > Honnappa Nagarahalli (2): > log: add TQS log type > tqs: add thread quiescent state library > > config/common_base | 6 + > lib/Makefile | 2 + > lib/librte_eal/common/include/rte_log.h | 1 + > lib/librte_tqs/Makefile | 23 + > lib/librte_tqs/meson.build | 5 + > lib/librte_tqs/rte_tqs.c | 249 +++++++++++ > lib/librte_tqs/rte_tqs.h | 352 +++++++++++++++ > lib/librte_tqs/rte_tqs_version.map | 16 + > lib/meson.build | 2 +- > mk/rte.app.mk | 1 + > test/test/Makefile | 2 + > test/test/autotest_data.py | 6 + > test/test/meson.build | 5 +- > test/test/test_tqs.c | 540 ++++++++++++++++++++++++ > 14 files changed, 1208 insertions(+), 2 deletions(-) > create mode 100644 lib/librte_tqs/Makefile > create mode 100644 lib/librte_tqs/meson.build > create mode 100644 lib/librte_tqs/rte_tqs.c > create mode 100644 lib/librte_tqs/rte_tqs.h > create mode 100644 lib/librte_tqs/rte_tqs_version.map > create mode 100644 test/test/test_tqs.c > > -- > 2.17.1