On 11-Dec-18 6:40 AM, Honnappa Nagarahalli wrote:


+
+/* Add a reader thread, running on an lcore, to the list of
+threads
+ * reporting their quiescent state on a TQS variable.
+ */
+int __rte_experimental
+rte_tqs_register_lcore(struct rte_tqs *v, unsigned int lcore_id) {
+       TQS_RETURN_IF_TRUE((v == NULL || lcore_id >=
RTE_TQS_MAX_LCORE),
+                               -EINVAL);

It is not very good practice to make function return different
values and behave in a different way in debug/non-debug mode.
I'd say that for slow-path (functions in .c) it is always good
to check input parameters.
For fast-path (functions in .h) we sometimes skip such checking,
but debug mode can probably use RTE_ASSERT() or so.
Makes sense, I will change this in the next version.



lcore_id >= RTE_TQS_MAX_LCORE

Is this limitation really necessary?
I added this limitation because currently DPDK application cannot
take a mask more than 64bit wide. Otherwise, this should be as big
as
RTE_MAX_LCORE.
I see that in the case of '-lcores' option, the number of lcores
can be more than the number of PEs. In this case, we still need a
MAX limit (but
can be bigger than 64).

First it means that only lcores can use that API (at least
data-path part), second even today many machines have more than 64
cores.
I think you can easily avoid such limitation, if instead of
requiring lcore_id as input parameter, you'll just make it
return index of
next available entry in w[].
Then tqs_update() can take that index as input parameter.
I had thought about a similar approach based on IDs. I was
concerned that ID will be one more thing to manage for the
application. But, I see the
limitations of the current approach now. I will change it to allocation
based.
This will support even non-EAL pthreads as well.

Yes, with such approach non-lcore threads will be able to use it also.

I realized that rte_tqs_register_lcore/ rte_tqs_unregister_lcore need to be
efficient as they can be called from the worker's packet processing loop
(rte_event_dequeue_burst allows blocking. So, the worker thread needs to
call rte_tqs_unregister_lcore before calling rte_event_dequeue_burst and
rte_tqs_register_lcore before starting packet processing). Allocating the
thread ID in these functions will make them more complex.

I suggest that we change the argument 'lcore_id' to 'thread_id'. The
application could use 'lcore_id' as 'thread_id' if threads are mapped to
physical cores 1:1.

If the threads are not mapped 1:1 to physical cores, the threads need to use
a thread_id in the range of 0 - RTE_TQS_MAX_THREADS. I do not see that
DPDK has a thread_id concept. For TQS, the thread IDs are global (i.e. not per
TQS variable). I could provide APIs to do the thread ID allocation, but I think
the thread ID allocation should not be part of this library. Such thread ID
might be useful for other libraries.

<snip


Thread id is problematic since Glibc doesn't want to give it out.
You have to roll your own function to do gettid().
It is not as easy as just that.  Plus what about preemption?

Agree. I looked into this further. The rte_gettid function uses a system call 
(BSD and Linux). I am not clear on the space of the ID returned (as well). I do 
not think it is guaranteed that it will be with in a narrow range that is 
required here.

My suggestion would be to add a set of APIs that would allow for allocation of thread 
IDs which are within a given range of 0 to <predefined MAX>


System-provided thread-ID's would probably also be potentially non-unique in multiprocess scenario?

--
Thanks,
Anatoly

Reply via email to