> > > > > > > > > > > + > > > > > > +/* Add a reader thread, running on an lcore, to the list of > > > > > > +threads > > > > > > + * reporting their quiescent state on a TQS variable. > > > > > > + */ > > > > > > +int __rte_experimental > > > > > > +rte_tqs_register_lcore(struct rte_tqs *v, unsigned int lcore_id) { > > > > > > + TQS_RETURN_IF_TRUE((v == NULL || lcore_id >= > > > > > RTE_TQS_MAX_LCORE), > > > > > > + -EINVAL); > > > > > > > > > > It is not very good practice to make function return different > > > > > values and behave in a different way in debug/non-debug mode. > > > > > I'd say that for slow-path (functions in .c) it is always good > > > > > to check input parameters. > > > > > For fast-path (functions in .h) we sometimes skip such checking, > > > > > but debug mode can probably use RTE_ASSERT() or so. > > > > Makes sense, I will change this in the next version. > > > > > > > > > > > > > > > > > > > lcore_id >= RTE_TQS_MAX_LCORE > > > > > > > > > > Is this limitation really necessary? > > > > I added this limitation because currently DPDK application cannot > > > > take a mask more than 64bit wide. Otherwise, this should be as big > > > > as > > > RTE_MAX_LCORE. > > > > I see that in the case of '-lcores' option, the number of lcores > > > > can be more than the number of PEs. In this case, we still need a > > > > MAX limit (but > > > can be bigger than 64). > > > > > > > > > First it means that only lcores can use that API (at least > > > > > data-path part), second even today many machines have more than 64 > cores. > > > > > I think you can easily avoid such limitation, if instead of > > > > > requiring lcore_id as input parameter, you'll just make it > > > > > return index of > > > next available entry in w[]. > > > > > Then tqs_update() can take that index as input parameter. > > > > I had thought about a similar approach based on IDs. I was > > > > concerned that ID will be one more thing to manage for the > > > > application. But, I see the > > > limitations of the current approach now. I will change it to allocation > based. > > > This will support even non-EAL pthreads as well. > > > > > > Yes, with such approach non-lcore threads will be able to use it also. > > > > > I realized that rte_tqs_register_lcore/ rte_tqs_unregister_lcore need to be > efficient as they can be called from the worker's packet processing loop > (rte_event_dequeue_burst allows blocking. So, the worker thread needs to > call rte_tqs_unregister_lcore before calling rte_event_dequeue_burst and > rte_tqs_register_lcore before starting packet processing). Allocating the > thread ID in these functions will make them more complex. > > > > I suggest that we change the argument 'lcore_id' to 'thread_id'. The > application could use 'lcore_id' as 'thread_id' if threads are mapped to > physical cores 1:1. > > > > If the threads are not mapped 1:1 to physical cores, the threads need to use > a thread_id in the range of 0 - RTE_TQS_MAX_THREADS. I do not see that > DPDK has a thread_id concept. For TQS, the thread IDs are global (i.e. not per > TQS variable). I could provide APIs to do the thread ID allocation, but I > think > the thread ID allocation should not be part of this library. Such thread ID > might be useful for other libraries. > > > > <snip > > > Thread id is problematic since Glibc doesn't want to give it out. > You have to roll your own function to do gettid(). > It is not as easy as just that. Plus what about preemption?
Agree. I looked into this further. The rte_gettid function uses a system call (BSD and Linux). I am not clear on the space of the ID returned (as well). I do not think it is guaranteed that it will be with in a narrow range that is required here. My suggestion would be to add a set of APIs that would allow for allocation of thread IDs which are within a given range of 0 to <predefined MAX>