> > On 11-Dec-18 6:40 AM, Honnappa Nagarahalli wrote: > >> > >>>>> > >>>>>>> + > >>>>>>> +/* Add a reader thread, running on an lcore, to the list of > >>>>>>> +threads > >>>>>>> + * reporting their quiescent state on a TQS variable. > >>>>>>> + */ > >>>>>>> +int __rte_experimental > >>>>>>> +rte_tqs_register_lcore(struct rte_tqs *v, unsigned int lcore_id) { > >>>>>>> + TQS_RETURN_IF_TRUE((v == NULL || lcore_id >= > >>>>>> RTE_TQS_MAX_LCORE), > >>>>>>> + -EINVAL); > >>>>>> > >>>>>> It is not very good practice to make function return different > >>>>>> values and behave in a different way in debug/non-debug mode. > >>>>>> I'd say that for slow-path (functions in .c) it is always good to > >>>>>> check input parameters. > >>>>>> For fast-path (functions in .h) we sometimes skip such checking, > >>>>>> but debug mode can probably use RTE_ASSERT() or so. > >>>>> Makes sense, I will change this in the next version. > >>>>> > >>>>>> > >>>>>> > >>>>>> lcore_id >= RTE_TQS_MAX_LCORE > >>>>>> > >>>>>> Is this limitation really necessary? > >>>>> I added this limitation because currently DPDK application cannot > >>>>> take a mask more than 64bit wide. Otherwise, this should be as big > >>>>> as > >>>> RTE_MAX_LCORE. > >>>>> I see that in the case of '-lcores' option, the number of lcores > >>>>> can be more than the number of PEs. In this case, we still need a > >>>>> MAX limit (but > >>>> can be bigger than 64). > >>>>> > >>>>>> First it means that only lcores can use that API (at least > >>>>>> data-path part), second even today many machines have more than > >>>>>> 64 > >> cores. > >>>>>> I think you can easily avoid such limitation, if instead of > >>>>>> requiring lcore_id as input parameter, you'll just make it return > >>>>>> index of > >>>> next available entry in w[]. > >>>>>> Then tqs_update() can take that index as input parameter. > >>>>> I had thought about a similar approach based on IDs. I was > >>>>> concerned that ID will be one more thing to manage for the > >>>>> application. But, I see the > >>>> limitations of the current approach now. I will change it to > >>>> allocation > >> based. > >>>> This will support even non-EAL pthreads as well. > >>>> > >>>> Yes, with such approach non-lcore threads will be able to use it also. > >>>> > >>> I realized that rte_tqs_register_lcore/ rte_tqs_unregister_lcore > >>> need to be > >> efficient as they can be called from the worker's packet processing > >> loop (rte_event_dequeue_burst allows blocking. So, the worker thread > >> needs to call rte_tqs_unregister_lcore before calling > >> rte_event_dequeue_burst and rte_tqs_register_lcore before starting > >> packet processing). Allocating the thread ID in these functions will make > them more complex. > >>> > >>> I suggest that we change the argument 'lcore_id' to 'thread_id'. The > >> application could use 'lcore_id' as 'thread_id' if threads are mapped > >> to physical cores 1:1. > >>> > >>> If the threads are not mapped 1:1 to physical cores, the threads > >>> need to use > >> a thread_id in the range of 0 - RTE_TQS_MAX_THREADS. I do not see > >> that DPDK has a thread_id concept. For TQS, the thread IDs are global > >> (i.e. not per TQS variable). I could provide APIs to do the thread ID > >> allocation, but I think the thread ID allocation should not be part > >> of this library. Such thread ID might be useful for other libraries. > >>> > >>> <snip > >> > >> > >> Thread id is problematic since Glibc doesn't want to give it out. > >> You have to roll your own function to do gettid(). > >> It is not as easy as just that. Plus what about preemption? > > > > Agree. I looked into this further. The rte_gettid function uses a system > > call > (BSD and Linux). I am not clear on the space of the ID returned (as well). I > do > not think it is guaranteed that it will be with in a narrow range that is > required > here. > > > > My suggestion would be to add a set of APIs that would allow for > > allocation of thread IDs which are within a given range of 0 to > > <predefined MAX> > > > > System-provided thread-ID's would probably also be potentially non-unique in > multiprocess scenario? For linux, rte_gettid is implemented as: int rte_sys_gettid(void) { return (int)syscall(SYS_gettid); }
Came across [1] which states, thread-IDs are unique across the system. For BSD, thr_self is used. [2] says it provides system wide unique thread IDs. [1] https://stackoverflow.com/questions/6372102/what-is-the-difference-between-pthread-self-and-gettid-which-one-should-i-u [2] https://nxmnpg.lemoda.net/2/thr_self > > -- > Thanks, > Anatoly