Hi Ferruh, > >> 22/10/2021 23:14, Bing Zhao: > >>> In the function "eth_dev_fp_ops_reset", a structure assignment > >>> operation is used to reset one queue's callback functions, etc., but > >>> it is not thread safe. > >>> > >>> The structure assignment is not atomic, a lot of instructions will > >>> be generated. Right now, since not all the fields are needed, the > >>> fields in the "dummy_ops" which is not set explicitly will be 0s > >>> based on the specification and compiler behavior. In order to make > >>> "fpo" has the same content with "dummy_ops", some clearing to 0 > >>> operation is needed. > >>> > >>> By checking the object instructions (e.g. with GCC 4.8.5) > >>> 0x0000000000a58317 <+35>: mov %rsi,%rdi > >>> 0x0000000000a5831a <+38>: mov %rdx,%rcx > >>> => 0x0000000000a5831d <+41>: rep stos %rax,%es:(%rdi) > >>> 0x0000000000a58320 <+44>: mov -0x38(%rsp),%rax > >>> 0x0000000000a58325 <+49>: lea -0xe0(%rip),%rdx > >>> // # 0xa5824c <dummy_eth_rx_burst> > >>> > >>> It shows that "rep stos" will clear the "fpo" structure before > >>> assigning new values. > >>> > >>> In the other thread, if some data path Tx / Rx functions are still > >>> running, there is a risk to get 0 instead of the correct dummy > >>> content. > >>> 1. qd = p->rxq.data[queue_id] > >>> 2. (void **)&p->rxq.clbk[queue_id] > >>> "data" and "clbk" may be observed with NULL (0) in other threads. > >>> Even it is temporary, the accessing to a NULL pointer will cause a > >>> crash. Using "memcpy" could get rid of this. > >>> > >>> Fixes: c87d435a4d79 ("ethdev: copy fast-path API into separate structure") > >>> Cc: konstantin.anan...@intel.com > >>> > >>> Signed-off-by: Bing Zhao <bi...@nvidia.com> > >>> --- > >>> --- a/lib/ethdev/ethdev_private.c > >>> +++ b/lib/ethdev/ethdev_private.c > >>> @@ -206,7 +206,7 @@ eth_dev_fp_ops_reset(struct rte_eth_fp_ops *fpo) > >>> .txq = {.data = dummy_data, .clbk = dummy_data,}, > >>> }; > >>> > >>> - *fpo = dummy_ops; > >>> + rte_memcpy(fpo, &dummy_ops, sizeof(struct rte_eth_fp_ops)); > >> > >> That's not trivial. > >> Please add a comment to briefly explain that memcpy avoids zeroing of a > >> simple assignment. > >> > > > > I think that patch is based on two totally wrong assumptions: > > 1) ethdev data-path and control-path API is MT-safe. > > With current design it is not. > > When calling rx/tx_burst it is caller responsibility to make sure that > > given port is > > already properly configured and started. Also it is user > > responsibility to guarantee > > that none other thread doing dev_stop for the same port simultaneously. > > And visa-versa when calling dev_stop(), it is user responsibility to > > ensure that > > none other thread doing rx/tx_burst for given port simultaneously. > > If your app doesn't follow these principles, then it is a bug that > > needs to be fixed. > > 2) rte_memcpy() provides some sort of atomicity and it is safe to use it on > > its own > > in MT environment. That's totally wrong. > > In both cases compiler has total freedom to perform copy in any order > > it likes > > (let say it can first read whole source data in some temporary buffer > > (SIMD register), > > and then right it in one go, or it can do the same trick with 'rep > > stos' as above). > > Moreover CPU itself can reorder instructions. > > So if you need this copy to be atomic you need to use some sort of > > sync primitives along with it (mutex, rwlock, rcu, etc.). > > But as I said above right now ethdev API is not MT-safe, so it is not > > required. > > > > To summarise - there is no point to mae these changes, > > and patch comment is wrong and misleading. > > Can we mark this patch as rejected now?
I believe so. > Patch seems trying to cover a wrong application usage, and it should > be addressed in the application level.