> > Not sure I understand the problem you are referring to. > > Are you saying that original rte_memcpy() code breaks strict aliasing? > > If so, could you point where exactly? > > As far as I understand, yes, it does break strict aliasing. For > example, in the following line: > > *(uint64_t *)dstu = *(const uint64_t *)srcu; > > IIUC, both casts break strict aliasing rules. While the src/dst > parameters are void* and can therefore be cast to something else > without breaking strict aliasing rules, the type of src/dst in the > calling code might be something other than uint64_t*. This can result > in src/dst pointers being cast to different unrelated types. AFAICT, > the fact that rte_memcpy is "always inline" increases the risk of the > compiler making an optimization that results in broken code. > > I was able to come up with an example where the latest version of GCC > produces broken code when strict aliasing is enabled: > > https://godbolt.org/z/3Yzvjr97c > > With -fstrict-aliasing, it reorders a write and results in broken > code. If you change the compiler flags to -fno-strict-aliasing, it > produces the expected result.
Indeed it looks like a problem. Thanks for pointing it out. Was able to reproduce it with gcc 11 (clang 13 seems fine). Actually, adding ' __attribute__ ((__may_alias__))' for both dst and src didn't quire the problem. To overcome it, I had to either: add '-fno-strict-aliasing' CC flag (as you mentioned above), or add: if (__builtin_constant_p(n)) return memcpy(dst, src, n); on top of rte_memcpy() code. Though I suppose the problem might be much wider than just rte_memcpy(). We do have similar inline copying code in other places too. As understand some of such cases also might be affected. Let say: '_rte_ring_(enqueue|dequeue_elems_*'. Not sure what would be the best approach in general for such cases: - always compile DPDK code with '-fno-strict-aliasing' But that wouldn't prevent people to use our inline functions without that flag. Also wonder what performance impact it will have. - Try to fix all such occurrences manually (but it would be hard to catch all of them upfront) - Something else ...? Wonder what do people think about it? Konstantin