On 07/10/2018, 06:03, "Jerin Jacob" <[email protected]> wrote:
In arm64 case, it will have ATOMIC_RELAXED followed by asm volatile
("":::"memory") of rte_pause().
I would n't have any issue, if the generated code code is same or better
than the exiting case. but it not the case, Right?
The existing case is actually not interesting (IMO) as it exposes undefined
behaviour which allows the compiler to do anything. But you seem to be
satisfied with "works for me, right here right now". I think the cost of
avoiding undefined behaviour is acceptable (actually I don't think it even will
be noticeable).
Skipping the compiler memory barrier in rte_pause() potentially allows for
optimisations that provide much more benefit, e.g. hiding some cache miss
latency for later loads. The DPDK ring buffer implementation is defined so to
enable inlining of enqueue/dequeue functions into the caller, any code could
immediately follow these calls.
From INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x
Programming languages — C
5.1.2.4
4 Two expression evaluations conflict if one of them modifies a memory location
and the other one reads or modifies the same memory location.
25 The execution of a program contains a data race if it contains two
conflicting actions in different threads, at least one of which is not atomic,
and neither happens before the other. Any such data race results in undefined
behavior.
-- Ola