On 07/10/2018, 06:03, "Jerin Jacob" <jerin.ja...@caviumnetworks.com> wrote:

    In arm64 case, it will have ATOMIC_RELAXED followed by asm volatile 
("":::"memory") of rte_pause().
    I would n't have any issue, if the generated code code is same or better 
than the exiting case. but it not the case, Right?
The existing case is actually not interesting (IMO) as it exposes undefined 
behaviour which allows the compiler to do anything. But you seem to be 
satisfied with "works for me, right here right now". I think the cost of 
avoiding undefined behaviour is acceptable (actually I don't think it even will 
be noticeable).

Skipping the compiler memory barrier in rte_pause() potentially allows for 
optimisations that provide much more benefit, e.g. hiding some cache miss 
latency for later loads. The DPDK ring buffer implementation is defined so to 
enable inlining of enqueue/dequeue functions into the caller, any code could 
immediately follow these calls.

From INTERNATIONAL STANDARD ©ISO/IEC ISO/IEC 9899:201x
Programming languages — C

5.1.2.4
4 Two expression evaluations conflict if one of them modifies a memory location 
and the other one reads or modifies the same memory location.

25 The execution of a program contains a data race if it contains two 
conflicting actions in different threads, at least one of which is not atomic, 
and neither happens before the other. Any such data race results in undefined 
behavior.

-- Ola

    

Reply via email to