On Tue, Jun 09, 2015 at 05:18:17PM +0530, Vineet Gupta wrote:
> When auditing cmpxchg call sites, Chuck noted that gcc was optimizing
> away some of the desired LDs.
> 
> |     do {
> |             new = old = *ipi_data_ptr;
> |             new |= 1U << msg;
> |     } while (cmpxchg(ipi_data_ptr, old, new) != old);
> 
> was generating to below
> 
> | 8015cef8:   ld         r2,[r4,0]  <-- First LD
> | 8015cefc:   bset       r1,r2,r1
> |
> | 8015cf00:   llock      r3,[r4]  <-- atomic op
> | 8015cf04:   brne       r3,r2,8015cf10
> | 8015cf08:   scond      r1,[r4]
> | 8015cf0c:   bnz        8015cf00
> |
> | 8015cf10:   brne       r3,r2,8015cf00  <-- Branch doesn't go to orig LD
> 
> Although this was fixed by adding a ACCESS_ONCE in this call site, it
> seems safer (for now at least) to add compiler barrier to LLSC based
> cmpxchg

This is required even. cmpxchg() should include a full memory barrier
_before_ and _after_ the op. Both imply a compiler barrier.

Acked-by: Peter Zijlstra (Intel) <pet...@infradead.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to