Micahel K. Edwards points out in PR/48126 that the sync is in the wrong place relative to the branch target of the compare, since the load could float up beyond the ldrex. PR target/48126 * config/arm/arm.c (arm_output_sync_loop): Move label before barrier
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 5161439..6e7105a 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -24214,8 +24214,11 @@ arm_output_sync_loop (emit_f emit, } } - arm_process_output_memory_barrier (emit, NULL); + /* Note: label is before barrier so that in cmp failure case we still get + a barrier to stop subsequent loads floating upwards past the ldrex + pr/48126. */ arm_output_asm_insn (emit, 1, operands, "%sLSYB%%=:", LOCAL_LABEL_PREFIX); + arm_process_output_memory_barrier (emit, NULL); } static rtx