Alex Bennée writes: > Pranith Kumar <bobby.pr...@gmail.com> writes: > >> This patch applies on top of the fence generation patch series. >> >> This commit optimizes fence instructions. Two optimizations are >> currently implemented. These are: >> >> 1. Unnecessary duplicate fence instructions >> >> If the same fence instruction is detected consecutively, we remove >> one instance of it. >> >> ex: mb; mb => mb, strl; strl => strl >> >> 2. Merging weaker fence with subsequent/previous stronger fence >> >> load-acquire/store-release fence can be combined with a full fence >> without relaxing the ordering constraint. >> >> ex: a) ld; ldaq; mb => ld; mb >> b) mb; strl; st => mb; st > > What test cases do you have for this? > > Currently the litmus tests don't fire (as they have exactly what they > need). Most programs don't seem to trigger multiple barriers. >
Indeed, these cases are not so commonly seen in the wild. To test it I wrote small test programs using C11 builtins to generate appropriate instructions. Then verified it by running 'qemu-aarch64 -d in_asm,out_asm'. _Atomic int val; int main() { val = __atomic_load_n(&val, __ATOMIC_ACQUIRE); __atomic_store_n(&val, val, __ATOMIC_RELEASE); barrier(); barrier(); } -- Pranith