On Tue, 2017-07-25 at 19:17 +0530, Santosh Sivaraj wrote: > I get the point. I looked at the generated assembly a bit closer, the update > count is optimized out. Will send the alternative asm only patch.
We could do it in C the way x86 does it, using some helpers for begin/end and have either an lwsync (but that would be slower than the data dependency I think) or carefully crafting the helpers to create one (make them return the pointer). If you go down that path though, you need to make sure we do not generate any TOC reference as the vDSO doesn't have a TOC. Cheers, Ben.