Hi, On Wed, May 01, 2019 at 11:57:14AM +0530, Mihir Luthra wrote: > For making my implementation of shared memory data structure more > space efficient, I was trying to implement a stack which stores > offsets to unused locations in the shared memory file. But as stack is > being shared it also needs to be edited in a lock free way. While > editing stack I need to atomically CAS both top of stack and the > element on it. For this I found double compare and swap. Also, in my > data structure at one point I need DCAS as well to ensure correct > editing. > To implement DCAS I came across some instruction “cmpxchg16”. But I > think its still not as per my need. [1] > > Do you know any alternative with which I can DCAS atomically or > anything which atomically checks 2 old values before replacing value > at an address? > > [1] > https://stackoverflow.com/questions/7646018/sse-instructions-which-cpus-can-do-atomic-16b-memory-operations
>From what I understand from the stackoverflow post you're right that cmxpchg16b will not give a consistent view of the 16 bytes of memory across multiple NUMA nodes. However, maybe two 4 byte values right next to each other would be sufficient for your use case and could then be casted to a 8 byte values for CAS? -- Clemens