On May 7, 2011, at 6:05 AM, erik quanstrom <quans...@quanstro.net> wrote:

i'm confused by the recent change to the thread library.
the old code was simply to do a locked incl.  the new code
does a locked exchange /within a loop/ until it's seen that
nobody else has updated the value at the same time, thus
insuring that the value has indeed been updated.

since the expensive operation is the MESI(F) negotiation
behind the scenes to get exclusive access to the cacheline,
i don't understand the motiviation is for replacing _xinc
with ainc.  since ainc can loop on an expensive lock instruction.

that is, i think the old version was wait free, and the new version
is not.

can someone explain what i'm missing here?

thanks!

- erik

----

TEXT    _xinc(SB),$0    /* void _xinc(long *); */

   MOVL    l+0(FP),AX
   LOCK
   INCL    0(AX)
   RET

----

TEXT ainc(SB), $0    /* long ainc(long *); */
   MOVL    addr+0(FP), BX
ainclp:
   MOVL    (BX), AX
   MOVL    AX, CX
   INCL    CX
   LOCK
   BYTE    $0x0F; BYTE $0xB1; BYTE $0x0B    /* CMPXCHGL CX, (BX) */
   JNZ    ainclp
   MOVL    CX, AX
   RET


Just guessing. May be the new code allows more concurrency? If the value is not in the processor cache, will the old code block other processors for much longer? The new code forces caching with the first read so may be high likelyhood cmpxchg will finish faster. I haven't studied x86 cache behavior so this guess could be completely wrong. Suggest asking on comp.arch where people like Andy Glew can give you a definitive answer.

Reply via email to