+Correct email for Thomas.
On Mon, Sep 04, 2017 at 02:02:05PM +0100, Bruce Richardson wrote: > On Fri, Feb 10, 2017 at 11:53:06AM +0100, Thomas Monjalon wrote: > > 2017-02-10 10:39, Hunt, David: > > > > > > On 9/2/2017 4:53 PM, Thomas Monjalon wrote: > > > > 2016-11-06 22:09, Thomas Monjalon: > > > >> 2016-09-29 18:34, Thomas Monjalon: > > > >>> 2016-09-30 02:54, Nikhil Rao: > > > >>>> The original code used movl instead of xchgl, this caused > > > >>>> rte_atomic64_cmpset to use ebx as the lower dword of the source > > > >>>> to cmpxchg8b instead of the lower dword of function argument "src". > > > >>> Could you please start the explanation with a statement of > > > >>> what is wrong from an user point of view? > > > >>> It could help to understand how severe it is. > > > >> Please, we need a clear explanation of the bug, and an acknowledgement. > > > > Should we close this bug? > > > > > > I took a few minutes to look at this, and the issue can easily be > > > reproduced with a small snippet of code. > > > With the 'mov', the lower dword of the result is incorrect. This is > > > resolved by using 'xchgl'. > > > > > > void main() > > > { > > > uint64_t a = 0xff000000ff; > > > > > > rte_atomic64_cmpset( &a, 0xff000000ff, 0xfa000000fa); > > > printf("0x%lx\n", a); > > > } > > > > > > When using 'mov', the result is 0xfa00000000 > > > When using 'xchgl', the result is 0xfa000000fa, as expected. > > > > This operation is used a lot in drivers for link status. > > > > I think we need to clearly explain what was the consequence of this bug. > > Resurrecting this old thread, with my analysis. > > The issue is indeed as described above, the low dword of the result of > the 64-bit cmpset is incorrect, if the exchange takes place. This is due > to the incorrect source value not being placed in the ebx register. > > What is meant to happen is that, if the old value (from EDX:EAX) matches > the value in the memory location, that memory location is written to by > the new value from ECX:EBX. However, for PIC code, we can't use EBX > register so the parameter is placed in EDI register instead. The first > line is meant to be moving the EDI value to EBX, but instead is doing > the opposite, of moving the current EBX value to EDI. This leads to the > incorrect result. > > An alternative fix would be the following code: > > asm volatile ( > "push %%ebx;" > "mov %%edi, %%ebx;" > MPLOCKED "cmpxchg8b (%[dst]);" > "setz %[res];" > "mov %%ebx, %%edi;" > "pop %%ebx;" > : [res] "=a" (res) /* result in eax */ > : [dst] "S" (dst), /* esi */ > "D" (_src.l32), /* edi, copied to ebx */ > "c" (_src.h32), /* ecx */ > "a" (_exp.l32), /* eax */ > "d" (_exp.h32) /* edx */ > : "memory" ); /* no-clobber list */ > > However, the xchg to swap the registers at the start and swap them back > at the end is shorter. > > Couple of other comments on this code area that should be taken into > account: > 1. the indentation of the asm code looks wrong, and should probably be > fixed to make it more readable. > 2. the comment on the "D" register is wrong as it refers to ebx > 3. the fact that we can't use ebx, and instead use edi and swap twice > should be commented. > > For the fix itself: > > Acked-by: Bruce Richardson <bruce.richard...@intel.com> > > Regards, > /Bruce