On Mon, Aug 17, 2020 at 02:03:23PM +0100, Julien Grall wrote:
> 
> 
> On 17/08/2020 12:50, Roger Pau Monné wrote:
> > On Mon, Aug 17, 2020 at 12:05:54PM +0100, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 17/08/2020 11:33, Roger Pau Monné wrote:
> > > > On Mon, Aug 17, 2020 at 10:42:54AM +0100, Julien Grall wrote:
> > > > > Hi,
> > > > > 
> > > > > On 17/08/2020 10:24, Roger Pau Monné wrote:
> > > > > > On Sat, Aug 15, 2020 at 06:21:43PM +0100, Julien Grall wrote:
> > > > > > > From: Julien Grall <jgr...@amazon.com>
> > > > > > > 
> > > > > > > The IOREQ code is using cmpxchg() with 64-bit value. At the 
> > > > > > > moment, this
> > > > > > > is x86 code, but there is plan to make it common.
> > > > > > > 
> > > > > > > To cater 32-bit arch, introduce two new helpers to deal with 
> > > > > > > 64-bit
> > > > > > > cmpxchg.
> > > > > > > 
> > > > > > > The Arm 32-bit implementation of cmpxchg64() is based on the 
> > > > > > > __cmpxchg64
> > > > > > > in Linux v5.8 (arch/arm/include/asm/cmpxchg.h).
> > > > > > > 
> > > > > > > Signed-off-by: Julien Grall <jgr...@amazon.com>
> > > > > > > Cc: Oleksandr Tyshchenko <oleksandr_tyshche...@epam.com>
> > > > > > > ---
> > > > > > > diff --git a/xen/include/asm-x86/guest_atomics.h 
> > > > > > > b/xen/include/asm-x86/guest_atomics.h
> > > > > > > index 029417c8ffc1..f4de9d3631ff 100644
> > > > > > > --- a/xen/include/asm-x86/guest_atomics.h
> > > > > > > +++ b/xen/include/asm-x86/guest_atomics.h
> > > > > > > @@ -20,6 +20,8 @@
> > > > > > >         ((void)(d), test_and_change_bit(nr, p))
> > > > > > >     #define guest_cmpxchg(d, ptr, o, n) ((void)(d), cmpxchg(ptr, 
> > > > > > > o, n))
> > > > > > > +#define guest_cmpxchg64(d, ptr, o, n) ((void)(d), cmpxchg64(ptr, 
> > > > > > > o, n))
> > > > > > > +
> > > > > > >     #endif /* _X86_GUEST_ATOMICS_H */
> > > > > > >     /*
> > > > > > > diff --git a/xen/include/asm-x86/x86_64/system.h 
> > > > > > > b/xen/include/asm-x86/x86_64/system.h
> > > > > > > index f471859c19cc..c1b16105e9f2 100644
> > > > > > > --- a/xen/include/asm-x86/x86_64/system.h
> > > > > > > +++ b/xen/include/asm-x86/x86_64/system.h
> > > > > > > @@ -5,6 +5,8 @@
> > > > > > >         ((__typeof__(*(ptr)))__cmpxchg((ptr),(unsigned long)(o),  
> > > > > > >           \
> > > > > > >                                        (unsigned 
> > > > > > > long)(n),sizeof(*(ptr))))
> > > > > > > +#define cmpxchg64(ptr, o, n) cmpxchg(ptr, o, n)
> > > > > > 
> > > > > > Why do you need to introduce an explicitly sized version of cmpxchg
> > > > > > for 64bit values?
> > > > > > 
> > > > > > There's no cmpxchg{8,16,32}, so I would expect cmpxchg64 to just be
> > > > > > handled by cmpxchg detecting the size of the parameter passed to the
> > > > > > function.
> > > > > That works quite well for 64-bit arches. However, for 32-bit, you 
> > > > > would need
> > > > > to take some detour so 32-bit and 64-bit can cohabit (you cannot 
> > > > > simply
> > > > > replace unsigned long with uint64_t).
> > > > 
> > > > Oh, I see. Switching __cmpxchg on Arm 32 to use unsigned long long or
> > > > uint64_t would be bad, as you would then need two registers to pass
> > > > the value to the function, or push it on the stack?
> > > 
> > > We have only 4 registers (r0 - r4) available for the arguments. With 
> > > 64-bit
> > > value, we will be using 2 registers, some will end up to be pushed on the
> > > stack.
> > > 
> > > This is assuming the compiler is not clever enough to see we are only 
> > > using
> > > the bottom 32-bit with some cmpxchg.
> > > 
> > > > 
> > > > Maybe do something like:
> > > > 
> > > > #define cmpxchg(ptr,o,n) ({                                             
> > > > \
> > > >         typeof(*(ptr)) tmp;                                             
> > > > \
> > > >                                                                         
> > > > \
> > > >         switch ( sizeof(*(ptr)) )                                       
> > > > \
> > > >         {                                                               
> > > > \
> > > >         case 8:                                                         
> > > > \
> > > >                 tmp = __cmpxchg_mb64((ptr), (uint64_t)(o),              
> > > > \
> > > >                                 (uint64_t)(n), sizeof(*(ptr))))         
> > > > \
> > > >                 break;                                                  
> > > > \
> > > >         default:                                                        
> > > > \
> > > >                 tmp = __cmpxchg_mb((ptr), (unsigned long)(o),           
> > > > \
> > > >                                 (unsigned long)(n), sizeof(*(ptr))))    
> > > > \
> > > >                 break;                                                  
> > > > \
> > > >         }                                                               
> > > > \
> > > >         tmp;                                                            
> > > > \
> > > > })
> > > 
> > > 
> > > Unfortunately this can't compile if o and n are pointers because the
> > > compiler will complain about the cast to uint64_t.
> > 
> > Right, we would have to cast to unsigned long first and then to
> > uint64_t, which is not very nice.
> 
> If you use (uint64_t)(unsigned long) in the 64-bit case, then you would lose
> the top 32-bit. So cmpxchg() wouldn't work as expected.
> 
> > 
> > > 
> > > We would also need a cast when assigning to tmp because tmp may not be a
> > > scalar type. This would lead to the same compiler issue.
> > 
> > Yes, we would have to do a bunch of casts.
> 
> I don't think there is a way to solve this using just cast.

Right. I certainly failed to see it.

> > 
> > I don't think the union is so bad, but let's wait to see what others
> > think.
> 
> I am not concerned about the code itself but the assembly generated. I don't
> want to increase the number memory access or instructions just for the sake
> of trying to get cmpxchg() to work with 64-bit.
> 
> I will have to implement it and see the code generated.

Since we already have a 128bit version I don't think there's a problem
in introducing a 64bit version (and forcing it's usage in common
code).

> > 
> > FWIW x86 already has a specific handler for 128bit values: cmpxchg16b.
> > Maybe it would be better to name this cmpxchg8b? Or rename the
> > existing one to cmpxchg128 for coherence.
> 
> I dislike the name cmpxchg8b(). This is much easier to match the type and
> the name with cmpxchg64().
> 
> I would be happy to rename cmpxchg16b() if the x86 folks would want it.

That would be fine by me.

Reply via email to