Wilco Dijkstra <wilco.dijks...@arm.com> writes:
> Hi Richard,
>
> I've reworded the commit message a bit:
>
> The CPU features initialization code uses CPUID registers (rather than
> HWCAP).  The equality comparisons it uses are incorrect: for example FEAT_SVE
> is not set if SVE2 is available.  Using HWCAPs for these is both simpler and
> correct.  The initialization must also be done atomically to avoid multiple
> threads causing corruption due to non-atomic RMW accesses to the global.

Thanks, sounds good.

>> What criteria did you use for choosing whether to keep or remove
>> the system register checks?
>
> Essentially anything covered by HWCAP doesn't need an explicit check. So I 
> kept
> the LS64 and PREDRES checks since they don't have a HWCAP allocated (I'm not
> entirely convinced we need these, let alone having 3 individual bits for 
> LS64, but
> that's something for the ACLE spec to sort out). The goal here is to fix all 
> obvious
> bugs so one can use FMV as intended.

Didn't we take the opposite approach for libatomic though?

/* LSE128 atomic support encoded in ID_AA64ISAR0_EL1.Atomic,
   bits[23:20].  The expected value is 0b0011.  Check that.  */

#define AT_FEAT_FIELD(isar0)    (((isar0) >> 20) & 15)

static inline bool
has_lse128 (unsigned long hwcap, const __ifunc_arg_t *features)
{
  if (hwcap & _IFUNC_ARG_HWCAP
      && features->_hwcap2 & HWCAP2_LSE128)
    return true;
  /* A 0 HWCAP2_LSE128 bit may be just as much a sign of missing HWCAP2 bit
     support in older kernels as it is of CPU feature absence.  Try fallback
     method to guarantee LSE128 is not implemented.

     In the absence of HWCAP_CPUID, we are unable to check for LSE128.
     If feature check available, check LSE2 prerequisite before proceeding.  */
  if (!(hwcap & HWCAP_CPUID) || !(hwcap & HWCAP_USCAT))
     return false;

  unsigned long isar0;
  asm volatile ("mrs %0, ID_AA64ISAR0_EL1" : "=r" (isar0));
  if (AT_FEAT_FIELD (isar0) >= 3)
    return true;
  return false;
}

I suppose one difference is that the libatomic code is gating a
choice between a well-defined, curated set of routines, whereas the
libgcc code is providing a general user-facing feature.  So maybe
libgcc should be more conservative for that reason?

Thanks,
Richard

Reply via email to