I was looking into some bitfield code for aarch64 and was wondering
why SLOW_BYTE_ACCESS is set to 0.  I can't seem to figure out why
though.
The header says:
   Although there's no difference in instruction count or cycles,
  in AArch64 we don't want to expand to a sub-word to a 64-bit access
  if we don't have to, for power-saving reasons.  */

But that does not make sense because with SLOW_BYTE_ACCESS to 0, GCC
expands a sub-word access to a 64bit access.

When I set to SLOW_BYTE_ACCESS to 1, I get between 38% to 208% speed
up for accesses of a bitfields inside a loop on ThunderX CN88xx.

Should we change SLOW_BYTE_ACCESS (or maybe better yet get rid of it)?

Thanks,
Andrew Pinski

Reply via email to