On Tue, Jul 11, 2017 at 3:09 AM, Richard Earnshaw (lists)
<richard.earns...@arm.com> wrote:
> On 11/07/17 05:16, Andrew Pinski wrote:
>> I was looking into some bitfield code for aarch64 and was wondering
>> why SLOW_BYTE_ACCESS is set to 0.  I can't seem to figure out why
>> though.
>> The header says:
>>    Although there's no difference in instruction count or cycles,
>>   in AArch64 we don't want to expand to a sub-word to a 64-bit access
>>   if we don't have to, for power-saving reasons.  */
>>
>> But that does not make sense because with SLOW_BYTE_ACCESS to 0, GCC
>> expands a sub-word access to a 64bit access.
>>> When I set to SLOW_BYTE_ACCESS to 1, I get between 38% to 208% speed
>> up for accesses of a bitfields inside a loop on ThunderX CN88xx.
>
> What's the test case?
>
>>
>> Should we change SLOW_BYTE_ACCESS (or maybe better yet get rid of it)?
>>
>
> The documentation for SLOW_BYTE_ACCESS is just plain confusing, IMO.
> And your comment above seems to be contrary to the documentation as well.

Here is the testcase which shows the issue:
typedef unsigned long long u64;
typedef struct
{
  u64 a:10;
  u64 b:10;
  u64 c:9;
  u64 d:7;
  u64 e:14;
  u64 f:14;
}s_t;
void setting(s_t *a)
{
  a->a = 0x2AA;
  a->b = 0x2AA;
  a->c = 0x155;
  a->d = 0x2A;
  a->e = 0x2AAA;
  a->f = 0x2AAA;
}
void set(s_t *a, int b, int c, int d, int e, int f, int g)
{
  a->a = b;
  a->b = c;
  a->c = d;
  a->d = e;
  a->e = f;
  a->f = g;
}
--- CUT ---
If SLOW_BYTE_ACCESS is set to 0, we get many more instructions.  See
the logic in bit_field_mode_iterator::next_mode (which calls
bit_field_mode_iterator::prefer_smaller_modes which checks
SLOW_BYTE_ACCESS).

Note the only other place which checks SLOW_BYTE_ACCESS is dojump.c
and I think that code might be dead due to expand directly from SSA.

Thanks,
Andrew Pinski

>
> R.

Reply via email to