RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

Bernd Edlinger Thu, 05 Mar 2015 07:08:16 -0800

Hi,

On Thu, 5 Mar 2015 12:24:56, Richard Biener wrote:
>
> On Thu, Mar 5, 2015 at 12:00 PM, Bernd Edlinger
> <bernd.edlin...@hotmail.de> wrote:
>> Hi,
>>
>> On Thu, 5 Mar 2015 10:00:26, Richard Biener wrote:
>>>
>>> On Wed, Mar 4, 2015 at 3:13 PM, Bernd Edlinger
>> Maybe one thing is missing from strict_volatile_bitfield_p, I am not sure.
>>
>> Maybe it should check that MEM_ALIGN (op0)>= GET_MODE_ALIGNMENT (fieldmode)
>> Because the strict volatile bitfields handling will inevitably try to use
>> the fieldmode to access the memory.
>>
>> Or would it be better to say MEM_ALIGN (op0)>= GET_MODE_BITSIZE (fieldmode),
>> because it is easier to explain when some one asks, when we guarantee the 
>> semantics
>> of strict volatile bitfield?
>
> But on non-strict-align targets you can even for 1-byte aligned MEMs
> access an SImode field directly. So the old code looks correct to me
> here and the fix needs to be done somewhere else.
>


But this SImode access is split up in several QImode or HImode accesses,
in the processor's execution pipeline, finally on an external bus like AXI all 
memory
transactions are aligned.  The difference is just that some processors can 
split up
the unaligned accesses and some need help from the compiler, but even on an
x86 we have a different semantics for unaligned acceses, that is they are no 
longer atomic,
while an aligned access is always executed as an atomic transaction on an x86 
processor.

>> Probably there is already something in the logic in expr.c that prevents 
>> these cases,
>> because otherwise it would be way to easy to find an example for unaligned 
>> accesses
>> to unaligned memory on STRICT_ALIGNMENT targets.
>>
>>
>> Ok, what would you think about this variant?
>>
>> --- expmed.c.jj 2015-01-16 11:20:40.000000000 +0100
>> +++ expmed.c 2015-03-05 11:50:09.400444416 +0100
>> @@ -472,9 +472,11 @@ strict_volatile_bitfield_p (rtx op0, uns
>> return false;
>>
>> /* Check for cases of unaligned fields that must be split. */
>> - if (bitnum % BITS_PER_UNIT + bitsize> modesize
>> - || (STRICT_ALIGNMENT
>> - && bitnum % GET_MODE_ALIGNMENT (fieldmode) + bitsize> modesize))
>> + if (bitnum % modesize + bitsize> modesize)
>> + return false;
>> +
>> + /* Check that the memory is sufficiently aligned. */
>> + if (MEM_ALIGN (op0) < modesize)
>
> I think that only applies to strict-align targets and then only for
> GET_MODE_ALIGNMENT (modesize). And of course what matters
> is the alignment at bitnum - even though op0 may be not sufficiently
> aligned it may have known misalignment so that op0 + bitnum is
> sufficiently aligned.
>
> Testcases would need to annotate structs with packed/aligned attributes
> to get at these cases.
>
> For the testcase included in the patch, what does the patch end up doing?
> Not going the strict-volatile bitfield expansion path? That looks unnecessary
> on !strict-alignment targets but resonable on strict-align targets where the
> access would need to be splitted. So, why does it end up being splitted
> on !strict-align targets?
>

gcc -fstrict-volatile-bitfields -S 20150304-1.c
without patch we find this in 20150304-1.s:

main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movzbl  global+1(%rip), %eax
        orl     $-1, %eax
        movb    %al, global+1(%rip)
        movzbl  global+2(%rip), %eax
        orl     $-1, %eax
        movb    %al, global+2(%rip)
        movzbl  global+3(%rip), %eax
        orl     $-1, %eax
        movb    %al, global+3(%rip)
        movzbl  global+4(%rip), %eax
        orl     $127, %eax
        movb    %al, global+4(%rip)
        movl    global(%rip), %eax
        shrl    $8, %eax
        andl    $2147483647, %eax
        cmpl    $2147483647, %eax
        je      .L2
        call    abort
.L2:
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc


so the write path is OK, because strict_volatile_bitfield_p returns false,
because

  /* Check for cases where the C++ memory model applies.  */
  if (bitregion_end != 0
      && (bitnum - bitnum % modesize < bitregion_start
          || bitnum - bitnum % modesize + modesize - 1> bitregion_end))
    return false;


bitregion_start=8, bitregion_end=39
bitnum - bitnum % modesize = 0
bitnum - bitnum % modesize + modesize - 1 = 31


this does not happen in the read path, and the problem is the access
here does not work:

          str_rtx = narrow_bit_field_mem (str_rtx, fieldmode, bitsize, bitnum,
                                          &bitnum);
          /* Explicitly override the C/C++ memory model; ignore the
             bit range so that we can do the access in the mode mandated
             by -fstrict-volatile-bitfields instead.  */
          store_fixed_bit_field_1 (str_rtx, bitsize, bitnum, value);


str_rtx = unchanged, bitnum = 8, bitsize= 31, but store_fixed_bit_fileld_1
can not handle that.

BTW: I can make the write code path also fail if I change this bit
in the test case add ":8" to char x:

struct s
{
  char x:8;
  unsigned int y:31;
} __attribute__((packed));

now the bit region is from 0..39, and we get this code:

main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    global(%rip), %eax
        orl     $-256, %eax
        movl    %eax, global(%rip)
        movl    global(%rip), %eax
        shrl    $8, %eax
        andl    $2147483647, %eax
        cmpl    $2147483647, %eax
        je      .L2
        call    abort
.L2:
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc



The patch (I think, I would still prefer my initial proposal) fixes
this by skipping the strict alingment code path.  That looks OK for me,
because the structure is always packed if that happens, and it can't
be used to access "device registers", which are by definition always
naturally aligned, at least to the machine word size.


with the patch we get:

main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movzbl  global+1(%rip), %eax
        orl     $-1, %eax
        movb    %al, global+1(%rip)
        movzbl  global+2(%rip), %eax
        orl     $-1, %eax
        movb    %al, global+2(%rip)
        movzbl  global+3(%rip), %eax
        orl     $-1, %eax
        movb    %al, global+3(%rip)
        movzbl  global+4(%rip), %eax
        orl     $127, %eax
        movb    %al, global+4(%rip)
        movzbl  global+1(%rip), %eax
        movzbl  %al, %eax
        movzbl  global+2(%rip), %edx
        movzbl  %dl, %edx
        salq    $8, %rdx
        orq     %rax, %rdx
        movzbl  global+3(%rip), %eax
        movzbl  %al, %eax
        salq    $16, %rax
        orq     %rax, %rdx
        movzbl  global+4(%rip), %eax
        movzbl  %al, %eax
        andl    $127, %eax
        salq    $24, %rax
        orq     %rdx, %rax
        cmpl    $2147483647, %eax
        je      .L2
        call    abort
.L2:
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc


every access is split in 4 QImode accesses. but that is as
expected, because the structure is byte aligned.


Thanks
Bernd.

RE: [PATCH] Fix another wrong-code bug with -fstrict-volatile-bitfields

Reply via email to