On Tue, Sep 17, 2013 at 2:08 PM, Bernd Edlinger <bernd.edlin...@hotmail.de> wrote: > On Tue, 17 Sep 2013 12:45:40, Richard Biener wrote: >> >> On Tue, Sep 17, 2013 at 12:00 PM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Sun, Sep 15, 2013 at 6:55 PM, Bernd Edlinger >>> <bernd.edlin...@hotmail.de> wrote: >>>> Hello Richard, >>>> >>>> attached is my second attempt at fixing PR 57748. This time the movmisalign >>>> path is completely removed and a similar bug in the read handling of >>>> misaligned >>>> structures with a non-BLKmode is fixed too. There are several new test >>>> cases for the >>>> different possible failure modes. >>>> >>>> This patch was boot-strapped and regression tested on >>>> x86_64-unknown-linux-gnu >>>> and i686-pc-linux-gnu. >>>> >>>> Additionally I generated eCos and an eCos-application (on ARMv5 using >>>> packed >>>> structures) with an arm-eabi cross compiler, and looked for differences in >>>> the >>>> disassembled code with and without this patch, but there were none. >>>> >>>> OK for trunk? >>> >>> I agree that the existing movmisaling path that you remove is simply bogus, >>> so >>> removing it looks fine to me. Can you give rationale to >>> >>> @@ -4773,6 +4738,8 @@ expand_assignment (tree to, tree from, b >>> if (MEM_P (to_rtx) >>> && GET_MODE (to_rtx) == BLKmode >>> && GET_MODE (XEXP (to_rtx, 0)) != VOIDmode >>> + && bitregion_start == 0 >>> + && bitregion_end == 0 >>> && bitsize> 0 >>> && (bitpos % bitsize) == 0 >>> && (bitsize % GET_MODE_ALIGNMENT (mode1)) == 0 > > OK, as already said, I think it could be dangerous to set bitpos=0 without > considering bitregion_start/end, but I think it may be possible that this > can not happen, because if bitsize is a multiple if ALIGNMENT, and > bitpos is a multiple of bitsize, we probably do not have a bit-field at all. > And of course I have no test case that fails without this hunk. > Maybe it would be better to add an assertion here like: > > { > gcc_assert (bitregion_start == 0 && bitregion_end == 0); > to_rtx = adjust_address (to_rtx, mode1, bitpos / BITS_PER_UNIT); > bitpos = 0; > } > >>> and especially to >>> >>> @@ -9905,7 +9861,7 @@ expand_expr_real_1 (tree exp, rtx target >>> && modifier != EXPAND_STACK_PARM >>> ? target : NULL_RTX), >>> VOIDmode, >>> - modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier); >>> + EXPAND_MEMORY); >>> >>> /* If the bitfield is volatile, we want to access it in the >>> field's mode, not the computed mode. >>> >>> which AFAIK makes "memory" expansion of loads/stores from/to registers >>> change (fail? go through stack memory?) - see handling of non-MEM return >>> values from that expand_expr call. > > I wanted to make the expansion of MEM_REF and TARGET_MEM_REF > not go thru the final misalign handling, which is guarded by > "if (modifier != EXPAND_WRITE && modifier != EXPAND_MEMORY && ..." > > What we want here is most likely EXPAND_MEMORY, which returns a > memory context if possible. > > Could you specify more explicitly what you mean with "handling of non-MEM > return > values from that expand_expr call", then I could try finding test cases for > that. > > >> In particular this seems to disable all movmisalign handling for MEM_REFs >> wrapped in component references which looks wrong. I was playing with >> >> typedef long long V >> __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); >> >> struct S { long long a[11]; V v; }__attribute__((aligned(8),packed)) ; >> struct S a, *b = &a; >> V v, w; >> >> int main() >> { >> v = b->v; >> b->v = w; >> return 0; >> } >> >> (use -fno-common) and I see that we use unaligned stores too often >> (even with a properly aligned MEM). >> >> The above at least shows movmisalign opportunities wrapped in component-refs. > > hmm, interesting. This does not compile differently with or without this > patch. > > I have another observation, regarding the testcase pr50444.c: > > method: > .LFB4: > .cfi_startproc > movq 32(%rdi), %rax > testq %rax, %rax > jne .L7 > addl $1, 16(%rdi) > movl $3, %eax > movq %rax, 32(%rdi) > movdqu 16(%rdi), %xmm0 > pxor (%rdi), %xmm0 > movdqu %xmm0, 40(%rdi) > > here the first movdqu could as well be movdqa, because 16+rdi is 128-bit > aligned. > In the ctor method a movdqa is used, but the SRA is very pessimistic and > generates > an unaligned MEM_REF. Also this example does not compile any different with > this patch. > > >>> That is, do you see anything break with just removing the movmisalign path? >>> I'd rather install that (with the new testcases that then pass) separately >>> as >>> this is a somewhat fragile area and being able to more selectively >>> bisect/backport >>> would be nice. > > No, I think that is a good idea. > > Attached the first part of the patch, that does only remove the movmisalign > path. > > Should I apply this one after regression testing?
It seems you already have...? Richard. > Bernd. > >>> Thanks, >>> Richard. >>> >>>> Regards >>>> Bernd.