Just for completeness, these were the test examples on this private communication:
On Fri, 6 Sep 2013 11:19:18, Richard Biener wrote: > On Fri, Sep 6, 2013 at 10:35 AM, Bernd Edlinger > <bernd.edlin...@hotmail.de> wrote: >> Richard, >> >>> But the movmisalign path skips all this code and with the >>> current code thinks the actual access is in the mode of the >>> whole structure. (and also misses the address adjustment >>> as shown in the other testcase for the bug) >>> >>> The movmisalign handling in this path is just broken. And >>> it's _not_ just for optimization! If you have >>> >>> struct __attribute__((packed)) { >>> double x; >>> v2df y; >>> } *p; >>> >>> and do >>> >>> p = malloc (48); // gets you 8-byte aligned memory >>> p->y = (v2df) { 1., 2. }; >>> >>> then you cannot skip the movmisaling handling because >>> we'd generate an aligned move (based on the mode) then. >>> >> >> Ok, test examples are really helpful here. >> >> This time the structure is BLKmode, unaligned, >> movmisalign = false anyway. >> >> I tried to make a test case out of your example, >> and as I expected, the code is also correct: >> >> foo: >> .cfi_startproc >> movdqa .LC1(%rip), %xmm0 >> movq $-1, (%rdi) >> movl $0x4048f5c3, 8(%rdi) >> movdqu %xmm0, 12(%rdi) >> ret >> >> movdqu. >> >> The test executes without trap. >> And I did everything to make the object unaligned. > > Yeah, well - for the expand path with BLKmode it's working fine. > We're doing all > the dances because of correctness for non-BLKmode expand paths. > >> I am sure we could completely remove the >> movmisalign path, and nothing would happen. >> probably. except maybe for a performance regression. > > In this place probably yes. But we can also fix it to use the proper mode and > properly do the offset adjustment. But just adding a bounds check looks > broken to me. > > Note that there may have been a correctness reason for this code in the > light of the IPA-SRA code. Maybe Martin remembers. > > If removing the movmisalign handling inside the handled-component > case in expand_assignment works out (make sure to also test a > strict-alignment target) then that is probably fine. > > Richard. > >> >> Bernd.
#include <stdio.h> #include <string.h> typedef long long V __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); union x { long long a; float b; } __attribute__((aligned(1))) ; struct s { union x xx[0]; V x; } __attribute__((packed)); void __attribute__((noinline, noclone)) foo(struct s * x) { x->xx[0].a = -1; x->xx[0].b = 3.14; x->x[1] = 0x123456789ABCDEF; } int main() { struct s ss; memset(&ss, 0, sizeof(ss)); foo (&ss); printf("%f %llX\n", ss.xx[0].b, ss.xx[0].a); printf("%llX %llX\n", ss.x[0], ss.x[1]); }
#include <stdio.h> #include <string.h> typedef long long V __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); struct __attribute__((packed)) s { long long a; float b; V x; }; void __attribute__((noinline, noclone)) foo(struct s * x) { V a = { 1, 2 }; x->a = -1; x->b = 3.14; x->x = a; } int main() { long long buf[100]; struct s* ss = (struct s*) ((char*)buf+1); memset(ss, 0, sizeof(*ss)); foo (ss); printf("%f %llX\n", ss->b, ss->a); printf("%llX %llX\n", ss->x[0], ss->x[1]); }