---------------------------------------- > Date: Tue, 3 Dec 2013 14:51:21 +0100 > Subject: Re: [REPOST] Invalid Code when reading from unaligned zero-sized > array > From: richard.guent...@gmail.com > To: bernd.edlin...@hotmail.de > CC: l...@redhat.com; gcc-patches@gcc.gnu.org; ja...@redhat.com > > On Tue, Dec 3, 2013 at 2:10 PM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Tue, Dec 3, 2013 at 1:48 PM, Bernd Edlinger >> <bernd.edlin...@hotmail.de> wrote: >>> Hi Jeff, >>> >>> please find attached the patch (incl. test cases) for the unaligned read >>> BUG that I found while investigating >>> on PR#57748: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57748 >>> >>> one test case is this one: >>> >>> pr57748-3.c: >>> /* PR middle-end/57748 */ >>> /* { dg-do run } */ >>> /* wrong code in expand_expr_real_1. */ >>> >>> #include <stdlib.h> >>> >>> extern void abort (void); >>> >>> typedef long long V >>> __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); >>> >>> typedef struct S { V a; V b[0]; } P __attribute__((aligned (1))); >>> >>> struct __attribute__((packed)) T { char c; P s; }; >>> >>> void __attribute__((noinline, noclone)) >>> check (P *p) >>> { >>> if (p->b[0][0] != 3 || p->b[0][1] != 4) >>> abort (); >>> } >>> >>> void __attribute__((noinline, noclone)) >>> foo (struct T *t) >>> { >>> V a = { 3, 4 }; >>> t->s.b[0] = a; >>> } >>> >>> int >>> main () >>> { >>> struct T *t = (struct T *) calloc (128, 1); >>> >>> foo (t); >>> check (&t->s); >>> >>> free (t); >>> return 0; >>> } >>> >>> >>> and the other one is >>> pr57748-4.c: >>> /* PR middle-end/57748 */ >>> /* { dg-do run } */ >>> /* wrong code in expand_expr_real_1. */ >>> >>> #include <stdlib.h> >>> >>> extern void abort (void); >>> >>> typedef long long V >>> __attribute__ ((vector_size (2 * sizeof (long long)), may_alias)); >>> >>> typedef struct S { V b[1]; } P __attribute__((aligned (1))); >>> >>> struct __attribute__((packed)) T { char c; P s; }; >>> >>> void __attribute__((noinline, noclone)) >>> check (P *p) >>> { >>> if (p->b[1][0] != 3 || p->b[1][1] != 4) >>> abort (); >>> } >>> >>> void __attribute__((noinline, noclone)) >>> foo (struct T *t) >>> { >>> V a = { 3, 4 }; >>> t->s.b[1] = a; >>> } >>> >>> int >>> main () >>> { >>> struct T *t = (struct T *) calloc (128, 1); >>> >>> foo (t); >>> check (&t->s); >>> >>> free (t); >>> return 0; >>> } >>> >>> >>> The patch does add a boolean "expand_reference" parameter to >>> expand_expr_real and >>> expand_expr_real_1. I pass true when I intend to use the returned memory >>> context >>> as an array reference, instead of a value. At places where mis-aligned >>> values are extracted, >>> I do not return a register with the extracted mis-aligned value if >>> expand_reference is true. >>> When I have a VIEW_CONVERT_EXPR I pay attention to pass down the outer >>> "expand_reference" >>> to the inner expand_expr_real call. Expand_reference, is pretty much >>> similar to the >>> expand_modifier "EXPAND_MEMORY". >>> >>> Boot-strapped and regression-tested on X86_64-pc-linux-gnu (many times). >>> >>> Ok for trunk? >> >> It still feels like papering over the underlying issue. Let me have a >> second (or third?) look. > > So I believe the issue is that we are asking the target for an optab > for movmisaling with the wrong mode and then create a movmisalign > with a wrong mode. > > Index: gcc/expr.c > =================================================================== > --- gcc/expr.c (revision 205623) > +++ gcc/expr.c (working copy) > @@ -9737,7 +9737,7 @@ expand_expr_real_1 (tree exp, rtx target > && mode != BLKmode > && align < GET_MODE_ALIGNMENT (mode)) > { > - if ((icode = optab_handler (movmisalign_optab, mode)) > + if ((icode = optab_handler (movmisalign_optab, tmode)) > != CODE_FOR_nothing) > { > struct expand_operand ops[2]; > @@ -9745,7 +9745,7 @@ expand_expr_real_1 (tree exp, rtx target > /* We've already validated the memory, and we're creating a > new pseudo destination. The predicates really can't fail, > nor can the generator. */ > - create_output_operand (&ops[0], NULL_RTX, mode); > + create_output_operand (&ops[0], NULL_RTX, tmode); > create_fixed_operand (&ops[1], temp); > expand_insn (icode, 2, ops); > temp = ops[0].value; > @@ -9966,7 +9966,7 @@ expand_expr_real_1 (tree exp, rtx target > != INTEGER_CST) > && modifier != EXPAND_STACK_PARM > ? target : NULL_RTX), > - VOIDmode, > + mode1, > modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier); > > /* If the bitfield is volatile, we want to access it in the > > fixes the testcases (the VIEW_CONVERT_EXPR path would need a > similar fix I think, both for the recursion and the movmisaling handling). > Note that the alignment checks guarding the movmisalign handling > use the wrong mode as well, they are also somewhat non-sensical > as they apply to the non-offsetted object. > > That said, the "get me a 'base' MEM for this" via recursing is misdesigned. > I still think that splitting it out is the correct fix in the end (doing the > movmisaling handling only there, but with correct info from the parent). > > I will try if I can come up with something that feels reasonably safe > for stage3. >
Fine. What is tmode and mode in the test example? Note: we can run in the same problem if this is executed: else if (SLOW_UNALIGNED_ACCESS (mode, align)) temp = extract_bit_field (temp, GET_MODE_BITSIZE (mode), 0, TYPE_UNSIGNED (TREE_TYPE (exp)), (modifier == EXPAND_STACK_PARM ? NULL_RTX : target), mode, mode); maybe with %s/mode/tmode/g Maybe just #define SLOW_UNALIGNED_ACCESS 1 on X86-64 to see what might happen on a different target.... Thanks Bernd.