On Tue, Jun 28, 2016 at 4:19 PM, Jakub Jelinek <ja...@redhat.com> wrote: > On Tue, Jun 28, 2016 at 12:26:55PM +0200, Jakub Jelinek wrote: >> On Tue, Jun 28, 2016 at 12:09:51PM +0200, Uros Bizjak wrote: >> > > So, this patch instead changes ix86_expand_vector_move, so that >> > > for SUBREGs it forces the SUBREG_REG into memory (or register if >> > > that fails, though I don't have a testcase for when that would happen), >> > > and just re-creates a SUBREG on the forced MEM (for whole size >> > > SUBREGs that is in the end just using different mode for the MEM). >> > >> > There can be issue with paradoxical subregs, since subreg mode can be >> > wider than original mode. >> >> For paradoxical subregs, the extra bits are undefined, whether it is SUBREG >> of a constant, REG or MEM, isn't that the case? Though I guess with MEM >> there is a risk that reading the undefined bits from mem will be beyond end >> of the data segment. It would really surprise me if something created a >> paradoxical SUBREG of a CONSTANT_P. >> Anyway, I can just always force_reg in that case, like: >> >> 2016-06-28 Jakub Jelinek <ja...@redhat.com> >> >> PR middle-end/71626 >> * config/i386/i386.c (ix86_expand_vector_move): For SUBREG of >> a constant, force its SUBREG_REG into memory or register instead >> of whole op1. >> >> * gcc.c-torture/execute/pr71626-1.c: New test. >> * gcc.c-torture/execute/pr71626-2.c: New test. > > This version passed bootstrap/regtest on x86_64-linux and i686-linux, is > this one ok, or should I test the previous one?
Uh, just missed this mail... Previous patch is OK, no need for beeing too paranoid here. Uros. >> --- gcc/config/i386/i386.c.jj 2016-06-27 14:50:51.000000000 +0200 >> +++ gcc/config/i386/i386.c 2016-06-28 10:51:18.444624190 +0200 >> @@ -19610,12 +19610,30 @@ ix86_expand_vector_move (machine_mode mo >> of the register, once we have that information we may be able >> to handle some of them more efficiently. */ >> if (can_create_pseudo_p () >> - && register_operand (op0, mode) >> && (CONSTANT_P (op1) >> || (SUBREG_P (op1) >> && CONSTANT_P (SUBREG_REG (op1)))) >> - && !standard_sse_constant_p (op1, mode)) >> - op1 = validize_mem (force_const_mem (mode, op1)); >> + && ((register_operand (op0, mode) >> + && !standard_sse_constant_p (op1, mode)) >> + /* ix86_expand_vector_move_misalign() does not like constants. */ >> + || (SSE_REG_MODE_P (mode) >> + && MEM_P (op0) >> + && MEM_ALIGN (op0) < align))) >> + { >> + if (SUBREG_P (op1)) >> + { >> + machine_mode imode = GET_MODE (SUBREG_REG (op1)); >> + rtx r = (paradoxical_subreg_p (op1) >> + ? NULL_RTX : force_const_mem (imode, SUBREG_REG (op1))); >> + if (r) >> + r = validize_mem (r); >> + else >> + r = force_reg (imode, SUBREG_REG (op1)); >> + op1 = simplify_gen_subreg (mode, r, imode, SUBREG_BYTE (op1)); >> + } >> + else >> + op1 = validize_mem (force_const_mem (mode, op1)); >> + } >> >> /* We need to check memory alignment for SSE mode since attribute >> can make operands unaligned. */ >> @@ -19626,13 +19643,8 @@ ix86_expand_vector_move (machine_mode mo >> { >> rtx tmp[2]; >> >> - /* ix86_expand_vector_move_misalign() does not like constants ... */ >> - if (CONSTANT_P (op1) >> - || (SUBREG_P (op1) >> - && CONSTANT_P (SUBREG_REG (op1)))) >> - op1 = validize_mem (force_const_mem (mode, op1)); >> - >> - /* ... nor both arguments in memory. */ >> + /* ix86_expand_vector_move_misalign() does not like both >> + arguments in memory. */ >> if (!register_operand (op0, mode) >> && !register_operand (op1, mode)) >> op1 = force_reg (mode, op1); > > Jakub