On Fri, Oct 02, 2015 at 02:04:48PM -0500, Peter Bergner wrote: > PR67808 exposes a problem with the constraints in the *extenddftf2_internal > pattern, in that it allows TFmode operands to occupy Altivec registers > which they are not allowed to do. Reload was able to work around the > problem, but LRA is more pedantic and it caused it to go into an infinite > spill loop until it ICEd. The following patch from Mike changes the TFmode > output operand to use the "d" constraint instead of "ws". It also allows > using the "ws" constraint for the two input operands, since that is allowed > for DFmode operands. > > This passed bootstraps (with reload on by default and lra on by default) > and shows no testsuite regressions. Is this ok for trunk? > > The bug is also present in the FSF 5 branch (4.9 is ok), is this ok for > that too, assuming my bootstrap/regtesting there are clean? > > Peter > > > gcc/ > PR target/67808 > * config/rs6000/rs6000.md (*extenddftf2_internal): Fix constraints. > > gcc/testsuite/ > > * gcc.target/powerpc/pr67808.c: New test. >
In looking at the constraints in more detail, after the patch we have the following alternatives: #1: op0 = m, op1 = ws, op2 = ws #2: op0 = Y, op1 = r, op2 = r #3: op0 = d, op1 = md, op2 = j #4: op0 = d, op1 = md, op2 = m #5: op0 = &d, op1 = md, op2 = ws I.e. #1: Store result, input in VSX register, 0.0 in VSX register (VSX only) #2: Store result, input in GPR register, 0.0 in GPR register #3: Result in FPR register, input in FPR or memory, 0.0 direct (VSX only) #4: Result in FPR register, input in FPR or memory, 0.0 in memory #5: Result in FPR reg (no overlap), input in FPR/memory, 0.0 in VSX reg So, the non-VSX case (were ws is NO_REGS) only deals with alternatives #2 and #4. I think (but I don't have a test case) that alternative #1 is potentially a problem if the input register is ever allocated to an Altivec register and the address mode is reg+offset (in which case we would not be able to form the address after the insn is split post-reload. I have attached a better version of the patch. This gives the constraints: #1: op0 = m, op1 = d, op2 = d #2: op0 = Y, op1 = r, op2 = r #3: op0 = d, op1 = ws, op2 = j #4: op0 = d, op1 = md, op2 = m #5: op0 = &d, op1 = m, op2 = md I.e. #1: Store result, input in FPR register, 0.0 in FPR register #2: Store result, input in GPR register, 0.0 in GPR register #3: Result in FPR reg, input in VSX reg, 0.0 direct (VSX only) #4: Result in FPR reg, input in FPR/memory, 0.0 in memory #5: Result in FPR reg, input in FPR/memory, 0.0 in FPR/memory (no overlap) [gcc] 2015-10-05 Peter Bergner <berg...@vnet.ibm.com> Michael Meissner <meiss...@linux.vnet.ibm.com> PR target/67808 * config/rs6000/rs6000.md (extenddftf2_internal): Fix up constraints. [gcc/testsuite] 2015-10-05 Peter Bergner <berg...@vnet.ibm.com> PR target/67808 * gcc.target/powerpc/pr67808.c: New test. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 228495) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -6505,9 +6505,9 @@ (define_expand "extenddftf2_fprs" }) (define_insn_and_split "*extenddftf2_internal" - [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,ws,d,&d") - (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,md,md,md"))) - (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,d"))] + [(set (match_operand:TF 0 "nonimmediate_operand" "=m,Y,d,d,&d") + (float_extend:TF (match_operand:DF 1 "input_operand" "d,r,ws,md,md"))) + (use (match_operand:DF 2 "zero_reg_mem_operand" "d,r,j,m,md"))] "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LONG_DOUBLE_128" Index: gcc/testsuite/gcc.target/powerpc/pr67808.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr67808.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/pr67808.c (working copy) @@ -0,0 +1,12 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O1 -mvsx -mlra" } */ + +/* PR 67808: LRA ICEs on simple double to long double conversion test case */ + +void +foo (long double *ldb1, double *db1) +{ + *ldb1 = *db1; +}