On Mon, Oct 5, 2015 at 6:36 PM, Michael Meissner <meiss...@linux.vnet.ibm.com> wrote: > Ok, after spending the day on going down the rabbit hole of trying to optimize > just about every, here are my patches. > > Note, I simplified the constraints to eliminate some rare possibilities, like > optimizing converting from double to long double if the double happened to be > in a GPR register and the long double value was stored in memory (but there > never was an optimization for having the double value be in a GPR and the long > double value to also be a GPR). > > I also separated the VSX case from the non-VSX case. This is to simplify > things > at the RTL level (non-VSX must load up 0.0 to put into the lower word, while > VSX can use the XXLXOR instruction to clear the register). > > I dropped support in the insns for extending the DFmode value to TFmode that > is > located in memory directly. Now, the compiler builds the whole value in FPRs > and then does the store. This simplifies the code somewhat, and SPE/ieeequad > paths require the value to be in registers, which might lead to other lra > bugs. In the case of just doing one conversion in straight line code, it just > changes the register allocation somewhat (allocate 1 TFmode pseudo instead of > 1-2 DFmode psuedos). > > I have bootstrapped the compiler on little endian power8 with no regressions. > I > have built the test case with various options (-mlra vs. no -mlra, 32-bit, > 64-bit, power5/power6/power7/power8), and it all builds correctly. > > Is this patch ok to apply to the trunk? > > I would like to apply this patch to GCC 5.x as well. However, in doing the > patch, this patch touches areas that I've been working on for IEEE 128-bit > floating point support, and so the patch will need to be reworked for GCC > 5.x. Is it ok to install in the trunk? > > In addition, I will need to modify this area again with the next IEEE 128-bit > floating point support patch, but I wanted to separate this patch out so that > it could be considered by itself, and back ported to GCC 5.x. > > [gcc] > 2015-10-05 Michael Meissner <meiss...@linux.vnet.ibm.com> > Peter Bergner <berg...@vnet.ibm.com> > > PR target/67808 > * config/rs6000/rs6000.md (extenddftf2): In the expander, only > allow registers, but provide insns for the combiner to create for > loads from memory. Separate VSX code from non-VSX code. For > non-VSX code, combine extenddftf2_fprs into extenddftf2 and rename > externaldftf2_internal to externaldftf2_fprs. Reorder constraints > so that registers come before memory operations. Drop support from > converting DFmode to TFmode, if the DFmode value is in a GPR > register, and the TFmode is in memory. > (extenddftf2_fprs): Likewise. > (extenddftf2_internal): Likewise. > (extenddftf2_vsx): Likewise. > (extendsftf2): In the expander, only allow registers, but provide > insns for the combiner to create for stores and loads. > > [gcc/testsuite] > 2015-10-05 Michael Meissner <meiss...@linux.vnet.ibm.com> > Peter Bergner <berg...@vnet.ibm.com> > > PR target/67808 > * gcc.target/powerpc/pr67808.c: New test.
This is okay for trunk, but can we hold off for GCC 5 and allow things to settle? Thanks, David