Hi Mike, Thanks for doing this!
On Sun, Jan 21, 2018 at 07:03:58PM -0500, Michael Meissner wrote: > I've reworked the patch for PR83864. This bug is due to the compiler issuing > an internal error for code of the form on a little endian system: > > int sb (_Float128 *p, unsigned long n) { return __builtin_signbit (p[n]); > } > > The problem is that the memory optimization wants to load the high double word > from memory to isolate the signbit. In little endian, the high word is at > offset 8 instead of 0. > > The bug will also show up if you use the long double type, and you use the > -mabi=ieeelongdouble option. > > Previously the code did one UNSPEC (signbit<mode>2_dm) that was done before > register allocation, and combined the value being in vector register, memory, > and GPR register along with the shift to isolate the signbit. Then after the > split after register allocation, it created a second UNSPEC > (signbit<mode>2_dm2) that was just the direct move, or it did the memory and > GPR code path separately. > > With these patches, the generator code issues an UNSPEC (signbit<mode>2_dm) > just to get the high double word, and the shift right is then emitted. The > UNSPEC only handles the value being in either vector or GPR register. There > is > a second UNSPEC that is created by the combiner if the value is in memory. On > little endian systems, the first split pass (before register allocation) will > allocate a pseudo register to hold the initial ADD of the base and index > registers for indexed loads, and then forms a LD reg,8(tmp) to load the high > word. Just in case, some code after register allocation reforms the UNSPEC, > it > uses a base register for the load, and it can use the base register as needed > for the temporary. But does it have to do an unspec at all? Can't it just immediately take the relevant 64-bit half (during expand), no unspec in sight, and that is that? > I have run this code on both little endian and big endian power8 systems, > doing > bootstrap and regression test. There were no regressions. Can I install this > bug on the trunk? > > I don't believe the bug shows up in gcc 6/7 branches, but I will build these > and test to see if the code is needed to be backported. If it is needed on the branches, okay for there too (once 7 reopens). > [gcc] > 2018-01-21 Michael Meissner <meiss...@linux.vnet.ibm.com> > > PR target/83862 > * config/rs6000/rs6000-protos.h (rs6000_split_signbit): Delete, > no longer used. > * config/rs6000/rs6000.c (rs6000_split_signbit): Likewise. > * config/rs6000/rs6000.md (signbit<mode>2): Change code for IEEE > 128-bit to produce an UNSPEC move to get the double word with the > signbit and then a shift directly to do signbit. > (signbit<mode>2_dm): Replace old IEEE 128-bit signbit > implementation with a new version that just does either a direct > move or a regular move. Move memory interface to separate insns. > Move insns so they are next to the expander. > (signbit<mode>2_dm_mem_be): New combiner insns to combine load > with signbit move. Split big and little endian case. > (signbit<mode>2_dm_mem_le): Likewise. > (signbit<mode>2_dm_<su>ext): Delete, no longer used. > (signbit<mode>2_dm2): Likewise. > > [gcc/testsuite] > 2018-01-22 Michael Meissner <meiss...@linux.vnet.ibm.com> > > PR target/83862 > * gcc.target/powerpc/pr83862.c: New test. The patch is okay for trunk, but I still think this can be a lot simpler. > +;; We can't use SUBREG:DI of the IEEE 128-bit value before register > +;; allocation, because KF/TFmode aren't tieable with DImode. Also, having > the > +;; 64-bit scalar part in the high end of the register instead of the low end > +;; also complicates things. So instead, we use an UNSPEC to do the move of > the > +;; high double word to isolate the signbit. I'll think about this. Either way, thank you for your patience! And the patch :-) Segher