Hi All, I'm looking for a backport of this patch to GCC8.
Ok for backport? Thanks, Tamar > -----Original Message----- > From: Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> > Sent: Wednesday, August 15, 2018 15:25 > To: Tamar Christina <tamar.christ...@arm.com>; Thomas Preudhomme > <thomas.preudho...@linaro.org> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Ramana Radhakrishnan > <ramana.radhakrish...@arm.com>; Richard Earnshaw > <richard.earns...@arm.com>; ni...@redhat.com > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by > enabling the FP16 pattern unconditionally. > > Hi Tamar, > > On 26/07/18 12:01, Tamar Christina wrote: > > Hi Thomas, > > > > > -----Original Message----- > > > From: Thomas Preudhomme <thomas.preudho...@linaro.org> > > > Sent: Thursday, July 26, 2018 09:29 > > > To: Tamar Christina <tamar.christ...@arm.com> > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Ramana > Radhakrishnan > > > <ramana.radhakrish...@arm.com>; Richard Earnshaw > > > <richard.earns...@arm.com>; ni...@redhat.com; Kyrylo Tkachov > > > <kyrylo.tkac...@arm.com> > > > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by > > > enabling the FP16 pattern unconditionally. > > > > > > Hi Tamar, > > > > > > On Wed, 25 Jul 2018 at 16:28, Tamar Christina > > > <tamar.christ...@arm.com> > > > wrote: > > > > > > > > Hi Thomas, > > > > > > > > Thanks for the review! > > > > > > > > > > > > > > > > I don't believe the TARGET_FP16 guard to be needed, because > > > > > > the pattern doesn't actually generate code and requires > > > > > > another pattern for that, and a reg to reg move should always > > > > > > be possible anyway. So allowing the force to register here is > > > > > > safe and it allows the compiler to generate a correct error > > > > > > instead of ICEing in an > > > infinite loop. > > > > > > > > > > How about subreg to subreg move? Doesn't that expand to more > > > > > insns (subreg to reg and reg to subreg)? Couldn't you improve > > > > > the logic to check that there is actually a mode change so that > > > > > if there isn't (like moving from one subreg to another) just expand to > a single move? > > > > > > > > > > > > > Yes, but that is not a new issue. My patch is simply removing the > > > > TARGET_FP16 restrictions and merging two patterns that should be > > > > one > > > using an iterator and nothing more. > > > > > > > > The redundant mov is already there and a different issue than the > > > > ICE I'm > > > trying to fix. > > > > > > It's there for movv4hf and movv6hf but your patch extends this > > > problem to movv2sf and movv4sf as well. > > > > I don't understand how it can. My patch just replaces one pattern for > > V4HF and one for V8HF with one pattern operating on VH. > > > > ;; Vector modes for 16-bit floating-point support. > > (define_mode_iterator VH [V8HF V4HF]) > > > > My pattern has absolutely no effect on V2SF and V4SF or any of the other > modes. > > > > > > > > > > > > > None of the code inside the expander is needed at all, the code > > > > really only has an effect on subreg to subreg moves, as > > > > `force_reg` doesn't do > > > anything when it's argument is already a reg. > > > > > > > > The comment in the expander (which was already there) is wrong. > > > > The > > > > *reason* the ICE is fixed isn't because of the `force_reg`. It's > > > > because of the mere presence of the expander itself. The expander > > > > matches the standard mov$a optab and so this prevents > > > emit_move_insn_1 from doing the move by subwords as it finds a > > > pattern that's able to do the move. > > > > > > Could you then fix the comment in your patch as well? I hadn't > > > understood the force_reg was not key here. You might want to update > > > the following sentence from your patch description if you are going > > > to include it in your commit message: > > > > I'll update the comment in the patch. The cover letter won't be > > included in the commit, But it does accurately reflect the current > > state of affairs. The patch will do the force_reg, It's just not the reason > > it > works. > > > > > > > > The way this is worked around in the back-end is that we have move > > > patterns in neon.md that usually just force the register instead of > > > checking with the back-end. > > > > > > "The way this is worked around (..) that just force the register" is > > > what led me to believe the force_reg was important. > > > > > > > > > > > The expander however always falls through and doesn’t stop RTL > > > > generation. You could remove all the code in there and have it > > > > properly match the *neon_mov instructions which will do the right > > > > thing later at code generation time and avoid the redundant moves. > > > > My > > > guess is the original `force_reg` was copied from the other patterns > > > like `movti` and the existing `mov<mode>`. There It makes sense > > > because the operands can be MEM or anything general_operand. > > > > > > > > However the redundant moves are a different problem than what I'm > > > > trying to solve here. So I think that's another patch which > > > > requires further > > > testing. > > > > > > I was just thinking of restricting when does the force_reg happens > > > but if it can be removed completely I agree it should probably be > > > done in a separate patch. > > > > > > Oh by the way, is there something that prevent those expander to > > > ever be used with a memory operand? Because the GCC internals > > > contains the following piece for mov standard pattern (bold marks added > by me): > > > > > > "Second, these patterns are not used solely in the RTL generation > > > pass. Even the reload pass can generate move insns to copy values > > > from stack slots into temporary registers. When it does so, one of > > > the operands is a hard register and the other is an operand that can need > to be reloaded into a register. > > > Therefore, when given such a pair of operands, the pattern must > > > generate RTL which needs no reloading and needs no temporary > > > registers—no registers other than the operands. For example, if you > > > support the pattern with a define_ expand, then in such a case the > > > define_expand *mustn’t call > > > force_reg* or any other such function which might generate new > > > pseudo registers." > > > > When used during expand the operand predicates are checked, this > prevents that case. > > When used during reload can_create_pseudo_p () guards the actions in > the patterns. > > > > gcc/rtl.h:#define can_create_pseudo_p() (!reload_in_progress && > > !reload_completed) > > > > So during or after reload it would just be an empty fall through > > pattern, so it won't do anything to MEM or anything else. > > > > This is ok for trunk. > Thanks for looking at it as well Thomas, > > Kyrill > > > Regards, > > Tamar > > > > > > > > Best regards, > > > > > > Thomas > > > > > > > > > > > Regards, > > > > Tamar > > > > > > > > > Best regards, > > > > > > > > > > Thomas > > > > > > > > > > > > > > > > > This patch ensures gcc.target/arm/big-endian-subreg.c is fixed > > > > > > without introducing any regressions while fixing > > > > > > > > > > > > gcc.dg/vect/vect-nop-move.c execution test > > > > > > g++.dg/torture/vshuf-v2si.C -O3 -g execution test > > > > > > g++.dg/torture/vshuf-v4si.C -O3 -g execution test > > > > > > g++.dg/torture/vshuf-v8hi.C -O3 -g execution test > > > > > > > > > > > > Regtested on armeb-none-eabi and no regressions. > > > > > > Bootstrapped on arm-none-linux-gnueabihf and no issues. > > > > > > > > > > > > > > > > > > Ok for trunk? > > > > > > > > > > > > Thanks, > > > > > > Tamar > > > > > > > > > > > > gcc/ > > > > > > 2018-07-23 Tamar Christina <tamar.christ...@arm.com> > > > > > > > > > > > > PR target/84711 > > > > > > * config/arm/arm.c (arm_can_change_mode_class): > > > > > > Disallow > > > subreg. > > > > > > * config/arm/neon.md (movv4hf, movv8hf): Refactored to.. > > > > > > (mov<mov>): ..this and enable unconditionally. > > > > > > > > > > > > --