On Thu, 26 Jul 2018 at 12:01, Tamar Christina <tamar.christ...@arm.com> wrote: > > Hi Thomas, > > > -----Original Message----- > > From: Thomas Preudhomme <thomas.preudho...@linaro.org> > > Sent: Thursday, July 26, 2018 09:29 > > To: Tamar Christina <tamar.christ...@arm.com> > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Ramana Radhakrishnan > > <ramana.radhakrish...@arm.com>; Richard Earnshaw > > <richard.earns...@arm.com>; ni...@redhat.com; Kyrylo Tkachov > > <kyrylo.tkac...@arm.com> > > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by > > enabling the FP16 pattern unconditionally. > > > > Hi Tamar, > > > > On Wed, 25 Jul 2018 at 16:28, Tamar Christina <tamar.christ...@arm.com> > > wrote: > > > > > > Hi Thomas, > > > > > > Thanks for the review! > > > > > > > > > > > > > I don't believe the TARGET_FP16 guard to be needed, because the > > > > > pattern doesn't actually generate code and requires another > > > > > pattern for that, and a reg to reg move should always be possible > > > > > anyway. So allowing the force to register here is safe and it > > > > > allows the compiler to generate a correct error instead of ICEing in > > > > > an > > infinite loop. > > > > > > > > How about subreg to subreg move? Doesn't that expand to more insns > > > > (subreg to reg and reg to subreg)? Couldn't you improve the logic to > > > > check that there is actually a mode change so that if there isn't > > > > (like moving from one subreg to another) just expand to a single move? > > > > > > > > > > Yes, but that is not a new issue. My patch is simply removing the > > > TARGET_FP16 restrictions and merging two patterns that should be one > > using an iterator and nothing more. > > > > > > The redundant mov is already there and a different issue than the ICE I'm > > trying to fix. > > > > It's there for movv4hf and movv6hf but your patch extends this problem to > > movv2sf and movv4sf as well. > > I don't understand how it can. My patch just replaces one pattern for V4HF and > one for V8HF with one pattern operating on VH. > > ;; Vector modes for 16-bit floating-point support. > (define_mode_iterator VH [V8HF V4HF]) > > My pattern has absolutely no effect on V2SF and V4SF or any of the other > modes.
My bad, I was looking at VF. > > > > > > > > > None of the code inside the expander is needed at all, the code really > > > only has an effect on subreg to subreg moves, as `force_reg` doesn't do > > anything when it's argument is already a reg. > > > > > > The comment in the expander (which was already there) is wrong. The > > > *reason* the ICE is fixed isn't because of the `force_reg`. It's > > > because of the mere presence of the expander itself. The expander > > > matches the standard mov$a optab and so this prevents > > emit_move_insn_1 from doing the move by subwords as it finds a pattern > > that's able to do the move. > > > > Could you then fix the comment in your patch as well? I hadn't understood > > the force_reg was not key here. You might want to update the following > > sentence from your patch description if you are going to include it in your > > commit message: > > I'll update the comment in the patch. The cover letter won't be included in > the commit, > But it does accurately reflect the current state of affairs. The patch will > do the force_reg, > It's just not the reason it works. Understood. > > > > > The way this is worked around in the back-end is that we have move > > patterns in neon.md that usually just force the register instead of checking > > with the back-end. > > > > "The way this is worked around (..) that just force the register" is what > > led > > me to believe the force_reg was important. > > > > > > > > The expander however always falls through and doesn’t stop RTL > > > generation. You could remove all the code in there and have it > > > properly match the *neon_mov instructions which will do the right > > > thing later at code generation time and avoid the redundant moves. My > > guess is the original `force_reg` was copied from the other patterns like > > `movti` and the existing `mov<mode>`. There It makes sense because the > > operands can be MEM or anything general_operand. > > > > > > However the redundant moves are a different problem than what I'm > > > trying to solve here. So I think that's another patch which requires > > > further > > testing. > > > > I was just thinking of restricting when does the force_reg happens but if it > > can be removed completely I agree it should probably be done in a separate > > patch. > > > > Oh by the way, is there something that prevent those expander to ever be > > used with a memory operand? Because the GCC internals contains the > > following piece for mov standard pattern (bold marks added by me): > > > > "Second, these patterns are not used solely in the RTL generation pass. Even > > the reload pass can generate move insns to copy values from stack slots into > > temporary registers. When it does so, one of the operands is a hard register > > and the other is an operand that can need to be reloaded into a register. > > Therefore, when given such a pair of operands, the pattern must generate > > RTL which needs no reloading and needs no temporary registers—no > > registers other than the operands. For example, if you support the pattern > > with a define_ expand, then in such a case the define_expand *mustn’t call > > force_reg* or any other such function which might generate new pseudo > > registers." > > When used during expand the operand predicates are checked, this prevents > that case. > When used during reload can_create_pseudo_p () guards the actions in the > patterns. > > gcc/rtl.h:#define can_create_pseudo_p() (!reload_in_progress && > !reload_completed) > > So during or after reload it would just be an empty fall through pattern, so > it won't do > anything to MEM or anything else. Of course. LGTM with the mentioned changes to the comment. This is not an approval though as I'm not a maintainer. Best regards, Thomas > > Regards, > Tamar > > > > > Best regards, > > > > Thomas > > > > > > > > Regards, > > > Tamar > > > > > > > Best regards, > > > > > > > > Thomas > > > > > > > > > > > > > > This patch ensures gcc.target/arm/big-endian-subreg.c is fixed > > > > > without introducing any regressions while fixing > > > > > > > > > > gcc.dg/vect/vect-nop-move.c execution test > > > > > g++.dg/torture/vshuf-v2si.C -O3 -g execution test > > > > > g++.dg/torture/vshuf-v4si.C -O3 -g execution test > > > > > g++.dg/torture/vshuf-v8hi.C -O3 -g execution test > > > > > > > > > > Regtested on armeb-none-eabi and no regressions. > > > > > Bootstrapped on arm-none-linux-gnueabihf and no issues. > > > > > > > > > > > > > > > Ok for trunk? > > > > > > > > > > Thanks, > > > > > Tamar > > > > > > > > > > gcc/ > > > > > 2018-07-23 Tamar Christina <tamar.christ...@arm.com> > > > > > > > > > > PR target/84711 > > > > > * config/arm/arm.c (arm_can_change_mode_class): Disallow > > subreg. > > > > > * config/arm/neon.md (movv4hf, movv8hf): Refactored to.. > > > > > (mov<mov>): ..this and enable unconditionally. > > > > > > > > > > --