RE: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

Tamar Christina Mon, 13 Aug 2018 02:31:43 -0700

Ping


> -----Original Message-----
> From: gcc-patches-ow...@gcc.gnu.org <gcc-patches-ow...@gcc.gnu.org>
> On Behalf Of Tamar Christina
> Sent: Tuesday, July 31, 2018 10:47
> To: gcc-patches@gcc.gnu.org
> Cc: nd <n...@arm.com>; Ramana Radhakrishnan
> <ramana.radhakrish...@arm.com>; Richard Earnshaw
> <richard.earns...@arm.com>; ni...@redhat.com; Kyrylo Tkachov
> <kyrylo.tkac...@arm.com>; Thomas Preudhomme
> <thomas.preudho...@linaro.org>
> Subject: RE: [PATCH][GCC][Arm] Fix subreg crash in different way by
> enabling the FP16 pattern unconditionally.
> 
> Ping 😊
> 
> > -----Original Message-----
> > From: Thomas Preudhomme <thomas.preudho...@linaro.org>
> > Sent: Thursday, July 26, 2018 12:29
> > To: Tamar Christina <tamar.christ...@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Ramana Radhakrishnan
> > <ramana.radhakrish...@arm.com>; Richard Earnshaw
> > <richard.earns...@arm.com>; ni...@redhat.com; Kyrylo Tkachov
> > <kyrylo.tkac...@arm.com>
> > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by
> > enabling the FP16 pattern unconditionally.
> >
> > On Thu, 26 Jul 2018 at 12:01, Tamar Christina
> > <tamar.christ...@arm.com>
> > wrote:
> > >
> > > Hi Thomas,
> > >
> > > > -----Original Message-----
> > > > From: Thomas Preudhomme <thomas.preudho...@linaro.org>
> > > > Sent: Thursday, July 26, 2018 09:29
> > > > To: Tamar Christina <tamar.christ...@arm.com>
> > > > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Ramana
> > Radhakrishnan
> > > > <ramana.radhakrish...@arm.com>; Richard Earnshaw
> > > > <richard.earns...@arm.com>; ni...@redhat.com; Kyrylo Tkachov
> > > > <kyrylo.tkac...@arm.com>
> > > > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way
> > > > by enabling the FP16 pattern unconditionally.
> > > >
> > > > Hi Tamar,
> > > >
> > > > On Wed, 25 Jul 2018 at 16:28, Tamar Christina
> > > > <tamar.christ...@arm.com>
> > > > wrote:
> > > > >
> > > > > Hi Thomas,
> > > > >
> > > > > Thanks for the review!
> > > > >
> > > > > > >
> > > > > > > I don't believe the TARGET_FP16 guard to be needed, because
> > > > > > > the pattern doesn't actually generate code and requires
> > > > > > > another pattern for that, and a reg to reg move should
> > > > > > > always be possible anyway. So allowing the force to register
> > > > > > > here is safe and it allows the compiler to generate a
> > > > > > > correct error instead of ICEing in an
> > > > infinite loop.
> > > > > >
> > > > > > How about subreg to subreg move? Doesn't that expand to more
> > > > > > insns (subreg to reg and reg to subreg)? Couldn't you improve
> > > > > > the logic to check that there is actually a mode change so
> > > > > > that if there isn't (like moving from one subreg to another)
> > > > > > just expand to
> > a single move?
> > > > > >
> > > > >
> > > > > Yes, but that is not a new issue. My patch is simply removing
> > > > > the
> > > > > TARGET_FP16 restrictions and merging two patterns that should be
> > > > > one
> > > > using an iterator and nothing more.
> > > > >
> > > > > The redundant mov is already there and a different issue than
> > > > > the ICE I'm
> > > > trying to fix.
> > > >
> > > > It's there for movv4hf and movv6hf but your patch extends this
> > > > problem to movv2sf and movv4sf as well.
> > >
> > > I don't understand how it can. My patch just replaces one pattern
> > > for V4HF and one for V8HF with one pattern operating on VH.
> > >
> > > ;; Vector modes for 16-bit floating-point support.
> > > (define_mode_iterator VH [V8HF V4HF])
> > >
> > > My pattern has absolutely no effect on V2SF and V4SF or any of the
> > > other
> > modes.
> >
> > My bad, I was looking at VF.
> >
> > >
> > > >
> > > > >
> > > > > None of the code inside the expander is needed at all, the code
> > > > > really only has an effect on subreg to subreg moves, as
> > > > > `force_reg` doesn't do
> > > > anything when it's argument is already a reg.
> > > > >
> > > > > The comment in the expander (which was already there) is wrong.
> > > > > The
> > > > > *reason* the ICE is fixed isn't because of the `force_reg`. It's
> > > > > because of the mere presence of the expander itself. The
> > > > > expander matches the standard mov$a optab and so this prevents
> > > > emit_move_insn_1 from doing the move by subwords as it finds a
> > > > pattern that's able to do the move.
> > > >
> > > > Could you then fix the comment in your patch as well? I hadn't
> > > > understood the force_reg was not key here. You might want to
> > > > update the following sentence from your patch description if you
> > > > are going to include it in your commit message:
> > >
> > > I'll update the comment in the patch. The cover letter won't be
> > > included in the commit, But it does accurately reflect the current
> > > state of affairs. The patch will do the force_reg, It's just not the
> > > reason it
> > works.
> >
> > Understood.
> >
> > >
> > > >
> > > > The way this is worked around in the back-end is that we have move
> > > > patterns in neon.md that usually just force the register instead
> > > > of checking with the back-end.
> > > >
> > > > "The way this is worked around (..) that just force the register"
> > > > is what led me to believe the force_reg was important.
> > > >
> > > > >
> > > > > The expander however always falls through and doesn’t stop RTL
> > > > > generation. You could remove all the code in there and have it
> > > > > properly match the *neon_mov instructions which will do the
> > > > > right thing later at code generation time and avoid the redundant
> moves.
> > > > > My
> > > > guess is the original `force_reg` was copied from the other
> > > > patterns like `movti` and the existing `mov<mode>`. There It makes
> > > > sense because the operands can be MEM or anything general_operand.
> > > > >
> > > > > However the redundant moves are a different problem than what
> > > > > I'm trying to solve here. So I think that's another patch which
> > > > > requires further
> > > > testing.
> > > >
> > > > I was just thinking of restricting when does the force_reg happens
> > > > but if it can be removed completely I agree it should probably be
> > > > done in a separate patch.
> > > >
> > > > Oh by the way, is there something that prevent those expander to
> > > > ever be used with a memory operand? Because the GCC internals
> > > > contains the following piece for mov standard pattern (bold marks
> > > > added
> > by me):
> > > >
> > > > "Second, these patterns are not used solely in the RTL generation
> > > > pass. Even the reload pass can generate move insns to copy values
> > > > from stack slots into temporary registers. When it does so, one of
> > > > the operands is a hard register and the other is an operand that
> > > > can need
> > to be reloaded into a register.
> > > > Therefore, when given such a pair of operands, the pattern must
> > > > generate RTL which needs no reloading and needs no temporary
> > > > registers—no registers other than the operands. For example, if
> > > > you support the pattern with a define_ expand, then in such a case
> > > > the define_expand *mustn’t call
> > > > force_reg* or any other such function which might generate new
> > > > pseudo registers."
> > >
> > > When used during expand the operand predicates are checked, this
> > prevents that case.
> > > When used during reload can_create_pseudo_p () guards the actions in
> > the patterns.
> > >
> > > gcc/rtl.h:#define can_create_pseudo_p() (!reload_in_progress &&
> > > !reload_completed)
> > >
> > > So during or after reload it would just be an empty fall through
> > > pattern, so it won't do anything to MEM or anything else.
> >
> > Of course. LGTM with the mentioned changes to the comment. This is not
> > an approval though as I'm not a maintainer.
> >
> > Best regards,
> >
> > Thomas
> >
> > >
> > > Regards,
> > > Tamar
> > >
> > > >
> > > > Best regards,
> > > >
> > > > Thomas
> > > >
> > > > >
> > > > > Regards,
> > > > > Tamar
> > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Thomas
> > > > > >
> > > > > > >
> > > > > > > This patch ensures gcc.target/arm/big-endian-subreg.c is
> > > > > > > fixed without introducing any regressions while fixing
> > > > > > >
> > > > > > > gcc.dg/vect/vect-nop-move.c execution test
> > > > > > > g++.dg/torture/vshuf-v2si.C   -O3 -g  execution test
> > > > > > > g++.dg/torture/vshuf-v4si.C   -O3 -g  execution test
> > > > > > > g++.dg/torture/vshuf-v8hi.C   -O3 -g  execution test
> > > > > > >
> > > > > > > Regtested on armeb-none-eabi and no regressions.
> > > > > > > Bootstrapped on arm-none-linux-gnueabihf and no issues.
> > > > > > >
> > > > > > >
> > > > > > > Ok for trunk?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Tamar
> > > > > > >
> > > > > > > gcc/
> > > > > > > 2018-07-23  Tamar Christina  <tamar.christ...@arm.com>
> > > > > > >
> > > > > > >         PR target/84711
> > > > > > >         * config/arm/arm.c (arm_can_change_mode_class):
> > > > > > > Disallow
> > > > subreg.
> > > > > > >         * config/arm/neon.md (movv4hf, movv8hf): Refactored to..
> > > > > > >         (mov<mov>): ..this and enable unconditionally.
> > > > > > >
> > > > > > > --

RE: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

Reply via email to