Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

Thomas Preudhomme Thu, 26 Jul 2018 04:34:13 -0700

On Thu, 26 Jul 2018 at 12:01, Tamar Christina <tamar.christ...@arm.com> wrote:
>
> Hi Thomas,
>
> > -----Original Message-----
> > From: Thomas Preudhomme <thomas.preudho...@linaro.org>
> > Sent: Thursday, July 26, 2018 09:29
> > To: Tamar Christina <tamar.christ...@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; nd <n...@arm.com>; Ramana Radhakrishnan
> > <ramana.radhakrish...@arm.com>; Richard Earnshaw
> > <richard.earns...@arm.com>; ni...@redhat.com; Kyrylo Tkachov
> > <kyrylo.tkac...@arm.com>
> > Subject: Re: [PATCH][GCC][Arm] Fix subreg crash in different way by
> > enabling the FP16 pattern unconditionally.
> >
> > Hi Tamar,
> >
> > On Wed, 25 Jul 2018 at 16:28, Tamar Christina <tamar.christ...@arm.com>
> > wrote:
> > >
> > > Hi Thomas,
> > >
> > > Thanks for the review!
> > >
> > > > >
> > > > > I don't believe the TARGET_FP16 guard to be needed, because the
> > > > > pattern doesn't actually generate code and requires another
> > > > > pattern for that, and a reg to reg move should always be possible
> > > > > anyway. So allowing the force to register here is safe and it
> > > > > allows the compiler to generate a correct error instead of ICEing in 
> > > > > an
> > infinite loop.
> > > >
> > > > How about subreg to subreg move? Doesn't that expand to more insns
> > > > (subreg to reg and reg to subreg)? Couldn't you improve the logic to
> > > > check that there is actually a mode change so that if there isn't
> > > > (like moving from one subreg to another) just expand to a single move?
> > > >
> > >
> > > Yes, but that is not a new issue. My patch is simply removing the
> > > TARGET_FP16 restrictions and merging two patterns that should be one
> > using an iterator and nothing more.
> > >
> > > The redundant mov is already there and a different issue than the ICE I'm
> > trying to fix.
> >
> > It's there for movv4hf and movv6hf but your patch extends this problem to
> > movv2sf and movv4sf as well.
>
> I don't understand how it can. My patch just replaces one pattern for V4HF and
> one for V8HF with one pattern operating on VH.
>
> ;; Vector modes for 16-bit floating-point support.
> (define_mode_iterator VH [V8HF V4HF])
>
> My pattern has absolutely no effect on V2SF and V4SF or any of the other 
> modes.


My bad, I was looking at VF.

>
> >
> > >
> > > None of the code inside the expander is needed at all, the code really
> > > only has an effect on subreg to subreg moves, as `force_reg` doesn't do
> > anything when it's argument is already a reg.
> > >
> > > The comment in the expander (which was already there) is wrong. The
> > > *reason* the ICE is fixed isn't because of the `force_reg`. It's
> > > because of the mere presence of the expander itself. The expander
> > > matches the standard mov$a optab and so this prevents
> > emit_move_insn_1 from doing the move by subwords as it finds a pattern
> > that's able to do the move.
> >
> > Could you then fix the comment in your patch as well? I hadn't understood
> > the force_reg was not key here. You might want to update the following
> > sentence from your patch description if you are going to include it in your
> > commit message:
>
> I'll update the comment in the patch. The cover letter won't be included in 
> the commit,
> But it does accurately reflect the current state of affairs. The patch will 
> do the force_reg,
> It's just not the reason it works.

Understood.

>
> >
> > The way this is worked around in the back-end is that we have move
> > patterns in neon.md that usually just force the register instead of checking
> > with the back-end.
> >
> > "The way this is worked around (..) that just force the register" is what 
> > led
> > me to believe the force_reg was important.
> >
> > >
> > > The expander however always falls through and doesn’t stop RTL
> > > generation. You could remove all the code in there and have it
> > > properly match the *neon_mov instructions which will do the right
> > > thing later at code generation time and avoid the redundant moves.  My
> > guess is the original `force_reg` was copied from the other patterns like
> > `movti` and the existing `mov<mode>`. There It makes sense because the
> > operands can be MEM or anything general_operand.
> > >
> > > However the redundant moves are a different problem than what I'm
> > > trying to solve here. So I think that's another patch which requires 
> > > further
> > testing.
> >
> > I was just thinking of restricting when does the force_reg happens but if it
> > can be removed completely I agree it should probably be done in a separate
> > patch.
> >
> > Oh by the way, is there something that prevent those expander to ever be
> > used with a memory operand? Because the GCC internals contains the
> > following piece for mov standard pattern (bold marks added by me):
> >
> > "Second, these patterns are not used solely in the RTL generation pass. Even
> > the reload pass can generate move insns to copy values from stack slots into
> > temporary registers. When it does so, one of the operands is a hard register
> > and the other is an operand that can need to be reloaded into a register.
> > Therefore, when given such a pair of operands, the pattern must generate
> > RTL which needs no reloading and needs no temporary registers—no
> > registers other than the operands. For example, if you support the pattern
> > with a define_ expand, then in such a case the define_expand *mustn’t call
> > force_reg* or any other such function which might generate new pseudo
> > registers."
>
> When used during expand the operand predicates are checked, this prevents 
> that case.
> When used during reload can_create_pseudo_p () guards the actions in the 
> patterns.
>
> gcc/rtl.h:#define can_create_pseudo_p() (!reload_in_progress && 
> !reload_completed)
>
> So during or after reload it would just be an empty fall through pattern, so 
> it won't do
> anything to MEM or anything else.

Of course. LGTM with the mentioned changes to the comment. This is not
an approval though as I'm not a maintainer.

Best regards,

Thomas

>
> Regards,
> Tamar
>
> >
> > Best regards,
> >
> > Thomas
> >
> > >
> > > Regards,
> > > Tamar
> > >
> > > > Best regards,
> > > >
> > > > Thomas
> > > >
> > > > >
> > > > > This patch ensures gcc.target/arm/big-endian-subreg.c is fixed
> > > > > without introducing any regressions while fixing
> > > > >
> > > > > gcc.dg/vect/vect-nop-move.c execution test
> > > > > g++.dg/torture/vshuf-v2si.C   -O3 -g  execution test
> > > > > g++.dg/torture/vshuf-v4si.C   -O3 -g  execution test
> > > > > g++.dg/torture/vshuf-v8hi.C   -O3 -g  execution test
> > > > >
> > > > > Regtested on armeb-none-eabi and no regressions.
> > > > > Bootstrapped on arm-none-linux-gnueabihf and no issues.
> > > > >
> > > > >
> > > > > Ok for trunk?
> > > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > gcc/
> > > > > 2018-07-23  Tamar Christina  <tamar.christ...@arm.com>
> > > > >
> > > > >         PR target/84711
> > > > >         * config/arm/arm.c (arm_can_change_mode_class): Disallow
> > subreg.
> > > > >         * config/arm/neon.md (movv4hf, movv8hf): Refactored to..
> > > > >         (mov<mov>): ..this and enable unconditionally.
> > > > >
> > > > > --

Re: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

Reply via email to