> -----Original Message-----
> From: Gcc-patches <gcc-patches-boun...@gcc.gnu.org> On Behalf Of Richard
> Biener
> Sent: 07 May 2020 14:40
> To: Richard Sandiford <richard.sandif...@arm.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH][RFC] extend DECL_GIMPLE_REG_P to all types
> 
> On Thu, 23 Apr 2020, Richard Biener wrote:
> 
> > On Wed, 22 Apr 2020, Richard Sandiford wrote:
> >
> > > Richard Biener <rguent...@suse.de> writes:
> > > > On Wed, 22 Apr 2020, Richard Biener wrote:
> > > >
> > > >>
> > > >> This extends DECL_GIMPLE_REG_P to all types so we can clear
> > > >> TREE_ADDRESSABLE even for integers with partial defs, not just
> > > >> complex and vector variables.  To make that transition easier
> > > >> the patch inverts DECL_GIMPLE_REG_P to DECL_NOT_GIMPLE_REG_P
> > > >> since that makes the default the current state for all other
> > > >> types besides complex and vectors.  That also nicely simplifies
> > > >> code throughout the compiler.
> > > >>
> > > >> TREE_ADDRESSABLE and DECL_NOT_GIMPLE_REG_P are now truly
> > > >> independent, either set prevents a decl from being rewritten
> > > >> into SSA form.
> > > >>
> > > >> For the testcase in PR94703 we're able to expand the partial
> > > >> def'ed local integer to a register then, producing a single
> > > >> movl rather than going through the stack.
> > > >>
> > > >> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
> > > >>
> > > >> If there are no objections I'm going to install this once
> > > >> stage1 opens.
> > > >
> > > > Of course there was some fallout.  On 32bit x86
> gcc.dg/torture/pr71522.c
> > > > fails execution because while the GIMPLE is unchanged at RTL
> expansion
> > > > time:
> > > >
> > > > main ()
> > > > {
> > > >   char s[12];
> > > >   long double d;
> > > >
> > > >   MEM <unsigned char[12]> [(char * {ref-all})&d] = MEM <unsigned
> char[12]>
> > > > [(char * {ref-all})"AAAAAAAAAAA"];
> > > >   MEM <unsigned char[12]> [(char * {ref-all})&s] = MEM <unsigned
> char[12]>
> > > > [(char * {ref-all})&d];
> > > >   _1 = __builtin_strcmp (&s, "AAAAAAAAAAA");
> > > >   if (_1 != 0)
> > > > ...
> > > >
> > > > we now assign 'd' an XFmode register (TREE_ADDRESSABLE is cleared
> > > > now since we can set DECL_NOT_GIMPLE_REG_P).  The case is lost
> > > > then, impossible to fix up AFAICS.  On x86 all moves to/from
> > > > XFmode are normalizing, specifically we end up with
> > > >
> > > >         fldt    .LC0
> > > >         fstpt   (%esp)
> > > >
> > > > now the most appealing solution - and totally in the opposite
> > > > direction of this patch - is to simply stop expanding non-SSA names
> > > > as pseudos.  I do not remember the history as why we do this
> > > > but it's likely remanents we preserved from either pre-SSA, times
> > > > we did not go into SSA for -O0 or times we really gone out-of-SSA.
> > > >
> > > > There is _some_ good reason to expand a non-SSA "register" into
> > > > a pseudo though - namely that RTL is not SSA and thus can accept
> > > > partial defs.  And of course that RTL cannot get rid of a stack
> > > > slot assigned to a variable.  Today we have somewhat robust
> > > > infrastructure to deal with partial defs on GIMPLE, namely
> > > > BIT_INSERT_EXPR, but it's not fully exercised.
> > >
> > > Yeah, not being able to get rid of the stack slot seems
> > > worrying here.
> > >
> > > > It's of course possible to fixup the above problematical
> > > > cases (there's precenent with discover_nonconstant_array_refs,
> > > > which could be "easily" extended to handle "weird" accesses
> > > > of non-integral-mode variables) but with the recent discussion
> > > > on making RTL expansion more straight-forward I'd bring up
> > > > the above idea ... it would get rid of quite some special
> > > > code dealing with tcc_reference trees (and MEM_REFs) ending
> > > > up operating on registers.
> > >
> > > It might be nice to do it eventually, but I think at least
> > > is_gimple_reg_type would need to be "return true" first,
> > > otherwise we'll lose too much on aggregates.
> > >
> > > There's also the problem that things passed in registers do need
> > > to be RTL registers at function boundaries, so I'm not sure all
> > > the expand code would necessarily go away.
> > >
> > > Wouldn't want to see all targets suffer for XFmode oddities :-)
> >
> > OK, so here's the patch amemded with some heuristics to catch
> > this.  The heuristic triggers exactly on the previously
> > failing testcase and nothing else on a x86_64 bootstrap and regtest.
> > Citing the code:
> >
> > /* If there's a chance to get a pseudo for t then if it would be of
> float
> > mode
> >    and the actual access is via an integer mode (lowered memcpy or
> similar
> >    access) then avoid the register expansion if the mode likely is not
> > storage
> >    suitable for raw bits processing (like XFmode on i?86).  */
> >
> > static void
> > avoid_type_punning_on_regs (tree t)
> > {
> >   machine_mode access_mode = TYPE_MODE (TREE_TYPE (t));
> >   if (access_mode != BLKmode
> >       && !SCALAR_INT_MODE_P (access_mode))
> >     return;
> >   tree base = get_base_address (t);
> >   if (DECL_P (base)
> >       && !TREE_ADDRESSABLE (base)
> >       && FLOAT_MODE_P (DECL_MODE (base))
> >       && maybe_lt (GET_MODE_PRECISION (DECL_MODE (base)),
> >                    GET_MODE_BITSIZE (GET_MODE_INNER (DECL_MODE
> (base))))
> >       /* Double check in the expensive way we really would get a
> pseudo.
> > */
> >       && use_register_for_decl (base))
> >     TREE_ADDRESSABLE (base) = 1;
> > }
> >
> > invoked on stores like
> >
> >             if (gimple_vdef (stmt))
> >               {
> >                 tree t = gimple_get_lhs (stmt);
> >                 if (t && REFERENCE_CLASS_P (t))
> >                   avoid_type_punning_on_regs (t);
> >               }
> >
> > loads are not an issue on their own.  So the basic idea is to rule
> > out float-mode pseudos which have less precision than their mode
> > but where we store to with integer-like modes.  A but hand-wavy
> > since similar issues might exist with partial integer modes as
> storage(?),
> > but we shouldn't really need this at all in an ideal world.
> >
> > You could check whether this triggers on arm with the respective
> > long double testcase.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1.
> 
> I have now pushed this.
> 
> Richard.

Hi Richard,

This introduces a wrong code bug on AArch64:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95526

Thanks,
Alex

Reply via email to