> -----Original Message----- > From: Gcc-patches <gcc-patches-boun...@gcc.gnu.org> On Behalf Of Richard > Biener > Sent: 07 May 2020 14:40 > To: Richard Sandiford <richard.sandif...@arm.com> > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH][RFC] extend DECL_GIMPLE_REG_P to all types > > On Thu, 23 Apr 2020, Richard Biener wrote: > > > On Wed, 22 Apr 2020, Richard Sandiford wrote: > > > > > Richard Biener <rguent...@suse.de> writes: > > > > On Wed, 22 Apr 2020, Richard Biener wrote: > > > > > > > >> > > > >> This extends DECL_GIMPLE_REG_P to all types so we can clear > > > >> TREE_ADDRESSABLE even for integers with partial defs, not just > > > >> complex and vector variables. To make that transition easier > > > >> the patch inverts DECL_GIMPLE_REG_P to DECL_NOT_GIMPLE_REG_P > > > >> since that makes the default the current state for all other > > > >> types besides complex and vectors. That also nicely simplifies > > > >> code throughout the compiler. > > > >> > > > >> TREE_ADDRESSABLE and DECL_NOT_GIMPLE_REG_P are now truly > > > >> independent, either set prevents a decl from being rewritten > > > >> into SSA form. > > > >> > > > >> For the testcase in PR94703 we're able to expand the partial > > > >> def'ed local integer to a register then, producing a single > > > >> movl rather than going through the stack. > > > >> > > > >> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > > >> > > > >> If there are no objections I'm going to install this once > > > >> stage1 opens. > > > > > > > > Of course there was some fallout. On 32bit x86 > gcc.dg/torture/pr71522.c > > > > fails execution because while the GIMPLE is unchanged at RTL > expansion > > > > time: > > > > > > > > main () > > > > { > > > > char s[12]; > > > > long double d; > > > > > > > > MEM <unsigned char[12]> [(char * {ref-all})&d] = MEM <unsigned > char[12]> > > > > [(char * {ref-all})"AAAAAAAAAAA"]; > > > > MEM <unsigned char[12]> [(char * {ref-all})&s] = MEM <unsigned > char[12]> > > > > [(char * {ref-all})&d]; > > > > _1 = __builtin_strcmp (&s, "AAAAAAAAAAA"); > > > > if (_1 != 0) > > > > ... > > > > > > > > we now assign 'd' an XFmode register (TREE_ADDRESSABLE is cleared > > > > now since we can set DECL_NOT_GIMPLE_REG_P). The case is lost > > > > then, impossible to fix up AFAICS. On x86 all moves to/from > > > > XFmode are normalizing, specifically we end up with > > > > > > > > fldt .LC0 > > > > fstpt (%esp) > > > > > > > > now the most appealing solution - and totally in the opposite > > > > direction of this patch - is to simply stop expanding non-SSA names > > > > as pseudos. I do not remember the history as why we do this > > > > but it's likely remanents we preserved from either pre-SSA, times > > > > we did not go into SSA for -O0 or times we really gone out-of-SSA. > > > > > > > > There is _some_ good reason to expand a non-SSA "register" into > > > > a pseudo though - namely that RTL is not SSA and thus can accept > > > > partial defs. And of course that RTL cannot get rid of a stack > > > > slot assigned to a variable. Today we have somewhat robust > > > > infrastructure to deal with partial defs on GIMPLE, namely > > > > BIT_INSERT_EXPR, but it's not fully exercised. > > > > > > Yeah, not being able to get rid of the stack slot seems > > > worrying here. > > > > > > > It's of course possible to fixup the above problematical > > > > cases (there's precenent with discover_nonconstant_array_refs, > > > > which could be "easily" extended to handle "weird" accesses > > > > of non-integral-mode variables) but with the recent discussion > > > > on making RTL expansion more straight-forward I'd bring up > > > > the above idea ... it would get rid of quite some special > > > > code dealing with tcc_reference trees (and MEM_REFs) ending > > > > up operating on registers. > > > > > > It might be nice to do it eventually, but I think at least > > > is_gimple_reg_type would need to be "return true" first, > > > otherwise we'll lose too much on aggregates. > > > > > > There's also the problem that things passed in registers do need > > > to be RTL registers at function boundaries, so I'm not sure all > > > the expand code would necessarily go away. > > > > > > Wouldn't want to see all targets suffer for XFmode oddities :-) > > > > OK, so here's the patch amemded with some heuristics to catch > > this. The heuristic triggers exactly on the previously > > failing testcase and nothing else on a x86_64 bootstrap and regtest. > > Citing the code: > > > > /* If there's a chance to get a pseudo for t then if it would be of > float > > mode > > and the actual access is via an integer mode (lowered memcpy or > similar > > access) then avoid the register expansion if the mode likely is not > > storage > > suitable for raw bits processing (like XFmode on i?86). */ > > > > static void > > avoid_type_punning_on_regs (tree t) > > { > > machine_mode access_mode = TYPE_MODE (TREE_TYPE (t)); > > if (access_mode != BLKmode > > && !SCALAR_INT_MODE_P (access_mode)) > > return; > > tree base = get_base_address (t); > > if (DECL_P (base) > > && !TREE_ADDRESSABLE (base) > > && FLOAT_MODE_P (DECL_MODE (base)) > > && maybe_lt (GET_MODE_PRECISION (DECL_MODE (base)), > > GET_MODE_BITSIZE (GET_MODE_INNER (DECL_MODE > (base)))) > > /* Double check in the expensive way we really would get a > pseudo. > > */ > > && use_register_for_decl (base)) > > TREE_ADDRESSABLE (base) = 1; > > } > > > > invoked on stores like > > > > if (gimple_vdef (stmt)) > > { > > tree t = gimple_get_lhs (stmt); > > if (t && REFERENCE_CLASS_P (t)) > > avoid_type_punning_on_regs (t); > > } > > > > loads are not an issue on their own. So the basic idea is to rule > > out float-mode pseudos which have less precision than their mode > > but where we store to with integer-like modes. A but hand-wavy > > since similar issues might exist with partial integer modes as > storage(?), > > but we shouldn't really need this at all in an ideal world. > > > > You could check whether this triggers on arm with the respective > > long double testcase. > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1. > > I have now pushed this. > > Richard.
Hi Richard, This introduces a wrong code bug on AArch64: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95526 Thanks, Alex