On Wed, Aug 25, 2021 at 5:14 AM Segher Boessenkool
<seg...@kernel.crashing.org> wrote:
>
> Hi!
>
> On Tue, Aug 24, 2021 at 04:55:30PM +0800, liuhongt wrote:
> >   This patch extend change_zero_ext to change illegitimate constant
> > into constant pool, this will enable simplification of below:
>
> It should be in a separate function.  recog_for_combine will call both.
> But not both for the same RTL!  This is important.  Originally, combine
> tried only one thing for every combination of input instructions it got.
> Every extra attempt causes quite a bit more garbage (not to mention it
> takes time as well, recog isn't super cheap), so we should try to not
> make recog_for_combine exponential in the variants it tries..
>
> And of course the function name should always be descriptive of what a
> function does :-)
>
> >  change_zero_ext (rtx pat)
> > @@ -11417,6 +11418,23 @@ change_zero_ext (rtx pat)
> >      {
> >        rtx x = **iter;
> >        scalar_int_mode mode, inner_mode;
> > +      machine_mode const_mode = GET_MODE (x);
> > +
> > +      /* Change illegitimate constant into memref of constant pool.  */
> > +      if (CONSTANT_P (x)
> > +       && !const_vec_duplicate_p (x)
>
> This is x86-specific?  It makes no sense in generic code, anyway.  Or if
> it does it needs a big fat comment :-)
>
> > +       && const_mode != BLKmode
> > +       && GET_CODE (x) != HIGH
> > +       && GET_MODE_SIZE (const_mode).is_constant ()
> > +       && !targetm.legitimate_constant_p (const_mode, x)
> > +       && !targetm.cannot_force_const_mem (const_mode, x))
>
> You should only test that it did not recog, and then force a constant
> to memory.  You do not want to do this for every constant (rotate by 42
> won't likely match better if you force 42 to memory) so some
> sophistication will help here, but please do not make it target-
> specific.
>
> > +     {
> > +       x = force_const_mem (GET_MODE (x), x);
>
> That mode is const_mode.
>
>
> Ideally you will have some example where it benefits some other target,
> too.  Running recog twice for a big fraction of all combine attempts,
> for no benefit at all, is not a good idea.  The zext* thing is there
> because combine *itself* creates a lot of extra zext*, whether those
> exist for the target or not.  So this isn't obvious precedent (and that
> wouldn't mean it is a good idea anyway ;-) )
When trying to construct a testcase for another backend, I realized
that the issue can also be solved by folding _mm_shuffle_ps into
gimple vec_perm_expr.
Let me try that way.

>
>
> Segher



-- 
BR,
Hongtao

Reply via email to