On Wed, Aug 7, 2024 at 11:08 AM Alexander Monakov <amona...@ispras.ru> wrote:
>
>
> On Wed, 7 Aug 2024, Richard Biener wrote:
>
> > > > +      data = *(const v16qi_u *)s;
> > > > +      /* Prevent propagation into pshufb and pcmp as memory operand.  
> > > > */
> > > > +      __asm__ ("" : "+x" (data));
> > >
> > > It would probably make sense to a file a PR on this separately,
> > > to eventually fix the compiler to not need such workarounds.
> > > Not sure how much difference it makes however.
> >
> > This is probably to work around bugs in older compiler versions?  If
> > not I agree.
>
> This is deliberate hand-tuning to avoid a subtle issue: pshufb is not
> macro-fused on Intel, so with propagation it is two uops early in the
> CPU front-end.
>
> The "propagation" actually falls out of IRA/LRA decisions, and stopped
> happening in gcc-14. I'm not sure if there were relevant RA changes.
> In any case, this can potentially flip-flop in the future again.
>
> Considering the trunk gets this right, I think the next move is to
> add a testcase for this, not a PR, correct?

Well, merging the memory operand into the pshufb would be wrong - embedded
memory ops are always considered aligned, no?

> > Otherwise the patch is OK.
>
> Still OK with the asms, or would you prefer them be taken out?

I think it's OK with the asms.

Richard.

> Thanks.
>
> Alexander

Reply via email to