On Wed, Aug 7, 2024 at 11:08 AM Alexander Monakov <amona...@ispras.ru> wrote: > > > On Wed, 7 Aug 2024, Richard Biener wrote: > > > > > + data = *(const v16qi_u *)s; > > > > + /* Prevent propagation into pshufb and pcmp as memory operand. > > > > */ > > > > + __asm__ ("" : "+x" (data)); > > > > > > It would probably make sense to a file a PR on this separately, > > > to eventually fix the compiler to not need such workarounds. > > > Not sure how much difference it makes however. > > > > This is probably to work around bugs in older compiler versions? If > > not I agree. > > This is deliberate hand-tuning to avoid a subtle issue: pshufb is not > macro-fused on Intel, so with propagation it is two uops early in the > CPU front-end. > > The "propagation" actually falls out of IRA/LRA decisions, and stopped > happening in gcc-14. I'm not sure if there were relevant RA changes. > In any case, this can potentially flip-flop in the future again. > > Considering the trunk gets this right, I think the next move is to > add a testcase for this, not a PR, correct?
Well, merging the memory operand into the pshufb would be wrong - embedded memory ops are always considered aligned, no? > > Otherwise the patch is OK. > > Still OK with the asms, or would you prefer them be taken out? I think it's OK with the asms. Richard. > Thanks. > > Alexander