On Fri, Sep 16, 2022 at 9:38 PM Alexander Monakov via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Fri, 16 Sep 2022, Uros Bizjak via Gcc-patches wrote:
>
> > On Fri, Sep 16, 2022 at 3:32 AM Jeff Law via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > >
> > > On 9/15/22 19:06, liuhongt via Gcc-patches wrote:
> > > > There's peephole2 submit in 1990s which split cmp mem, 0 to load mem,
> > > > reg + test reg, reg. I don't know exact reason why gcc do this.
> > > >
> > > > For latest x86 processors, ciscization should help processor frontend
> > > > also codesize, for processor backend, they should be the same(has same
> > > > uops).
> > > >
> > > > So the patch deleted the peephole2, and also modify another splitter to
> > > > generate more cmp mem, 0 for 32-bit target.
> > > >
> > > > It will help instruction fetch.
> > > >
> > > > for minmax-1.c minmax-2.c minmax-10, pr96891.c, it's supposed to scan 
> > > > there's no
> > > > comparison to 1 or -1, so adjust the testcase since under 32-bit
> > > > target, we now generate cmp mem, 0 instead of load + test.
> > > >
> > > > Similar for pr78035.c.
> > > >
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> > > > No performance impact for SPEC2017 on ICX/Znver3.
> > > >
> > > It was almost certainly for PPro/P2 given it was rth's work from
> > > 1999.    Probably should have been conditionalized on PPro/P2 at the
> > > time.   No worries losing it now...
> >
> > Please add a tune flag in x86-tune.def under "Historical relics" and
> > use it in the relevant peephole2 instead of deleting it.
>
> When the next instruction after 'load mem; test reg, reg' is a conditional
> branch, this disables macro-op fusion because Intel CPUs do not macro-fuse
> 'cmp mem, imm; jcc'.
>
Oh, i didn't realize it, thanks for your reply.
I'll hold on the patch until more investigation.
> It would be nice to rephrase the commit message to acknowledge this (the
> statement 'has same uops' is not always true with this considered).
>
> AMD CPUs can fuse some 'cmp mem, imm; jcc' under some conditions, so this
> should be beneficial for AMD.
>
> Alexander



-- 
BR,
Hongtao

Reply via email to