On Fri, Sep 16, 2022 at 9:38 PM Alexander Monakov via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > On Fri, 16 Sep 2022, Uros Bizjak via Gcc-patches wrote: > > > On Fri, Sep 16, 2022 at 3:32 AM Jeff Law via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > On 9/15/22 19:06, liuhongt via Gcc-patches wrote: > > > > There's peephole2 submit in 1990s which split cmp mem, 0 to load mem, > > > > reg + test reg, reg. I don't know exact reason why gcc do this. > > > > > > > > For latest x86 processors, ciscization should help processor frontend > > > > also codesize, for processor backend, they should be the same(has same > > > > uops). > > > > > > > > So the patch deleted the peephole2, and also modify another splitter to > > > > generate more cmp mem, 0 for 32-bit target. > > > > > > > > It will help instruction fetch. > > > > > > > > for minmax-1.c minmax-2.c minmax-10, pr96891.c, it's supposed to scan > > > > there's no > > > > comparison to 1 or -1, so adjust the testcase since under 32-bit > > > > target, we now generate cmp mem, 0 instead of load + test. > > > > > > > > Similar for pr78035.c. > > > > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} > > > > No performance impact for SPEC2017 on ICX/Znver3. > > > > > > > It was almost certainly for PPro/P2 given it was rth's work from > > > 1999. Probably should have been conditionalized on PPro/P2 at the > > > time. No worries losing it now... > > > > Please add a tune flag in x86-tune.def under "Historical relics" and > > use it in the relevant peephole2 instead of deleting it. > > When the next instruction after 'load mem; test reg, reg' is a conditional > branch, this disables macro-op fusion because Intel CPUs do not macro-fuse > 'cmp mem, imm; jcc'. > Oh, i didn't realize it, thanks for your reply. I'll hold on the patch until more investigation. > It would be nice to rephrase the commit message to acknowledge this (the > statement 'has same uops' is not always true with this considered). > > AMD CPUs can fuse some 'cmp mem, imm; jcc' under some conditions, so this > should be beneficial for AMD. > > Alexander
-- BR, Hongtao