Re: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining

Alexandre Oliva via Gcc-patches Thu, 25 May 2023 06:25:46 -0700

On May 25, 2023, Richard Biener <richard.guent...@gmail.com> wrote:

> On Thu, May 25, 2023 at 1:10 PM Alexandre Oliva <ol...@adacore.com> wrote:
>> 
>> On May 25, 2023, Richard Biener <richard.guent...@gmail.com> wrote:
>> 
>> > I mean we could do what RTL expansion would do later and do
>> > by-pieces, thus emit multiple loads/stores but not n loads and then
>> > n stores but interleaved.
>> 
>> That wouldn't help e.g. gcc.dg/memcpy-6.c's fold_move_8, because
>> MOVE_MAX and MOVE_MAX_PIECES currently limits inline expansion to 4
>> bytes on x86 without SSE, both in gimple and RTL, and interleaved loads
>> and stores wouldn't help with memmove.  We can't fix that by changing
>> code that uses MOVE_MAX and/or MOVE_MAX_PIECES, when these limits are
>> set too low.


> Btw, there was a short period where the MOVE_MAX limit was restricted
> but that had fallout and we've reverted since then.

Erhm...  Are we even talking about the same issue?

i386/i386.h reduced the 32-bit non-SSE MOVE_MAX from 16 to 4, which
broke this test; I'm proposing to bounce it back up to 8, so that we get
a little more memmove inlining, enough for tests that expect that much
to pass.

You may be focusing on the gimple-fold bit, because I mentioned it, but
even the rtl expander is failing to expand the memmove because of the
setting, as evidenced by the test's failure in the scan for memmove in
the final dump.

That MOVE_MAX change was a significant regression in codegen for 32-bit
non-SSE x86, and I'm proposing to fix that.  Compensating for that
regression elsewhere doesn't seem desirable to me: MOVE_MAX can be much
higher even on other x86 variants, so the effects of such attempts may
harm quite significantly more modern CPUs.

Conversely, I don't expect the reduction of MOVE_MAX on SSE-less x86 a
couple of years ago to have been measured for performance effects, given
the little overall relevance of such CPUs, and the very visible and
undesirable effects on codegen that change brought onto them.  And yet,
I'm being very conservative in the proposed reversion, because
benchmarking such targets in any meaningful way would be somewhat
challenging for myself as well.

So, could we please have this narrow fix of this limited regression at
the spot where it was introduced accepted, rather than debating
tangents?

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

Re: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining

Reply via email to