On Wed, Nov 5, 2025 at 3:12 PM H.J. Lu <[email protected]> wrote:
>
> On Tue, Oct 28, 2025 at 10:31 AM H.J. Lu <[email protected]> wrote:
> >
> > On Tue, Oct 28, 2025 at 5:53 AM H.J. Lu <[email protected]> wrote:
> > >
> > > On Tue, Oct 28, 2025 at 12:04 AM Jeff Law <[email protected]> wrote:
> > > >
> > > >
> > > >
> > > > On 10/27/25 2:37 AM, Richard Biener wrote:
> > > > ?
> > > > >>
> > > > >> There are 2 reasons:
> > > > >>
> > > > >> 1.  RISC processors only allow moves with memory.   Since this 
> > > > >> optimization
> > > > >> only applies to other load operations, it is nop for most RISC 
> > > > >> processors.
> > > > >> 2.  Some GCC tests scan compiler assembly outputs which don't expect 
> > > > >> other
> > > > >> load operations with volatile memory.   I have to disable this 
> > > > >> optimization
> > > > >> in 100+ x86 tests in my patch.  If this is enabled for other 
> > > > >> targets, it may
> > > > >> cause many GCC tests failures.
> > > > >>
> > > > >> If your target supports this, you can give it a try.
> > > > >>
> > > > >>> Can it break volatile correctness or so?
> > > > >>
> > > > >> No.   My patch has some tests to verify the volatile correctness.
> > > > >
> > > > > IMO the -fcombine-op-with-volatile-memory-load is quite a bad name.
> > > > > Options should have names that mean something to users.  As this 
> > > > > seems to
> > > > > supposed to be a target opt-in I'd suggest to not expose this to users
> > > > > (or if targets are concerned, via some -m...) but instead make this a
> > > > > target hook/macro.  Of course I think this should be enabled by 
> > > > > default.
> > > >
> > > > If we're going to go with this patch or a variant thereof; I'd go a step
> > > > further and not bother with a target hook.  If the semantics are wrong
> > > > for some case, we're more likely to find it the more targets use the
> > > > option.  And it sounds like the main purpose behind the hook is to avoid
> > > > working through any testsuite issues on those other targets, which isn't
> > > > a great reason to introduce a new target hook.
> > >
> > > One advantage of a target hook is that if a target doesn't opt in, there
> > > no need to do additional check:
> > >
> > > diff --git a/gcc/recog.cc b/gcc/recog.cc
> > > index 67d7fa63069..fc411842859 100644
> > > --- a/gcc/recog.cc
> > > +++ b/gcc/recog.cc
> > > @@ -1596,7 +1596,8 @@ general_operand (rtx op, machine_mode mode)
> > >      {
> > >        rtx y = XEXP (op, 0);
> > >
> > > -      if (! volatile_ok && MEM_VOLATILE_P (op))
> > > +      if (!targetm.volatile_mem_ok_in_insn (op,
> > > +                  recog_data.try_combine_insn))
> > >    return false;
> > >
> > >        /* Use the mem's mode, since it will be reloaded thus.  LRA can
> > > diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> > > index 1873d572ba3..ece1f496571 100644
> > > --- a/gcc/targhooks.cc
> > > +++ b/gcc/targhooks.cc
> > > @@ -1839,6 +1839,14 @@ default_new_address_profitable_p (rtx memref
> > > ATTRIBUTE_UNUSED,
> > >    return true;
> > >  }
> > >
> > > +/* The default implementation of TARGET_VOLATILE_MEM_OK_IN_INSN.  */
> > > +
> > > +bool
> > > +default_volatile_mem_ok_in_insn (rtx memref, rtx_insn *)
> > > +{
> > > +  return volatile_ok || !MEM_VOLATILE_P (memref);
> > > +}
> > > +
> > >
> > > > As HJ indicated, I wouldn't expect significant fallout on load/store
> > > > architectures.  So the scope of fallout isn't likely to be too terrible.
> > > >
> > > > I was going to throw it in my tester, but it doesn't apply cleanly on
> > > > the trunk.
> > > >
> > > > > patching file gcc/recog.cc
> > > > > Hunk #1 FAILED at 116.
> > > > > 1 out of 2 hunks FAILED -- saving rejects to file gcc/recog.cc.rej
> > > >
> > >
> > > I put my branch at
> > >
> > > https://gitlab.com/x86-gcc/gcc/-/commits/users/hjl/pr122343/master
> > >
> > > on top of
> > >
> > > commit 76943639ddd861dce3886d1def2a353ccfcdd585
> > > Author: Eric Botcazou <[email protected]>
> > > Date:   Mon Oct 27 19:51:11 2025 +0100
> > >
> > >     Ada: Fix assertion failure on child generic package
> > >
> > > Does it work for you?
> > >
> >
> > Here is the patch to add TARGET_VOLATILE_MEM_OK_IN_INSN.
> >
> > For x86, volatile memory can be used as source operand for loads.  Add
> > TARGET_VOLATILE_MEM_OK_IN_INSN to allow a target to support operations
> > with volatile memory load:
> >
> > 1. Add a field, try_combine_insn, to recog_data_d for the combine pass to
> >    pass the instruction information to general_operand.
> > 2. Update recog_for_combine_1 to set recog_data.try_combine_insn before
> >    calling recog and clear it afterward.
> > 3. Add a target hook, TARGET_VOLATILE_MEM_OK_IN_INSN which returns true
> >    if a volatile memory reference is OK in an instruction.  The default
> >    returns true only if volatile memory reference is OK or the memory
> >    reference isn't volatile.
> > 4. Update general_operand to call targetm.volatile_mem_ok_in_insn to
> >    determine if a memory operand is OK.
> > 5. Add an x86 TARGET_VOLATILE_MEM_OK_IN_INSN hook which returns true when
> >    volatile memory is used in load and -mfuse-ops-with-volatile-access is
> >    enabled.
> >
> > This optimizes
> >
> > extern volatile int bar;
> >
> > int
> > foo (int z)
> > {
> >   z *= 123;
> >   return bar + z;
> > }
> >
> > into
> >
> > foo:
> > imull $123, %edi, %eax
> > addl bar(%rip), %eax
> > ret
> >
> > instead of
> >
> > foo:
> > imull $123, %edi, %eax
> > movl bar(%rip), %edx
> > addl %edx, %eax
> > ret
> >
> > Tested with Linux kernel 6.17.4 on Intel Core i7-1195G7.
> >
> > TODO: Investigate if adding the LOCK prefix to allow volatile memory store
> > will improve performance since the LOCK prefix may be expensive.
> >
> > gcc/
> >
> > PR target/122343
> > * combine.cc (recog_for_combine_1): Set and clear
> > recog_data.try_combine_insn when calling recog.
> > * recog.cc (general_operand): Call targetm.volatile_mem_ok_in_insn.
> > * recog.h (recog_data_d): Add try_combine_insn.
> > * target.def (volatile_mem_ok_in_insn): New hook.
> > * targhooks.cc (default_volatile_mem_ok_in_insn): New.
> > * targhooks.h (default_volatile_mem_ok_in_insn): Likewise.
> > * config/i386/i386.cc (ix86_volatile_mem_ok_in_insn): Likewise.
> > (TARGET_VOLATILE_MEM_OK_IN_INSN): Likewise.
> > * doc/invoke.texi: Document -mfuse-ops-with-volatile-access.
> > * doc/tm.texi: Regenerated.
> > * doc/tm.texi.in: Add TARGET_VOLATILE_MEM_OK_IN_INSN.
>
> Hi Richard, Jeff,
>
> Does this patch look OK?
>
> Thanks.
>
> > gcc/testsuite/
> >
> > PR target/122343
> > * gcc.target/i386/avx-ne-convert-1.c: Compile with
> > -mno-fuse-ops-with-volatile-access.
> > * gcc.target/i386/avx10_2-bf16-1.c: Likewise.
> > * gcc.target/i386/avx10_2-convert-1.c: Likewise.
> > * gcc.target/i386/avx10_2-satcvt-1.c: Likewise.
> > * gcc.target/i386/avx512bf16-vcvtneps2bf16-1.c: Likewise.
> > * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1a.c: Likewise.
> > * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1b.c: Likewise.
> > * gcc.target/i386/avx512dq-vcvtps2qq-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vcvtps2uqq-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vcvtqq2pd-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vcvtqq2ps-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vcvttps2qq-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vcvttps2uqq-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vcvtuqq2pd-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vcvtuqq2ps-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vfpclasssd-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vfpclassss-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vpmullq-1.c: Likewise.
> > * gcc.target/i386/avx512dq-vpmullq-3.c: Likewise.
> > * gcc.target/i386/avx512f-pr100267-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtps2pd-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtsd2si-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtsd2si64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtsd2usi-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtsd2usi64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtss2si-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtss2si64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtss2usi-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvtss2usi64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvttsd2si-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvttsd2si64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvttsd2usi-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvttsd2usi64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvttss2si-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvttss2si64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvttss2usi-1.c: Likewise.
> > * gcc.target/i386/avx512f-vcvttss2usi64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vmovapd-1.c: Likewise.
> > * gcc.target/i386/avx512f-vmovaps-1.c: Likewise.
> > * gcc.target/i386/avx512f-vmovdqa64-1.c: Likewise.
> > * gcc.target/i386/avx512f-vpandnq-1.c: Likewise.
> > * gcc.target/i386/avx512f-vpbroadcastd-1.c: Likewise.
> > * gcc.target/i386/avx512f-vpbroadcastq-1.c: Likewise.
> > * gcc.target/i386/avx512f-vpcmpeqq-1.c: Likewise.
> > * gcc.target/i386/avx512f-vpcmpequq-1.c: Likewise.
> > * gcc.target/i386/avx512f-vrndscalepd-1.c: Likewise.
> > * gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise.
> > * gcc.target/i386/avx512fp16-complex-fma.c: Likewise.
> > * gcc.target/i386/avx512fp16-vaddph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtpd2ph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtph2dq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtph2pd-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtph2psx-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtph2qq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtph2udq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtph2uqq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtph2uw-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtph2w-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtps2ph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtqq2ph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvttph2dq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvttph2qq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvttph2udq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvttph2uqq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvttph2uw-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvttph2w-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vcvtuqq2ph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vfcmaddcph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vfcmulcph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vfmaddcph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vfmulcph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vfpclassph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vfpclasssh-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vmulph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vrcpph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vrsqrtph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16-vsqrtph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vaddph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtpd2ph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtph2dq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtph2psx-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtph2qq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtph2udq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtph2uqq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtph2uw-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtph2w-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtps2ph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtqq2ph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvttph2dq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvttph2udq-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvttph2uw-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvttph2w-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vfcmaddcph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vfcmulcph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vfmaddcph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vfmulcph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vfpclassph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vmulph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vrcpph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vrsqrtph-1a.c: Likewise.
> > * gcc.target/i386/avx512fp16vl-vsqrtph-1a.c: Likewise.
> > * gcc.target/i386/avx512vl-pr100267-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vcvtpd2ps-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vcvtpd2udq-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vcvttpd2udq-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vcvttps2udq-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vmovapd-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vmovaps-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vmovdqa64-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vpcmpeqq-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vpcmpequq-1.c: Likewise.
> > * gcc.target/i386/avx512vl-vpcmpuq-1.c: Likewise.
> > * gcc.target/i386/pr122343-1a.c: New test.
> > * gcc.target/i386/pr122343-1b.c: Likewise.
> > * gcc.target/i386/pr122343-2a.c: Likewise.
> > * gcc.target/i386/pr122343-2b.c: Likewise.
> > * gcc.target/i386/pr122343-3.c: Likewise.
> > * gcc.target/i386/pr122343-4a.c: Likewise.
> > * gcc.target/i386/pr122343-4b.c: Likewise.
> >
> > --
> > H.J.
>

PING.

-- 
H.J.

Reply via email to