On Tue, Oct 28, 2025 at 10:31 AM H.J. Lu <[email protected]> wrote:
>
> On Tue, Oct 28, 2025 at 5:53 AM H.J. Lu <[email protected]> wrote:
> >
> > On Tue, Oct 28, 2025 at 12:04 AM Jeff Law <[email protected]> wrote:
> > >
> > >
> > >
> > > On 10/27/25 2:37 AM, Richard Biener wrote:
> > > ?
> > > >>
> > > >> There are 2 reasons:
> > > >>
> > > >> 1.  RISC processors only allow moves with memory.   Since this 
> > > >> optimization
> > > >> only applies to other load operations, it is nop for most RISC 
> > > >> processors.
> > > >> 2.  Some GCC tests scan compiler assembly outputs which don't expect 
> > > >> other
> > > >> load operations with volatile memory.   I have to disable this 
> > > >> optimization
> > > >> in 100+ x86 tests in my patch.  If this is enabled for other targets, 
> > > >> it may
> > > >> cause many GCC tests failures.
> > > >>
> > > >> If your target supports this, you can give it a try.
> > > >>
> > > >>> Can it break volatile correctness or so?
> > > >>
> > > >> No.   My patch has some tests to verify the volatile correctness.
> > > >
> > > > IMO the -fcombine-op-with-volatile-memory-load is quite a bad name.
> > > > Options should have names that mean something to users.  As this seems 
> > > > to
> > > > supposed to be a target opt-in I'd suggest to not expose this to users
> > > > (or if targets are concerned, via some -m...) but instead make this a
> > > > target hook/macro.  Of course I think this should be enabled by default.
> > >
> > > If we're going to go with this patch or a variant thereof; I'd go a step
> > > further and not bother with a target hook.  If the semantics are wrong
> > > for some case, we're more likely to find it the more targets use the
> > > option.  And it sounds like the main purpose behind the hook is to avoid
> > > working through any testsuite issues on those other targets, which isn't
> > > a great reason to introduce a new target hook.
> >
> > One advantage of a target hook is that if a target doesn't opt in, there
> > no need to do additional check:
> >
> > diff --git a/gcc/recog.cc b/gcc/recog.cc
> > index 67d7fa63069..fc411842859 100644
> > --- a/gcc/recog.cc
> > +++ b/gcc/recog.cc
> > @@ -1596,7 +1596,8 @@ general_operand (rtx op, machine_mode mode)
> >      {
> >        rtx y = XEXP (op, 0);
> >
> > -      if (! volatile_ok && MEM_VOLATILE_P (op))
> > +      if (!targetm.volatile_mem_ok_in_insn (op,
> > +                  recog_data.try_combine_insn))
> >    return false;
> >
> >        /* Use the mem's mode, since it will be reloaded thus.  LRA can
> > diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
> > index 1873d572ba3..ece1f496571 100644
> > --- a/gcc/targhooks.cc
> > +++ b/gcc/targhooks.cc
> > @@ -1839,6 +1839,14 @@ default_new_address_profitable_p (rtx memref
> > ATTRIBUTE_UNUSED,
> >    return true;
> >  }
> >
> > +/* The default implementation of TARGET_VOLATILE_MEM_OK_IN_INSN.  */
> > +
> > +bool
> > +default_volatile_mem_ok_in_insn (rtx memref, rtx_insn *)
> > +{
> > +  return volatile_ok || !MEM_VOLATILE_P (memref);
> > +}
> > +
> >
> > > As HJ indicated, I wouldn't expect significant fallout on load/store
> > > architectures.  So the scope of fallout isn't likely to be too terrible.
> > >
> > > I was going to throw it in my tester, but it doesn't apply cleanly on
> > > the trunk.
> > >
> > > > patching file gcc/recog.cc
> > > > Hunk #1 FAILED at 116.
> > > > 1 out of 2 hunks FAILED -- saving rejects to file gcc/recog.cc.rej
> > >
> >
> > I put my branch at
> >
> > https://gitlab.com/x86-gcc/gcc/-/commits/users/hjl/pr122343/master
> >
> > on top of
> >
> > commit 76943639ddd861dce3886d1def2a353ccfcdd585
> > Author: Eric Botcazou <[email protected]>
> > Date:   Mon Oct 27 19:51:11 2025 +0100
> >
> >     Ada: Fix assertion failure on child generic package
> >
> > Does it work for you?
> >
>
> Here is the patch to add TARGET_VOLATILE_MEM_OK_IN_INSN.
>
> For x86, volatile memory can be used as source operand for loads.  Add
> TARGET_VOLATILE_MEM_OK_IN_INSN to allow a target to support operations
> with volatile memory load:
>
> 1. Add a field, try_combine_insn, to recog_data_d for the combine pass to
>    pass the instruction information to general_operand.
> 2. Update recog_for_combine_1 to set recog_data.try_combine_insn before
>    calling recog and clear it afterward.
> 3. Add a target hook, TARGET_VOLATILE_MEM_OK_IN_INSN which returns true
>    if a volatile memory reference is OK in an instruction.  The default
>    returns true only if volatile memory reference is OK or the memory
>    reference isn't volatile.
> 4. Update general_operand to call targetm.volatile_mem_ok_in_insn to
>    determine if a memory operand is OK.
> 5. Add an x86 TARGET_VOLATILE_MEM_OK_IN_INSN hook which returns true when
>    volatile memory is used in load and -mfuse-ops-with-volatile-access is
>    enabled.
>
> This optimizes
>
> extern volatile int bar;
>
> int
> foo (int z)
> {
>   z *= 123;
>   return bar + z;
> }
>
> into
>
> foo:
> imull $123, %edi, %eax
> addl bar(%rip), %eax
> ret
>
> instead of
>
> foo:
> imull $123, %edi, %eax
> movl bar(%rip), %edx
> addl %edx, %eax
> ret
>
> Tested with Linux kernel 6.17.4 on Intel Core i7-1195G7.
>
> TODO: Investigate if adding the LOCK prefix to allow volatile memory store
> will improve performance since the LOCK prefix may be expensive.
>
> gcc/
>
> PR target/122343
> * combine.cc (recog_for_combine_1): Set and clear
> recog_data.try_combine_insn when calling recog.
> * recog.cc (general_operand): Call targetm.volatile_mem_ok_in_insn.
> * recog.h (recog_data_d): Add try_combine_insn.
> * target.def (volatile_mem_ok_in_insn): New hook.
> * targhooks.cc (default_volatile_mem_ok_in_insn): New.
> * targhooks.h (default_volatile_mem_ok_in_insn): Likewise.
> * config/i386/i386.cc (ix86_volatile_mem_ok_in_insn): Likewise.
> (TARGET_VOLATILE_MEM_OK_IN_INSN): Likewise.
> * doc/invoke.texi: Document -mfuse-ops-with-volatile-access.
> * doc/tm.texi: Regenerated.
> * doc/tm.texi.in: Add TARGET_VOLATILE_MEM_OK_IN_INSN.

Hi Richard, Jeff,

Does this patch look OK?

Thanks.

> gcc/testsuite/
>
> PR target/122343
> * gcc.target/i386/avx-ne-convert-1.c: Compile with
> -mno-fuse-ops-with-volatile-access.
> * gcc.target/i386/avx10_2-bf16-1.c: Likewise.
> * gcc.target/i386/avx10_2-convert-1.c: Likewise.
> * gcc.target/i386/avx10_2-satcvt-1.c: Likewise.
> * gcc.target/i386/avx512bf16-vcvtneps2bf16-1.c: Likewise.
> * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1a.c: Likewise.
> * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1b.c: Likewise.
> * gcc.target/i386/avx512dq-vcvtps2qq-1.c: Likewise.
> * gcc.target/i386/avx512dq-vcvtps2uqq-1.c: Likewise.
> * gcc.target/i386/avx512dq-vcvtqq2pd-1.c: Likewise.
> * gcc.target/i386/avx512dq-vcvtqq2ps-1.c: Likewise.
> * gcc.target/i386/avx512dq-vcvttps2qq-1.c: Likewise.
> * gcc.target/i386/avx512dq-vcvttps2uqq-1.c: Likewise.
> * gcc.target/i386/avx512dq-vcvtuqq2pd-1.c: Likewise.
> * gcc.target/i386/avx512dq-vcvtuqq2ps-1.c: Likewise.
> * gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise.
> * gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise.
> * gcc.target/i386/avx512dq-vfpclasssd-1.c: Likewise.
> * gcc.target/i386/avx512dq-vfpclassss-1.c: Likewise.
> * gcc.target/i386/avx512dq-vpmullq-1.c: Likewise.
> * gcc.target/i386/avx512dq-vpmullq-3.c: Likewise.
> * gcc.target/i386/avx512f-pr100267-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtps2pd-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtsd2si-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtsd2si64-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtsd2usi-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtsd2usi64-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtss2si-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtss2si64-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtss2usi-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvtss2usi64-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvttsd2si-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvttsd2si64-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvttsd2usi-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvttsd2usi64-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvttss2si-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvttss2si64-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvttss2usi-1.c: Likewise.
> * gcc.target/i386/avx512f-vcvttss2usi64-1.c: Likewise.
> * gcc.target/i386/avx512f-vmovapd-1.c: Likewise.
> * gcc.target/i386/avx512f-vmovaps-1.c: Likewise.
> * gcc.target/i386/avx512f-vmovdqa64-1.c: Likewise.
> * gcc.target/i386/avx512f-vpandnq-1.c: Likewise.
> * gcc.target/i386/avx512f-vpbroadcastd-1.c: Likewise.
> * gcc.target/i386/avx512f-vpbroadcastq-1.c: Likewise.
> * gcc.target/i386/avx512f-vpcmpeqq-1.c: Likewise.
> * gcc.target/i386/avx512f-vpcmpequq-1.c: Likewise.
> * gcc.target/i386/avx512f-vrndscalepd-1.c: Likewise.
> * gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise.
> * gcc.target/i386/avx512fp16-complex-fma.c: Likewise.
> * gcc.target/i386/avx512fp16-vaddph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtpd2ph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtph2dq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtph2pd-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtph2psx-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtph2qq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtph2udq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtph2uqq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtph2uw-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtph2w-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtps2ph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtqq2ph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvttph2dq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvttph2qq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvttph2udq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvttph2uqq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvttph2uw-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvttph2w-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vcvtuqq2ph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vfcmaddcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vfcmulcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vfmaddcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vfmulcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vfpclassph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vfpclasssh-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vmulph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vrcpph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vrsqrtph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16-vsqrtph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vaddph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtpd2ph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtph2dq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtph2psx-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtph2qq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtph2udq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtph2uqq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtph2uw-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtph2w-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtps2ph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtqq2ph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvttph2dq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvttph2udq-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvttph2uw-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvttph2w-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vfcmaddcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vfcmulcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vfmaddcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vfmulcph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vfpclassph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vmulph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vrcpph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vrsqrtph-1a.c: Likewise.
> * gcc.target/i386/avx512fp16vl-vsqrtph-1a.c: Likewise.
> * gcc.target/i386/avx512vl-pr100267-1.c: Likewise.
> * gcc.target/i386/avx512vl-vcvtpd2ps-1.c: Likewise.
> * gcc.target/i386/avx512vl-vcvtpd2udq-1.c: Likewise.
> * gcc.target/i386/avx512vl-vcvttpd2udq-1.c: Likewise.
> * gcc.target/i386/avx512vl-vcvttps2udq-1.c: Likewise.
> * gcc.target/i386/avx512vl-vmovapd-1.c: Likewise.
> * gcc.target/i386/avx512vl-vmovaps-1.c: Likewise.
> * gcc.target/i386/avx512vl-vmovdqa64-1.c: Likewise.
> * gcc.target/i386/avx512vl-vpcmpeqq-1.c: Likewise.
> * gcc.target/i386/avx512vl-vpcmpequq-1.c: Likewise.
> * gcc.target/i386/avx512vl-vpcmpuq-1.c: Likewise.
> * gcc.target/i386/pr122343-1a.c: New test.
> * gcc.target/i386/pr122343-1b.c: Likewise.
> * gcc.target/i386/pr122343-2a.c: Likewise.
> * gcc.target/i386/pr122343-2b.c: Likewise.
> * gcc.target/i386/pr122343-3.c: Likewise.
> * gcc.target/i386/pr122343-4a.c: Likewise.
> * gcc.target/i386/pr122343-4b.c: Likewise.
>
> --
> H.J.



-- 
H.J.

Reply via email to