On Wed, Nov 5, 2025 at 3:12 PM H.J. Lu <[email protected]> wrote: > > On Tue, Oct 28, 2025 at 10:31 AM H.J. Lu <[email protected]> wrote: > > > > On Tue, Oct 28, 2025 at 5:53 AM H.J. Lu <[email protected]> wrote: > > > > > > On Tue, Oct 28, 2025 at 12:04 AM Jeff Law <[email protected]> wrote: > > > > > > > > > > > > > > > > On 10/27/25 2:37 AM, Richard Biener wrote: > > > > ? > > > > >> > > > > >> There are 2 reasons: > > > > >> > > > > >> 1. RISC processors only allow moves with memory. Since this > > > > >> optimization > > > > >> only applies to other load operations, it is nop for most RISC > > > > >> processors. > > > > >> 2. Some GCC tests scan compiler assembly outputs which don't expect > > > > >> other > > > > >> load operations with volatile memory. I have to disable this > > > > >> optimization > > > > >> in 100+ x86 tests in my patch. If this is enabled for other > > > > >> targets, it may > > > > >> cause many GCC tests failures. > > > > >> > > > > >> If your target supports this, you can give it a try. > > > > >> > > > > >>> Can it break volatile correctness or so? > > > > >> > > > > >> No. My patch has some tests to verify the volatile correctness. > > > > > > > > > > IMO the -fcombine-op-with-volatile-memory-load is quite a bad name. > > > > > Options should have names that mean something to users. As this > > > > > seems to > > > > > supposed to be a target opt-in I'd suggest to not expose this to users > > > > > (or if targets are concerned, via some -m...) but instead make this a > > > > > target hook/macro. Of course I think this should be enabled by > > > > > default. > > > > > > > > If we're going to go with this patch or a variant thereof; I'd go a step > > > > further and not bother with a target hook. If the semantics are wrong > > > > for some case, we're more likely to find it the more targets use the > > > > option. And it sounds like the main purpose behind the hook is to avoid > > > > working through any testsuite issues on those other targets, which isn't > > > > a great reason to introduce a new target hook. > > > > > > One advantage of a target hook is that if a target doesn't opt in, there > > > no need to do additional check: > > > > > > diff --git a/gcc/recog.cc b/gcc/recog.cc > > > index 67d7fa63069..fc411842859 100644 > > > --- a/gcc/recog.cc > > > +++ b/gcc/recog.cc > > > @@ -1596,7 +1596,8 @@ general_operand (rtx op, machine_mode mode) > > > { > > > rtx y = XEXP (op, 0); > > > > > > - if (! volatile_ok && MEM_VOLATILE_P (op)) > > > + if (!targetm.volatile_mem_ok_in_insn (op, > > > + recog_data.try_combine_insn)) > > > return false; > > > > > > /* Use the mem's mode, since it will be reloaded thus. LRA can > > > diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc > > > index 1873d572ba3..ece1f496571 100644 > > > --- a/gcc/targhooks.cc > > > +++ b/gcc/targhooks.cc > > > @@ -1839,6 +1839,14 @@ default_new_address_profitable_p (rtx memref > > > ATTRIBUTE_UNUSED, > > > return true; > > > } > > > > > > +/* The default implementation of TARGET_VOLATILE_MEM_OK_IN_INSN. */ > > > + > > > +bool > > > +default_volatile_mem_ok_in_insn (rtx memref, rtx_insn *) > > > +{ > > > + return volatile_ok || !MEM_VOLATILE_P (memref); > > > +} > > > + > > > > > > > As HJ indicated, I wouldn't expect significant fallout on load/store > > > > architectures. So the scope of fallout isn't likely to be too terrible. > > > > > > > > I was going to throw it in my tester, but it doesn't apply cleanly on > > > > the trunk. > > > > > > > > > patching file gcc/recog.cc > > > > > Hunk #1 FAILED at 116. > > > > > 1 out of 2 hunks FAILED -- saving rejects to file gcc/recog.cc.rej > > > > > > > > > > I put my branch at > > > > > > https://gitlab.com/x86-gcc/gcc/-/commits/users/hjl/pr122343/master > > > > > > on top of > > > > > > commit 76943639ddd861dce3886d1def2a353ccfcdd585 > > > Author: Eric Botcazou <[email protected]> > > > Date: Mon Oct 27 19:51:11 2025 +0100 > > > > > > Ada: Fix assertion failure on child generic package > > > > > > Does it work for you? > > > > > > > Here is the patch to add TARGET_VOLATILE_MEM_OK_IN_INSN. > > > > For x86, volatile memory can be used as source operand for loads. Add > > TARGET_VOLATILE_MEM_OK_IN_INSN to allow a target to support operations > > with volatile memory load: > > > > 1. Add a field, try_combine_insn, to recog_data_d for the combine pass to > > pass the instruction information to general_operand. > > 2. Update recog_for_combine_1 to set recog_data.try_combine_insn before > > calling recog and clear it afterward. > > 3. Add a target hook, TARGET_VOLATILE_MEM_OK_IN_INSN which returns true > > if a volatile memory reference is OK in an instruction. The default > > returns true only if volatile memory reference is OK or the memory > > reference isn't volatile. > > 4. Update general_operand to call targetm.volatile_mem_ok_in_insn to > > determine if a memory operand is OK. > > 5. Add an x86 TARGET_VOLATILE_MEM_OK_IN_INSN hook which returns true when > > volatile memory is used in load and -mfuse-ops-with-volatile-access is > > enabled. > > > > This optimizes > > > > extern volatile int bar; > > > > int > > foo (int z) > > { > > z *= 123; > > return bar + z; > > } > > > > into > > > > foo: > > imull $123, %edi, %eax > > addl bar(%rip), %eax > > ret > > > > instead of > > > > foo: > > imull $123, %edi, %eax > > movl bar(%rip), %edx > > addl %edx, %eax > > ret > > > > Tested with Linux kernel 6.17.4 on Intel Core i7-1195G7. > > > > TODO: Investigate if adding the LOCK prefix to allow volatile memory store > > will improve performance since the LOCK prefix may be expensive. > > > > gcc/ > > > > PR target/122343 > > * combine.cc (recog_for_combine_1): Set and clear > > recog_data.try_combine_insn when calling recog. > > * recog.cc (general_operand): Call targetm.volatile_mem_ok_in_insn. > > * recog.h (recog_data_d): Add try_combine_insn. > > * target.def (volatile_mem_ok_in_insn): New hook. > > * targhooks.cc (default_volatile_mem_ok_in_insn): New. > > * targhooks.h (default_volatile_mem_ok_in_insn): Likewise. > > * config/i386/i386.cc (ix86_volatile_mem_ok_in_insn): Likewise. > > (TARGET_VOLATILE_MEM_OK_IN_INSN): Likewise. > > * doc/invoke.texi: Document -mfuse-ops-with-volatile-access. > > * doc/tm.texi: Regenerated. > > * doc/tm.texi.in: Add TARGET_VOLATILE_MEM_OK_IN_INSN. > > Hi Richard, Jeff, > > Does this patch look OK? > > Thanks. > > > gcc/testsuite/ > > > > PR target/122343 > > * gcc.target/i386/avx-ne-convert-1.c: Compile with > > -mno-fuse-ops-with-volatile-access. > > * gcc.target/i386/avx10_2-bf16-1.c: Likewise. > > * gcc.target/i386/avx10_2-convert-1.c: Likewise. > > * gcc.target/i386/avx10_2-satcvt-1.c: Likewise. > > * gcc.target/i386/avx512bf16-vcvtneps2bf16-1.c: Likewise. > > * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1a.c: Likewise. > > * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1b.c: Likewise. > > * gcc.target/i386/avx512dq-vcvtps2qq-1.c: Likewise. > > * gcc.target/i386/avx512dq-vcvtps2uqq-1.c: Likewise. > > * gcc.target/i386/avx512dq-vcvtqq2pd-1.c: Likewise. > > * gcc.target/i386/avx512dq-vcvtqq2ps-1.c: Likewise. > > * gcc.target/i386/avx512dq-vcvttps2qq-1.c: Likewise. > > * gcc.target/i386/avx512dq-vcvttps2uqq-1.c: Likewise. > > * gcc.target/i386/avx512dq-vcvtuqq2pd-1.c: Likewise. > > * gcc.target/i386/avx512dq-vcvtuqq2ps-1.c: Likewise. > > * gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise. > > * gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise. > > * gcc.target/i386/avx512dq-vfpclasssd-1.c: Likewise. > > * gcc.target/i386/avx512dq-vfpclassss-1.c: Likewise. > > * gcc.target/i386/avx512dq-vpmullq-1.c: Likewise. > > * gcc.target/i386/avx512dq-vpmullq-3.c: Likewise. > > * gcc.target/i386/avx512f-pr100267-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtps2pd-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtsd2si-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtsd2si64-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtsd2usi-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtsd2usi64-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtss2si-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtss2si64-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtss2usi-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvtss2usi64-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvttsd2si-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvttsd2si64-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvttsd2usi-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvttsd2usi64-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvttss2si-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvttss2si64-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvttss2usi-1.c: Likewise. > > * gcc.target/i386/avx512f-vcvttss2usi64-1.c: Likewise. > > * gcc.target/i386/avx512f-vmovapd-1.c: Likewise. > > * gcc.target/i386/avx512f-vmovaps-1.c: Likewise. > > * gcc.target/i386/avx512f-vmovdqa64-1.c: Likewise. > > * gcc.target/i386/avx512f-vpandnq-1.c: Likewise. > > * gcc.target/i386/avx512f-vpbroadcastd-1.c: Likewise. > > * gcc.target/i386/avx512f-vpbroadcastq-1.c: Likewise. > > * gcc.target/i386/avx512f-vpcmpeqq-1.c: Likewise. > > * gcc.target/i386/avx512f-vpcmpequq-1.c: Likewise. > > * gcc.target/i386/avx512f-vrndscalepd-1.c: Likewise. > > * gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise. > > * gcc.target/i386/avx512fp16-complex-fma.c: Likewise. > > * gcc.target/i386/avx512fp16-vaddph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtpd2ph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtph2dq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtph2pd-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtph2psx-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtph2qq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtph2udq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtph2uqq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtph2uw-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtph2w-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtps2ph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtqq2ph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvttph2dq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvttph2qq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvttph2udq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvttph2uqq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvttph2uw-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvttph2w-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vcvtuqq2ph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vfcmaddcph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vfcmulcph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vfmaddcph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vfmulcph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vfpclassph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vfpclasssh-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vmulph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vrcpph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vrsqrtph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16-vsqrtph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vaddph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtpd2ph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtph2dq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtph2psx-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtph2qq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtph2udq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtph2uqq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtph2uw-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtph2w-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtps2ph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtqq2ph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvttph2dq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvttph2udq-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvttph2uw-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvttph2w-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vfcmaddcph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vfcmulcph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vfmaddcph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vfmulcph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vfpclassph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vmulph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vrcpph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vrsqrtph-1a.c: Likewise. > > * gcc.target/i386/avx512fp16vl-vsqrtph-1a.c: Likewise. > > * gcc.target/i386/avx512vl-pr100267-1.c: Likewise. > > * gcc.target/i386/avx512vl-vcvtpd2ps-1.c: Likewise. > > * gcc.target/i386/avx512vl-vcvtpd2udq-1.c: Likewise. > > * gcc.target/i386/avx512vl-vcvttpd2udq-1.c: Likewise. > > * gcc.target/i386/avx512vl-vcvttps2udq-1.c: Likewise. > > * gcc.target/i386/avx512vl-vmovapd-1.c: Likewise. > > * gcc.target/i386/avx512vl-vmovaps-1.c: Likewise. > > * gcc.target/i386/avx512vl-vmovdqa64-1.c: Likewise. > > * gcc.target/i386/avx512vl-vpcmpeqq-1.c: Likewise. > > * gcc.target/i386/avx512vl-vpcmpequq-1.c: Likewise. > > * gcc.target/i386/avx512vl-vpcmpuq-1.c: Likewise. > > * gcc.target/i386/pr122343-1a.c: New test. > > * gcc.target/i386/pr122343-1b.c: Likewise. > > * gcc.target/i386/pr122343-2a.c: Likewise. > > * gcc.target/i386/pr122343-2b.c: Likewise. > > * gcc.target/i386/pr122343-3.c: Likewise. > > * gcc.target/i386/pr122343-4a.c: Likewise. > > * gcc.target/i386/pr122343-4b.c: Likewise. > > > > -- > > H.J. >
PING. -- H.J.
