On Mon, Oct 27, 2025 at 8:43 PM Richard Biener
<[email protected]> wrote:
>
> On Mon, Oct 27, 2025 at 12:16 PM H.J. Lu <[email protected]> wrote:
> >
> > On Mon, Oct 27, 2025 at 4:37 PM Richard Biener
> > <[email protected]> wrote:
> > >
> > > On Fri, Oct 24, 2025 at 11:25 PM H.J. Lu <[email protected]> wrote:
> > > >
> > > > On Fri, Oct 24, 2025 at 10:23 PM Georg-Johann Lay <[email protected]> wrote:
> > > > >
> > > > > Am 24.10.25 um 15:45 schrieb H.J. Lu:
> > > > > > For x86, volatile memory can be used as source operand for loads.  
> > > > > > Add
> > > > > > -fcombine-op-with-volatile-memory-load to allow operations with 
> > > > > > volatile
> > > > > > memory load:
> > > > > >
> > > > > > 1. Add a field, try_combine_insn, to recog_data_d for the combine 
> > > > > > pass to
> > > > > >     pass the instruction information to general_operand.
> > > > > > 2. Update recog_for_combine_1 to set recog_data.try_combine_insn 
> > > > > > and clear
> > > > > >     when calling recog.
> > > > > > 3. Add a function, volatile_memory_ok, which returns true if 
> > > > > > volatile
> > > > > >     memory is used for load and 
> > > > > > -fcombine-op-with-volatile-memory-load is
> > > > > >     enabled.
> > > > > > 4. Update general_operand to call volatile_memory_ok to determine if
> > > > > >     a memory operand is OK.
> > > > > >
> > > > > > Enabled at -O1 and higher for x86.  It optimizes
> > > > >
> > > > > Hi, what's the reason why this is not enabled for all targets?
> > > >
> > > > There are 2 reasons:
> > > >
> > > > 1.  RISC processors only allow moves with memory.   Since this 
> > > > optimization
> > > > only applies to other load operations, it is nop for most RISC 
> > > > processors.
> > > > 2.  Some GCC tests scan compiler assembly outputs which don't expect 
> > > > other
> > > > load operations with volatile memory.   I have to disable this 
> > > > optimization
> > > > in 100+ x86 tests in my patch.  If this is enabled for other targets, 
> > > > it may
> > > > cause many GCC tests failures.
> > > >
> > > > If your target supports this, you can give it a try.
> > > >
> > > > > Can it break volatile correctness or so?
> > > >
> > > > No.   My patch has some tests to verify the volatile correctness.
> > >
> > > IMO the -fcombine-op-with-volatile-memory-load is quite a bad name.
> > > Options should have names that mean something to users.  As this seems to
> > > supposed to be a target opt-in I'd suggest to not expose this to users
> > > (or if targets are concerned, via some -m...) but instead make this a
> >
> > -mvolatile-memory-load?
>
> -mfuse-ops-with-volatile-access?
>
> So as to eventually also allow RMW?

Yes.  We can add the LOCK prefix for RMW if it is allowed.  But will it
improve performance?  The LOCK prefix may be expensive.

> > > target hook/macro.  Of course I think this should be enabled by default.
> >
> > targetm.volatile_memory_ok (rtx mem)?
>
> If we want to make this 'mem' specific then maybe even
>
> targetm.volatile_mem_ok_in_insn (rtx_insn insn, rtx *mem)
>
> but I think this is overkill until we run into a specific sub-case we want to
> disable.  So
>
> bool targetm.fuse_ops_with_volatile_mem ()
>
> I'd say.
>
> Richard.
>
> >
> > > I have not looked at the implementation, so no comments on that
> > > (I'd have chosen late-combine or fwprop as "easier to understand" targets
> > > I guess).
> > >
> > > Richard.
> > >
> > > > > Johann
> > > > >
> > > > >
> > > > > > extern volatile int bar;
> > > > > >
> > > > > > int
> > > > > > foo (int z)
> > > > > > {
> > > > > >    z *= 123;
> > > > > >    return bar + z;
> > > > > > }
> > > > > >
> > > > > > into
> > > > > >
> > > > > > foo:
> > > > > > imull $123, %edi, %eax
> > > > > > addl bar(%rip), %eax
> > > > > > ret
> > > > > >
> > > > > > instead of
> > > > > >
> > > > > > foo:
> > > > > > imull $123, %edi, %eax
> > > > > > movl bar(%rip), %edx
> > > > > > addl %edx, %eax
> > > > > > ret
> > > > > >
> > > > > > Tested with Linux kernel 6.17.4 on Intel Core i7-1195G7.
> > > > > >
> > > > > > gcc/
> > > > > >
> > > > > > PR target/122343
> > > > > > * combine.cc (recog_for_combine_1): Set and clear
> > > > > > recog_data.try_combine_insn when calling recog.
> > > > > > * common.opt (-fcombine-op-with-volatile-memory-load): New option.
> > > > > > * recog.cc (volatile_memory_ok): New.
> > > > > > (general_operand): Call volatile_memory_ok.
> > > > > > * recog.h (recog_data_d): Add try_combine_insn.
> > > > > > * config/i386/i386-options.cc (ix86_recompute_optlev_based_flags):
> > > > > > Set -fcombine-op-with-volatile-memory-load for -O1 and above if
> > > > > > unset.
> > > > > > * doc/invoke.texi: Document -fcombine-op-with-volatile-memory-load.
> > > > > >
> > > > > > gcc/testsuite/
> > > > > >
> > > > > > PR target/122343
> > > > > > * gcc.target/i386/avx10_2-bf16-1.c: Compile with
> > > > > > -fno-combine-op-with-volatile-memory-load.
> > > > > > * gcc.target/i386/avx10_2-convert-1.c: Likewise.
> > > > > > * gcc.target/i386/avx10_2-satcvt-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512bf16-vcvtneps2bf16-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512bf16vl-vcvtneps2bf16-1b.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vcvtps2qq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vcvtps2uqq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vcvtqq2pd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vcvtqq2ps-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vcvttps2qq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vcvttps2uqq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vcvtuqq2pd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vcvtuqq2ps-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vfpclasspd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vfpclassps-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vfpclasssd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vfpclassss-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vpmullq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512dq-vpmullq-3.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-complex-fma.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vaddph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtpd2ph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtph2dq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtph2pd-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtph2psx-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtph2qq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtph2udq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtph2uqq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtph2uw-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtph2w-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtps2ph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtqq2ph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvttph2dq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvttph2qq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvttph2udq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvttph2uqq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvttph2uw-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvttph2w-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vcvtuqq2ph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vfcmaddcph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vfcmulcph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vfmaddcph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vfmulcph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vfpclassph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vfpclasssh-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vaddph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtpd2ph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtph2dq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtph2psx-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtph2qq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtph2udq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtph2uqq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtph2uw-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtph2w-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtps2ph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtqq2ph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvttph2dq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvttph2udq-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvttph2uw-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvttph2w-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vcvtuqq2ph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vfcmaddcph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vfcmulcph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vfmaddcph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vfmulcph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vfpclassph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vmulph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vrcpph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vrsqrtph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16vl-vsqrtph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vmulph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vrcpph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vrsqrtph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512fp16-vsqrtph-1a.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-pr100267-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtps2pd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtsd2si-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtsd2si64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtsd2usi-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtsd2usi64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtss2si-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtss2si64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtss2usi-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvtss2usi64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvttsd2si-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvttsd2si64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvttsd2usi-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvttsd2usi64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvttss2si-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvttss2si64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvttss2usi-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vcvttss2usi64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vmovapd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vmovaps-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vmovdqa64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vpandnq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vpbroadcastd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vpbroadcastq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vpcmpeqq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vpcmpequq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vrndscalepd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512f-vrndscaleps-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-pr100267-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vcvtpd2ps-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vcvtpd2udq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vcvttpd2udq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vcvttps2udq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vmovapd-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vmovaps-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vmovdqa64-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vpcmpeqq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vpcmpequq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx512vl-vpcmpuq-1.c: Likewise.
> > > > > > * gcc.target/i386/avx-ne-convert-1.c: Likewise.
> > > > > > * gcc.target/i386/pr122343-1a.c: New test.
> > > > > > * gcc.target/i386/pr122343-1b.c: Likewise.
> > > > > > * gcc.target/i386/pr122343-2a.c: Likewise.
> > > > > > * gcc.target/i386/pr122343-2b.c: Likewise.
> > > > > > * gcc.target/i386/pr122343-3.c: Likewise.
> > > > > > * gcc.target/i386/pr122343-4a.c: Likewise.
> > > > > > * gcc.target/i386/pr122343-4b.c: Likewise.
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > H.J.
> >
> >
> >
> > --
> > H.J.



-- 
H.J.

Reply via email to