On Mon, Jul 4, 2022 at 7:27 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > Hi Uros, > Thanks for the review. This patch implements all of your suggestions, both > removing ix86_pre_reload_split from the combine splitter(s), and dividing > the original splitter up into four simpler variants, that use match_dup to > handle the variants/permutations caused by operator commutativity. > > This revised patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} with no > new failures. Ok for mainline? > > 2022-07-04 Roger Sayle <ro...@nextmovesoftware.com> > Uroš Bizjak <ubiz...@gmail.com> > > gcc/ChangeLog > PR rtl-optimization/96692 > * config/i386/i386.md (define_split): Split ((A | B) ^ C) ^ D > as (X & ~Y) ^ Z on target BMI when either C or D is A or B. > > gcc/testsuite/ChangeLog > PR rtl-optimization/96692 > * gcc.target/i386/bmi-andn-4.c: New test case.
OK. Thanks, Uros. > > Thanks again, > Roger > -- > > > -----Original Message----- > > From: Uros Bizjak <ubiz...@gmail.com> > > Sent: 26 June 2022 18:08 > > To: Roger Sayle <ro...@nextmovesoftware.com> > > Cc: gcc-patches@gcc.gnu.org > > Subject: Re: [x86 PATCH] PR rtl-optimization/96692: ((A|B)^C)^A using andn > > with > > -mbmi. > > > > On Sun, Jun 26, 2022 at 2:04 PM Roger Sayle <ro...@nextmovesoftware.com> > > wrote: > > > > > > > > > This patch addresses PR rtl-optimization/96692 on x86_64, by providing > > > a define_split for combine to convert the three operation ((A|B)^C)^D > > > into a two operation sequence using andn when either A or B is the > > > same register as C or D. This is essentially a reassociation problem > > > that's only a win if the target supports an and-not instruction (as with > > > -mbmi). > > > > > > Hence for the new test case: > > > > > > int f(int a, int b, int c) > > > { > > > return (a ^ b) ^ (a | c); > > > } > > > > > > GCC on x86_64-pc-linux-gnu wth -O2 -mbmi would previously generate: > > > > > > xorl %edi, %esi > > > orl %edx, %edi > > > movl %esi, %eax > > > xorl %edi, %eax > > > ret > > > > > > but with this patch now generates: > > > > > > andn %edx, %edi, %eax > > > xorl %esi, %eax > > > ret > > > > > > I'll investigate whether this optimization can also be implemented > > > more generically in simplify_rtx when the backend provides accurate > > > rtx_costs for "(and (not ..." (as there's no optab for andn). > > > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > > and make -k check, both with and without --target_board=unix{-m32}, > > > with no new failures. Ok for mainline? > > > > > > > > > 2022-06-26 Roger Sayle <ro...@nextmovesoftware.com> > > > > > > gcc/ChangeLog > > > PR rtl-optimization/96692 > > > * config/i386/i386.md (define_split): Split ((A | B) ^ C) ^ D > > > as (X & ~Y) ^ Z on target BMI when either C or D is A or B. > > > > > > gcc/testsuite/ChangeLog > > > PR rtl-optimization/96692 > > > * gcc.target/i386/bmi-andn-4.c: New test case. > > > > + "TARGET_BMI > > + && ix86_pre_reload_split () > > + && (rtx_equal_p (operands[1], operands[3]) > > + || rtx_equal_p (operands[1], operands[4]) > > + || (REG_P (operands[2]) > > + && (rtx_equal_p (operands[2], operands[3]) > > + || rtx_equal_p (operands[2], operands[4]))))" > > > > You don't need a ix86_pre_reload_split for combine splitter* > > > > OTOH, please split the pattern to two for each commutative operand and use > > (match_dup x) instead. Something similar to [1]. > > > > *combine splitter is described in the documentation as the splitter pattern > > that > > does *not* match any existing insn pattern. > > > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596804.html > > > > Uros.