On Fri, Jun 16, 2023 at 3:27 PM Roger Sayle <ro...@nextmovesoftware.com> wrote: > > > Hi Uros, > Here's an updated version of this patch incorporating your comments. > It uses emit_insn (target, const1_rtx), bt_comparison operator to > combine the sete/setne to setc/setnc, and je/jne to jc/jnc patterns, > uses scan-assembler-times in the test cases, and cleans up the silly > cut'n'paste issue that mangled strict_low_part/subreg of a register > that was already QImode. I tried, but the strict_low_part variant > really is required (some of the new test cases fail without it), but > things are much neater now, and have few patterns than the original. > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32} > with no new failures. Ok for mainline? > > > 2023-06-16 Roger Sayle <ro...@nextmovesoftware.com> > Uros Bizjak <ubiz...@gmail.com> > > gcc/ChangeLog > * config/i386/i386-expand.cc (ix86_expand_sse_ptest): Recognize > expansion of ptestc with equal operands as producing const1_rtx. > * config/i386/i386.cc (ix86_rtx_costs): Provide accurate cost > estimates of UNSPEC_PTEST, where the ptest performs the PAND > or PAND of its operands. > * config/i386/sse.md (define_split): Transform CCCmode UNSPEC_PTEST > of reg_equal_p operands into an x86_stc instruction. > (define_split): Split pandn/ptestz/set{n?}e into ptestc/set{n?}c. > (define_split): Similar to above for strict_low_part destinations. > (define_split): Split pandn/ptestz/j{n?}e into ptestc/j{n?}c. > > gcc/testsuite/ChangeLog > * gcc.target/i386/avx-vptest-4.c: New test case. > * gcc.target/i386/avx-vptest-5.c: Likewise. > * gcc.target/i386/avx-vptest-6.c: Likewise. > * gcc.target/i386/pr109973-1.c: Update test case. > * gcc.target/i386/pr109973-2.c: Likewise. > * gcc.target/i386/sse4_1-ptest-4.c: New test case. > * gcc.target/i386/sse4_1-ptest-5.c: Likewise. > * gcc.target/i386/sse4_1-ptest-6.c: Likewise.
+(define_split + [(set (strict_low_part (subreg:QI (match_operand:SI 0 "register_operand") 0)) I think you should use (set (strict_low_part (match_operand:QI 0 "register_operand")) ... here and ... + (set (strict_low_part (subreg:QI (match_dup 0) 0)) corresponding (set (strict_low_part (match_dup 0))... without explicit SUBREG here. This will handle all subregs automatically, as they are also matched by "register_operand" predicate. OK with the above change. Thanks, Uros.