On Fri, Jun 16, 2023 at 3:27 PM Roger Sayle <ro...@nextmovesoftware.com> wrote:
>
>
> Hi Uros,
> Here's an updated version of this patch incorporating your comments.
> It uses emit_insn (target, const1_rtx), bt_comparison operator to
> combine the sete/setne to setc/setnc, and je/jne to jc/jnc patterns,
> uses scan-assembler-times in the test cases, and cleans up the silly
> cut'n'paste issue that mangled strict_low_part/subreg of a register
> that was already QImode.  I tried, but the strict_low_part variant
> really is required (some of the new test cases fail without it), but
> things are much neater now, and have few patterns than the original.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2023-06-16  Roger Sayle  <ro...@nextmovesoftware.com>
>             Uros Bizjak  <ubiz...@gmail.com>
>
> gcc/ChangeLog
>         * config/i386/i386-expand.cc (ix86_expand_sse_ptest): Recognize
>         expansion of ptestc with equal operands as producing const1_rtx.
>         * config/i386/i386.cc (ix86_rtx_costs): Provide accurate cost
>         estimates of UNSPEC_PTEST, where the ptest performs the PAND
>         or PAND of its operands.
>         * config/i386/sse.md (define_split): Transform CCCmode UNSPEC_PTEST
>         of reg_equal_p operands into an x86_stc instruction.
>         (define_split): Split pandn/ptestz/set{n?}e into ptestc/set{n?}c.
>         (define_split): Similar to above for strict_low_part destinations.
>         (define_split): Split pandn/ptestz/j{n?}e into ptestc/j{n?}c.
>
> gcc/testsuite/ChangeLog
>         * gcc.target/i386/avx-vptest-4.c: New test case.
>         * gcc.target/i386/avx-vptest-5.c: Likewise.
>         * gcc.target/i386/avx-vptest-6.c: Likewise.
>         * gcc.target/i386/pr109973-1.c: Update test case.
>         * gcc.target/i386/pr109973-2.c: Likewise.
>         * gcc.target/i386/sse4_1-ptest-4.c: New test case.
>         * gcc.target/i386/sse4_1-ptest-5.c: Likewise.
>         * gcc.target/i386/sse4_1-ptest-6.c: Likewise.

+(define_split
+  [(set (strict_low_part (subreg:QI (match_operand:SI 0 "register_operand") 0))

I think you should use

(set (strict_low_part (match_operand:QI 0 "register_operand")) ... here and ...

+   (set (strict_low_part (subreg:QI (match_dup 0) 0))

corresponding

(set (strict_low_part (match_dup 0))...

without explicit SUBREG here. This will handle all subregs
automatically, as they are also matched by "register_operand"
predicate.

OK with the above change.

Thanks,
Uros.

Reply via email to