This is the mail system at host fx306.security-mail.net.

I'm sorry to have to inform you that your message could not
be delivered to one or more recipients. It's attached below.

For further assistance, please send mail to postmaster.

If you do so, please include this problem report. You can
delete your own text from the attached returned message.

                   The mail system

<marc.poulh...@kalray.eu>: host zimbra2.kalray.eu[195.135.97.26] said: 550
    5.1.1 <marc.poulh...@kalray.eu>: Recipient address rejected: User unknown
    in virtual mailbox table (in reply to RCPT TO command)
Reporting-MTA: dns; fx306.security-mail.net
X-Postfix-Queue-ID: A4AD8399578
X-Postfix-Sender: rfc822; gcc-patches@gcc.gnu.org
Arrival-Date: Tue, 10 Aug 2021 11:12:06 +0200 (CEST)

Final-Recipient: rfc822; marc.poulhies@kalray.eu
Original-Recipient: rfc822;marc.poulhies@kalray.eu
Action: failed
Status: 5.1.1
Remote-MTA: dns; zimbra2.kalray.eu
Diagnostic-Code: smtp; 550 5.1.1 <marc.poulhies@kalray.eu>: Recipient address
    rejected: User unknown in virtual mailbox table
--- Begin Message ---
On Tue, Aug 10, 2021 at 4:44 PM Jakub Jelinek <ja...@redhat.com> wrote:
>
> Hi!
>
> On the following testcase we emit
>         vmovdqa32       .LC0(%rip), %zmm1
>         vpermd  %zmm0, %zmm1, %zmm0
> and
>         vmovdqa64       .LC1(%rip), %zmm1
>         vpermq  %zmm0, %zmm1, %zmm0
> instead of
>         vshufi32x4      $78, %zmm0, %zmm0, %zmm0
> and
>         vshufi64x2      $78, %zmm0, %zmm0, %zmm0
> we can emit with the patch.  We have patterns that match two argument
> permutations for vshuf[if]*, but for one argument it doesn't trigger.
> Either we can add two patterns for that, or we would need to add another
> routine to i386-expand.c that would transform under certain condition
> these cases to the two argument vshuf*, doing it in sse.md looked simpler.
Yes, we already have
<mask_codefor>avx512dq_shuf_<shuffletype>64x2_1<mask_name> and
avx512f_shuf_<shuffletype>32x4_1<mask_name>, it's simpler to add
another 2 patterns with similar logic but selected from only 1 vector
instead of 2 vectors.
Patch LGTM.
> We don't need this for 32-byte vectors, we already emit single insn
> permutation that doesn't need memory op there.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2021-08-10  Jakub Jelinek  <ja...@redhat.com>
>
>         PR target/80355
>         * config/i386/sse.md (*avx512f_shuf_<shuffletype>64x2_1<mask_name>_1,
>         *avx512f_shuf_<shuffletype>32x4_1<mask_name>_1): New define_insn
>         patterns.
>
>         * gcc.target/i386/avx512f-pr80355-1.c: New test.
>
> --- gcc/config/i386/sse.md.jj   2021-08-05 10:26:15.592554985 +0200
> +++ gcc/config/i386/sse.md      2021-08-09 13:31:49.025479889 +0200
> @@ -15320,6 +15320,42 @@ (define_insn "avx512f_shuf_<shuffletype>
>     (set_attr "prefix" "evex")
>     (set_attr "mode" "<sseinsnmode>")])
>
> +(define_insn "*avx512f_shuf_<shuffletype>64x2_1<mask_name>_1"
> +  [(set (match_operand:V8FI 0 "register_operand" "=v")
> +       (vec_select:V8FI
> +         (match_operand:V8FI 1 "register_operand" "v")
> +         (parallel [(match_operand 2 "const_0_to_7_operand")
> +                    (match_operand 3 "const_0_to_7_operand")
> +                    (match_operand 4 "const_0_to_7_operand")
> +                    (match_operand 5 "const_0_to_7_operand")
> +                    (match_operand 6 "const_0_to_7_operand")
> +                    (match_operand 7 "const_0_to_7_operand")
> +                    (match_operand 8 "const_0_to_7_operand")
> +                    (match_operand 9 "const_0_to_7_operand")])))]
> +  "TARGET_AVX512F
> +   && (INTVAL (operands[2]) & 1) == 0
> +   && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
> +   && (INTVAL (operands[4]) & 1) == 0
> +   && INTVAL (operands[4]) == INTVAL (operands[5]) - 1
> +   && (INTVAL (operands[6]) & 1) == 0
> +   && INTVAL (operands[6]) == INTVAL (operands[7]) - 1
> +   && (INTVAL (operands[8]) & 1) == 0
> +   && INTVAL (operands[8]) == INTVAL (operands[9]) - 1"
> +{
> +  int mask;
> +  mask = INTVAL (operands[2]) / 2;
> +  mask |= INTVAL (operands[4]) / 2 << 2;
> +  mask |= INTVAL (operands[6]) / 2 << 4;
> +  mask |= INTVAL (operands[8]) / 2 << 6;
> +  operands[2] = GEN_INT (mask);
> +
> +  return "vshuf<shuffletype>64x2\t{%2, %1, %1, 
> %0<mask_operand10>|%0<mask_operand10>, %1, %1, %2}";
> +}
> +  [(set_attr "type" "sselog")
> +   (set_attr "length_immediate" "1")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "<sseinsnmode>")])
> +
>  (define_expand "avx512vl_shuf_<shuffletype>32x4_mask"
>    [(match_operand:VI4F_256 0 "register_operand")
>     (match_operand:VI4F_256 1 "register_operand")
> @@ -15463,6 +15499,58 @@ (define_insn "avx512f_shuf_<shuffletype>
>  }
>    [(set_attr "type" "sselog")
>     (set_attr "length_immediate" "1")
> +   (set_attr "prefix" "evex")
> +   (set_attr "mode" "<sseinsnmode>")])
> +
> +(define_insn "*avx512f_shuf_<shuffletype>32x4_1<mask_name>_1"
> +  [(set (match_operand:V16FI 0 "register_operand" "=v")
> +       (vec_select:V16FI
> +         (match_operand:V16FI 1 "register_operand" "v")
> +         (parallel [(match_operand 2 "const_0_to_15_operand")
> +                    (match_operand 3 "const_0_to_15_operand")
> +                    (match_operand 4 "const_0_to_15_operand")
> +                    (match_operand 5 "const_0_to_15_operand")
> +                    (match_operand 6 "const_0_to_15_operand")
> +                    (match_operand 7 "const_0_to_15_operand")
> +                    (match_operand 8 "const_0_to_15_operand")
> +                    (match_operand 9 "const_0_to_15_operand")
> +                    (match_operand 10 "const_0_to_15_operand")
> +                    (match_operand 11 "const_0_to_15_operand")
> +                    (match_operand 12 "const_0_to_15_operand")
> +                    (match_operand 13 "const_0_to_15_operand")
> +                    (match_operand 14 "const_0_to_15_operand")
> +                    (match_operand 15 "const_0_to_15_operand")
> +                    (match_operand 16 "const_0_to_15_operand")
> +                    (match_operand 17 "const_0_to_15_operand")])))]
> +  "TARGET_AVX512F
> +   && (INTVAL (operands[2]) & 3) == 0
> +   && INTVAL (operands[2]) == INTVAL (operands[3]) - 1
> +   && INTVAL (operands[2]) == INTVAL (operands[4]) - 2
> +   && INTVAL (operands[2]) == INTVAL (operands[5]) - 3
> +   && (INTVAL (operands[6]) & 3) == 0
> +   && INTVAL (operands[6]) == INTVAL (operands[7]) - 1
> +   && INTVAL (operands[6]) == INTVAL (operands[8]) - 2
> +   && INTVAL (operands[6]) == INTVAL (operands[9]) - 3
> +   && (INTVAL (operands[10]) & 3) == 0
> +   && INTVAL (operands[10]) == INTVAL (operands[11]) - 1
> +   && INTVAL (operands[10]) == INTVAL (operands[12]) - 2
> +   && INTVAL (operands[10]) == INTVAL (operands[13]) - 3
> +   && (INTVAL (operands[14]) & 3) == 0
> +   && INTVAL (operands[14]) == INTVAL (operands[15]) - 1
> +   && INTVAL (operands[14]) == INTVAL (operands[16]) - 2
> +   && INTVAL (operands[14]) == INTVAL (operands[17]) - 3"
> +{
> +  int mask;
> +  mask = INTVAL (operands[2]) / 4;
> +  mask |= INTVAL (operands[6]) / 4 << 2;
> +  mask |= INTVAL (operands[10]) / 4 << 4;
> +  mask |= INTVAL (operands[14]) / 4 << 6;
> +  operands[2] = GEN_INT (mask);
> +
> +  return "vshuf<shuffletype>32x4\t{%2, %1, %1, 
> %0<mask_operand18>|%0<mask_operand18>, %1, %1, %2}";
> +}
> +  [(set_attr "type" "sselog")
> +   (set_attr "length_immediate" "1")
>     (set_attr "prefix" "evex")
>     (set_attr "mode" "<sseinsnmode>")])
>
> --- gcc/testsuite/gcc.target/i386/avx512f-pr80355-1.c.jj        2021-08-09 
> 13:42:14.621904142 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-pr80355-1.c   2021-08-09 
> 13:04:10.070249292 +0200
> @@ -0,0 +1,19 @@
> +/* PR target/80355 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512f -mno-avx512vl -mno-avx512dq" } */
> +/* { dg-final { scan-assembler "\tvshufi32x4\t" } } */
> +/* { dg-final { scan-assembler "\tvshufi64x2\t" } } */
> +
> +typedef long long V __attribute__((vector_size (64)));
> +typedef int W __attribute__((vector_size (64)));
> +
> +W
> +f0 (W x)
> +{
> +  return __builtin_shuffle (x, (W) { 8, 9, 10, 11, 12, 13, 14, 15, 0, 1, 2, 
> 3, 4, 5, 6, 7 });
> +}
> +V
> +f1 (V x)
> +{
> +  return __builtin_shuffle (x, (V) { 4, 5, 6, 7, 0, 1, 2, 3 });
> +}
>
>         Jakub
>


-- 
BR,
Hongtao


To declare a filtering error, please use the following link : 
https://www.security-mail.net/reporter.php?mid=266c.611242e4.a6bac.0&r=marc.poulhies%40kalray.eu&s=gcc-patches-bounces%2Bmarc.poulhies%3Dkalray.eu%40gcc.gnu.org&o=Re%3A+%5BPATCH%5D+i386%3A+Improve+single+operand+AVX512F+permutations+%5BPR80355%5D&verdict=C&c=ea7d5238f66bcf8c067ccd709f1f021b4861a903

--- End Message ---

Reply via email to