Re: More type narrowing in match.pd

Marc Glisse Thu, 30 Apr 2015 14:38:33 -0700

On Thu, 30 Apr 2015, Jeff Law wrote:

On 04/30/2015 01:17 AM, Marc Glisse wrote:


+/* This is another case of narrowing, specifically when there's an outer
+   BIT_AND_EXPR which masks off bits outside the type of the innermost
+   operands.   Like the previous case we have to convert the operands
+   to unsigned types to avoid introducing undefined behaviour for the
+   arithmetic operation.  */
+(for op (minus plus)

No mult? or widen_mult with a different pattern? (maybe that's already
done elsewhere)

No mult. When I worked on the pattern for 47477, supporting mult clearlyregressed the generated code -- presumably because we can often widen theoperands for free.

It would help with the testcase below, but I am willing to accept that thecases where it hurts are more common (and guessing if it will help or hurtmay be hard), while with +- the cases that help are more common.


void f(short*a) {
  a = __builtin_assume_aligned(a,128);
  for (int i = 0; i < (1<<22); ++i) {
#ifdef EASY
    a[i] *= a[i];
#else
    int x = a[i];
    x *= x;
    a[i] = x;
#endif
  }
}

With EASY, a nice little loop:
.L2:
        movdqa  (%rdi), %xmm0
        addq    $16, %rdi
        pmullw  %xmm0, %xmm0
        movaps  %xmm0, -16(%rdi)
        cmpq    %rdi, %rax
        jne     .L2

while without EASY, we get the uglier:
.L2:
        movdqa  (%rdi), %xmm0
        addq    $16, %rdi
        movdqa  %xmm0, %xmm2
        movdqa  %xmm0, %xmm1
        pmullw  %xmm0, %xmm2
        pmulhw  %xmm0, %xmm1
        movdqa  %xmm2, %xmm0
        punpckhwd       %xmm1, %xmm2
        punpcklwd       %xmm1, %xmm0
        movdqa  %xmm2, %xmm1
        movdqa  %xmm0, %xmm2
        punpcklwd       %xmm1, %xmm0
        punpckhwd       %xmm1, %xmm2
        movdqa  %xmm0, %xmm1
        punpcklwd       %xmm2, %xmm0
        punpckhwd       %xmm2, %xmm1
        punpcklwd       %xmm1, %xmm0
        movaps  %xmm0, -16(%rdi)
        cmpq    %rdi, %rax
        jne     .L2

A small pattern like
(simplify
 (vec_pack_trunc (widen_mult_lo @0 @1) (widen_mult_hi:c @0 @1))
 (mult @0 @1))

probably with some tweaks (convert to unsigned? only do it before vectorlowering?), would fix this particular case, but not as well as narrowingbefore vectorization.


--
Marc Glisse

Re: More type narrowing in match.pd

Reply via email to