https://bugs.llvm.org/show_bug.cgi?id=39432
Bug ID: 39432
Summary: [SLPVectorizer] Investigate using poor throughput
instructions as seed points
Product: libraries
Version: trunk
Hardware: PC
OS: Windows NT
Status: NEW
Severity: enhancement
Priority: P
Component: Scalar Optimizations
Assignee: unassignedb...@nondot.org
Reporter: llvm-...@redking.me.uk
CC: a.bat...@hotmail.com, andrea.dibia...@gmail.com,
dtemirbula...@gmail.com, llvm-bugs@lists.llvm.org
Even for otherwise very different code paths, its often very beneficial to
vectorize poor throughput instructions (FDIV + FSQRT in particular) as they are
usually the bottleneck:
Codegen: https://godbolt.org/z/bWkftx
LLVM MCA Analysis: https://godbolt.org/z/eYmVHk
void prim(double x, double y, double z, double w, double *p0, double *p1) {
x -= z;
y += w;
x /= z;
y /= w;
x -= z;
y += w;
*p0++ = x;
*p1++ = y;
}
Z4primddddPdS_: # @_Z4primddddPdS_
vsubsd %xmm2, %xmm0, %xmm0
vaddsd %xmm1, %xmm3, %xmm1
vdivsd %xmm2, %xmm0, %xmm0
vdivsd %xmm3, %xmm1, %xmm1
vsubsd %xmm2, %xmm0, %xmm0
vaddsd %xmm3, %xmm1, %xmm1
vmovsd %xmm0, (%rdi)
vmovsd %xmm1, (%rsi)
retq
block throughput: 38cy
void prim2(double x, double y, double z, double w, double *p0, double *p1) {
x -= z;
y += w;
__m128d xy = _mm_div_pd(_mm_setr_pd(x, y), _mm_setr_pd(z, w));
x = xy[0];
y = xy[1];
x -= z;
y += w;
*p0++ = x;
*p1++ = y;
}
_Z5prim2ddddPdS_: # @_Z5prim2ddddPdS_
vsubsd %xmm2, %xmm0, %xmm0
vaddsd %xmm1, %xmm3, %xmm1
vunpcklpd %xmm1, %xmm0, %xmm0 # xmm0 = xmm0[0],xmm1[0]
vunpcklpd %xmm3, %xmm2, %xmm1 # xmm1 = xmm2[0],xmm3[0]
vdivpd %xmm1, %xmm0, %xmm0
vpermilpd $1, %xmm0, %xmm1 # xmm1 = xmm0[1,0]
vsubsd %xmm2, %xmm0, %xmm0
vaddsd %xmm3, %xmm1, %xmm1
vmovsd %xmm0, (%rdi)
vmovsd %xmm1, (%rsi)
retq
block throughput: 19cy
--
You are receiving this mail because:
You are on the CC list for the bug.
_______________________________________________
llvm-bugs mailing list
llvm-bugs@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs