[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

wilco at gcc dot gnu.org Mon, 07 Jan 2019 04:57:28 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398


--- Comment #13 from Wilco <wilco at gcc dot gnu.org> ---
So to add some real numbers to the discussion, the average number of iterations
is 4.31. Frequency stats (16 includes all iterations > 16 too):

1: 29.0
2: 4.2
3: 1.0
4: 36.7
5: 8.7
6: 3.4
7: 3.0
8: 2.6
9: 2.1
10: 1.9
11: 1.6
12: 1.2
13: 0.9
14: 0.8
15: 0.7
16: 2.1

So unrolling 4x is perfect for this loop. Note the official xz version has
optimized this loop since 2014(!) using unaligned accesses:
https://git.tukaani.org/?p=xz.git;a=blob;f=src/liblzma/common/memcmplen.h

[Bug tree-optimization/88398] vectorization failure for a small loop to do byte comparison

Reply via email to