On Wed, 9 Sep 2020, Andrea Corallo wrote:
> Hi all,
>
> this patch is meant not to generate predication in loop when the
> loop is operating only on full vectors.
>
> Ex:
>
> #+BEGIN_SRC C
> /* Vector length is 256. */
> void
> f (int *restrict x, int *restrict y, unsigned int n) {
> for (unsigned int i = 0; i < n * 8; ++i)
> x[i] += y[i];
> }
> #+END_SRC
>
> Compiling on aarch64 with -O3 -msve-vector-bits=256 current trunk
> gives:
>
> #+BEGIN_SRC asm
> f:
> .LFB0:
> .cfi_startproc
> lsl w2, w2, 3
> cbz w2, .L1
> mov x3, 0
> whilelo p0.s, xzr, x2
> .p2align 3,,7
> .L3:
> ld1w z0.s, p0/z, [x0, x3, lsl 2]
> ld1w z1.s, p0/z, [x1, x3, lsl 2]
> add z0.s, z0.s, z1.s
> st1w z0.s, p0, [x0, x3, lsl 2]
> add x3, x3, 8
> whilelo p0.s, x3, x2
> b.any .L3
> .L1:
> ret
> .cfi_endproc
> #+END_SRC
>
> With the patch applied:
>
> #+BEGIN_SRC asm
> f:
> .LFB0:
> .cfi_startproc
> lsl w3, w2, 3
> cbz w3, .L1
> mov x2, 0
> ptrue p0.b, vl32
> .p2align 3,,7
> .L3:
> ld1w z0.s, p0/z, [x0, x2, lsl 2]
> ld1w z1.s, p0/z, [x1, x2, lsl 2]
> add z0.s, z0.s, z1.s
> st1w z0.s, p0, [x0, x2, lsl 2]
> add x2, x2, 8
> cmp x2, x3
> bne .L3
> .L1:
> ret
> .cfi_endproc
> #+END_SRC
>
> To achieve this we check earlier if the loop needs peeling and if is
> not the case we do not set LOOP_VINFO_USING_PARTIAL_VECTORS_P to true.
>
> I moved some logic from 'determine_peel_for_niter' to
> 'vect_need_peeling_or_part_vects_p' so it can be used for this purpose.
>
> Bootstrapped and regtested on aarch64-linux-gnu.
Looks OK to me, the comment
@@ -2267,7 +2278,10 @@ start_over:
{
if (param_vect_partial_vector_usage == 0)
LOOP_VINFO_USING_PARTIAL_VECTORS_P (loop_vinfo) = false;
- else if (vect_verify_full_masking (loop_vinfo)
+ else if ((vect_verify_full_masking (loop_vinfo)
+ && vect_need_peeling_or_part_vects_p (loop_vinfo))
+ /* Don't use partial vectors if we don't need to peel the
+ loop. */
|| vect_verify_loop_lens (loop_vinfo))
seems to be oddly misplaced (I'd put it before the call).
Richard.
> Feedback is welcome, thanks.
>
> Andrea
>
>
--
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)