Hi, Richard.

RVV infrastructure in RISC-V backend status:
1. All RVV instructions pattern related to intrinsics are all finished (They 
will be called not only by intrinsics but also autovec in the future).
2. In case of autovec, we finished len_load/len_store (They are temporary used 
and will be removed after I support len_mask_load/len_mask_store in the 
middle-end).
   binary integer autovec patterns.
   vec_init pattern.
   That's all we have so far.

In case of testing of this patch, I have multiple rgroup testcases in local, 
you mean you want me to post them together with this patch?
Since I am gonna to put them in RISC-V backend testsuite, I was planning to 
post them after this patch is finished and merged into trunk.
What do you suggest ?

Thanks.


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-05-16 16:16
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; rguenther
Subject: Re: [PATCH V7] VECT: Add decrement IV support in Loop Vectorizer
"juzhe.zh...@rivai.ai" <juzhe.zh...@rivai.ai> writes:
> Oh, 
> I am sorry for incorrect typos in the last email, fix typos :
>
> Hi, Richard.
> For case 2, I come up with this idea:
> +      Case 2 (SLP multiple rgroup):
> + ...
> + _38 = (unsigned long) n_12(D);
> + _39 = _38 * 2;
> + _40 = MAX_EXPR <_39, 16>;   ----------------->remove
> + _41 = _40 - 16; ----------------->remove
>
> + ...
> + # ivtmp_42 = PHI <ivtmp_43(4), _41(3)>  ----------------->remove
>
> + # ivtmp_45 = PHI <ivtmp_46(4), _39(3)>
> + ...
> + _44 = MIN_EXPR <ivtmp_42, 32>;  ----------------->remove
>
> + _47 = MIN_EXPR <ivtmp_45, 32>;+               _47_2 = MIN_EXPR <_47, 16>;  
> -------->add+               _47_3 = _47 - _47_2 ; --------> add
> + ...
> + .LEN_STORE (_6, 8B, _47_2, ...);
> + ...
> + .LEN_STORE (_25, 8B, _47_3, ...);
> + _33 = _47_2 / 2;
> + ...
> + .LEN_STORE (_8, 16B, _33, ...);
> + _36 = _47_3 / 2;
> + ...
> + .LEN_STORE (_15, 16B, _36, ...);
> + ivtmp_46 = ivtmp_45 - _47;
> + ivtmp_43 = ivtmp_42 - _44;  ----------------->remove
>
> + ...
> + if (ivtmp_46 != 0)
> +   goto <bb 4>; [83.33%]
> + else
> +   goto <bb 5>; [16.67%]
> Is it reasonable ? Or you do have better idea for it?
 
Yeah, this makes sense, and I think it makes case 2 very similar
(equivalent?) to case 3.  If so, it would be nice if they could be
combined.
 
Of course, this loses the nice property that the original had: that each
IV was independent, and so the dependency chains were shorter.  With the
above approach, the second length parameter instead depends on a
three-instruction chain.  But that might be OK (up to you).
 
How much of the riscv backend infrastructure is in place now?  The reason
I ask is that it would be good if the patch had some tests.  AIUI, the
patch is an optimisation on top of what the current len_load/store code does,
rather than something that is needed for correctness.  So it seems like
the necessary patterns could be added and tested using the current approach,
then this patch could be applied on top, with its own tests for the new
approach.
 
Thanks,
Richard
 

Reply via email to