On Tue, Sep 29, 2020 at 7:22 PM 夏 晋 <ilyply2...@hotmail.com> wrote:
> vint16m1_t foo3(vint16m1_t a, vint16m1_t b){
>   vint16m1_t add = a+b;
>   vint16m1_t mul = a*b;
>   vsetvl_e8m1(32);
>   return add + mul;
> }

Taking another look at your example, you have type confusion.  Using
vsetvl to specify an element width of 8 does not magically convert
types into 8-bit vector types.  They are still 16-bit vector types and
will still result in 16-bit vector operations.  So your explicit
vsetvl_e8m1 is completely useless.

In the RISC-V V scheme, every vector operation emits an implicit
vsetvl instruction, and then we optimize away the redundant ones.  So
the add and mul at the start are emitting two vsetvl instructions.
Then you have an explicit vsetvl.  Then another add, which will emit
another implicit vsetvl.  The compiler reordered the arithmetic in
such a way that two of the implicit vsetvl instructions can be
optimized away.  That probably happened by accident.  But we don't
have support for optimizing away the useless explicit vsetvl, so it
remains.

Jim

Reply via email to