Hi Robin:

Your suggested code seems work fine, let me run more test and send v2, I
guess I just don’t know how to explain why it work in comment :p

Robin Dapp <rdapp....@gmail.com>於 2023年10月5日 週四,03:57寫道:

> >> I think the "max poly value" is the LMUL 1 mode coeffs[1]
> >>
> >> See int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1];
> >>
> >> So I think bump max_power to exact_log2 (64); is not enough.
> >> since we adjust the LMUL 1 mode size according to TARGET_MIN_VLEN.
> >>
> >> I suspect the testcase you append in this patch will fail with
> -march=rv64gcv_zvl4096b.
> >
> >
> > There is no type smaller than  [64, 64] in zvl4096b, RVVMF64BI is [64,
> > 64], it’s smallest type, and RVVFM1BI is [512, 512] (size of single
> > vector reg.) which at most 64x for zvl4096b, so my understanding is
> > log2(64) is enough :)
> >
> > and of cause, verified the testcase is work with -march=rv64gcv_zvl4096b
>
> I was wondering if the whole hunk couldn't be condensed into something
> like (untested):
>
>       div_factor = wi::ctz (factor) - wi::ctz (vlenb);
>       if (div_factor >= 0)
>         div_factor = 1;
>       else
>         div_factor = 1 << -div_factor;
>
> This would avoid the loop as well.  An assert for the div_factor (not
> exceeding a value) could still be added.
>
> Regards
>  Robin
>

Reply via email to