>> I think the "max poly value" is the LMUL 1 mode coeffs[1]
>>
>> See int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1];
>>
>> So I think bump max_power to exact_log2 (64); is not enough.
>> since we adjust the LMUL 1 mode size according to TARGET_MIN_VLEN.
>>
>> I suspect the testcase you append in this patch will fail with 
>> -march=rv64gcv_zvl4096b.
> 
> 
> There is no type smaller than  [64, 64] in zvl4096b, RVVMF64BI is [64,
> 64], it’s smallest type, and RVVFM1BI is [512, 512] (size of single
> vector reg.) which at most 64x for zvl4096b, so my understanding is
> log2(64) is enough :)
> 
> and of cause, verified the testcase is work with -march=rv64gcv_zvl4096b

I was wondering if the whole hunk couldn't be condensed into something
like (untested):

      div_factor = wi::ctz (factor) - wi::ctz (vlenb);
      if (div_factor >= 0)
        div_factor = 1;
      else
        div_factor = 1 << -div_factor;

This would avoid the loop as well.  An assert for the div_factor (not
exceeding a value) could still be added.

Regards
 Robin

Reply via email to