>> I think the "max poly value" is the LMUL 1 mode coeffs[1] >> >> See int vlenb = BYTES_PER_RISCV_VECTOR.coeffs[1]; >> >> So I think bump max_power to exact_log2 (64); is not enough. >> since we adjust the LMUL 1 mode size according to TARGET_MIN_VLEN. >> >> I suspect the testcase you append in this patch will fail with >> -march=rv64gcv_zvl4096b. > > > There is no type smaller than [64, 64] in zvl4096b, RVVMF64BI is [64, > 64], it’s smallest type, and RVVFM1BI is [512, 512] (size of single > vector reg.) which at most 64x for zvl4096b, so my understanding is > log2(64) is enough :) > > and of cause, verified the testcase is work with -march=rv64gcv_zvl4096b
I was wondering if the whole hunk couldn't be condensed into something like (untested): div_factor = wi::ctz (factor) - wi::ctz (vlenb); if (div_factor >= 0) div_factor = 1; else div_factor = 1 << -div_factor; This would avoid the loop as well. An assert for the div_factor (not exceeding a value) could still be added. Regards Robin