Thanks Robin. > - IMHO we need to check both series for overflow, if step2 overflows in the > smaller type isn't the result equally wrong?
The series2 will shift right before IOR, thus the overflow bits never effect on the final result. For example, the series2 will be similar as below after shift. v2.b = {0, 17, 0, 33, 0, 49, 0, 65, 0, 81, 0, 97, 0, 113, 0, 129} > - As discussed in V1, isn't is sufficient to just check the last element? > XVECLEN is equal to the number of elements in a fixed-length const vector but > not in a variable-length one. So for a variable-length vector you'd be > getting XVECLEN = 6 and only check up to a step of 2 when there might be > many more steps. For i32 interleave, we will extend to i64 for series1 and series2, I think we can design a series like base + i * step with last element overflow to bit 65 but have bits 32-64 all zero, and then the element before the last one will have overflow bits pollute the result. It seems checking the last 2 elements is good enough here. > We need to either forbid variable-length vectors for this scheme altogether or > determine the maximum runtime number of elements possible for the current > mode. > We don't support more than 65536 bits in a vector which would "naturally" > limit > the number of elements. I suppose variable-length vectors are not too common > for this scheme and falling back to a less optimized one shouldn't be too > costly. That's just a gut feeling, though. Make sense, will fall back to less optimized for VLA. > - Dividing/right shifting poly_ints is tricky and there is can_div_trunc_p to > check if we can do that at all. If we forbade VLA that wouldn't be an issue > and all we needed to check is CONST_VECTOR_NUNITS () * step. I see, that explains why we have poly shift right in previous, will update in v3. Pan -----Original Message----- From: Robin Dapp <rdapp....@gmail.com> Sent: Wednesday, February 26, 2025 12:46 AM To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; Robin Dapp <rdapp....@gmail.com> Subject: Re: [PATCH v2] RISC-V: Fix bug for expand_const_vector interleave [PR118931] > This patch would like to fix one bug when expanding const vector for the > interleave case. For example, we have: > > base1 = 151 > step = 121 > > For vec_series, we will generate vector in format of v[i] = base + i * step. > Then the vec_series will have below result for HImode, and we can find > that the result overflow to the highest 8 bits of HImode. > > v1.b = {151, 255, 7, 0, 119, 0, 231, 0, 87, 1, 199, 1, 55, 2, 167, 2} A few remarks: - IMHO we need to check both series for overflow, if step2 overflows in the smaller type isn't the result equally wrong? - As discussed in V1, isn't is sufficient to just check the last element? XVECLEN is equal to the number of elements in a fixed-length const vector but not in a variable-length one. So for a variable-length vector you'd be getting XVECLEN = 6 and only check up to a step of 2 when there might be many more steps. We need to either forbid variable-length vectors for this scheme altogether or determine the maximum runtime number of elements possible for the current mode. We don't support more than 65536 bits in a vector which would "naturally" limit the number of elements. I suppose variable-length vectors are not too common for this scheme and falling back to a less optimized one shouldn't be too costly. That's just a gut feeling, though. - Dividing/right shifting poly_ints is tricky and there is can_div_trunc_p to check if we can do that at all. If we forbade VLA that wouldn't be an issue and all we needed to check is CONST_VECTOR_NUNITS () * step. -- Regards Robin