On 4/25/20 11:49 AM, Richard Henderson wrote: > On 4/23/20 6:59 AM, Laurent Desnogues wrote: >> 2. sve_zip_p >> >> This generates extraneous data in the higher part of the result. >> >> I hit this when I got a wrong result on an instruction that ends up >> using sve_cntp which counts all bits set in each 64-bit chunk. There >> might be some other instructions beyond ZIP that generate extra data >> that would break sve_cntp. So perhaps it'd be easier to fix sve_cmtp >> (and hope that it's the only function that uses bits beyond vector >> length...). > > I don't see how sve_zip_p can set high bits. If vl is not a multiple of 512, > it writes in units of uint16_t. This cannot produce values outside range.
Bah. I was looking at zip2 first. zip1 uses the uint64_t path. I see the problem now. r~