On Wed, 1 Mar 2023, Richard Sandiford wrote: > Pan Li via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > > I am not very familiar with the memory pattern, maybe juzhe can provide > > more information or correct me if anything is misleading. > > > > The different precision try to resolve the below bugs, the second vlm(with > > different size of load bytes compared to first one) > > is eliminated because vbool8 and vbool16 have the same precision size, aka > > [8, 8]. > > > > vbool8_t v2 = *(vbool8_t*)in; > > vbool16_t v5 = *(vbool16_t*)in; > > *(vbool16_t*)(out + 200) = v5; > > *(vbool8_t*)(out + 100) = v2; > > > > addi a4,a1,100 > > vsetvli a5,zero,e8,m1,ta,ma > > addi a1,a1,200 > > vlm.v v24,0(a0) > > vsm.v v24,0(a4) > > // Need one vsetvli and vlm.v for correctness here. > > vsm.v v24,0(a1) > > But I think it's important to think about the patch as more than a way > of fixing the bug above. The aim has to be to describe the modes as they > really are. > > I don't think there's a way for GET_MODE_SIZE to be "conservatively wrong". > A GET_MODE_SIZE that is too small would cause problems. So would a > GET_MODE_SIZE that is too big. > > Like Richard says, I think the question comes down to the amount of padding. > Is it the case that for 4+4X ([4,4]), the memory representation has 4 bits > of padding for even X and 0 bits of padding for odd X? > > I agree getting rid of GET_MODE_SIZE and representing everything in bits > would avoid the problem at this point, but I think it would just be pushing > the difficulty elsewhere. E.g. stack layout will be "interesting" if we > can't work in byte sizes.
I suppose the backend could ensure when it performs setvl, that we only ever end up with the even or odd case and also restrict vectorization that way? So we could implement the working half and leave the not working half never happening at runtime ... Richard.