Tejas Belagod <tbela...@arm.com> writes: >> The problem is that one reg rtx can span several hard registers. >> E.g. (reg:V4SI 32) might represent one 64-bit register (no. 32), >> but it might instead represent two 32-bit registers (nos. 32 and 33). >> Obviously the latter's not very likely for vectors this small, >> but more likely for larger ones (including on NEON IIRC). >> >> So if we had 2 32-bit registers being treated as a V4HI, it would be: >> >> <--32--><--33--> >> msb lsb >> 0000111122223333 >> VVVVVVVV >> 00001111 >> msb lsb >> <--32--> >> >> for big endian and: >> >> <--33--><--32--> >> msb lsb >> 3333222211110000 >> VVVVVVVV >> 11110000 >> msb lsb >> <--32--> >> >> for little endian. > > Ah, ok, that makes things clearer. Thanks for that. > > I can't find any helper function that figures out if we're writing partial or > full result regs. Would something like > > REGNO (src) == REGNO (dst) && > HARD_REGNO_NREGS (src) == HARD_REGNO_NREGS (dst) == 1 > > be a sane check for partial result regs?
Yeah, that should work. I think a more general alternative would be: simplify_subreg_regno (REGNO (src), GET_MODE (src), offset, GET_MODE (dst)) == (int) REGNO (dst) where: offset = GET_MODE_UNIT_SIZE (GET_MODE (src)) * INTVAL (XVECEXP (sel, 0)) That offset is the byte offset of the first selected element from the start of a vector in memory, which is also the way that SUBREG_BYTEs are counted. For little-endian it gives the offset of the lsb of the slice, while for big-endian it gives the offset of the msb (which is also how SUBREG_BYTEs work). The simplify_subreg_regno should cope with both single-register vectors and multi-register vectors. Thanks, Richard