On Wed, Dec 11, 2013 at 8:26 AM, Tejas Belagod <tbela...@arm.com> wrote: > H.J. Lu wrote: >> >> On Wed, Dec 11, 2013 at 7:49 AM, Richard Sandiford >> <rdsandif...@googlemail.com> wrote: >>> >>> "H.J. Lu" <hjl.to...@gmail.com> writes: >>>> >>>> On Wed, Dec 11, 2013 at 1:13 AM, Richard Sandiford >>>> <rdsandif...@googlemail.com> wrote: >>>>> >>>>> Richard Henderson <r...@redhat.com> writes: >>>>>> >>>>>> On 12/10/2013 10:44 AM, Richard Sandiford wrote: >>>>>>> >>>>>>> Sorry, I don't understand. I never said it was invalid. I said >>>>>>> (subreg:SF (reg:V4SF X) 1) was invalid if (reg:V4SF X) represents >>>>>>> a single register. On a little-endian target, the offset cannot be >>>>>>> anything other than 0 in that case. >>>>>>> >>>>>>> So the CANNOT_CHANGE_MODE_CLASS code above seems to be checking for >>>>>>> something that is always invalid, regardless of the target. That >>>>>>> kind >>>>>>> of situation should be rejected by target-independent code instead. >>>>>> >>>>>> But, we want to disable the subreg before we know whether or not >>>>>> (reg:V4SF X) >>>>>> will be allocated to a single hard register. That is something that >>>>>> we can't >>>>>> know in target-independent code before register allocation. >>>>> >>>>> I was thinking that if we've got a class, we've also got things like >>>>> CLASS_MAX_NREGS. Maybe that doesn't cope with padding properly though. >>>>> But even in the padding cases an offset-based check in C_C_M_C could >>>>> be derived from other information. >>>>> >>>>> subreg_get_info handles padding with: >>>>> >>>>> nregs_xmode = HARD_REGNO_NREGS_WITH_PADDING (xregno, xmode); >>>>> if (GET_MODE_INNER (xmode) == VOIDmode) >>>>> xmode_unit = xmode; >>>>> else >>>>> xmode_unit = GET_MODE_INNER (xmode); >>>>> gcc_assert (HARD_REGNO_NREGS_HAS_PADDING (xregno, xmode_unit)); >>>>> gcc_assert (nregs_xmode >>>>> == (GET_MODE_NUNITS (xmode) >>>>> * HARD_REGNO_NREGS_WITH_PADDING (xregno, >>>>> xmode_unit))); >>>>> gcc_assert (hard_regno_nregs[xregno][xmode] >>>>> == (hard_regno_nregs[xregno][xmode_unit] >>>>> * GET_MODE_NUNITS (xmode))); >>>>> >>>>> /* You can only ask for a SUBREG of a value with holes in the >>>>> middle >>>>> if you don't cross the holes. (Such a SUBREG should be done >>>>> by >>>>> picking a different register class, or doing it in memory if >>>>> necessary.) An example of a value with holes is XCmode on >>>>> 32-bit >>>>> x86 with -m128bit-long-double; it's represented in 6 32-bit >>>>> registers, >>>>> 3 for each part, but in memory it's two 128-bit parts. >>>>> Padding is assumed to be at the end (not necessarily the 'high >>>>> part') >>>>> of each unit. */ >>>>> if ((offset / GET_MODE_SIZE (xmode_unit) + 1 >>>>> < GET_MODE_NUNITS (xmode)) >>>>> && (offset / GET_MODE_SIZE (xmode_unit) >>>>> != ((offset + GET_MODE_SIZE (ymode) - 1) >>>>> / GET_MODE_SIZE (xmode_unit)))) >>>>> { >>>>> info->representable_p = false; >>>>> rknown = true; >>>>> } >>>>> >>>>> and I wouldn't really want to force targets to individually reproduce >>>>> that kind of logic at the class level. If the worst comes to the worst >>>>> we could cache the difficult cases. >>>>> >>>> My case is x86 CANNOT_CHANGE_MODE_CLASS only needs >>>> to know if the subreg byte is zero or not. It doesn't care about mode >>>> padding. You are concerned about information passed to >>>> CANNOT_CHANGE_MODE_CLASS is too expensive for target >>>> to process. It isn't the case for x86. >>> >>> No, I'm concerned that by going this route, we're forcing every target >>> (or at least every target with wider-than-word registers, which is most >>> of the common ones) to implement the same target-independent restriction. >>> This is not an x86-specific issue. >>> >> >> So you prefer a generic solution which makes >> CANNOT_CHANGE_MODE_CLASS return true >> for vector mode subreg if subreg byte != 0. Is this >> correct? > > > Do you mean a generic solution for C_C_M_C to return true for non-zero > byte_offset vector subregs in the context of x86? > > I want to clarify because in the context of 32-bit ARM little-endian, a > non-zero byte-offset vector subreg is still a valid full hardreg. eg. for > > (subreg:DI (reg:V4SF) 8) > > C_C_M_C can return 'false' as this can be resolved to a full D-reg. >
Does that mean subreg byte interpretation is endian-dependent? Both llittle endian subreg:DI (reg:V4SF) 0) and big endian subreg:DI (reg:V4SF) MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT) refer to the same lower 64 bits of reg:V4SF. Is this correct? -- H.J.