On Thu, Dec 31, 2015 at 07:29:21AM -0800, H.J. Lu wrote:
> > This patch is just papering over deeper problem, as Jakub said in the PR 
> > [1]:
> >
> > --q--
> > GCC uses the ix86_legitimate_combined_insn target hook to disallow
> > misaligned memory into certain SSE instructions.
> > (subreg:V4SI (reg:TI 245 [ MEM[(const struct bitset &)FeatureEntry_21 + 8] 
> > ]) 0)
> > is not misaligned memory, it is a subreg of a pseudo register, so it is 
> > fine.
> > If the replacement of the pseudo register with memory happens in some
> > other pass, then it probably either should use the
> > legitimate_combined_insn target hook or some other one.  I think we
> > have already a PR where that happens during live range shrinking.
> > --/q--
> >
> > Please figure out where memory replacement happens. There are several
> > other SSE insns (please grep the .md for "ssememalign" attribute) that
> > are affected by this problem, so fixing a couple of patterns won't
> > solve the problem completely.
> 
> LRA turns
> 
> insn 64 63 108 6 (set (reg:V4SI 148 [ vect__28.85 ])
>         (xor:V4SI (reg:V4SI 149)
>             (subreg:V4SI (reg:TI 147 [ MEM[(const struct bitset
> &)FeatureEntry_2(D)] ]) 0))) foo.ii:26 3454 {*xorv4si3}
>      (expr_list:REG_DEAD (reg:V4SI 149)
>         (expr_list:REG_DEAD (reg:TI 147 [ MEM[(const struct bitset
> &)FeatureEntry_2(D)] ])
>             (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame)
>                         (const_int -16 [0xfffffffffffffff0])) [3
> MEM[(unsigned int *)&D.2851]+0 S16 A128])
>                 (nil)))))
> 
> into
> 
> (insn 64 63 108 6 (set (reg:V4SI 21 xmm0 [orig:148 vect__28.85 ] [148])
>         (xor:V4SI (reg:V4SI 21 xmm0 [149])
>             (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117])
> [6 MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
> foo.ii:26 3454 {*xorv4si3}
>      (expr_list:REG_EQUIV (mem/c:V4SI (plus:DI (reg/f:DI 20 frame)
>                 (const_int -16 [0xfffffffffffffff0])) [3 MEM[(unsigned
> int *)&D.2851]+0 S16 A128])
>         (nil)))
> 
> since
> 
> (mem:V4SI (reg/v/f:DI 4 si [orig:117 FeatureEntry ] [117]) [6
> MEM[(const struct bitset &)FeatureEntry_2(D)]+0 S16 A32])))
> 
> satisfies the 'm" constraint.  I don't think LRA should call
> ix86_legitimate_combined_insn to validate to validate constraints on
> an instruction.

I think best would be if we arrange (through some gen* extension?) that
the backend in *.md files could specify additional code to be run during
recognition (ideally have a way that it is used only on operands that are or
might have vector mode and are or might be MEM, so that it wouldn't be done
on every insn) and move the
          /* For pre-AVX disallow unaligned loads/stores where the
             instructions don't support it.  */
          if (!TARGET_AVX
              && VECTOR_MODE_P (mode)
              && misaligned_operand (op, mode))
            {
              unsigned int min_align = get_attr_ssememalign (insn);
              if (min_align == 0
                  || MEM_ALIGN (op) < min_align)
                return false;
            }
from the legitimate_combined_insn target hook into that.  Or run some target
hook during recog?  That way it would
be checked in all recog spots, rather than just the combiner.
Or indeed move to different predicates and/or constraints for operands that
disallow misaligned operands and perform this check in there.
Unless something in LRA is the only oher spot that could try to "combine"
misaligned memory into vector instructions.

        Jakub

Reply via email to