On 12/02/13 15:58, Jakub Jelinek wrote:
Hi!

As discussed in the PR, combiner can combine e.g. unaligned integral
load (e.g. TImode) together with some SSE instruction that requires aligned
load, but doesn't actually check it.  For AVX, most of the instructions
actually allow unaligned operands, except for a few vmov* instructions where
the pattern typically handle the misaligned mems through misaligned_operand
checks, and some nontemporal move insns that have UNSPECs that should
prevent combination.  The following patch attempts to solve this by
rejecting combining of unaligned memory loads/stores into SSE insns that
don't allow it.  I've added ssememalign attribute for that, but actually
only later on realized that even for the insns which load/store < 16 byte
memory values if strict alignment checking isn't turned on in hw, the
arguments don't have to be aligned at all, so perhaps instead of
ssememalign in bits all we could have is a boolean attribute whether
insn requires for pre-AVX memory operands to be as aligned as their mode, or
not (with default that it does require that).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2013-12-02  Jakub Jelinek  <ja...@redhat.com>
            Uros Bizjak  <ubiz...@gmail.com>

        PR target/59163
        * config/i386/i386.c (ix86_legitimate_combined_insn): If for
        !TARGET_AVX there is misaligned MEM operand with vector mode
        and get_attr_ssememalign is 0, return false.
        (ix86_expand_special_args_builtin): Add get_pointer_alignment
        computed alignment and for non-temporal loads/stores also
        at least GET_MODE_ALIGNMENT as MEM_ALIGN.
        * config/i386/sse.md
        (<sse>_loadu<ssemodesuffix><avxsizesuffix><mask_name>,
        <sse>_storeu<ssemodesuffix><avxsizesuffix>,
        <sse2_avx_avx512f>_loaddqu<mode><mask_name>,
        <sse2_avx_avx512f>_storedqu<mode>, <sse3>_lddqu<avxsizesuffix>,
        sse_vmrcpv4sf2, sse_vmrsqrtv4sf2, sse2_cvtdq2pd, sse_movhlps,
        sse_movlhps, sse_storehps, sse_loadhps, *vec_interleave_highv2df,
        *vec_interleave_lowv2df, *vec_extractv2df_1_sse, sse2_movsd,
        sse4_1_<code>v8qiv8hi2, sse4_1_<code>v4qiv4si2,
        sse4_1_<code>v4hiv4si2, sse4_1_<code>v2qiv2di2,
        sse4_1_<code>v2hiv2di2, sse4_1_<code>v2siv2di2, sse4_2_pcmpestr,
        *sse4_2_pcmpestr_unaligned, sse4_2_pcmpestri, sse4_2_pcmpestrm,
        sse4_2_pcmpestr_cconly, sse4_2_pcmpistr, *sse4_2_pcmpistr_unaligned,
        sse4_2_pcmpistri, sse4_2_pcmpistrm, sse4_2_pcmpistr_cconly): Add
        ssememalign attribute.
        * config/i386/i386.md (ssememalign): New define_attr.

        * g++.dg/torture/pr59163.C: New test.
OK for the trunk.

I doubt it's worth doing anything special for the case where strict alignment on SSE stuff is turned off. Unless someone is screaming for it.

I'm trusting that Uros & yourself actually have the right alignments in the sse.md changes. I didn't look at those closely.

jeff

Reply via email to