Re: [PATCH i386][google]With -mtune=core2, avoid generating the slow unaligned vector load/store (issue5488054)

Sriraman Tallam Tue, 13 Dec 2011 10:28:24 -0800

On Mon, Dec 12, 2011 at 11:49 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Mon, Dec 12, 2011 at 06:05:57PM -0800, Sriraman Tallam wrote:
>>       Do not vectorize loops on Core2 that need to use unaligned
>>       vector load/stores.
>>       * tree-vect-stmts.c (is_slow_vect_unaligned_load_store): New function.
>>       (vect_analyze_stmt): Check if the vectorizable load/store is slow.
>>       * target.def (TARGET_SLOW_UNALIGNED_VECTOR_MEMOP): New target hook.
>>       * doc/m.texi.in: Document new target hook:
>>       TARGET_SLOW_UNALIGNED_VECTOR_MEMOP
>>       * doc/m.texi: Regenerate.
>>       * config/i386/i386.c (ix86_slow_unaligned_vector_memop): New function.
>>       (TARGET_SLOW_UNALIGNED_VECTOR_MEMOP): New macro.
>
> IMHO it would be better if it didn't prevent vectorization of the loops
> altogether, but lead to using aligned stores with an alignment check
> before the vectorized loop if possible.
>
> Also, are unaligned loads equally expensive to unaligned stores?


Unaligned stores are always expensive, irrespective of whether the
data turns out to be aligned or not at run-time. Unaligned loads are
only as expensive when the data item is unaligned. For unaligned loads
of aligned data, the movdqu is still slow but by ~2x rather than 6x.

>
> See http://gcc.gnu.org/PR49442 for further info, this really should be done
> using some cost model rather than a boolean hook.
>
>        Jakub

Re: [PATCH i386][google]With -mtune=core2, avoid generating the slow unaligned vector load/store (issue5488054)

Reply via email to