On Sun, Dec 2, 2012 at 1:30 PM, Marc Glisse <marc.gli...@inria.fr> wrote:
>>> here is a patch. If it is accepted, I'll extend it to other vm patterns >>> (mul, div, min, max are likely candidates, but I need to check the doc). >>> It >>> passed bootstrap+testsuite on x86_64-linux. >>> >>> >>> 2012-12-01 Marc Glisse <marc.gli...@inria.fr> >>> >>> PR target/54855 >>> gcc/ >>> * config/i386/sse.md (<sse>_vm<plusminus_insn><mode>3): Rewrite >>> pattern. >>> * config/i386/i386-builtin-types.def: New function types. >>> * config/i386/i386.c (ix86_expand_args_builtin): Likewise. >>> (bdesc_args) <__builtin_ia32_addss, __builtin_ia32_subss, >>> __builtin_ia32_addsd, __builtin_ia32_subsd>: Change prototype. >>> * config/i386/xmmintrin.h: Adapt to new builtin prototype. >>> * config/i386/emmintrin.h: Likewise. >>> * doc/extend.texi (X86 Built-in Functions): Document changed >>> prototype. >>> >>> testsuite/ >>> * gcc.target/i386/pr54855-1.c: New testcase. >>> * gcc.target/i386/pr54855-2.c: New testcase. >> >> >> Yes, the approach looks correct to me, but I wonder why we have >> different representations for v4sf and v2df cases? I'd say that we >> should canonicalize patterns somewhere in the middle end (probably to >> vec_merge variant, as IMO vec_dup looks like degenerated vec_merge >> variant), otherwise we will have pattern explosion. > > > (I assume s/vec_dup/vec_concat/ above) Ah, yes. However, looking a bit more into the usage cases for these patterns, they are only used through intrinsics with _m128 operands. While your proposed patch makes these patterns more general (they can use 64bit aligned memory), this is not their usual usage, and for their intended usage, your proposed improvement complicates these patterns unnecessarily. Following on these facts, I'd say that we leave these special patters (since they serve their purpose well) and rather introduce new patterns for "other" uses. Uros.