On Wed, Sep 21, 2011 at 1:37 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> For vcond{,u} etc. we currently generate vpandn+vpand+vpor > sequence but SSE4.1+ has instructions for at least some modes > to handle those 3 in one instruction (haven't benchmarked how much > faster/slower it is though). > > Bootstrapped/regtested on x86_64-linux and i686-linux, tested > on SandyBridge too, AVX2 just eyeballed. > > 2011-09-21 Jakub Jelinek <ja...@redhat.com> > > * config/i386/i386.c (ix86_expand_sse_movcc): Use > blendvps, blendvpd and pblendvb if possible. > > * gcc.dg/vect/vect-cond-7.c: New test. > * gcc.target/i386/sse4_1-cond-1.c: New test. > * gcc.target/i386/avx-cond-1.c: New test. OK with a nit below: > --- gcc/config/i386/i386.c.jj 2011-09-20 22:21:35.000000000 +0200 > +++ gcc/config/i386/i386.c 2011-09-21 10:09:09.000000000 +0200 > @@ -18905,24 +18905,42 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp > } > else > { > - op_true = force_reg (mode, op_true); > + rtx (*gen) (rtx, rtx, rtx, rtx) = NULL; > + > op_false = force_reg (mode, op_false); > + switch (mode) > + { > + case V4SFmode: if (TARGET_SSE4_1) gen = gen_sse4_1_blendvps; break; > + case V2DFmode: if (TARGET_SSE4_1) gen = gen_sse4_1_blendvpd; break; > + case V16QImode: if (TARGET_SSE4_1) gen = gen_sse4_1_pblendvb; break; > + case V8SFmode: if (TARGET_AVX) gen = gen_avx_blendvps256; break; > + case V4DFmode: if (TARGET_AVX) gen = gen_avx_blendvpd256; break; > + case V32QImode: if (TARGET_AVX2) gen = gen_avx2_pblendvb; break; > + default: break; gen = NULL; here instead of break. > + } Please also add appropriate line breaks in the above code... Thanks, Uros.