"H. J. Lu" <[EMAIL PROTECTED]> wrote on 23/04/2007 01:34:39:
> On Mon, Apr 23, 2007 at 12:55:26AM +0300, Dorit Nuzman wrote: > > "H. J. Lu" <[EMAIL PROTECTED]> wrote on 23/04/2007 00:29:16: > > > > > On Sun, Apr 22, 2007 at 11:14:20PM +0300, Dorit Nuzman wrote: > > > > "H. J. Lu" <[EMAIL PROTECTED]> wrote on 20/04/2007 18:02:09: > > > > > > > > > Hi Dorit, > > > > > > > > > > SSE4 has vector zero/sign-extensions like: > > > > > > > > > > (define_insn "sse4_1_zero_extendv2siv2di2" > > > > > [(set (match_operand:V2DI 0 "register_operand" "=x") > > > > > (zero_extend:V2DI > > > > > (vec_select:V2SI > > > > > (match_operand:V4SI 1 "nonimmediate_operand" "xm") > > > > > (parallel [(const_int 0) > > > > > (const_int 1)]))))] > > > > > "TARGET_SSE4_1" > > > > > "pmovzxdq\t{%1, %0|%0, %1}" > > > > > [(set_attr "type" "ssemov") > > > > > (set_attr "mode" "TI")]) > > > > > > > > > > Does vectorizer support them? > > > > > > > > > > > > > (sorry, I was away from email during Friday-Saturday) - > > > > > > > > so this looks like a vec_unpacku_hi_v4si (or _lo?), i.e. what is now > > > > modeled as follows in sse.md: > > > > > > > > (define_expand "vec_unpacku_hi_v4si" > > > > [(match_operand:V2DI 0 "register_operand" "") > > > > (match_operand:V4SI 1 "register_operand" "")] > > > > "TARGET_SSE2" > > > > { > > > > ix86_expand_sse_unpack (operands, true, true); > > > > DONE; > > > > }) > > > > > > > > > > I am not sure if they are the same since SSE4.1 instructions > > > extend the first 2 elements in the vector, not the high/low > > > parts. > > > > > > > unpack high/low means the high/low elements of the vector > > > > SSE4.1 has > > 1. The first 8 elements of V16QI zero/sign externd to V8HI. This one is equivalent to vec_unpacku/s_hi_v16qi. > 2. The first 4 elements of V16QI/V8HI zero/sign externd to V4SI. The second of these two - "first 4 elements of V8HI zero/sign externd to V4SI" - is equivalent to vec_unpacku/s_hi_v8hi. > 2. The first 2 elements of V16QI/V8HI/V4SI zero/sign externd to V2DI. The last of these three - "first 2 elements of V4SI zero/sign extend to V2DI" - is equivalent to vec_unpacku/s_hi_v4si. We currently don't have idioms that represent the other forms. By the way, the vectorizer will not be able to make use of these vec_unpacku/s_hi_* insns if you don't define the corresponding vec_unpacku/s_lo_* patterns (although I think these are already defined in sse.md, though maybe less efficiently than the way sse4 can support them?). dorit > > > H.J.