On Fri, Sep 27, 2013 at 04:19:45PM +0100, Vidya Praveen wrote: > On Fri, Sep 27, 2013 at 03:50:08PM +0100, Vidya Praveen wrote: > [...] > > > > I can't really insist on the single lane load.. something like: > > > > > > > > vc:V4SI[0] = c > > > > vt:V4SI = vec_duplicate:V4SI (vec_select:SI vc:V4SI 0) > > > > va:V4SI = vb:V4SI <op> vt:V4SI > > > > > > > > Or is there any other way to do this? > > > > > > Can you elaborate on "I can't really insist on the single lane load"? > > > What's the single lane load in your example? > > > > Loading just one lane of the vector like this: > > > > vc:V4SI[0] = c // from the above scalar example > > > > or > > > > vc:V4SI[0] = c[2] > > > > is what I meant by single lane load. In this example: > > > > t = c[2] > > ... > > vb:v4si = b[0:3] > > vc:v4si = { t, t, t, t } > > va:v4si = vb:v4si <op> vc:v4si > > > > If we are expanding the CONSTRUCTOR as vec_duplicate at vec_init, I cannot > > insist 't' to be vector and t = c[2] to be vect_t[0] = c[2] (which could be > > seen as vec_select:SI (vect_t 0) ). > > > > > I'd expect the instruction > > > pattern as quoted to just work (and I hope we expand an uniform > > > constructor { a, a, a, a } properly using vec_duplicate). > > > > As much as I went through the code, this is only done using vect_init. It is > > not expanded as vec_duplicate from, for example, store_constructor() of > > expr.c > > Do you see any issues if we expand such constructor as vec_duplicate directly > instead of going through vect_init way?
Sorry, that was a bad question. But here's what I would like to propose as a first step. Please tell me if this is acceptable or if it makes sense: - Introduce standard pattern names "vmulim4" - vector muliply with second operand as indexed operand Example: (define_insn "vmuliv4si4" [set (match_operand:V4SI 0 "register_operand") (mul:V4SI (match_operand:V4SI 1 "register_operand") (vec_duplicate:V4SI (vec_select:SI (match_operand:V4SI 2 "register_operand") (match_operand:V4SI 3 "immediate_operand)))))] ... ) "vlmovmn3" - move where one of the operands is specific lane of a vector and other is a scalar. Example: (define_insn "vlmovv4sisi3" [set (vec_select:SI (match_operand:V4SI 0 "register_operand") (match_operand:SI 1 "immediate_operand")) (match_operand:SI 2 "memory_operand")] ... ) - Identify the following idiom and expand through the above standard patterns: t = c[m] vc[0:n] = { t, t, t, t} a[0:n] = b[0:n] * vc[0:n] as (insn (set (vec_select:SI (reg:V4SI 0) 0) (mem:SI ... ))) (insn (set (reg:V4SI 1) (mult:V4SI (reg:V4SI 2) (vec_duplicate:V4SI (vec_select:SI (reg:V4SI 0) 0))))) If this path is acceptable, then I can extend this to support "vmaddim4" - multiply and add (with indexed element as multiplier) "vmsubim4" - multiply and subtract (with indexed element as multiplier) Please let me know your thoughts. Cheers VP