On 10/08/2011 08:43 AM, Jakub Jelinek wrote: > (define_expand "avx2_gathersi<mode>" > - [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "") > - (unspec:VEC_GATHER_MODE > - [(match_operand:VEC_GATHER_MODE 1 "register_operand" "") > - (match_operand:<ssescalarmode> 2 "memory_operand" "") > - (match_operand:<VEC_GATHER_MODE> 3 "register_operand" "") > - (match_operand:VEC_GATHER_MODE 4 "register_operand" "") > - (match_operand:SI 5 "const1248_operand " "")] > - UNSPEC_GATHER))] > + [(parallel [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "") > + (unspec:VEC_GATHER_MODE > + [(match_operand:VEC_GATHER_MODE 1 "register_operand" "") > + (match_operand:<ssescalarmode> 2 "memory_operand" "") > + (match_operand:<VEC_GATHER_MODE> 3 "register_operand" "") > + (match_operand:VEC_GATHER_MODE 4 "register_operand" "") > + (match_operand:SI 5 "const1248_operand " "")] > + UNSPEC_GATHER)) > + (clobber (match_dup 4))])] > "TARGET_AVX2")
The use of match_dup in the clobber is wrong. We should not be clobbering the user-visible copy of the operand. That does not make sense when dealing with the user-visible builtin. > > (define_insn "*avx2_gathersi<mode>" > - [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "=x") > + [(set (match_operand:VEC_GATHER_MODE 0 "register_operand" "=&x") > (unspec:VEC_GATHER_MODE > - [(match_operand:VEC_GATHER_MODE 1 "register_operand" "0") > + [(match_operand:VEC_GATHER_MODE 2 "register_operand" "0") > (mem:<ssescalarmode> > - (match_operand:P 2 "register_operand" "r")) > - (match_operand:<VEC_GATHER_MODE> 3 "register_operand" "x") > - (match_operand:VEC_GATHER_MODE 4 "register_operand" "x") > - (match_operand:SI 5 "const1248_operand" "n")] > - UNSPEC_GATHER))] > + (match_operand:P 3 "register_operand" "r")) > + (match_operand:<VEC_GATHER_MODE> 4 "register_operand" "x") > + (match_operand:VEC_GATHER_MODE 5 "register_operand" "1") > + (match_operand:SI 6 "const1248_operand" "n")] > + UNSPEC_GATHER)) > + (clobber (match_operand:VEC_GATHER_MODE 1 "register_operand" "=&x"))] > "TARGET_AVX2" > - "v<gthrfirstp>gatherd<gthrlastp>\t{%4, (%2, %3, %c5), %0|%0, (%2, %3, > %c5), %4}" > + "v<gthrfirstp>gatherd<gthrlastp>\t{%1, (%3, %4, %c6), %0|%0, (%3, %4, > %c6), %1}" > [(set_attr "type" "ssemov") > (set_attr "prefix" "vex") > (set_attr "mode" "<sseinsnmode>")]) Instead, use (clobber (match_scratch)) and matching constraints with operand 4. > Still, the insn description is imprecise, saying that it loads from mem > at the address register is wrong and perhaps some DCE might delete > what shouldn't be deleted. So, either it should (use (mem (scratch))) > or something similar, or in the unspec list all the memory locations > that are being read > (mem:<scalarssemode> (plus:SI (reg:SI) (vec_select:SI (match_operand:V4SI) > (parallel [(const_int N)])))) > for N 0 through something (but it is complicated by Pmode size vs. > the need to do nothing/truncate/sign_extend the vec_select to the right > mode). I think that a (mem (scratch)) as input to the unspec is probably best. The exact memory usage is almost certainly too complex to describe in a useful way. r~