Hello! > SAD (Sum of Absolute Differences) is a common and important algorithm > in image processing and other areas. SSE2 even introduced a new > instruction PSADBW for it. A SAD loop can be greatly accelerated by > this instruction after being vectorized. This patch introduced a new > operation SAD_EXPR and a SAD pattern recognizer in vectorizer. > > In order to express this new operation, a new expression SAD_EXPR is > introduced in tree.def, and the corresponding entry in optabs is > added. The patch also added the "define_expand" for SSE2 and AVX2 > platforms for i386.
+(define_expand "sadv16qi" + [(match_operand:V4SI 0 "register_operand") + (match_operand:V16QI 1 "register_operand") + (match_operand:V16QI 2 "register_operand") + (match_operand:V4SI 3 "register_operand")] + "TARGET_SSE2" +{ + rtx t1 = gen_reg_rtx (V2DImode); + rtx t2 = gen_reg_rtx (V4SImode); + emit_insn (gen_sse2_psadbw (t1, operands[1], operands[2])); + convert_move (t2, t1, 0); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], + gen_rtx_PLUS (V4SImode, + operands[3], t2))); + DONE; +}) Please use generic expanders (expand_simple_binop) to generate plus expression. Also, please use nonimmediate_operand predicate for operand 2 and operand 3. Please note, that nonimmediate operands should be passed as the second input operand to commutative operators, to match their insn pattern layout. Uros.