On 12.04.19 23:05, Richard Henderson wrote: > On 4/11/19 12:07 AM, David Hildenbrand wrote: >> + static const GVecGen3 g[5] = { >> + { .fni8 = gen_acc8_i64, }, >> + { .fni8 = gen_acc16_i64, }, >> + { .fni8 = gen_acc32_i64, }, >> + { .fni8 = gen_acc_i64, }, >> + { .fno = gen_helper_gvec_vacc128, }, >> + }; > > Vector versions of the first four are fairly simple too. > > static void gen_acc_vec(unsigned vece, TCGv_vec d, TCGv_vec a, TCGv_vec b) > { > tcgv_vec t = tcg_temp_new_vec_matching(d); > > tcg_gen_add_vec(vece, t, a, b); > tcg_gen_cmp_vec(TCG_COND_LTU, vece, d, r, a); /* produces -1 for carry */ > tcg_gen_neg_vec(vece, d, d); /* convert to +1 for carry */ > } > > { .fni8 = gen_acc8_i64, > .fniv = gen_acc_vec, > .opc = INDEX_op_cmp_vec, > .vece = MO_8 }, > ... >
Indeed, I didn't really explore vector operations yet. This is more compact than I expected :) > > I'm surprised that you're expanding the 128-bit addition out-of-line. > One possible expansion is > > tcg_gen_add2_i64(tl, th, al, zero, bl, zero); > tcg_gen_add2_i64(tl, th, th, zero, ah, zero); > tcg_gen_add2_i64(tl, th, tl, th, bl, zero); > /* carry out in th */ Nice trick. Just so I get it right, the third line should actually be tcg_gen_add2_i64(tl, th, tl, th, bh, zero); right? Thanks! -- Thanks, David / dhildenb