Hi! On Sat, Oct 10, 2020 at 03:08:23AM -0500, Xionghu Luo wrote: > vec_insert accepts 3 arguments, arg0 is input vector, arg1 is the value > to be insert, arg2 is the place to insert arg1 to arg0. Current expander > generates stxv+stwx+lxv if arg2 is variable instead of constant, which > causes serious store hit load performance issue on Power. This patch tries > 1) Build VIEW_CONVERT_EXPR for vec_insert (i, v, n) like v[n&3] = i to > unify the gimple code, then expander could use vec_set_optab to expand. > 2) Expand the IFN VEC_SET to fast instructions: lvsr+insert+lvsl. > In this way, "vec_insert (i, v, n)" and "v[n&3] = i" won't be expanded too > early in gimple stage if arg2 is variable, avoid generating store hit load > instructions. > > For Power9 V4SI: > addi 9,1,-16 > rldic 6,6,2,60 > stxv 34,-16(1) > stwx 5,9,6 > lxv 34,-16(1) > => > rlwinm 6,6,2,28,29 > mtvsrwz 0,5 > lvsr 1,0,6 > lvsl 0,0,6 > xxperm 34,34,33 > xxinsertw 34,0,12 > xxperm 34,34,32
It still takes me quite some time to verify this, tricky bit-fiddling! But the code that generates this is easier to read :-) > +/* Insert VAL into IDX of TARGET, VAL size is same of the vector element, IDX > + is variable and also counts by vector element size. */ "Set vector element IDX of TARGET to VAL. IDX is not a constant integer."? Okay for trunk (with an improved comment). Thanks! Segher