On Sat, Aug 24, 2024 at 1:31 PM Li, Pan2 <pan2...@intel.com> wrote: > > Thanks Jakub and Richard for explanation and help, I will double check > saturate matching for the const_int strict check. > > Back to this below case, do we still need some ad-hoc step to unblock the > type check when vectorizable_call? > For example, the const_int 9u may have int type for .SAT_ADD(uint8_t, 9u). > Or we have somewhere else to make the vectorizable_call happy.
I don't see how vectorizable_call itself can handle this since it doesn't have any idea about the type requirements. Instead pattern recognition of .SAT_ADD should promote/demote the invariants - of course there might be correctness issues involved with matching .ADD_OVERFLOW in the first place. What I read is that .ADD_OVERFLOW produces a value that is equal to the twos-complement add of its arguments promoted/demoted to the result type, correct? Richard. > #define DEF_VEC_SAT_U_ADD_IMM_FMT_3(T, IMM) \ > T __attribute__((noinline)) \ > vec_sat_u_add_imm##IMM##_##T##_fmt_3 (T *out, T *in, unsigned limit) \ > { \ > unsigned i; \ > T ret; \ > for (i = 0; i < limit; i++) \ > { \ > out[i] = __builtin_add_overflow (in[i], IMM, &ret) ? -1 : ret; \ > } \ > } > > DEF_VEC_SAT_U_ADD_IMM_FMT_3(uint8_t, 9u) > > Pan > > -----Original Message----- > From: Richard Biener <richard.guent...@gmail.com> > Sent: Friday, August 23, 2024 6:53 PM > To: Jakub Jelinek <ja...@redhat.com> > Cc: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH v1] Vect: Promote unsigned .SAT_ADD constant operand for > vectorizable_call > > On Thu, Aug 22, 2024 at 8:36 PM Jakub Jelinek <ja...@redhat.com> wrote: > > > > On Tue, Aug 20, 2024 at 01:52:35PM +0200, Richard Biener wrote: > > > On Sat, Aug 17, 2024 at 11:18 PM Jakub Jelinek <ja...@redhat.com> wrote: > > > > > > > > On Sat, Aug 17, 2024 at 05:03:14AM +0000, Li, Pan2 wrote: > > > > > Please feel free to let me know if there is anything I can do to fix > > > > > this issue. Thanks a lot. > > > > > > > > There is no bug. The operands of .{ADD,SUB,MUL}_OVERFLOW don't have to > > > > have the same type, as described in the > > > > __builtin_{add,sub,mul}_overflow{,_p} documentation, each argument can > > > > have different type and result yet another one, the behavior is then > > > > (as if) to perform the operation in infinite precision and if that > > > > result fits into the result type, there is no overflow, otherwise there > > > > is. > > > > So, there is no need to promote anything. > > > > > > Hmm, it's a bit awkward to have this state in the IL. > > > > Why? These aren't the only internal functions which have different types > > of arguments, from the various widening ifns, conditional ifns, > > scatter/gather, ... Even the WIDEN_*EXPR trees do have type differences > > among arguments. > > And it matches what the user builtin does. > > > > Furthermore, at least without _BitInt (but even with _BitInt at the maximum > > precision too) this might not be even possible. > > E.g. if there is __builtin_add_overflow with unsigned __int128 and __int128 > > arguments and there are no wider types there is simply no type to use for > > both > > arguments, it would need to be a signed type with at least 129 bits... > > > > > I see that > > > expand_arith_overflow eventually applies > > > promotion, namely to the type of the LHS. > > > > The LHS doesn't have to be wider than the operand types, so it can't promote > > always. Yes, in some cases it applies promotion if it is desirable for > > codegen purposes. But without the promotions explicitly in the IL it > > doesn't need to rely on VRP to figure out how to expand it exactly. > > > > > Exposing this earlier could > > > enable optimization even > > > > Which optimizations? > > I was thinking of merging conversions with that implied promotion. > > > We already try to fold the .{ADD,SUB,MUL}_OVERFLOW > > builtins to constants or non-overflowing arithmetics etc. as soon as we > > can e.g. using ranges prove the operation will never overflow or will always > > overflow. Doing unnecessary promotion (see above that it might not be > > always possible at all) would just make the IL larger and risk we during > > expansion actually perform the promotions even when we don't have to. > > We on the other side already have match.pd rules to undo such promotions > > in the operands. See > > /* Demote operands of IFN_{ADD,SUB,MUL}_OVERFLOW. */ > > And the result (well, TREE_TYPE of the lhs type) can be yet another type, > > not related to either of those in any way. > > OK, fair enough. I think this also shows again the lack of documentation > of internal function signatures (hits me all the time with the more complex > ones like MASK_LEN_GATHER_LOAD where I always wonder which > argument is what) as well as IL type checking (which can also serve as > documentation about argument constraints). > > IMO comments in internal-fn.def would suffice for the former (like effectively > tree.h/def provide authority for tree codes); for IL verification a function > in internal-fn.cc would be suitable, we can call that from call verification. > > For the saturating matching this means strictly matching types > (I think that's what was proposed) is probably the least complicated > variant to reason OK. > > Thanks, > Richard. > > > > apart from IL hygiene which is violated with different typed operands > > > of an ADD, SUB or MUL. > > > > Jakub > >