Hello. > > Similarly addtfsf3 that multiplies TFmode and produces an SFmode result, and > so on.
I want to extend this patch for FADDL and DADDL. What operand constraints should I use for TFmode alongside "f"? > In cases where long double and double have the same mode, >the daddl function should use the existing adddf3 pattern. So, should I use adddf3 for DADDL directly? How would I map the add<mode>3 optab with DADDL? Thanks, Tejas On Sat, 24 Aug 2019 at 15:23, Richard Sandiford <richard.sandif...@arm.com> wrote: > > Martin Jambor <mjam...@suse.cz> writes: > > Hello, > > > > On Thu, Aug 22 2019, Segher Boessenkool wrote: > >>> > Hi Tejas, > >>> > > >>> > [ Please do not top-post. ] > >> > >> On Thu, Aug 22, 2019 at 01:27:06PM +0530, Tejas Joshi wrote: > >>> > What happens then? "It does not work" is very very vague. At least it > >>> > seems the compiler does build now? > >>> > >>> Oh, compiler builds but instruction is still "bl fadd". It should be > >>> "fadds" right? > >> > >> Yes, but that means the problem is earlier, before it hits RTL perhaps. > >> > >> Compile with -dap, look at the expand dump (the lowest numbered one, 234 > >> or so), and see what it looked like in the final Gimple, and then in the > >> RTL generated from that. And then drill down. > >> > > > > Tejas sent me his patch and I looked at why it did not work. I found > > two reasons: > > > > 1. associated_internal_fn (in builtins.c) does not handle > > DEF_INTERNAL_OPTAB_FN kind of internal functions, and Tejas > > (sensibly, I'd say) used that macro to define the internal function. > > But when I worked around that by manually adding a case for it in the > > switch statement, I ran into an assert because... > > > > 2. direct_internal_fn_supported_p on which replacement_internal_fn > > depends to expand built-ins as internal functions cannot handle > > conversion optabs... and narrowing is a kind of conversion and the > > optab is added as such with OPTAB_CD. > > > > Actually, the second statement is not entirely true because somehow it > > can handle optab while_ult which is a conversion optab but a) the way it > > is handled, if I can understand it at all, seems to be a big hack and > > would be even worse if we decided to copy that for all narrowing math > > functions > > Think "big hack" is a bit unfair. The way that the internal function > maps argument types to the optab modes, and the way it expands calls > into rtl, depends on the "optab type" argument (the final argument to > DEF_INTERNAL_OPTAB_FN). This is relatively flexible in that it can use > a single-mode "direct" optab or a dual-mode "conversion" optab, with the > modes coming from whichever arguments are appropriate. New optab types > can be added as needed. > > FWIW, several other DEF_INTERNAL_OPTAB_FNs are conversion optabs too > (e.g. IFN_LOAD_LANES, IFN_STORE_LANES, IFN_MASK_LOAD, etc.). > > But... > > > and b) it gets both modes from argument types whereas we need one from > > the result type and so we would have to rewrite > > replacement_internal_fn anyway. > > ...yeah, I agree this breaks the current model. The reason IFN_WHILE_ULT > doesn't rely on the return type is that if you have: > > _2 = .WHILE_ULT (_0, _1) // returning a vector of 4 booleans > _3 = .WHILE_ULT (_0, _1) // returning a vector of 8 booleans > > then the calls look equivalent. So instead we pass an extra argument > indicating the required boolean vector "shape". > > The same "problem" could in principle apply to FADD if we ever needed > to support double+double->_Float16 for example. > > > Therefore, at least for now (GSoC deadline is kind of looming), I > > decided that the best way forward would be to not rely on internal > > functions but plug into expand_builtin() and I wrote the following, > > lightly tested patch - which of course misses testcases and stuff - but > > I'd be curious about any feedback now anyway. When I proposed a very > > similar approach for the roundeven x86_64 expansion, Uros actually then > > opted for a solution based on internal functions, so I am curious > > whether there are simple alternatives I do not see. > > > > Tejas, of course cases for other fadd variants should at least be added > > to expand_builtin. > > > > Thanks, > > > > Martin > > > > > > 2019-08-23 Tejas Joshi <tejasjoshi9...@gmail.com> > > Martin Jambor <mjam...@suse.cz> > > > > * builtins.c (expand_builtin_binary_conversion): New function. > > (expand_builtin): Call it. > > * config/rs6000/rs6000.md (unspec): Add UNSPEC_ADD_NARROWING. > > (add_truncdfsf3): New define_insn. > > * optabs.def (fadd_optab): New. > > > > [...] > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > > index 9461693bcd1..3f56880c23f 100644 > > --- a/gcc/internal-fn.def > > +++ b/gcc/internal-fn.def > > @@ -140,6 +140,8 @@ DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | > > ECF_NOTHROW, while_ult, while) > > DEF_INTERNAL_OPTAB_FN (VEC_SHL_INSERT, ECF_CONST | ECF_NOTHROW, > > vec_shl_insert, binary) > > > > +DEF_INTERNAL_OPTAB_FN (FADD, ECF_CONST, fadd, binary) > > + > > DEF_INTERNAL_OPTAB_FN (FMS, ECF_CONST, fms, ternary) > > DEF_INTERNAL_OPTAB_FN (FNMA, ECF_CONST, fnma, ternary) > > DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, fnms, ternary) > > Should be dropped now. > > OK with that change and the ones Segher asked for. > > Thanks, > Richard