On Tue, 29 Aug 2017, Tamar Christina wrote: > > > > -----Original Message----- > > From: Richard Biener [mailto:rguent...@suse.de] > > Sent: 24 August 2017 10:16 > > To: Tamar Christina > > Cc: Joseph Myers; Christophe Lyon; Markus Trippelsdorf; Jeff Law; GCC > > Patches; Wilco Dijkstra; Michael Meissner; nd > > Subject: RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like > > numbers in GIMPLE. > > > > On Wed, 23 Aug 2017, Tamar Christina wrote: > > > > > Hi Richard, > > > > > > Thanks for the feedback, > > > > > > > > > > > I think you should split up your work a bit. The pieces from > > > > fold_builtin_* you remove and move to lowering that require no > > > > control flow like __builtin_isnan (x) -> x UNORD x should move to > > > > match.pd patterns (aka foldings that will then be applied both on > > GENERIC and GIMPLE). > > > > > > > > > > But emitting them as an UNORD wouldn't solve the performance or > > > correctness issues The patch is trying to address. For correctness, > > > for instance an UNORD would still signal When -fsignaling-nans is used > > (PR/66462). > > > > Well, then address the correctness issues separately. I think lumping > > everything into a single patch just makes your live harder ;) > > But the patch was made to fix the correctness issues. Which unless I'm > mistaken can only be solved using integer operations. To give a short > history on the patch: > > My original patch had this in builtins.c where the current code is and > had none of this problem. The intention of this patch was just address > the correctness issues with isnan etc. > > Since we had to do this using integer operations we figured we'd also > replace the operations with faster ones so the code was tweaked in order > to e.g. make the common paths of fpclassify exit earlier. (I had > provided benchmark results to show that the integer operations are > faster than the floating point ones, also on x86). > > It was then suggested by upstream review that I do this as a gimple lowering > phase. I pushed back against this but ultimately lost and submitted a new > version > that moved from builtins.c to gimple-low.c, which was why late expansion for > C++ > broke, because the expansion is now done very early. > > The rational for this move was that it would give the rest of the pipeline a > chance > to optimize the code further. > > After a few revisions this was ultimately accepted (the v3 version), but was > reverted > because it broke IBM's long double format due to the special case code for it > violating SSA. > > That was trivially fixed (once I finally had access to a PowerPC hardware > months later) and I changed the > patch to leave the builtin as a function call if at lowering time it wasn't > known to be a builtin. > This then "fixes" the C++ testcase, in that the test doesn't fail anymore, > but it doesn't generate > the same code as before. > Though the test is quite contrived and I don't know if actual user code often > has this. > > While checking to see if this behavior is OK, I was suggested to do it > as a gimple-fold instead. I did, but then turns out it won't work for > non-constants as I can't generate new control flow from a fold (which > makes sense in retrospect but I didn't know this before). > > This is why the patch is more complex than what it was before. The > original version only emitted simple tree.
The correctness issue is only about -fsignaling-nans, correct? So a simple fix is to guard the existing builtins.c code with !flag_singalling_nans (and emit a library call for -fsignalling-nans). > > > > I was saying that if I write isunordered (x, y) and use -fno-signaling-nans > > (the > > default) then I would expect the same code to be generated as if I say isnan > > (x) || isnan (y). And I'm not sure if isnan (x) || isnan (y) is more > > efficient if > > both are implemented with integer ops compared to using an unordered FP > > compare. > > > > But isnan (x) || isnan (y) can be rewritten using a match.pd rule to > isunordered (x, y) > If that should really generate the same code. So I don’t think it's an issue > for this patch. If isnan is lowered to integer ops than we'll have a hard time recognizing it. > > A canonical high-level representation on GIMPLE is equally important as > > maybe getting some optimization on a lowered "optimized" form. > > > > And a good canonical representation is short (x UNORD x or x UNORD y > > counts as short here). > > > > > > As fallback the expanders should simply emit a call (as done for > > > > isinf when not folded for example). > > > > > > Yes the patch currently already does this. > > > > Ah, I probably looked at an old version then. > > > > > My question had more to do if > > > the late expansion when you alias one of these functions in C++ was > > > really an issue, since It used to (sometimes, only when you got the > > > type of the argument correct) expand before whereas now it'll always > > > just emit a function call in this edge case. > > > > The answer here is really that we should perform lowering at a point where > > we do not omit possible optimizations. For the particular case this would > > mean lowering after IPA optimizations (also think of accelerator offloading > > which uses the early optimization phase of the host). This also allows > > early > > optimization to more accurately estimate code size (when optimizing for > > size). And I still think whether to expose fp classification as FP or > > integer ops > > should be a target decision and done late. > > > > This still means canonicalizations like isnan -> UNORD should happen early > > and during folding. > > > > This feels to me like it's a different patch, whatever transformations you do > to get isnan or any of the other builtins don't seem relevant. Only that when > you > do generate isnan, you get better and more correct code than before. > > If the integer operations are of a concern though, I could add a target hook > to make this opt-in. The backup code is still there as it gets used when the > floating point > type is not IEEE like. I am concerned about using integer ops without having a way for the targets to intervene. This choice would be naturally made at RTL expansion time (like in your original patch I guess). There are currently no optabs for isnan() and friends given we do lower some of them early (to FP ops, and with bogus sNaN behavior). > > > > > > > > > > > That said, split out the uncontroversical part of moving existing > > > > foldings from builtins.c to match.pd where they generate no control- > > flow. > > > > > > If that's the direction I have take I will, I am just not convinced the patch > will be > simpler nor smaller... It makes the intent more visible, a wrong-code fix (adding flag_* checks) compared to a added optimization. Adding the flags checks is sth we'd want to backport for example. Richard. > Thanks, > Tamar > > > > > Thanks, > > > > Richard. > > > > > > > > > Originally the patch was in expand, which would have covered the > > > > > late expansion case similar to what it's doing now in trunk. I was > > > > > however asked to move this to a GIMPLE lowering pass as to give > > > > > other > > > > optimizations a chance to do further optimizations on the generated > > code. > > > > > > > > > > This of course works fine for C since these math functions are a > > > > > Macro in C > > > > but are functions in C++. > > > > > > > > > > C++ would then allow you to do stuff like take the address of the > > > > > C++ function so > > > > > > > > > > void foo(float a) { > > > > > int (*xx)(...); > > > > > xx = isnan; > > > > > if (xx(a)) > > > > > g++; > > > > > } > > > > > > > > > > is perfectly legal. My current patch leaves "isnan" in as a call > > > > > as by the time we're doing GIMPLE lowering the alias has not been > > > > > resolved yet, whereas the version currently committed is able to > > > > > expand it as it's late > > > > in expand. > > > > > > > > > > Curiously the behaviour is somewhat confusing. > > > > > if xx is called with a non-constant it is expanded as you would > > > > > expect > > > > > > > > > > .LFB0: > > > > > .cfi_startproc > > > > > fcvt d0, s0 > > > > > fcmp d0, d0 > > > > > bvs .L4 > > > > > > > > > > but when xx is called with a constant, say 0 it's not > > > > > > > > > > .LFB0: > > > > > .cfi_startproc > > > > > mov w0, 0 > > > > > bl isnan > > > > > cbz w0, .L1 > > > > > > > > > > because it infers the type of the 0 to be an integer and then > > > > > doesn't > > > > recognize the call. > > > > > using 0.0 works, but the behaviour is really counter intuitive. > > > > > > > > > > The question is what should I do now, clearly it would be useful > > > > > to handle the late expansion as well, however I don't know what > > > > > the best > > > > approach would be. > > > > > > > > > > I can either: > > > > > > > > > > 1) Add two new implementations, one for constant folding and one > > > > > for > > > > expansion, but then one in expand would > > > > > be quite similar to the one in the early lowering phase. The > > > > > constant > > > > folding one could be very simple since > > > > > it's a constant I can just call the buildins and evaluate the > > > > > value > > > > completely. > > > > > > > > > > 2) Use the patch as is but make another one to allow the renaming > > > > > to be applied quite early on. e.g while still in Tree or GIMPLE > > > > > resolve > > > > > > > > > > int (*xx)(...); > > > > > xx = isnan; > > > > > if (xx(a)) > > > > > > > > > > to > > > > > > > > > > int (*xx)(...); > > > > > xx = isnan; > > > > > if (isnan(a)) > > > > > > > > > > This seems like it would be the best approach and the more > > > > > useful one in > > > > general. > > > > > > > > > > Any thoughts? > > > > > > > > > > Thanks, > > > > > Tamar > > > > > > > > > > > > > > > > > Richard. > > > > > > > > > > > > > > > > > > -- > > > > Richard Biener <rguent...@suse.de> > > > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham > > > > Norton, HRB 21284 (AG Nuernberg) > > > > > > > > > > -- > > Richard Biener <rguent...@suse.de> > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, > > HRB 21284 (AG Nuernberg) > > -- Richard Biener <rguent...@suse.de> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)