RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.

Richard Biener Wed, 30 Aug 2017 06:26:46 -0700

On Tue, 29 Aug 2017, Tamar Christina wrote:

> 
> 
> > -----Original Message-----
> > From: Richard Biener [mailto:rguent...@suse.de]
> > Sent: 24 August 2017 10:16
> > To: Tamar Christina
> > Cc: Joseph Myers; Christophe Lyon; Markus Trippelsdorf; Jeff Law; GCC
> > Patches; Wilco Dijkstra; Michael Meissner; nd
> > Subject: RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like
> > numbers in GIMPLE.
> > 
> > On Wed, 23 Aug 2017, Tamar Christina wrote:
> > 
> > > Hi Richard,
> > >
> > > Thanks for the feedback,
> > >
> > > >
> > > > I think you should split up your work a bit.  The pieces from
> > > > fold_builtin_* you remove and move to lowering that require no
> > > > control flow like __builtin_isnan (x) -> x UNORD x should move to
> > > > match.pd patterns (aka foldings that will then be applied both on
> > GENERIC and GIMPLE).
> > > >
> > >
> > > But emitting them as an UNORD wouldn't solve the performance or
> > > correctness issues The patch is trying to address. For correctness,
> > > for instance an UNORD would still signal When -fsignaling-nans is used
> > (PR/66462).
> > 
> > Well, then address the correctness issues separately.  I think lumping
> > everything into a single patch just makes your live harder ;)
> 
> But the patch was made to fix the correctness issues. Which unless I'm 
> mistaken can only be solved using integer operations. To give a short 
> history on the patch:
> 
> My original patch had this in builtins.c where the current code is and 
> had none of this problem. The intention of this patch was just address 
> the correctness issues with isnan etc.
> 
> Since we had to do this using integer operations we figured we'd also 
> replace the operations with faster ones so the code was tweaked in order 
> to e.g. make the common paths of fpclassify exit earlier. (I had 
> provided benchmark results to show that the integer operations are 
> faster than the floating point ones, also on x86).
> 
> It was then suggested by upstream review that I do this as a gimple lowering
> phase. I pushed back against this but ultimately lost and submitted a new 
> version
> that moved from builtins.c to gimple-low.c, which was why late expansion for 
> C++
> broke, because the expansion is now done very early.
> 
> The rational for this move was that it would give the rest of the pipeline a 
> chance
> to optimize the code further.
> 
> After a few revisions this was ultimately accepted (the v3 version), but was 
> reverted
> because it broke IBM's long double format due to the special case code for it 
> violating SSA.
> 
> That was trivially fixed (once I finally had access to a PowerPC hardware 
> months later) and I changed the
> patch to leave the builtin as a function call if at lowering time it wasn't 
> known to be a builtin.
> This then "fixes" the C++ testcase, in that the test doesn't fail anymore, 
> but it doesn't generate
> the same code as before.
> Though the test is quite contrived and I don't know if actual user code often 
> has this.
> 
> While checking to see if this behavior is OK, I was suggested to do it 
> as a gimple-fold instead. I did, but then turns out it won't work for 
> non-constants as I can't generate new control flow from a fold (which 
> makes sense in retrospect but I didn't know this before).
> 
> This is why the patch is more complex than what it was before. The 
> original version only emitted simple tree.


The correctness issue is only about -fsignaling-nans, correct?  So a
simple fix is to guard the existing builtins.c code with
!flag_singalling_nans (and emit a library call for -fsignalling-nans).

> > 
> > I was saying that if I write isunordered (x, y) and use -fno-signaling-nans 
> > (the
> > default) then I would expect the same code to be generated as if I say isnan
> > (x) || isnan (y).  And I'm not sure if isnan (x) || isnan (y) is more 
> > efficient if
> > both are implemented with integer ops compared to using an unordered FP
> > compare.
> > 
> 
> But isnan (x) || isnan (y) can be rewritten using a match.pd rule to 
> isunordered (x, y)
> If that should really generate the same code. So I don’t think it's an issue 
> for this patch.

If isnan is lowered to integer ops than we'll have a hard time recognizing
it.

> > A canonical high-level representation on GIMPLE is equally important as
> > maybe getting some optimization on a lowered "optimized" form.
> > 
> > And a good canonical representation is short (x UNORD x or x UNORD y
> > counts as short here).
> > 
> > > > As fallback the expanders should simply emit a call (as done for
> > > > isinf when not folded for example).
> > >
> > > Yes the patch currently already does this.
> > 
> > Ah, I probably looked at an old version then.
> > 
> > > My question had more to do if
> > > the late expansion when you alias one of these functions in C++ was
> > > really an issue, since It used to (sometimes, only when you got the
> > > type of the argument correct) expand before whereas now it'll always
> > > just emit a function call in this edge case.
> > 
> > The answer here is really that we should perform lowering at a point where
> > we do not omit possible optimizations.  For the particular case this would
> > mean lowering after IPA optimizations (also think of accelerator offloading
> > which uses the early optimization phase of the host).  This also allows 
> > early
> > optimization to more accurately estimate code size (when optimizing for
> > size).  And I still think whether to expose fp classification as FP or 
> > integer ops
> > should be a target decision and done late.
> > 
> > This still means canonicalizations like isnan -> UNORD should happen early
> > and during folding.
> > 
> 
> This feels to me like it's a different patch, whatever transformations you do
> to get isnan or any of the other builtins don't seem relevant. Only that when 
> you
> do generate isnan, you get better and more correct code than before.
> 
> If the integer operations are of a concern though, I could add a target hook
> to make this opt-in. The backup code is still there as it gets used when the 
> floating point
> type is not IEEE like.

I am concerned about using integer ops without having a way for the
targets to intervene.  This choice would be naturally made at
RTL expansion time (like in your original patch I guess).  There
are currently no optabs for isnan() and friends given we do lower
some of them early (to FP ops, and with bogus sNaN behavior).

> > >
> > > >
> > > > That said, split out the uncontroversical part of moving existing
> > > > foldings from builtins.c to match.pd where they generate no control-
> > flow.
> > > >
> 
> If that's the direction I have take I will, I am just not convinced the patch 
> will be
> simpler nor smaller...

It makes the intent more visible, a wrong-code fix (adding flag_* checks)
compared to a added optimization.  Adding the flags checks is sth
we'd want to backport for example.

Richard.

> Thanks,
> Tamar
> 
> > > > Thanks,
> > > > Richard.
> > > >
> > > > > Originally the patch was in expand, which would have covered the
> > > > > late expansion case similar to what it's doing now in trunk. I was
> > > > > however asked to move this to a GIMPLE lowering pass as to give
> > > > > other
> > > > optimizations a chance to do further optimizations on the generated
> > code.
> > > > >
> > > > > This of course works fine for C since these math functions are a
> > > > > Macro in C
> > > > but are functions in C++.
> > > > >
> > > > > C++ would then allow you to do stuff like take the address of the
> > > > > C++ function so
> > > > >
> > > > > void foo(float a) {
> > > > >   int (*xx)(...);
> > > > >   xx = isnan;
> > > > >   if (xx(a))
> > > > >     g++;
> > > > > }
> > > > >
> > > > > is perfectly legal. My current patch leaves "isnan" in as a call
> > > > > as by the time we're doing GIMPLE lowering the alias has not been
> > > > > resolved yet, whereas the version currently committed is able to
> > > > > expand it as it's late
> > > > in expand.
> > > > >
> > > > > Curiously the behaviour is somewhat confusing.
> > > > > if xx is called with a non-constant it is expanded as you would
> > > > > expect
> > > > >
> > > > > .LFB0:
> > > > >         .cfi_startproc
> > > > >         fcvt    d0, s0
> > > > >         fcmp    d0, d0
> > > > >         bvs     .L4
> > > > >
> > > > > but when xx is called with a constant, say 0 it's not
> > > > >
> > > > > .LFB0:
> > > > >         .cfi_startproc
> > > > >         mov     w0, 0
> > > > >         bl      isnan
> > > > >         cbz     w0, .L1
> > > > >
> > > > > because it infers the type of the 0 to be an integer and then
> > > > > doesn't
> > > > recognize the call.
> > > > > using 0.0 works, but the behaviour is really counter intuitive.
> > > > >
> > > > > The question is what should I do now, clearly it would be useful
> > > > > to handle the late expansion as well, however I don't know what
> > > > > the best
> > > > approach would be.
> > > > >
> > > > > I can either:
> > > > >
> > > > > 1) Add two new implementations, one for constant folding and one
> > > > > for
> > > > expansion, but then one in expand would
> > > > >    be quite similar to the one in the early lowering phase. The
> > > > > constant
> > > > folding one could be very simple since
> > > > >    it's a constant I can just call the buildins and evaluate the
> > > > > value
> > > > completely.
> > > > >
> > > > > 2) Use the patch as is but make another one to allow the renaming
> > > > > to be applied quite early on. e.g while still in Tree or GIMPLE
> > > > > resolve
> > > > >
> > > > >         int (*xx)(...);
> > > > >         xx = isnan;
> > > > >         if (xx(a))
> > > > >
> > > > >         to
> > > > >
> > > > >         int (*xx)(...);
> > > > >         xx = isnan;
> > > > >         if (isnan(a))
> > > > >
> > > > >    This seems like it would be the best approach and the more
> > > > > useful one in
> > > > general.
> > > > >
> > > > > Any thoughts?
> > > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > >
> > > > > > Richard.
> > > > >
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguent...@suse.de>
> > > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > > Norton, HRB 21284 (AG Nuernberg)
> > >
> > >
> > 
> > --
> > Richard Biener <rguent...@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.

Reply via email to