builtins for FPU rounding mode changes

Sylvain Pion Fri, 07 Apr 2006 06:52:14 -0700

Dear Richard and GCC list,

I am working on a proposal to standardize interval arithmetic in C++.
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1843.pdf)


In order to implement it efficiently, one typically needs to use the
FPU rounding modes (e.g. using fegetround and fesetround from C99).

Typically, this boils down to something like:

struct interval {
  double lower;
  double upper;
};

and functions like:

interval operator+(interval a, interval b)
{
  int backup = fegetround(); // save current rounding mode
  fesetround(FE_UPWARD);     // set rounding mode to +infinity

  double upper = a.upper + b.upper;
  double lower = - (-a.lower - b.lower); // trick to avoid round to -inf

  fesetround(backup);        // restore default rounding mode
  return interval(lower, upper);
}


Of course, there is some C++ syntactic sugar on top of this (templates,
plus you isolate the rounding mode restoration as the destructor of a
local object...), but this is not my point here as I assume proper
inlining.


My question now is related to the capacity of the compilers in general,
and GCC in particular, to optimize away some of useless rounding mode
function calls.

Consider for example a sequence of operations like:

  interval a, b, c;
  ...
  interval d = a+b+c;


You can also imagine loops, which are quite common to sum arrays of
intervals.

Once operator+() calls are inlined, the code will look like:

  fegetround  ; // save current rounding mode
  fesetround  ; // set rounding mode to +infinity

  double tmp_upper = a.upper + b.upper;
  double tmp_lower = -(-a.lower - b.lower);

  fesetround  ; // restore default rounding mode

  fegetround  ; // save current rounding mode
  fesetround  ; // set rounding mode to +infinity

  double upper = tmp_upper + c.upper;
  double lower = -(-tmp_lower - c.lower);

  fesetround  ; // restore default rounding mode
  return interval(lower, upper);


This can safely be optimized to (squeezing the 3 middle lines):

  fegetround  ; // save current rounding mode
  fesetround  ; // set rounding mode to +infinity

  double tmp_upper = a.upper + b.upper;
  double tmp_lower = -(-a.lower - b.lower);

  double upper = tmp_upper + c.upper;
  double lower = -(-tmp_lower - c.lower);

  fesetround  ; // restore default rounding mode
  return interval(lower, upper);



As far as the std::interval C++ proposal is concerned, there are 2
choices to obtain this optimization (which is fundamental):
1- hide the rounding mode function calls from the API, and let the
   compiler perform the optimizations
2- make it appear in the interface, in terms of a policy-based design
   or something similar, which forces to have a separate type and in
   general complexifies the standard proposal.


I would very much prefer to go for the option #1.  My question is
to know if you think this is a reasonnable assumption to make in
general, for GCC, and in practice concretely how could this be
implemented in GCC ?

For GCC, would this just boild down to defining builtins mimicing
fesetround and fegetround, have them expose their properties to the
optimizers, and let the generic optimizers remove the calls ?

The properties needed on the builtins to perform the optimizations
I think are :
- two calls to fesetround which are not separated by floating point
  computations can be simplified by removing the first call.
- if a call to fesetround is followed by a call fegetround with no
  intermediate call to fesetround, then the call to fegetround
  can be replaced by taking the argument passed to fesetround.
- if two calls to fesetround with the same argument are not
  separated by some other call to fesetround (or other operation
  manipulating the FPU rounding mode register), the second call
  can be removed.

--
Sylvain

builtins for FPU rounding mode changes

Reply via email to