Re: [PATCH V2] Practical Improvement to Double Precision Complex Divide

Patrick McGehearty via Gcc-patches Tue, 09 Jun 2020 16:03:03 -0700

I see your point about other floating point formats.
According to the C standard, DOUBLE precision must
have a DBL_MAX of at least 1E+37. That is (slightly)
greater than 0x1.0p+63.


Would
#define RMIN2 (0x1.0p-53)
#define RMINSCAL (0x1.0p+51)
be acceptable?
Those will be in range for any architecture that supports the C standard.

I estimate using those values will cause a typical complex divide to run
somewhat slower for the full exponent dataset because the test for
whether fabs(d) < RMIN2 [or fabs(c) < RMIN2 ] will be taken more often,
adding four scaling multiplies to the overall operation.

On the positive side, using the new scaling factor significantly reduces
the frequency of 2 ulp or larger inaccuracies for the full range dataset.
The alternative would be to remove the RMIN2 test altogether.
That provides a modest reduction in overhead but increases
the inaccuracy for the 8 ulp case to 0.04147%.

Errors   Full Dataset
gtr eq     current   RMIN2=-512 RMIN2=-53   no RMIN2
======    ========   ========   ========   ========
 1 ulp      1.87%    0.19755%   0.16788%   0.22169%
 2 ulp      1.70%    0.03857%   0.01007%   0.06182%
 8 ulp      1.61%    0.02394%   0.00199%   0.04147%
16 ulp      1.51%    0.01548%   0.00087%   0.02727%
24 ulp      1.41%    0.00945%   0.00042%   0.01694%
52 ulp      1.13%    0.00001%   0.00001%   0.00001%
===================================================

The RMIN2=-53% option greatly reduces 8 ulp or larger errors.
The "No RMIN2" option has about double the error rate
of the initial proposal of RMIN2=-512, but still has a error
rate at the 8 ulp level 40 times lower than the current code.

I welcome any thoughts from the community about the relative importance
of accuracy vs performance tradeoffs at these levels.

- Patrick McGehearty


On 6/4/2020 6:45 PM, Joseph Myers wrote:

On Fri, 5 Jun 2020, Patrick McGehearty wrote:

diff --git a/libgcc/libgcc2.c b/libgcc/libgcc2.c
index e0a9fd7..2a1d3dc 100644
--- a/libgcc/libgcc2.c
+++ b/libgcc/libgcc2.c
@@ -2036,26 +2036,77 @@ CONCAT3(__mul,MODE,3) (MTYPE a, MTYPE b, MTYPE c, MTYPE 
d)
  CTYPE
  CONCAT3(__div,MODE,3) (MTYPE a, MTYPE b, MTYPE c, MTYPE d)
  {
+#define RBIG   ((DBL_MAX)/2.0)
+#define RMIN   (DBL_MIN)
+#define RMIN2  (0x1.0p-512)
+#define RMINSCAL (0x1.0p+510)

This code is used for many different machine modes and floating-point
formats (the type and format corresponding to a particular machine mode
may depend on the target for which GCC is configured).  You can't hardcode
particular values specific to DFmode (and to DFmode being IEEE binary64)
here.

Re: [PATCH V2] Practical Improvement to Double Precision Complex Divide

Reply via email to