Re: Numeric x^y for negative x

Dean Rasheed Thu, 29 Jul 2021 11:14:39 -0700

On Thu, 22 Jul 2021 at 16:19, Dean Rasheed <dean.a.rash...@gmail.com> wrote:
>
> On Thu, 22 Jul 2021 at 06:13, Yugo NAGATA <nag...@sraoss.co.jp> wrote:
> >
> > Thank you for updating the patch. I am fine with the additional comments.
> > I don't think there is any other problem left, so I marked it 
> > Ready-for-Committers.
>
> Thanks for looking. Barring any further comments, I'll push this in a few 
> days.
>


So I have been testing this a lot over the last couple of days, and I
have concluded that the patch works well as far as it goes, but I did
manage to construct another case where numeric_power() loses
precision. I think, though, that it would be better to tackle that as
a separate patch.

In writing the commit message for this patch, I realised that it was
possible to tidy up the local_rscale calculation part of it a little,
to make it more obvious what was going wrong, so attached is a
slightly tweaked version. I'll hold off on committing it for a few
more days in case anyone else wants to have a look. Tom?

The other issue I found is related to the first part of power_var(),
where it does a low-precision calculation to get an estimate for the
weight of the result. It occurred to me that for certain input bases,
that calculation could be made to be quite inaccurate, and therefore
lead to choosing the wrong rscale later. This is the test case I
constructed:

  (1-1.5123456789012345678e-1000) ^ 1.15e1003

Here, the base is a sliver under 1, and so ln(base) is approximately
-1.5e-1000, and ln_dweight is -1000 (the decimal weight of ln(base)).
The problem is that the local_rscale used for the first low-precision
calculation is limited to NUMERIC_MAX_DISPLAY_SCALE (which is 1000),
so we only compute ln_base to a scale of 1000 at that stage, and the
result is rounded to exactly 2e-1000, which is off by a factor of
around 1.333333.

That makes it think the result weight will be -998, when actually it's
-755, so it then chooses a local_rscale for the full calculation
that's far too small, and the result is very inaccurate.

To fix this, I think it's necessary to remove the line that limits the
initial local_rscale. I tried that in a debugger, and managed to get a
result that agreed with the result from "bc -l" with a scale of 2000.

The reason I think it will be OK to remove that line is that it only
ever comes into play when ln_dweight is a large negative number (and
the smallest it can be is -16383). But that can only happen in
instances like this, where the base is very very close to 1. In such
cases, the ln(base) calculation is very fast, because it basically
only has to do a couple of Taylor series terms, and it's done. This
will still only be a low-precision estimate of the result (about 8
significant digits, shifted down a long way).

It might also be necessary to re-think the choice of local_rscale for
the mul_var() that follows. If the weight of exp is much larger than
the weight of ln_base (or vice versa), it doesn't really make sense to
compute the product to the same local_rscale. That might be a source
of other inaccuracies. I'll try to investigate some more.

Anyway, I don't think any of that should get in the way of committing
the current patch.

Regards,
Dean

From 071d877af3ffcb8e2abd445cb5963d90a2e766bc Mon Sep 17 00:00:00 2001
From: Dean Rasheed <dean.a.rash...@gmail.com>
Date: Thu, 29 Jul 2021 17:26:08 +0100
Subject: [PATCH] Fix corner-case errors and loss of precision in
 numeric_power().

This fixes a couple of related problems that arise when raising
numbers to very large powers.

Firstly, when raising a negative number to a very large integer power,
the result should be well-defined, but the previous code would only
cope if the exponent was small enough to go through power_var_int().
Otherwise it would throw an internal error, attempting to take the
logarithm of a negative number. Fix this by adding suitable handling
to the general case in power_var() to cope with negative bases,
checking for integer powers there.

Next, when raising a (positive or negative) number whose absolute
value is slightly less than 1 to a very large power, the result should
approach zero as the power is increased. However, in some cases, for
sufficiently large powers, this would lose all precision and return 1
instead. This was due to the way that the local_rscale was being
calculated for the final full-precision calculation:

  local_rscale = rscale + (int) val - ln_dweight + 8

The first two terms on the right hand side are meant to give the
number of significant digits required in the result ("val" being the
estimated result weight). However, this failed to account for the fact
that rscale is clipped to a maximum of NUMERIC_MAX_DISPLAY_SCALE
(1000), and the result weight might be less then -1000, causing their
sum to be negative, leading to a loss of precision. Fix this by
forcing the number of significant digits to be nonnegative. It's OK
for it to be zero (when the result weight is less than -1000), since
the local_rscale value then includes a few extra digits to ensure an
accurate result.

Finally, add additional underflow checks to exp_var() and power_var(),
so that they consistently return zero for cases like this where the
result is indistinguishable from zero. Some paths through this code
already returned zero in such cases, but others were throwing overflow
errors.

Dean Rasheed, reviewed by Yugo Nagata

Discussion: http://postgr.es/m/CAEZATCW6Dvq7+3wN3tt5jLj-FyOcUgT5xNoOqce5=6su0bc...@mail.gmail.com
---
 src/backend/utils/adt/numeric.c       | 80 ++++++++++++++++++++++-----
 src/test/regress/expected/numeric.out | 55 ++++++++++++++++++
 src/test/regress/sql/numeric.sql      | 11 ++++
 3 files changed, 131 insertions(+), 15 deletions(-)

diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c
index faff09f5d5..1b077b8469 100644
--- a/src/backend/utils/adt/numeric.c
+++ b/src/backend/utils/adt/numeric.c
@@ -3993,7 +3993,9 @@ numeric_power(PG_FUNCTION_ARGS)
 	/*
 	 * The SQL spec requires that we emit a particular SQLSTATE error code for
 	 * certain error conditions.  Specifically, we don't return a
-	 * divide-by-zero error code for 0 ^ -1.
+	 * divide-by-zero error code for 0 ^ -1.  Raising a negative number to a
+	 * non-integer power must produce the same error code, but that case is
+	 * handled in power_var().
 	 */
 	sign1 = numeric_sign_internal(num1);
 	sign2 = numeric_sign_internal(num2);
@@ -4003,11 +4005,6 @@ numeric_power(PG_FUNCTION_ARGS)
 				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
 				 errmsg("zero raised to a negative power is undefined")));
 
-	if (sign1 < 0 && !numeric_is_integral(num2))
-		ereport(ERROR,
-				(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
-				 errmsg("a negative number raised to a non-integer power yields a complex result")));
-
 	/*
 	 * Initialize things
 	 */
@@ -9822,12 +9819,18 @@ exp_var(const NumericVar *arg, NumericVar *result, int rscale)
 	 */
 	val = numericvar_to_double_no_overflow(&x);
 
-	/* Guard against overflow */
+	/* Guard against overflow/underflow */
 	/* If you change this limit, see also power_var()'s limit */
 	if (Abs(val) >= NUMERIC_MAX_RESULT_SCALE * 3)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value overflows numeric format")));
+	{
+		if (val > 0)
+			ereport(ERROR,
+					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+					 errmsg("value overflows numeric format")));
+		zero_var(result);
+		result->dscale = rscale;
+		return;
+	}
 
 	/* decimal weight = log10(e^x) = x * log10(e) */
 	dweight = (int) (val * 0.434294481903252);
@@ -10185,10 +10188,13 @@ log_var(const NumericVar *base, const NumericVar *num, NumericVar *result)
 static void
 power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result)
 {
+	int			res_sign;
+	NumericVar	abs_base;
 	NumericVar	ln_base;
 	NumericVar	ln_num;
 	int			ln_dweight;
 	int			rscale;
+	int			sig_digits;
 	int			local_rscale;
 	double		val;
 
@@ -10228,9 +10234,40 @@ power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result)
 		return;
 	}
 
+	init_var(&abs_base);
 	init_var(&ln_base);
 	init_var(&ln_num);
 
+	/*
+	 * If base is negative, insist that exp be an integer.  The result is then
+	 * positive if exp is even and negative if exp is odd.
+	 */
+	if (base->sign == NUMERIC_NEG)
+	{
+		/*
+		 * Check that exp is an integer.  This error code is defined by the
+		 * SQL standard, and matches other errors in numeric_power().
+		 */
+		if (exp->ndigits > 0 && exp->ndigits > exp->weight + 1)
+			ereport(ERROR,
+					(errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION),
+					 errmsg("a negative number raised to a non-integer power yields a complex result")));
+
+		/* Test if exp is odd or even */
+		if (exp->ndigits > 0 && exp->ndigits == exp->weight + 1 &&
+			(exp->digits[exp->ndigits - 1] & 1))
+			res_sign = NUMERIC_NEG;
+		else
+			res_sign = NUMERIC_POS;
+
+		/* Then work with abs(base) below */
+		set_var_from_var(base, &abs_base);
+		abs_base.sign = NUMERIC_POS;
+		base = &abs_base;
+	}
+	else
+		res_sign = NUMERIC_POS;
+
 	/*----------
 	 * Decide on the scale for the ln() calculation.  For this we need an
 	 * estimate of the weight of the result, which we obtain by doing an
@@ -10261,11 +10298,17 @@ power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result)
 
 	val = numericvar_to_double_no_overflow(&ln_num);
 
-	/* initial overflow test with fuzz factor */
+	/* initial overflow/underflow test with fuzz factor */
 	if (Abs(val) > NUMERIC_MAX_RESULT_SCALE * 3.01)
-		ereport(ERROR,
-				(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
-				 errmsg("value overflows numeric format")));
+	{
+		if (val > 0)
+			ereport(ERROR,
+					(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+					 errmsg("value overflows numeric format")));
+		zero_var(result);
+		result->dscale = NUMERIC_MAX_DISPLAY_SCALE;
+		return;
+	}
 
 	val *= 0.434294481903252;	/* approximate decimal result weight */
 
@@ -10276,8 +10319,11 @@ power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result)
 	rscale = Max(rscale, NUMERIC_MIN_DISPLAY_SCALE);
 	rscale = Min(rscale, NUMERIC_MAX_DISPLAY_SCALE);
 
+	/* significant digits required in the result */
+	sig_digits = Max(rscale + (int) val, 0);
+
 	/* set the scale for the real exp * ln(base) calculation */
-	local_rscale = rscale + (int) val - ln_dweight + 8;
+	local_rscale = sig_digits - ln_dweight + 8;
 	local_rscale = Max(local_rscale, NUMERIC_MIN_DISPLAY_SCALE);
 
 	/* and do the real calculation */
@@ -10288,8 +10334,12 @@ power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result)
 
 	exp_var(&ln_num, result, rscale);
 
+	if (res_sign == NUMERIC_NEG && result->ndigits > 0)
+		result->sign = NUMERIC_NEG;
+
 	free_var(&ln_num);
 	free_var(&ln_base);
+	free_var(&abs_base);
 }
 
 /*
diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out
index cc11995398..ddcf5dbb11 100644
--- a/src/test/regress/expected/numeric.out
+++ b/src/test/regress/expected/numeric.out
@@ -2396,6 +2396,12 @@ select 1.000000000123 ^ (-2147483648);
  0.7678656556403084
 (1 row)
 
+select 0.9999999999 ^ 23300000000000 = 0 as rounds_to_zero;
+ rounds_to_zero 
+----------------
+ t
+(1 row)
+
 -- cases that used to error out
 select 0.12 ^ (-25);
                  ?column?                  
@@ -2409,6 +2415,43 @@ select 0.5678 ^ (-85);
  782333637740774446257.7719390061997396
 (1 row)
 
+select 0.9999999999 ^ 70000000000000 = 0 as underflows;
+ underflows 
+------------
+ t
+(1 row)
+
+-- negative base to integer powers
+select (-1.0) ^ 2147483646;
+      ?column?      
+--------------------
+ 1.0000000000000000
+(1 row)
+
+select (-1.0) ^ 2147483647;
+      ?column?       
+---------------------
+ -1.0000000000000000
+(1 row)
+
+select (-1.0) ^ 2147483648;
+      ?column?      
+--------------------
+ 1.0000000000000000
+(1 row)
+
+select (-1.0) ^ 1000000000000000;
+      ?column?      
+--------------------
+ 1.0000000000000000
+(1 row)
+
+select (-1.0) ^ 1000000000000001;
+      ?column?       
+---------------------
+ -1.0000000000000000
+(1 row)
+
 --
 -- Tests for raising to non-integer powers
 --
@@ -2545,6 +2588,18 @@ select exp('-inf'::numeric);
    0
 (1 row)
 
+select exp(-5000::numeric) = 0 as rounds_to_zero;
+ rounds_to_zero 
+----------------
+ t
+(1 row)
+
+select exp(-10000::numeric) = 0 as underflows;
+ underflows 
+------------
+ t
+(1 row)
+
 -- cases that used to generate inaccurate results
 select exp(32.999);
          exp         
diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql
index 14b4acfe12..4e91abae20 100644
--- a/src/test/regress/sql/numeric.sql
+++ b/src/test/regress/sql/numeric.sql
@@ -1126,10 +1126,19 @@ select 3.789 ^ 35;
 select 1.2 ^ 345;
 select 0.12 ^ (-20);
 select 1.000000000123 ^ (-2147483648);
+select 0.9999999999 ^ 23300000000000 = 0 as rounds_to_zero;
 
 -- cases that used to error out
 select 0.12 ^ (-25);
 select 0.5678 ^ (-85);
+select 0.9999999999 ^ 70000000000000 = 0 as underflows;
+
+-- negative base to integer powers
+select (-1.0) ^ 2147483646;
+select (-1.0) ^ 2147483647;
+select (-1.0) ^ 2147483648;
+select (-1.0) ^ 1000000000000000;
+select (-1.0) ^ 1000000000000001;
 
 --
 -- Tests for raising to non-integer powers
@@ -1172,6 +1181,8 @@ select exp(1.0::numeric(71,70));
 select exp('nan'::numeric);
 select exp('inf'::numeric);
 select exp('-inf'::numeric);
+select exp(-5000::numeric) = 0 as rounds_to_zero;
+select exp(-10000::numeric) = 0 as underflows;
 
 -- cases that used to generate inaccurate results
 select exp(32.999);
-- 
2.26.2

Re: Numeric x^y for negative x

Reply via email to