On Thu, 22 Jul 2021 at 16:19, Dean Rasheed <dean.a.rash...@gmail.com> wrote: > > On Thu, 22 Jul 2021 at 06:13, Yugo NAGATA <nag...@sraoss.co.jp> wrote: > > > > Thank you for updating the patch. I am fine with the additional comments. > > I don't think there is any other problem left, so I marked it > > Ready-for-Committers. > > Thanks for looking. Barring any further comments, I'll push this in a few > days. >
So I have been testing this a lot over the last couple of days, and I have concluded that the patch works well as far as it goes, but I did manage to construct another case where numeric_power() loses precision. I think, though, that it would be better to tackle that as a separate patch. In writing the commit message for this patch, I realised that it was possible to tidy up the local_rscale calculation part of it a little, to make it more obvious what was going wrong, so attached is a slightly tweaked version. I'll hold off on committing it for a few more days in case anyone else wants to have a look. Tom? The other issue I found is related to the first part of power_var(), where it does a low-precision calculation to get an estimate for the weight of the result. It occurred to me that for certain input bases, that calculation could be made to be quite inaccurate, and therefore lead to choosing the wrong rscale later. This is the test case I constructed: (1-1.5123456789012345678e-1000) ^ 1.15e1003 Here, the base is a sliver under 1, and so ln(base) is approximately -1.5e-1000, and ln_dweight is -1000 (the decimal weight of ln(base)). The problem is that the local_rscale used for the first low-precision calculation is limited to NUMERIC_MAX_DISPLAY_SCALE (which is 1000), so we only compute ln_base to a scale of 1000 at that stage, and the result is rounded to exactly 2e-1000, which is off by a factor of around 1.333333. That makes it think the result weight will be -998, when actually it's -755, so it then chooses a local_rscale for the full calculation that's far too small, and the result is very inaccurate. To fix this, I think it's necessary to remove the line that limits the initial local_rscale. I tried that in a debugger, and managed to get a result that agreed with the result from "bc -l" with a scale of 2000. The reason I think it will be OK to remove that line is that it only ever comes into play when ln_dweight is a large negative number (and the smallest it can be is -16383). But that can only happen in instances like this, where the base is very very close to 1. In such cases, the ln(base) calculation is very fast, because it basically only has to do a couple of Taylor series terms, and it's done. This will still only be a low-precision estimate of the result (about 8 significant digits, shifted down a long way). It might also be necessary to re-think the choice of local_rscale for the mul_var() that follows. If the weight of exp is much larger than the weight of ln_base (or vice versa), it doesn't really make sense to compute the product to the same local_rscale. That might be a source of other inaccuracies. I'll try to investigate some more. Anyway, I don't think any of that should get in the way of committing the current patch. Regards, Dean
From 071d877af3ffcb8e2abd445cb5963d90a2e766bc Mon Sep 17 00:00:00 2001 From: Dean Rasheed <dean.a.rash...@gmail.com> Date: Thu, 29 Jul 2021 17:26:08 +0100 Subject: [PATCH] Fix corner-case errors and loss of precision in numeric_power(). This fixes a couple of related problems that arise when raising numbers to very large powers. Firstly, when raising a negative number to a very large integer power, the result should be well-defined, but the previous code would only cope if the exponent was small enough to go through power_var_int(). Otherwise it would throw an internal error, attempting to take the logarithm of a negative number. Fix this by adding suitable handling to the general case in power_var() to cope with negative bases, checking for integer powers there. Next, when raising a (positive or negative) number whose absolute value is slightly less than 1 to a very large power, the result should approach zero as the power is increased. However, in some cases, for sufficiently large powers, this would lose all precision and return 1 instead. This was due to the way that the local_rscale was being calculated for the final full-precision calculation: local_rscale = rscale + (int) val - ln_dweight + 8 The first two terms on the right hand side are meant to give the number of significant digits required in the result ("val" being the estimated result weight). However, this failed to account for the fact that rscale is clipped to a maximum of NUMERIC_MAX_DISPLAY_SCALE (1000), and the result weight might be less then -1000, causing their sum to be negative, leading to a loss of precision. Fix this by forcing the number of significant digits to be nonnegative. It's OK for it to be zero (when the result weight is less than -1000), since the local_rscale value then includes a few extra digits to ensure an accurate result. Finally, add additional underflow checks to exp_var() and power_var(), so that they consistently return zero for cases like this where the result is indistinguishable from zero. Some paths through this code already returned zero in such cases, but others were throwing overflow errors. Dean Rasheed, reviewed by Yugo Nagata Discussion: http://postgr.es/m/CAEZATCW6Dvq7+3wN3tt5jLj-FyOcUgT5xNoOqce5=6su0bc...@mail.gmail.com --- src/backend/utils/adt/numeric.c | 80 ++++++++++++++++++++++----- src/test/regress/expected/numeric.out | 55 ++++++++++++++++++ src/test/regress/sql/numeric.sql | 11 ++++ 3 files changed, 131 insertions(+), 15 deletions(-) diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index faff09f5d5..1b077b8469 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -3993,7 +3993,9 @@ numeric_power(PG_FUNCTION_ARGS) /* * The SQL spec requires that we emit a particular SQLSTATE error code for * certain error conditions. Specifically, we don't return a - * divide-by-zero error code for 0 ^ -1. + * divide-by-zero error code for 0 ^ -1. Raising a negative number to a + * non-integer power must produce the same error code, but that case is + * handled in power_var(). */ sign1 = numeric_sign_internal(num1); sign2 = numeric_sign_internal(num2); @@ -4003,11 +4005,6 @@ numeric_power(PG_FUNCTION_ARGS) (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION), errmsg("zero raised to a negative power is undefined"))); - if (sign1 < 0 && !numeric_is_integral(num2)) - ereport(ERROR, - (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION), - errmsg("a negative number raised to a non-integer power yields a complex result"))); - /* * Initialize things */ @@ -9822,12 +9819,18 @@ exp_var(const NumericVar *arg, NumericVar *result, int rscale) */ val = numericvar_to_double_no_overflow(&x); - /* Guard against overflow */ + /* Guard against overflow/underflow */ /* If you change this limit, see also power_var()'s limit */ if (Abs(val) >= NUMERIC_MAX_RESULT_SCALE * 3) - ereport(ERROR, - (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), - errmsg("value overflows numeric format"))); + { + if (val > 0) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("value overflows numeric format"))); + zero_var(result); + result->dscale = rscale; + return; + } /* decimal weight = log10(e^x) = x * log10(e) */ dweight = (int) (val * 0.434294481903252); @@ -10185,10 +10188,13 @@ log_var(const NumericVar *base, const NumericVar *num, NumericVar *result) static void power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result) { + int res_sign; + NumericVar abs_base; NumericVar ln_base; NumericVar ln_num; int ln_dweight; int rscale; + int sig_digits; int local_rscale; double val; @@ -10228,9 +10234,40 @@ power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result) return; } + init_var(&abs_base); init_var(&ln_base); init_var(&ln_num); + /* + * If base is negative, insist that exp be an integer. The result is then + * positive if exp is even and negative if exp is odd. + */ + if (base->sign == NUMERIC_NEG) + { + /* + * Check that exp is an integer. This error code is defined by the + * SQL standard, and matches other errors in numeric_power(). + */ + if (exp->ndigits > 0 && exp->ndigits > exp->weight + 1) + ereport(ERROR, + (errcode(ERRCODE_INVALID_ARGUMENT_FOR_POWER_FUNCTION), + errmsg("a negative number raised to a non-integer power yields a complex result"))); + + /* Test if exp is odd or even */ + if (exp->ndigits > 0 && exp->ndigits == exp->weight + 1 && + (exp->digits[exp->ndigits - 1] & 1)) + res_sign = NUMERIC_NEG; + else + res_sign = NUMERIC_POS; + + /* Then work with abs(base) below */ + set_var_from_var(base, &abs_base); + abs_base.sign = NUMERIC_POS; + base = &abs_base; + } + else + res_sign = NUMERIC_POS; + /*---------- * Decide on the scale for the ln() calculation. For this we need an * estimate of the weight of the result, which we obtain by doing an @@ -10261,11 +10298,17 @@ power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result) val = numericvar_to_double_no_overflow(&ln_num); - /* initial overflow test with fuzz factor */ + /* initial overflow/underflow test with fuzz factor */ if (Abs(val) > NUMERIC_MAX_RESULT_SCALE * 3.01) - ereport(ERROR, - (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), - errmsg("value overflows numeric format"))); + { + if (val > 0) + ereport(ERROR, + (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("value overflows numeric format"))); + zero_var(result); + result->dscale = NUMERIC_MAX_DISPLAY_SCALE; + return; + } val *= 0.434294481903252; /* approximate decimal result weight */ @@ -10276,8 +10319,11 @@ power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result) rscale = Max(rscale, NUMERIC_MIN_DISPLAY_SCALE); rscale = Min(rscale, NUMERIC_MAX_DISPLAY_SCALE); + /* significant digits required in the result */ + sig_digits = Max(rscale + (int) val, 0); + /* set the scale for the real exp * ln(base) calculation */ - local_rscale = rscale + (int) val - ln_dweight + 8; + local_rscale = sig_digits - ln_dweight + 8; local_rscale = Max(local_rscale, NUMERIC_MIN_DISPLAY_SCALE); /* and do the real calculation */ @@ -10288,8 +10334,12 @@ power_var(const NumericVar *base, const NumericVar *exp, NumericVar *result) exp_var(&ln_num, result, rscale); + if (res_sign == NUMERIC_NEG && result->ndigits > 0) + result->sign = NUMERIC_NEG; + free_var(&ln_num); free_var(&ln_base); + free_var(&abs_base); } /* diff --git a/src/test/regress/expected/numeric.out b/src/test/regress/expected/numeric.out index cc11995398..ddcf5dbb11 100644 --- a/src/test/regress/expected/numeric.out +++ b/src/test/regress/expected/numeric.out @@ -2396,6 +2396,12 @@ select 1.000000000123 ^ (-2147483648); 0.7678656556403084 (1 row) +select 0.9999999999 ^ 23300000000000 = 0 as rounds_to_zero; + rounds_to_zero +---------------- + t +(1 row) + -- cases that used to error out select 0.12 ^ (-25); ?column? @@ -2409,6 +2415,43 @@ select 0.5678 ^ (-85); 782333637740774446257.7719390061997396 (1 row) +select 0.9999999999 ^ 70000000000000 = 0 as underflows; + underflows +------------ + t +(1 row) + +-- negative base to integer powers +select (-1.0) ^ 2147483646; + ?column? +-------------------- + 1.0000000000000000 +(1 row) + +select (-1.0) ^ 2147483647; + ?column? +--------------------- + -1.0000000000000000 +(1 row) + +select (-1.0) ^ 2147483648; + ?column? +-------------------- + 1.0000000000000000 +(1 row) + +select (-1.0) ^ 1000000000000000; + ?column? +-------------------- + 1.0000000000000000 +(1 row) + +select (-1.0) ^ 1000000000000001; + ?column? +--------------------- + -1.0000000000000000 +(1 row) + -- -- Tests for raising to non-integer powers -- @@ -2545,6 +2588,18 @@ select exp('-inf'::numeric); 0 (1 row) +select exp(-5000::numeric) = 0 as rounds_to_zero; + rounds_to_zero +---------------- + t +(1 row) + +select exp(-10000::numeric) = 0 as underflows; + underflows +------------ + t +(1 row) + -- cases that used to generate inaccurate results select exp(32.999); exp diff --git a/src/test/regress/sql/numeric.sql b/src/test/regress/sql/numeric.sql index 14b4acfe12..4e91abae20 100644 --- a/src/test/regress/sql/numeric.sql +++ b/src/test/regress/sql/numeric.sql @@ -1126,10 +1126,19 @@ select 3.789 ^ 35; select 1.2 ^ 345; select 0.12 ^ (-20); select 1.000000000123 ^ (-2147483648); +select 0.9999999999 ^ 23300000000000 = 0 as rounds_to_zero; -- cases that used to error out select 0.12 ^ (-25); select 0.5678 ^ (-85); +select 0.9999999999 ^ 70000000000000 = 0 as underflows; + +-- negative base to integer powers +select (-1.0) ^ 2147483646; +select (-1.0) ^ 2147483647; +select (-1.0) ^ 2147483648; +select (-1.0) ^ 1000000000000000; +select (-1.0) ^ 1000000000000001; -- -- Tests for raising to non-integer powers @@ -1172,6 +1181,8 @@ select exp(1.0::numeric(71,70)); select exp('nan'::numeric); select exp('inf'::numeric); select exp('-inf'::numeric); +select exp(-5000::numeric) = 0 as rounds_to_zero; +select exp(-10000::numeric) = 0 as underflows; -- cases that used to generate inaccurate results select exp(32.999); -- 2.26.2