On Fri, 23 Aug 2024 at 13:40, Peter Eisentraut <pe...@eisentraut.org> wrote: > > What are examples of where this would be useful in a database context?
gamma() and lgamma() are the kinds of functions that are generally useful for a variety of tasks like statistical analysis and combinatorial computations, and having them in the database allows those sort of computations to be performed without the need to export the data to an external tool. We discussed adding them in a thread last year [1], and there has been at least a little prior interest [2]. [1] https://www.postgresql.org/message-id/CA%2BhUKGKJAcB8Q5qziKTTSnkA4Mnv_6f%2B7-_XUgbh9jFjSdEFQg%40mail.gmail.com [2] https://stackoverflow.com/questions/58884066/how-can-i-run-the-equivalent-of-excels-gammaln-function-in-postgres Of course, there's a somewhat fuzzy line between what is generally useful enough, and what is too specialised for core Postgres, but I would argue that these qualify, since they are part of C99, and commonly appear in other general-purpose math libraries like the Python math module. > > The error-handling for these functions seems to be a little trickier > > than most, so that might need further discussion. > > Yeah, this is quite something. > > I'm not sure why you are doing the testing for special values (NaN etc.) > yourself when the C library function already does it. For example, if I > remove the isnan(arg1) check in your dlgamma(), then it still behaves > the same in all tests. It's useful to do that so that we don't need to assume that every platform conforms to the POSIX standard, and because it simplifies the later overflow checks. This is consistent with the approach taken in other functions, such as dexp(), dsin(), dcos(), etc. > The overflow checks after the C library call are written > differently for the two functions. dgamma() does not check errno for > ERANGE for example. It might also be good if dgamma() checked errno for > EDOM, because other the result of gamma(-1) being "overflow" seems a bit > wrong. They're intentionally different because the functions themselves are different. In this case: select gamma(-1); ERROR: value out of range: overflow it is correct to throw an error, because gamma(-1) is undefined (it goes to -Inf as x goes to -1 from above, and +Inf as x goes to -1 from below, so there is no well-defined limit). I've updated the patch to give a more specific error message for negative integer inputs, as opposed to other overflow cases. Relying on errno being ERANGE or EDOM doesn't seem possible though, because the docs indicate that, while its behaviour is one thing today, per POSIX, that will change in the future. By contrast, lgamma() does not raise an error for such inputs: select lgamma(-1); lgamma ---------- Infinity This is correct because lgamma() is the log of the absolute value of the gamma function, so the limit is +Inf as x goes to -1 from both sides. > I'm also not clear why you are turning a genuine result overflow into > infinity in lgamma(). OK, I've changed that to only return infinity if the input is infinite, zero, or a negative integer. Otherwise, it now throws an overflow error. > Btw., I'm reading that lgamma() in the C library is not necessarily > thread-safe. Is that a possible problem? It's not quite clear what to do about that. Some platforms may define the lgamma_r() function, but not all. Do we have a standard pattern for dealing with functions for which there is no thread-safe alternative? Regards, Dean
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml new file mode 100644 index 461fc3f..b2a3368 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -1399,6 +1399,27 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FRO <row> <entry role="func_table_entry"><para role="func_signature"> <indexterm> + <primary>gamma</primary> + </indexterm> + <function>gamma</function> ( <type>double precision</type> ) + <returnvalue>double precision</returnvalue> + </para> + <para> + Gamma function + </para> + <para> + <literal>gamma(0.5)</literal> + <returnvalue>1.772453850905516</returnvalue> + </para> + <para> + <literal>gamma(6)</literal> + <returnvalue>120</returnvalue> + </para></entry> + </row> + + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> <primary>gcd</primary> </indexterm> <function>gcd</function> ( <replaceable>numeric_type</replaceable>, <replaceable>numeric_type</replaceable> ) @@ -1436,6 +1457,23 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FRO </para></entry> </row> + <row> + <entry role="func_table_entry"><para role="func_signature"> + <indexterm> + <primary>lgamma</primary> + </indexterm> + <function>lgamma</function> ( <type>double precision</type> ) + <returnvalue>double precision</returnvalue> + </para> + <para> + Natural logarithm of the absolute value of the gamma function + </para> + <para> + <literal>lgamma(1000)</literal> + <returnvalue>5905.220423209181</returnvalue> + </para></entry> + </row> + <row> <entry role="func_table_entry"><para role="func_signature"> <indexterm> diff --git a/src/backend/utils/adt/float.c b/src/backend/utils/adt/float.c new file mode 100644 index f709c21..4694ea2 --- a/src/backend/utils/adt/float.c +++ b/src/backend/utils/adt/float.c @@ -2787,6 +2787,97 @@ derfc(PG_FUNCTION_ARGS) } +/* ========== GAMMA FUNCTIONS ========== */ + + +/* + * dgamma - returns the gamma function of arg1 + */ +Datum +dgamma(PG_FUNCTION_ARGS) +{ + float8 arg1 = PG_GETARG_FLOAT8(0); + float8 result; + + /* + * Per the POSIX spec, return NaN if the input is NaN, +/-infinity if the + * input is +/-0, NaN if the input is -infinity, and infinity if the input + * is infinity. + */ + if (isnan(arg1)) + PG_RETURN_FLOAT8(get_float8_nan()); + if (arg1 == 0) + { + if (signbit(arg1)) + PG_RETURN_FLOAT8(-get_float8_infinity()); + else + PG_RETURN_FLOAT8(get_float8_infinity()); + } + if (isinf(arg1)) + { + if (arg1 < 0) + PG_RETURN_FLOAT8(get_float8_nan()); + else + PG_RETURN_FLOAT8(get_float8_infinity()); + } + + /* + * Note that the POSIX/C99 gamma function is called "tgamma", not "gamma". + * + * For negative integer inputs, it may raise a domain error or a pole + * error, or return NaN or +/-infinity (the POSIX spec indicates that this + * may change in future implementations). Try to distinguish these cases + * from other overflow cases, and give a more specific error message. + */ + errno = 0; + result = tgamma(arg1); + if (errno != 0 || isinf(result) || isnan(result)) + { + if (arg1 < 0 && floor(arg1) == arg1) + ereport(ERROR, + errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("the gamma function is undefined for negative integer inputs")); + float_overflow_error(); + } + + PG_RETURN_FLOAT8(result); +} + + +/* + * dlgamma - natural logarithm of absolute value of gamma of arg1 + */ +Datum +dlgamma(PG_FUNCTION_ARGS) +{ + float8 arg1 = PG_GETARG_FLOAT8(0); + float8 result; + + /* Per the POSIX spec, return NaN if the input is NaN */ + if (isnan(arg1)) + PG_RETURN_FLOAT8(get_float8_nan()); + + errno = 0; + result = lgamma(arg1); + + /* + * If an ERANGE error occurs, it means there is an overflow. This can + * happen if the input is 0, a negative integer, -infinity, or infinity. + * In each of those cases, the correct result is infinity. Otherwise, it + * is a genuine overflow error. + */ + if (unlikely(errno == ERANGE)) + { + if (isinf(arg1) || (arg1 <= 0 && floor(arg1) == arg1)) + result = get_float8_infinity(); + else + float_overflow_error(); + } + + PG_RETURN_FLOAT8(result); +} + + /* * ========================= diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat new file mode 100644 index ff5436a..cee8398 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -3490,6 +3490,13 @@ proname => 'erfc', prorettype => 'float8', proargtypes => 'float8', prosrc => 'derfc' }, +{ oid => '8702', descr => 'gamma function', + proname => 'gamma', prorettype => 'float8', proargtypes => 'float8', + prosrc => 'dgamma' }, +{ oid => '8703', descr => 'natural logarithm of absolute value of gamma function', + proname => 'lgamma', prorettype => 'float8', proargtypes => 'float8', + prosrc => 'dlgamma' }, + { oid => '1618', proname => 'interval_mul', prorettype => 'interval', proargtypes => 'interval float8', prosrc => 'interval_mul' }, diff --git a/src/test/regress/expected/float8.out b/src/test/regress/expected/float8.out new file mode 100644 index 344d6b7..a31624b --- a/src/test/regress/expected/float8.out +++ b/src/test/regress/expected/float8.out @@ -830,6 +830,53 @@ FROM (VALUES (float8 '-infinity'), (22 rows) RESET extra_float_digits; +-- gamma functions +-- we run these with extra_float_digits = -1, to get consistently rounded +-- results on all platforms. +SET extra_float_digits = -1; +SELECT x, + gamma(x), + lgamma(x) +FROM (VALUES (float8 '-infinity'), + ('-0'), (0), (0.5), (1), (2), (3), (4), (5), + (float8 'infinity'), (float8 'nan')) AS t(x); + x | gamma | lgamma +-----------+-----------------+------------------ + -Infinity | NaN | Infinity + -0 | -Infinity | Infinity + 0 | Infinity | Infinity + 0.5 | 1.7724538509055 | 0.5723649429247 + 1 | 1 | 0 + 2 | 1 | 0 + 3 | 2 | 0.69314718055995 + 4 | 6 | 1.7917594692281 + 5 | 24 | 3.1780538303479 + Infinity | Infinity | Infinity + NaN | NaN | NaN +(11 rows) + +-- invalid inputs for gamma() +SELECT gamma(-1); +ERROR: the gamma function is undefined for negative integer inputs +SELECT gamma(1000000); +ERROR: value out of range: overflow +-- valid for lgamma() +SELECT lgamma(-1); + lgamma +---------- + Infinity +(1 row) + +SELECT lgamma(1000000); + lgamma +----------------- + 12815504.569148 +(1 row) + +-- inavlid input for lgamma() +SELECT lgamma(1e308); +ERROR: value out of range: overflow +RESET extra_float_digits; -- test for over- and underflow INSERT INTO FLOAT8_TBL(f1) VALUES ('10e400'); ERROR: "10e400" is out of range for type double precision diff --git a/src/test/regress/sql/float8.sql b/src/test/regress/sql/float8.sql new file mode 100644 index 98e9926..de2656a --- a/src/test/regress/sql/float8.sql +++ b/src/test/regress/sql/float8.sql @@ -245,6 +245,27 @@ FROM (VALUES (float8 '-infinity'), RESET extra_float_digits; +-- gamma functions +-- we run these with extra_float_digits = -1, to get consistently rounded +-- results on all platforms. +SET extra_float_digits = -1; +SELECT x, + gamma(x), + lgamma(x) +FROM (VALUES (float8 '-infinity'), + ('-0'), (0), (0.5), (1), (2), (3), (4), (5), + (float8 'infinity'), (float8 'nan')) AS t(x); +-- invalid inputs for gamma() +SELECT gamma(-1); +SELECT gamma(1000000); +-- valid for lgamma() +SELECT lgamma(-1); +SELECT lgamma(1000000); +-- inavlid input for lgamma() +SELECT lgamma(1e308); + +RESET extra_float_digits; + -- test for over- and underflow INSERT INTO FLOAT8_TBL(f1) VALUES ('10e400');