On Thu, 29 Aug 2024 at 05:39, Andy Fan <zhihuifan1...@163.com> wrote: > > Yes, that's a valid usage. the new vesion is attached. I have changed > the the commit entry [1] from "Waiting on Author" to "Needs review". >
Note that this needs a rebase, following commit 4681ad4b2f. Here are a few more review comments: 1). In the tests: +select setseed(0.8); +select rand_array(10, 0, 3, 50::int, 80::int); +select setseed(0.8); +select rand_array(10, 0, 3, 50::bigint, 80::bigint); +select setseed(0.8); +select rand_array(10, 0, 3, 50::float8, 80::float8); +select setseed(0.8); +select rand_array(10, 0, 3, 50::float4, 80::float4); +select setseed(0.8); +select rand_array(10, 0, 3, 50::numeric, 80::numeric); this should really have a comment block to distinguish these new tests from the preceeding normal_rand() tests. 2). It's worth also having tests for the error cases. 3). It's only necessary to call setseed() once to get a reproducible set of results from then on. 4). I'd use setseed(0) or setseed(0.5), since those values are exactly representable as double precision values, unlike 0.8, ensuring that it works the same on all platforms. It might not matter, but why take the risk? 5). The float8 case still has minimum and maximum value arguments that it just ignores. It should either not take those arguments, or it should respect them, and try to return float8 values in the requested range. I suspect that trying to return float8 values in the requested range would be hard, if not impossible, due to the inexact nature of double precision arithmetic. That's why I suggested earlier having the float8 function not take minval/maxval arguments, and just return values in the range 0 to 1. 6). This new function: +static Datum +rand_array_internal(FunctionCallInfo fcinfo, Oid datatype) +{ should have a comment block. In particular, it should document what its inputs and outputs are. 7). This code: + int num_tuples = PG_GETARG_INT32(0); + int minlen = PG_GETARG_INT32(1); + int maxlen = PG_GETARG_INT32(2); + Datum minval = PG_GETARG_DATUM(3), + maxval = PG_GETARG_DATUM(4); + rand_array_fctx *fctx; + + if (datatype == INT4OID) + random_fn_oid = F_RANDOM_INT4_INT4; + else if (datatype == INT8OID) + random_fn_oid = F_RANDOM_INT8_INT8; + else if (datatype == FLOAT8OID) + random_fn_oid = F_RANDOM_; + else if (datatype == NUMERICOID) + random_fn_oid = F_RANDOM_NUMERIC_NUMERIC; + else + elog(ERROR, "unsupported type %d for rand_array function.", + datatype); + + if (num_tuples < 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("number of rows cannot be negative"))); + + if (minlen > maxlen) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("minlen must not be greater than maxlen."))); + + if (minlen < 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("minlen and maxlen must be greater than or equal to zero."))); should be inside the "if (SRF_IS_FIRSTCALL())" block. There's no point repeating all those checks for every call. 8). I think it would be neater to code the "if (datatype == INT4OID) ... else if ..." as a switch statement. 9). I would swap the last 2 bound checks round, and simplify the error messages as follows: if (minlen < 0) ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("minlen must be greater than or equal to zero")); if (maxlen < minlen) ereport(ERROR, errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("maxlen must be greater than or equal to minlen")); Also note that primary error messages should not end with a period. See https://www.postgresql.org/docs/current/error-style-guide.html 10). In this error: + elog(ERROR, "unsupported type %d for rand_array function.", + datatype); "datatype" is of type Oid, which is unsigned, and so it should use "%u" not "%d". Also, as above, it should not end with a period, so it should be: + elog(ERROR, "unsupported type %u for rand_array function", + datatype); 11). If the float8 case is made to not have minval/maxval arguments, this code: + Datum minval = PG_GETARG_DATUM(3), + maxval = PG_GETARG_DATUM(4); and the FunctionCallInfo setup code needs to be different for the float8 case, in order to avoid reading and setting undefined arguments. Perhaps introduce a "need_val_bounds" boolean variable, based on the datatype. 12). This code: + random_len_fcinfo->args[0].value = minlen; + random_len_fcinfo->args[1].value = maxlen; should really be: + random_len_fcinfo->args[0].value = Int32GetDatum(minlen); + random_len_fcinfo->args[1].value = Int32GetDatum(maxlen); It amounts to the same thing, but documents the fact that it's converting an integer to a Datum. 13). These new functions are significantly under-documented, especially when compared to all the other functions on https://www.postgresql.org/docs/current/tablefunc.html They really should have their own subsection, along the same lines as "F.41.1.1. Normal_rand", explaining what the functions do, what their arguments mean, and giving a couple of usage examples. Regards, Dean