Re: Exposing the lock manager's WaitForLockers() to SQL
On Sun, Sep 3, 2023 at 11:16 PM Will Mortensen wrote: > I realized that for our use case, we'd ideally wait for holders of > RowExclusiveLock only, and not e.g. VACUUM holding > ShareUpdateExclusiveLock. Waiting for lockers in a specific mode seems > possible by generalizing/duplicating WaitForLockersMultiple() and > GetLockConflicts(), but I'd love to have a sanity check before > attempting that. Also, I imagine those semantics might be too > different to make sense as part of the LOCK command. Well I attempted it. :-) Here is a new series that refactors GetLockConflicts(), generalizes WaitForLockersMultiple(), and adds a new WAIT FOR LOCKERS command. I first tried extending LOCK further, but the code became somewhat unwieldy and the syntax became very confusing. I also thought again about making new pg_foo() functions, but that would seemingly make it harder to share code with LOCK, and sharing syntax (to the extent it makes sense) feels very natural. Also, a new SQL command provides plenty of doc space. :-) (I'll see about adding more examples later.) I'll try to edit the title of the CF entry accordingly. Still looking forward to any feedback. :-) v4-0002-Allow-specifying-single-lockmode-in-WaitForLocker.patch Description: Binary data v4-0003-Add-WAIT-FOR-LOCKERS-command.patch Description: Binary data v4-0001-Refactor-GetLockConflicts-into-more-general-GetLo.patch Description: Binary data
Re: Exposing the lock manager's WaitForLockers() to SQL
I meant to add that the example in the doc is adapted from Marco Slot's blog post linked earlier: https://www.citusdata.com/blog/2018/06/14/scalable-incremental-data-aggregation/
Re: date_trunc function in interval version
Hi pá 22. 12. 2023 v 23:25 odesílatel Przemysław Sztoch napsal: > In my opinion date_trunc is very good name. > Truncated data is timestamp type, not interval. > First parameter has same meaning in original date_trunc and in my new > version. > New version provides only more granularity. > ok, I miss it. Regards Pavel > > Pavel Stehule wrote on 12/22/2023 8:43 PM: > > Hi > > pá 22. 12. 2023 v 20:26 odesílatel Przemysław Sztoch > napsal: > >> Hello. >> There is date_trunc(interval, timestamptz, timezone) function. >> First parameter can be '5 year', '2 month', '6 hour', '3 hour', '15 >> minute', '10 second' etc. >> > > should not be named interval_trunc instead? In this case the good name can > be hard to choose, but with the name date_trunc it can be hard to find it. > > Regards > > Pavel > > >> -- >> Przemysław Sztoch | Mobile +48 509 99 00 66 >> > > -- > Przemysław Sztoch | Mobile +48 509 99 00 66 >
Re: date_trunc function in interval version
so 23. 12. 2023 v 13:33 odesílatel Pavel Stehule napsal: > Hi > > pá 22. 12. 2023 v 23:25 odesílatel Przemysław Sztoch > napsal: > >> In my opinion date_trunc is very good name. >> Truncated data is timestamp type, not interval. >> First parameter has same meaning in original date_trunc and in my new >> version. >> New version provides only more granularity. >> > > ok, I miss it. > I was confused - I am sorry, I imagined something different. Then the name is correct. Regards Pavel > > Pavel Stehule wrote on 12/22/2023 8:43 PM: > > Hi > > pá 22. 12. 2023 v 20:26 odesílatel Przemysław Sztoch > napsal: > >> Hello. >> There is date_trunc(interval, timestamptz, timezone) function. >> First parameter can be '5 year', '2 month', '6 hour', '3 hour', '15 >> minute', '10 second' etc. >> > > should not be named interval_trunc instead? In this case the good name can > be hard to choose, but with the name date_trunc it can be hard to find it. > > Regards > > Pavel > > >> -- >> Przemysław Sztoch | Mobile +48 509 99 00 66 >> > > -- > Przemysław Sztoch | Mobile +48 509 99 00 66 >
Re: Make attstattarget nullable
Here is an updated patch rebased over 3e2e0d5ad7. The 0001 patch stands on its own, but I also tacked on two additional WIP patches that simplify some pg_attribute handling and make these kinds of refactorings simpler in the future. See description in the patches. On 05.12.23 13:52, Peter Eisentraut wrote: In [0] it was discussed that we could make attstattarget a nullable column, instead of always storing an explicit -1 default value for most columns. This patch implements this. This changes the pg_attribute field attstattarget into a nullable field in the variable-length part of the row. If no value is set by the user for attstattarget, it is now null instead of previously -1. This saves space in pg_attribute and tuple descriptors for most practical scenarios. (ATTRIBUTE_FIXED_PART_SIZE is reduced from 108 to 104.) Also, null is the semantically more correct value. The ANALYZE code internally continues to represent the default statistics target by -1, so that that code can avoid having to deal with null values. But that is now contained to ANALYZE code. The DDL code deals with attstattarget possibly null. For system columns, the field is now always null but the effective value 0 (don't analyze) is assumed. To set a column's statistics target to the default value, the new command form ALTER TABLE ... SET STATISTICS DEFAULT can be used. (SET STATISTICS -1 still works.) [0]: https://www.postgresql.org/message-id/flat/d07ffc2b-e0e8-77f7-38fb-be921dff71af%40enterprisedb.com From f370eceec0cbb9b6bf76d3394e56a5df4280c906 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Sat, 23 Dec 2023 10:47:19 +0100 Subject: [PATCH v2 1/3] Make attstattarget nullable This changes the pg_attribute field attstattarget into a nullable field in the variable-length part of the row. If no value is set by the user for attstattarget, it is now null instead of previously -1. This saves space in pg_attribute and tuple descriptors for most practical scenarios. (ATTRIBUTE_FIXED_PART_SIZE is reduced from 108 to 104.) Also, null is the semantically more correct value. The ANALYZE code internally continues to represent the default statistics target by -1, so that that code can avoid having to deal with null values. But that is now contained to ANALYZE code. The DDL code deals with attstattarget possibly null. For system columns, the field is now always null but the effective value 0 (don't analyze) is assumed. To set a column's statistics target to the default value, the new command form ALTER TABLE ... SET STATISTICS DEFAULT can be used. (SET STATISTICS -1 still works.) Discussion: https://www.postgresql.org/message-id/flat/4da8d211-d54d-44b9-9847-f2a9f1184...@eisentraut.org TODO: move get_attstattarget() into analyze.c? TODO: catversion --- doc/src/sgml/ref/alter_table.sgml | 4 +- src/backend/access/common/tupdesc.c| 4 -- src/backend/bootstrap/bootstrap.c | 1 - src/backend/catalog/genbki.pl | 1 - src/backend/catalog/heap.c | 14 +++ src/backend/catalog/index.c| 21 --- src/backend/commands/analyze.c | 7 +++- src/backend/commands/tablecmds.c | 44 +- src/backend/parser/gram.y | 18 ++--- src/backend/utils/cache/lsyscache.c| 22 +-- src/bin/pg_dump/pg_dump.c | 7 +++- src/include/catalog/pg_attribute.h | 16 src/include/commands/vacuum.h | 2 +- src/test/regress/expected/create_index.out | 4 +- 14 files changed, 109 insertions(+), 56 deletions(-) diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index e1d207bc60..9d637157eb 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -50,7 +50,7 @@ ALTER [ COLUMN ] column_name ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY [ ( sequence_options ) ] ALTER [ COLUMN ] column_name { SET GENERATED { ALWAYS | BY DEFAULT } | SET sequence_option | RESTART [ [ WITH ] restart ] } [...] ALTER [ COLUMN ] column_name DROP IDENTITY [ IF EXISTS ] -ALTER [ COLUMN ] column_name SET STATISTICS integer +ALTER [ COLUMN ] column_name SET STATISTICS { integer | DEFAULT } ALTER [ COLUMN ] column_name SET ( attribute_option = value [, ... ] ) ALTER [ COLUMN ] column_name RESET ( attribute_option [, ... ] ) ALTER [ COLUMN ] column_name SET STORAGE { PLAIN | EXTERNAL | EXTENDED | MAIN | DEFAULT } @@ -317,7 +317,7 @@ Description sets the per-column statistics-gathering target for subsequent ANALYZE operations. The target can be set in the range 0 to 1; alternatively, set it - to -1 to revert to using the system default statistics + to DEFAULT to revert to using the system default statistics target (). For more information on the use of statistics by the PostgreSQL query
Re: authentication/t/001_password.pl trashes ~/.psql_history
On 2023-12-22 Fr 17:11, Tom Lane wrote: I wrote: I happened to notice this stuff getting added to my .psql_history: \echo background_psql: ready SET password_encryption='scram-sha-256'; ; \echo background_psql: QUERY_SEPARATOR SET scram_iterations=42; ; \echo background_psql: QUERY_SEPARATOR \password scram_role_iter \q After grepping for these strings, this is evidently the fault of src/test/authentication/t/001_password.pl by way of BackgroundPsql.pm, which fires up an interactive psql run that is not given the -n switch. Currently the only other user of interactive_psql() seems to be psql/t/010_tab_completion.pl, which avoids this problem by explicitly redirecting the history file. We could have 001_password.pl do likewise, or we could have it pass the -n switch, but I think we're going to have this problem resurface repeatedly if we leave it to the outer test script to remember to do it. After studying this some more, my conclusion is that BackgroundPsql.pm failed to borrow as much as it should have from 010_tab_completion.pl. Specifically, we want all the environment-variable changes that that script performed to be applied in any test using an interactive psql. Maybe ~/.inputrc and so forth would never affect any other test scripts, but that doesn't seem like a great bet. So that leads me to the attached proposed patch. Looks fine, after reading your original post I was thinking along the same lines. You could shorten this + my $history_file = $params{history_file}; + $history_file ||= '/dev/null'; + $ENV{PSQL_HISTORY} = $history_file; to just $ENV{PSQL_HISTORY} = $params{history_file} || '/dev/null'; cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
pg_stat_statements: more test coverage
pg_stat_statements has some significant gaps in test coverage, especially around the serialization of data around server restarts, so I wrote a test for that and also made some other smaller tweaks to increase the coverage a bit. These patches are all independent of each other. After that, the only major function that isn't tested is gc_qtexts(). Maybe a future project.From 32b51698b581b816f7ca2ff16c92be8d25e7fd66 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Sat, 23 Dec 2023 14:25:26 +0100 Subject: [PATCH v1 1/5] pg_stat_statements: Exclude from lcov functions that are impossible to test The current code only supports installing version 1.4 of pg_stat_statements and upgrading from there. So the C code entry points for older versions are not reachable from the tests. To prevent this from ruining the test coverage figures, mark those functions are excluded from lcov. --- contrib/pg_stat_statements/pg_stat_statements.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 6f62a531f7..8ac73bf55e 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -311,13 +311,15 @@ static bool pgss_save = true; /* whether to save stats across shutdown */ PG_FUNCTION_INFO_V1(pg_stat_statements_reset); PG_FUNCTION_INFO_V1(pg_stat_statements_reset_1_7); PG_FUNCTION_INFO_V1(pg_stat_statements_reset_1_11); +/* LCOV_EXCL_START */ +PG_FUNCTION_INFO_V1(pg_stat_statements); PG_FUNCTION_INFO_V1(pg_stat_statements_1_2); +/* LCOV_EXCL_STOP */ PG_FUNCTION_INFO_V1(pg_stat_statements_1_3); PG_FUNCTION_INFO_V1(pg_stat_statements_1_8); PG_FUNCTION_INFO_V1(pg_stat_statements_1_9); PG_FUNCTION_INFO_V1(pg_stat_statements_1_10); PG_FUNCTION_INFO_V1(pg_stat_statements_1_11); -PG_FUNCTION_INFO_V1(pg_stat_statements); PG_FUNCTION_INFO_V1(pg_stat_statements_info); static void pgss_shmem_request(void); @@ -1610,6 +1612,8 @@ pg_stat_statements_1_3(PG_FUNCTION_ARGS) return (Datum) 0; } +/* LCOV_EXCL_START */ + Datum pg_stat_statements_1_2(PG_FUNCTION_ARGS) { @@ -1633,6 +1637,8 @@ pg_stat_statements(PG_FUNCTION_ARGS) return (Datum) 0; } +/* LCOV_EXCL_STOP */ + /* Common code for all versions of pg_stat_statements() */ static void pg_stat_statements_internal(FunctionCallInfo fcinfo, base-commit: 3e2e0d5ad7fcb89d18a71cbfc885ef184e1b6f2e -- 2.43.0 From 1033183cea71c6bd37934cedb972d35a4e251134 Mon Sep 17 00:00:00 2001 From: Peter Eisentraut Date: Sat, 23 Dec 2023 14:25:26 +0100 Subject: [PATCH v1 2/5] pg_stat_statements: Add coverage for pg_stat_statements_1_8() This requires reading pg_stat_statements at least once while the 1.8 version of the extension is installed. --- .../expected/oldextversions.out | 24 --- .../pg_stat_statements/sql/oldextversions.sql | 3 ++- 2 files changed, 17 insertions(+), 10 deletions(-) diff --git a/contrib/pg_stat_statements/expected/oldextversions.out b/contrib/pg_stat_statements/expected/oldextversions.out index ec317b0d6b..f3a90cac0a 100644 --- a/contrib/pg_stat_statements/expected/oldextversions.out +++ b/contrib/pg_stat_statements/expected/oldextversions.out @@ -88,6 +88,17 @@ SELECT count(*) > 0 AS has_data FROM pg_stat_statements; -- New functions and views for pg_stat_statements in 1.8 AlTER EXTENSION pg_stat_statements UPDATE TO '1.8'; +SELECT pg_get_functiondef('pg_stat_statements_reset'::regproc); + pg_get_functiondef + + CREATE OR REPLACE FUNCTION public.pg_stat_statements_reset(userid oid DEFAULT 0, dbid oid DEFAULT 0, queryid bigint DEFAULT 0)+ + RETURNS void + + LANGUAGE c + + PARALLEL SAFE STRICT + + AS '$libdir/pg_stat_statements', $function$pg_stat_statements_reset_1_7$function$ + + +(1 row) + \d pg_stat_statements View "public.pg_stat_statements" Column| Type | Collation | Nullable | Default @@ -125,15 +136,10 @@ AlTER EXTENSION pg_stat_statements UPDATE TO '1.8'; wal_fpi | bigint | | | wal_bytes | numeric | | | -SELECT pg_get_functiondef('pg_stat_statements_reset'::regproc); - pg_get_functiondef
Password leakage avoidance
I have recently, once again for the umpteenth time, been involved in discussions around (paraphrasing) "why does Postgres leak the passwords into the logs when they are changed". I know well that the canonical advice is something like "use psql with \password if you care about that". And while that works, it is a deeply unsatisfying answer for me to give and for the OP to receive. The alternative is something like "...well if you don't like that, use PQencryptPasswordConn() to roll your own solution that meets your security needs". Again, not a spectacular answer IMHO. It amounts to "here is a do-it-yourself kit, go put it together". It occurred to me that we can, and really should, do better. The attached patch set moves the guts of \password from psql into the libpq client side -- PQchangePassword() (patch 0001). The usage in psql serves as a ready built-in test for the libpq function (patch 0002). Docs included too (patch 0003). One thing I have not done but, considered, is adding an additional optional parameter to allow "VALID UNTIL" to be set. Seems like it would be useful to be able to set an expiration when setting a new password. I will register this in the upcoming commitfest, but meantime thought/comments/etc. would be gratefully received. Thanks, -- Joe Conway PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.comdiff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt index 850734a..28b861f 100644 *** a/src/interfaces/libpq/exports.txt --- b/src/interfaces/libpq/exports.txt *** PQclosePrepared 188 *** 191,193 --- 191,194 PQclosePortal 189 PQsendClosePrepared 190 PQsendClosePortal 191 + PQchangePassword 192 diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c index 912aa14..3ee67ba 100644 *** a/src/interfaces/libpq/fe-auth.c --- b/src/interfaces/libpq/fe-auth.c *** PQencryptPasswordConn(PGconn *conn, cons *** 1372,1374 --- 1372,1463 return crypt_pwd; } + + /* + * PQchangePassword -- exported routine to change a password + * + * This is intended to be used by client applications that wish to + * change the password for a user. The password is not sent in + * cleartext because it is encrypted on the client side. This is + * good because it ensures the cleartext password is never known by + * the server, and therefore won't end up in logs, pg_stat displays, + * etc. We export the function so that clients won't be dependent + * on the implementation specific details with respect to how the + * server changes passwords. + * + * Arguments are a connection object, the cleartext password, the SQL + * name of the user it is for, and a string indicating the algorithm to + * use for encrypting the password. If algorithm is NULL, + * PQencryptPasswordConn() queries the server for the current + * 'password_encryption' value. If you wish to avoid that, e.g. to avoid + * blocking, you can execute 'show password_encryption' yourself before + * calling this function, and pass it as the algorithm. + * + * Return value is a boolean, true for success. On error, an error message + * is stored in the connection object, and returns false. + */ + bool + PQchangePassword(PGconn *conn, const char *passwd, const char *user, + const char *algorithm) + { + char *encrypted_password = NULL; + PQExpBufferData buf; + bool success = true; + + encrypted_password = PQencryptPasswordConn(conn, passwd, user, algorithm); + + if (!encrypted_password) + { + /* PQencryptPasswordConn() already registered the error */ + success = false; + } + else + { + char *fmtpw = NULL; + + fmtpw = PQescapeLiteral(conn, encrypted_password, + strlen(encrypted_password)); + + /* no longer needed, so clean up now */ + PQfreemem(encrypted_password); + + if (!fmtpw) + { + /* PQescapeLiteral() already registered the error */ + success = false; + } + else + { + char *fmtuser = NULL; + + fmtuser = PQescapeIdentifier(conn, user, strlen(user)); + if (!fmtuser) + { + /* PQescapeIdentifier() already registered the error */ + success = false; + PQfreemem(fmtpw); + } + else + { + PGresult *res; + + initPQExpBuffer(&buf); + printfPQExpBuffer(&buf, "ALTER USER %s PASSWORD %s", + fmtuser, fmtpw); + + res = PQexec(conn, buf.data); + if (!res) + success = false; + else + PQclear(res); + + /* clean up */ + termPQExpBuffer(&buf); + PQfreemem(fmtuser); + PQfreemem(fmtpw); + } + } + } + + return success; + } diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h index 97762d5..f6e7207 100644 *** a/src/interfaces/libpq/libpq-fe.h --- b/src/interfaces/libpq/libpq-fe.h *** extern int PQenv2encoding(void); *** 659,664 ---
Re: Password leakage avoidance
Joe Conway writes: > The attached patch set moves the guts of \password from psql into the > libpq client side -- PQchangePassword() (patch 0001). Haven't really read the patch, just looked at the docs, but here's a bit of bikeshedding: * This seems way too eager to promote the use of md5. Surely the default ought to be SCRAM, full stop. I question whether we even need an algorithm parameter. Perhaps it's a good idea for future-proofing, but we could also plan that the function would make its own decisions based on noting the server's version. (libpq is far more likely to be au courant about what to do than the calling application, IMO.) * Parameter order seems a bit odd: to me it'd be natural to write user before password. * Docs claim return type is char *, but then say bool (which is also what the code says). We do not use bool in libpq's API; the admittedly-hoary convention is to use int with 1/0 values. Rather than quibble about that, though, I wonder if we should make the result be the PGresult from the command, which the caller would be responsible to free. That would document error conditions much more flexibly. On the downside, client-side errors would be harder to report, particularly OOM, but I think we have solutions for that in existing functions. * The whole business of nonblock mode seems fairly messy here, and I wonder whether it's worth the trouble to support. If we do want to claim it works then it ought to be tested. > One thing I have not done but, considered, is adding an additional > optional parameter to allow "VALID UNTIL" to be set. Seems like it would > be useful to be able to set an expiration when setting a new password. No strong opinion about that. regards, tom lane
Re: authentication/t/001_password.pl trashes ~/.psql_history
Andrew Dunstan writes: > On 2023-12-22 Fr 17:11, Tom Lane wrote: >> After studying this some more, my conclusion is that BackgroundPsql.pm >> failed to borrow as much as it should have from 010_tab_completion.pl. >> Specifically, we want all the environment-variable changes that that >> script performed to be applied in any test using an interactive psql. >> Maybe ~/.inputrc and so forth would never affect any other test scripts, >> but that doesn't seem like a great bet. > Looks fine, after reading your original post I was thinking along the > same lines. Thanks for reviewing. > You could shorten this > + my $history_file = $params{history_file}; > + $history_file ||= '/dev/null'; > + $ENV{PSQL_HISTORY} = $history_file; > to just > $ENV{PSQL_HISTORY} = $params{history_file} || '/dev/null'; OK. I was unsure which way would be considered more readable, but based on your advice I shortened it. regards, tom lane
Re: Transaction timeout
> On 22 Dec 2023, at 10:39, Japin Li wrote: > > > I try to split the test for transaction timeout, and all passed on my CI [1]. I like the refactoring you did in timeout.spec. I thought it is impossible, because permutations would try to reinitialize FATALed sessions. But, obviously, tests work the way you refactored it. However I don't think ignoring test failures on Windows without understanding root cause is a good idea. Let's get back to v13 version of tests, understand why it failed, apply your test refactorings afterwards. BTW are you sure that v14 refactorings are functional equivalent of v13 tests? To go with this plan I attach slightly modified version of v13 tests in v16 patchset. The only change is timing in "sleep_there" step. I suspect that failure was induced by more coarse timer granularity on Windows. Tests were giving only 9 milliseconds for a timeout to entirely wipe away backend from pg_stat_activity. This saves testing time, but might induce false positive test flaps. So I've raised wait times to 100ms. This seems too much, but I do not have other ideas how to ensure tests stability. Maybe 50ms would be enough, I do not know. Isolation runs ~50 seconds now. I'm tempted to say that 200ms for timeouts worth it. As to 2nd step "Try to enable transaction_timeout during transaction", I think this makes sense. But if we are doing so, shouldn't we also allow to enable idle_in_transaction timeout in a same manner? Currently we only allow to disable other timeouts... Also, if we are already in transaction, shouldn't we also subtract current transaction span from timeout? I think making this functionality as another step of the patchset was a good idea. Thanks! Best regards, Andrey Borodin. v17-0001-Introduce-transaction_timeout.patch Description: Binary data v17-0002-Try-to-enable-transaction_timeout-before-next-co.patch Description: Binary data
Re: authentication/t/001_password.pl trashes ~/.psql_history
> On 23 Dec 2023, at 17:52, Tom Lane wrote: > > Andrew Dunstan writes: >> On 2023-12-22 Fr 17:11, Tom Lane wrote: >>> After studying this some more, my conclusion is that BackgroundPsql.pm >>> failed to borrow as much as it should have from 010_tab_completion.pl. >>> Specifically, we want all the environment-variable changes that that >>> script performed to be applied in any test using an interactive psql. >>> Maybe ~/.inputrc and so forth would never affect any other test scripts, >>> but that doesn't seem like a great bet. > >> Looks fine, after reading your original post I was thinking along the >> same lines. > > Thanks for reviewing. > >> You could shorten this >> +my $history_file = $params{history_file}; >> +$history_file ||= '/dev/null'; >> +$ENV{PSQL_HISTORY} = $history_file; >> to just >> $ENV{PSQL_HISTORY} = $params{history_file} || '/dev/null'; > > OK. I was unsure which way would be considered more readable, > but based on your advice I shortened it. Sorry for jumping in late, only saw this now (ETOOMUCHCHRISTMASPREP) but the committed patch looks good to me too. Thanks! -- Daniel Gustafsson
Improving information_schema._pg_expandarray()
I happened to notice that information_schema._pg_expandarray(), which has the nigh-unreadable definition AS 'select $1[s], s operator(pg_catalog.-) pg_catalog.array_lower($1,1) operator(pg_catalog.+) 1 from pg_catalog.generate_series(pg_catalog.array_lower($1,1), pg_catalog.array_upper($1,1), 1) as g(s)'; can now be implemented using unnest(): AS 'SELECT * FROM pg_catalog.unnest($1) WITH ORDINALITY'; It seems to be slightly more efficient this way, but the main point is to make it more readable. I then realized that we could also borrow unnest's infrastructure for rowcount estimation: ROWS 100 SUPPORT pg_catalog.array_unnest_support because array_unnest_support just looks at the array argument and doesn't have any hard dependency on the function being specifically unnest(). I'm not sure that any of its uses in information_schema can benefit from that right now, but surely it can't hurt. One minor annoyance is that psql.sql is using _pg_expandarray as a test case for \sf[+]. While we could keep doing so, I think the main point of that test case is to exercise \sf+'s line numbering ability, so the new one-line body is not a great test. I changed that test to use _pg_index_position instead. regards, tom lane diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql index 10b34c3c5b..893f73ecb5 100644 --- a/src/backend/catalog/information_schema.sql +++ b/src/backend/catalog/information_schema.sql @@ -43,11 +43,8 @@ SET search_path TO information_schema; CREATE FUNCTION _pg_expandarray(IN anyarray, OUT x anyelement, OUT n int) RETURNS SETOF RECORD LANGUAGE sql STRICT IMMUTABLE PARALLEL SAFE -AS 'select $1[s], -s operator(pg_catalog.-) pg_catalog.array_lower($1,1) operator(pg_catalog.+) 1 -from pg_catalog.generate_series(pg_catalog.array_lower($1,1), -pg_catalog.array_upper($1,1), -1) as g(s)'; +ROWS 100 SUPPORT pg_catalog.array_unnest_support +AS 'SELECT * FROM pg_catalog.unnest($1) WITH ORDINALITY'; /* Given an index's OID and an underlying-table column number, return the * column's position in the index (NULL if not there) */ diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index 631012a0f2..e783a24519 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -6317,6 +6317,9 @@ array_unnest(PG_FUNCTION_ARGS) /* * Planner support function for array_unnest(anyarray) + * + * Note: this is now also used for information_schema._pg_expandarray(), + * which is simply a wrapper around array_unnest(). */ Datum array_unnest_support(PG_FUNCTION_ARGS) diff --git a/src/test/regress/expected/psql.out b/src/test/regress/expected/psql.out index 13e4f6db7b..5d61e4c7bb 100644 --- a/src/test/regress/expected/psql.out +++ b/src/test/regress/expected/psql.out @@ -5293,26 +5293,30 @@ comment on function psql_df_plpgsql () is 'some comment'; rollback; drop role regress_psql_user; -- check \sf -\sf information_schema._pg_expandarray -CREATE OR REPLACE FUNCTION information_schema._pg_expandarray(anyarray, OUT x anyelement, OUT n integer) - RETURNS SETOF record +\sf information_schema._pg_index_position +CREATE OR REPLACE FUNCTION information_schema._pg_index_position(oid, smallint) + RETURNS integer LANGUAGE sql - IMMUTABLE PARALLEL SAFE STRICT -AS $function$select $1[s], -s operator(pg_catalog.-) pg_catalog.array_lower($1,1) operator(pg_catalog.+) 1 -from pg_catalog.generate_series(pg_catalog.array_lower($1,1), -pg_catalog.array_upper($1,1), -1) as g(s)$function$ -\sf+ information_schema._pg_expandarray -CREATE OR REPLACE FUNCTION information_schema._pg_expandarray(anyarray, OUT x anyelement, OUT n integer) - RETURNS SETOF record + STABLE STRICT +BEGIN ATOMIC + SELECT (ss.a).n AS n +FROM ( SELECT information_schema._pg_expandarray(pg_index.indkey) AS a +FROM pg_index + WHERE (pg_index.indexrelid = $1)) ss + WHERE ((ss.a).x = $2); +END +\sf+ information_schema._pg_index_position +CREATE OR REPLACE FUNCTION information_schema._pg_index_position(oid, smallint) + RETURNS integer LANGUAGE sql - IMMUTABLE PARALLEL SAFE STRICT -1 AS $function$select $1[s], -2 s operator(pg_catalog.-) pg_catalog.array_lower($1,1) operator(pg_catalog.+) 1 -3 from pg_catalog.generate_series(pg_catalog.array_lower($1,1), -4 pg_catalog.array_upper($1,1), -5 1) as g(s)$function$ + STABLE STRICT +1 BEGIN ATOMIC +2SELECT (ss.a).n AS n +3 FROM ( SELE
Re: Improving information_schema._pg_expandarray()
so 23. 12. 2023 v 19:18 odesílatel Tom Lane napsal: > I happened to notice that information_schema._pg_expandarray(), > which has the nigh-unreadable definition > > AS 'select $1[s], > s operator(pg_catalog.-) pg_catalog.array_lower($1,1) > operator(pg_catalog.+) 1 > from pg_catalog.generate_series(pg_catalog.array_lower($1,1), > pg_catalog.array_upper($1,1), > 1) as g(s)'; > > can now be implemented using unnest(): > > AS 'SELECT * FROM pg_catalog.unnest($1) WITH ORDINALITY'; > > It seems to be slightly more efficient this way, but the main point > is to make it more readable. > > I then realized that we could also borrow unnest's infrastructure > for rowcount estimation: > > ROWS 100 SUPPORT pg_catalog.array_unnest_support > > because array_unnest_support just looks at the array argument and > doesn't have any hard dependency on the function being specifically > unnest(). I'm not sure that any of its uses in information_schema > can benefit from that right now, but surely it can't hurt. > > One minor annoyance is that psql.sql is using _pg_expandarray > as a test case for \sf[+]. While we could keep doing so, I think > the main point of that test case is to exercise \sf+'s line > numbering ability, so the new one-line body is not a great test. > I changed that test to use _pg_index_position instead. > +1 regards Pavel > regards, tom lane > >
Are operations on real values IMMUTABLE or STABLE?
I've got a small question about marking functions working with decimal number types as either IMMUTABLE or STABLE. Below are a pair of trivial functions that show what I'm guessing. An int8/int8[] seems like it's going to be immutable forever. However, decimal types aren't quite so crisp and consistent. Does this mean that I need to mark such a function as STABLE instead of IMMUTABLE, like below? I'm a bit hazy on exactly when some operations shift from IMMUTABLE to STABLE. For example, it seems fair that many time/date operations are not IMMUTABLE because they vary based on the current time zone. Likewise, I think that text operations are generally not IMMUTABLE since collations vary across versions and platforms. Any clarification would be appreciated. I've been googling around and checking the archives, but haven't found these specific details addressed, so far. Ah, and I have no clue how much difference it even makes to mark a function as IMMUTABLE instead of STABLE. If the difference is more theoretical than practical, I can feel comfortable using STABLE, when unclear. Thank you! --- -- array_sum(int8[]) : int8 --- CREATE OR REPLACE FUNCTION tools.array_sum(array_in int8[]) RETURNS int8 AS $BODY$ SELECT SUM(element) AS result FROM UNNEST(array_in) AS element; $BODY$ LANGUAGE sql IMMUTABLE; -- Add a comment to describe the function COMMENT ON FUNCTION tools.array_sum(int8[]) IS 'Sum an int8[] array.'; -- Set the function's owner to USER_BENDER ALTER FUNCTION tools.array_sum(int8[]) OWNER TO user_bender; --- -- array_sum(real[]]) : real --- CREATE OR REPLACE FUNCTION tools.array_sum(array_in real[]) RETURNS real AS $BODY$ SELECT SUM(element) AS result FROM UNNEST(array_in) AS element; $BODY$ LANGUAGE sql STABLE; -- Decimal number types seem to change across versions and chips? -- Add a comment to describe the function COMMENT ON FUNCTION tools.array_sum(real[]) IS 'Sum an real[] array.'; -- Set the function's owner to USER_BENDER ALTER FUNCTION tools.array_sum(real[]) OWNER TO user_bender;
Re: broken master regress tests
Hi pá 22. 12. 2023 v 0:17 odesílatel Jeff Davis napsal: > On Wed, 2023-12-20 at 17:48 -0800, Jeff Davis wrote: > > Attached. > > It appears to increase the coverage. I committed it and I'll see how > the buildfarm reacts. > I tested it locally ( LANG=cs_CZ.UTF-8 ) without problems Regards Pavel > Regards, > Jeff Davis > >
Re: trying again to get incremental backup
My compiler has the following complaint: ../postgresql/src/backend/postmaster/walsummarizer.c: In function ‘GetOldestUnsummarizedLSN’: ../postgresql/src/backend/postmaster/walsummarizer.c:540:32: error: ‘unsummarized_lsn’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 540 | WalSummarizerCtl->pending_lsn = unsummarized_lsn; | ~~^~ I haven't looked closely to see whether there is actually a problem here, but the attached patch at least resolves the warning. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com diff --git a/src/backend/postmaster/walsummarizer.c b/src/backend/postmaster/walsummarizer.c index 9b5d3cdeb0..0cf6bbe59d 100644 --- a/src/backend/postmaster/walsummarizer.c +++ b/src/backend/postmaster/walsummarizer.c @@ -438,7 +438,7 @@ GetOldestUnsummarizedLSN(TimeLineID *tli, bool *lsn_is_exact, LWLockMode mode = reset_pending_lsn ? LW_EXCLUSIVE : LW_SHARED; int n; List *tles; - XLogRecPtr unsummarized_lsn; + XLogRecPtr unsummarized_lsn = InvalidXLogRecPtr; TimeLineID unsummarized_tli = 0; bool should_make_exact = false; List *existing_summaries;
Re: date_trunc function in interval version
date_bin has big problem with DST. In example, if you put origin in winter zone, then generated bin will be incorrect for summer input date. date_trunc is resistant for this problem. My version of date_trunc is additionally more flexible, you can select more granular interval, 12h, 8h, 6h, 15min, 10 min etc... John Naylor wrote on 23.12.2023 01:32: On Sat, Dec 23, 2023 at 5:26 AM Przemysław Sztoch wrote: In my opinion date_trunc is very good name. Truncated data is timestamp type, not interval. First parameter has same meaning in original date_trunc and in my new version. New version provides only more granularity. I haven't looked at the patch, but your description sounds awfully close to date_bin(), which already takes an arbitrary interval. -- Przemysław Sztoch | Mobile +48 509 99 00 66
Re: Are operations on real values IMMUTABLE or STABLE?
Morris de Oryx writes: > I've got a small question about marking functions working with decimal > number types as either IMMUTABLE or STABLE. Below are a pair of trivial > functions that show what I'm guessing. An int8/int8[] seems like it's going > to be immutable forever. However, decimal types aren't quite so crisp and > consistent. Does this mean that I need to mark such a function as > STABLE instead > of IMMUTABLE, like below? I think you're overthinking it. We have no hesitation about marking built-in floating-point functions as immutable, so if you're worried about some other machine hypothetically delivering different results, you're in trouble anyway. (In practice, the whole world is supposedly compliant with IEEE float arithmetic, so such cases shouldn't arise.) > Ah, and I have no clue how much difference it even makes to mark a function > as IMMUTABLE instead of STABLE. If the difference is more theoretical than > practical, I can feel comfortable using STABLE, when unclear. It's entirely not theoretical. The system won't let you use a non-IMMUTABLE function in an index definition or generated column, and there are significant query-optimization implications as well. So generally people tend to err on the side of marking things IMMUTABLE if it's at all plausible to do so. In the worst case you might end up having to reindex, or rebuild generated columns, should the function's behavior actually change. Frequently that risk is well worth taking. regards, tom lane
Re: Fixing pgbench init overflow
On Sat, Dec 23, 2023 at 03:37:33PM +0800, Japin Li wrote: > Thanks for you confirmation! Please consider the v2 patch to review. This oversight is from me via commit e35cc3b3f2d0. I'll take care of it. Thanks for the report! -- Michael signature.asc Description: PGP signature
Re: Remove MSVC scripts from the tree
On Fri, Dec 22, 2023 at 09:07:21AM -0500, Andrew Dunstan wrote: > I've done it a bit differently, but the same idea. I have tested that what I > committed passes checks on Unix and works on Windows. Sounds fine by me. Thanks for the quick turnaround! -- Michael signature.asc Description: PGP signature
Re: Commitfest manager January 2024
On Sat, Dec 23, 2023 at 08:52:38AM +0530, vignesh C wrote: > I didn't see anyone volunteering for the January Commitfest, so I'll > volunteer to be CF manager for January 2024 Commitfest. (Adding Magnus in CC.) That would be really helpful. Thanks for helping! Do you have the admin rights on the CF app? You are going to require them in order to mark the CF as in-process, and you would also need to switch the CF after that from "Future" to "Open" so as people can still post patches once January one begins. -- Michael signature.asc Description: PGP signature
Re: pg_stat_statements: more test coverage
On Sat, Dec 23, 2023 at 03:18:01PM +0100, Peter Eisentraut wrote: > +/* LCOV_EXCL_START */ > +PG_FUNCTION_INFO_V1(pg_stat_statements); > PG_FUNCTION_INFO_V1(pg_stat_statements_1_2); > +/* LCOV_EXCL_STOP */ The only reason why I've seen this used at the C level was to bump up the coverage requirements because of some internal company projects. I'm pretty sure to have proposed in the past at least one patch that would make use of that, but it got rejected. It is not the only code area that has a similar pattern. So why introducing that now? > Subject: [PATCH v1 2/5] pg_stat_statements: Add coverage for > pg_stat_statements_1_8() > > [...] > > Subject: [PATCH v1 3/5] pg_stat_statements: Add coverage for > pg_stat_statements_reset_1_7 Yep, why not. > +SELECT format('create table t%s (a int)', lpad(i::text, 3, '0')) FROM > generate_series(1, 101) AS x(i) \gexec > +create table t001 (a int) > [...] > +create table t101 (a int) That's a lot of bloat. This relies on pg_stat_statements.max's default to be at 100. Based on the regression tests, the maximum number of rows we have reported from the view pg_stat_statements is 39 in utility.c. I think that you should just: - Use a DO block of a PL function, say with something like that to ensure an amount of N queries? Say with something like that after tweaking pg_stat_statements.track: CREATE OR REPLACE FUNCTION create_tables(num_tables int) RETURNS VOID AS $func$ BEGIN FOR i IN 1..num_tables LOOP EXECUTE format(' CREATE TABLE IF NOT EXISTS %I (id int)', 't_' || i); END LOOP; END $func$ LANGUAGE plpgsql; - Switch the minimum to be around 40~50 in the local pg_stat_statements.conf used for the regression tests. > +SELECT count(*) <= 100 AND count(*) > 0 FROM pg_stat_statements; You could fetch the max value in a \get and reuse it here, as well. > +is( $node->safe_psql( > + 'postgres', > + "SELECT count(*) FROM pg_stat_statements WHERE query LIKE > '%t1%'"), > + '2', > + 'pg_stat_statements data kept across restart'); Checking that the contents match would be a bit more verbose than just a count. One trick I've used in the patch is in 027_stream_regress.pl, where there is a query grouping the stats depending on the beginning of the queries. Not exact, a bit more verbose. -- Michael signature.asc Description: PGP signature
Re: Are operations on real values IMMUTABLE or STABLE?
> > > I think you're overthinking it. > *Moi*? Never happens ;-) Fantastic answer, thanks very much for giving me all of these details. Coming from you, I'll take it as authoritative and run with it.
Re: Emit fewer vacuum records by reaping removable tuples during pruning
On Mon, Nov 13, 2023 at 07:06:15PM -0500, Melanie Plageman wrote: > As best I can tell, our best case scenario is Thomas' streaming read API > goes in, we add vacuum as a user, and we can likely remove the skip > range logic. This does not prevent the work you've been doing in 0001 and 0002 posted upthread, right? Some progress is always better than no progress, and I can see the appeal behind doing 0001 actually to keep the updates of the block numbers closer to where we determine if relation truncation is safe of not rather than use an intermediate state in LVPagePruneState. 0002 is much, much, much trickier.. -- Michael signature.asc Description: PGP signature
Re: pgsql: Prevent tuples to be marked as dead in subtransactions on standb
On Tue, Dec 19, 2023 at 03:15:42PM -0500, Robert Haas wrote: > I don't think this is a good commit message. It's totally unclear what > it means, and when I opened up the diff to try to see what was > changed, it looked nothing like what I expected. (Not my intention to ignore you here, my head has been underwater for some time.) > I think a better message would have been something like > "startedInRecovery flag must be propagated to subtransactions". You are right, sorry about that. Something like "Propagate properly startedInRecovery in subtransactions started during recovery" would have been better than what I used. > And I > think there should have been some analysis in the commit message or > the comments within the commit itself of whether it was intended that > this be propagated to subtransactions or not. It's hard to understand > why the flag would have been placed in the TransactionState if it > applied globally to the transaction and all subtransactions, but maybe > that's what happened. Alvaro has mentioned something like that on the original thread where we could use comments when a transaction state is pushed to a subtransaction to track better the fields used and/or not used. Documenting more all that at the top of TransactionStateData is something we should do. > Instead of discussing that issue, your commit message focuses in the > user-visible consequences, but in a sort of baffling way. The > statement that "Dead tuples are ignored and are not marked as dead > during recovery," for example, is clearly false on its face. If > recovery didn't mark dead tuples as dead, it would be completely > broken. Rather than "dead" tuples, I implied "killed" tuples in this sentence. Killed tuples are ignored during recovery. -- Michael signature.asc Description: PGP signature
Re: pg_upgrade --copy-file-range
On Sat, Dec 23, 2023 at 09:52:59AM +1300, Thomas Munro wrote: > As it happens I was just thinking about this particular patch because > I suddenly had a strong urge to teach pg_combinebackup to use > copy_file_range. I wonder if you had the same idea... Yeah, +1. That would make copy_file_blocks() more efficient where the code is copying 50 blocks in batches because it needs to reassign checksums to the blocks copied. -- Michael signature.asc Description: PGP signature