Re: Add contrib/pg_logicalsnapinspect
Hi, On Mon, Sep 16, 2024 at 10:16:16PM -0700, David G. Johnston wrote: > On Monday, September 16, 2024, shveta malik wrote: > > > On Tue, Sep 17, 2024 at 10:18 AM shveta malik > > wrote: > > > > > > Thanks for addressing the comments. I have not started reviewing v4 > > > yet, but here are few more comments on v3: > > > > > > > I just noticed that when we pass NULL input, both the new functions > > give 1 row as output, all cols as NULL: > > > > newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL); > > magic | checksum | version > > ---+--+- > >| | > > > > (1 row) > > > > Similar behavior with pg_get_logical_snapshot_info(). While the > > existing 'pg_ls_logicalsnapdir' function gives this error, which looks > > more meaningful: > > > > newdb1=# select * from pg_ls_logicalsnapdir(NULL); > > ERROR: function pg_ls_logicalsnapdir(unknown) does not exist > > LINE 1: select * from pg_ls_logicalsnapdir(NULL); > > HINT: No function matches the given name and argument types. You > > might need to add explicit type casts. > > > > > > Shouldn't the new functions have same behavior? > > > > No. Since the name pg_ls_logicalsnapdir has zero single-argument > implementations passing a null value as an argument is indeed attempt to > invoke a function signature that doesn’t exist. Agree. > I can see an argument that they should produce an empty set instead of a > single all-null row, Yeah, it's outside the scope of this patch but I've seen different behavior in this area. For example: postgres=# select * from pg_ls_replslotdir(NULL); name | size | modification --+--+-- (0 rows) as compared to: postgres=# select * from pg_walfile_name_offset(NULL); file_name | file_offset ---+- | (1 row) I thought that it might be linked to the volatility but it is not: postgres=# select * from pg_stat_get_xact_blocks_fetched(NULL); pg_stat_get_xact_blocks_fetched - (1 row) postgres=# select * from pg_get_multixact_members(NULL); xid | mode -+-- (0 rows) while both are volatile. I think both make sense: It's "empty" or we "don't know the values of the fields". I don't know if there is any reason about this "inconsistency". Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: Add contrib/pg_logicalsnapinspect
Hi, On Tue, Sep 17, 2024 at 10:18:35AM +0530, shveta malik wrote: > Thanks for addressing the comments. I have not started reviewing v4 > yet, but here are few more comments on v3: > > 1) > +#include "port/pg_crc32c.h" > > It is not needed in pg_logicalinspect.c as it is already included in > internal_snapbuild.h Yeap, forgot to remove that one when creating the new "internal".h file, done in v5 attached, thanks! > > 2) > + values[0] = Int16GetDatum(ondisk.builder.state); > > + values[8] = LSNGetDatum(ondisk.builder.last_serialized_snapshot); > + values[9] = TransactionIdGetDatum(ondisk.builder.next_phase_at); > + values[10] = Int64GetDatum(ondisk.builder.committed.xcnt); > > We can have values[i++] in all the places and later we can check : > Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS); > Then we need not to keep track of number even in later part of code, > as it goes till 14. Right, let's do it that way (as it is done in pg_walinspect for example). > 4) > Most of the output columns in pg_get_logical_snapshot_info() look > self-explanatory except 'state'. Should we have meaningful 'text' here > corresponding to SnapBuildState? Similar to what we do for > 'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses > for ReplicationSlotInvalidationCause) Yeah we could. I was not sure about that (and that was my first remark in [1]) , as the module is mainly for debugging purpose, I was thinking that the one using it could refer to "snapbuild.h". Let's see what others think. [1]: https://www.postgresql.org/message-id/ZscuZ92uGh3wm4tW%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com >From ef6ce23b7145d3a1639dc776c56f0b94ae33bff0 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot Date: Wed, 14 Aug 2024 08:46:05 + Subject: [PATCH v5] Add contrib/pg_logicalinspect Provides SQL functions that allow to inspect logical decoding components. It currently allows to inspect the contents of serialized logical snapshots of a running database cluster, which is useful for debugging or educational purposes. --- contrib/Makefile | 1 + contrib/meson.build | 1 + contrib/pg_logicalinspect/.gitignore | 4 + contrib/pg_logicalinspect/Makefile| 31 +++ .../expected/logical_inspect.out | 52 contrib/pg_logicalinspect/logicalinspect.conf | 1 + contrib/pg_logicalinspect/meson.build | 39 +++ .../pg_logicalinspect--1.0.sql| 43 +++ contrib/pg_logicalinspect/pg_logicalinspect.c | 253 ++ .../pg_logicalinspect.control | 5 + .../specs/logical_inspect.spec| 34 +++ doc/src/sgml/contrib.sgml | 1 + doc/src/sgml/filelist.sgml| 1 + doc/src/sgml/pglogicalinspect.sgml| 145 ++ src/backend/replication/logical/snapbuild.c | 190 + src/include/port/pg_crc32c.h | 16 +- src/include/replication/snapbuild.h | 2 +- src/include/replication/snapbuild_internal.h | 204 ++ 18 files changed, 830 insertions(+), 193 deletions(-) 7.6% contrib/pg_logicalinspect/expected/ 5.7% contrib/pg_logicalinspect/specs/ 32.6% contrib/pg_logicalinspect/ 13.2% doc/src/sgml/ 17.4% src/backend/replication/logical/ 4.1% src/include/port/ 18.9% src/include/replication/ diff --git a/contrib/Makefile b/contrib/Makefile index abd780f277..952855d9b6 100644 --- a/contrib/Makefile +++ b/contrib/Makefile @@ -32,6 +32,7 @@ SUBDIRS = \ passwordcheck \ pg_buffercache \ pg_freespacemap \ + pg_logicalinspect \ pg_prewarm \ pg_stat_statements \ pg_surgery \ diff --git a/contrib/meson.build b/contrib/meson.build index 14a8906865..159ff41555 100644 --- a/contrib/meson.build +++ b/contrib/meson.build @@ -46,6 +46,7 @@ subdir('passwordcheck') subdir('pg_buffercache') subdir('pgcrypto') subdir('pg_freespacemap') +subdir('pg_logicalinspect') subdir('pg_prewarm') subdir('pgrowlocks') subdir('pg_stat_statements') diff --git a/contrib/pg_logicalinspect/.gitignore b/contrib/pg_logicalinspect/.gitignore new file mode 100644 index 00..5dcb3ff972 --- /dev/null +++ b/contrib/pg_logicalinspect/.gitignore @@ -0,0 +1,4 @@ +# Generated subdirectories +/log/ +/results/ +/tmp_check/ diff --git a/contrib/pg_logicalinspect/Makefile b/contrib/pg_logicalinspect/Makefile new file mode 100644 index 00..55124514d4 --- /dev/null +++ b/contrib/pg_logicalinspect/Makefile @@ -0,0 +1,31 @@ +# contrib/pg_logicalinspect/Makefile + +MODULE_big = pg_logicalinspect +OBJS = \ + $(WIN32RES) \ + pg_logicalinspect.o +PGFILEDESC = "pg_logicalinspect - functions to inspect logical decoding components" + +EXTENSION = pg_logicalinspect +DATA = pg_logicalinspect--1.0.sql + +EXTRA_INSTALL = contrib/test_decoding
Re: Pgoutput not capturing the generated columns
Here are some review comments for v31-0001 (for the docs only) There may be some overlap here with some comments already made for v30-0001 which are not yet addressed in v31-0001. == Commit message 1. When introducing the 'publish_generated_columns' parameter, you must also say this is a PUBLICATION parameter. ~~~ 2. With this enhancement, users can now include the 'include_generated_columns' option when querying logical replication slots using either the pgoutput plugin or the test_decoding plugin. This option, when set to 'true' or '1', instructs the replication system to include generated column information and data in the replication stream. ~ The above is stale information because it still refers to the old name 'include_generated_columns', and to test_decoding which was already removed in this patch. == doc/src/sgml/ddl.sgml 3. + Generated columns may be skipped during logical replication according to the + CREATE PUBLICATION option + + publish_generated_columns. 3a. nit - The linkend is based on the old name instead of the new name. 3b. nit - Better to call this a parameter instead of an option because that is what the CREATE PUBLICATION docs call it. == doc/src/sgml/protocol.sgml 4. + + publish_generated_columns + + +Boolean option to enable generated columns. This option controls +whether generated columns should be included in the string +representation of tuples during logical decoding in PostgreSQL. + + + + Is this even needed anymore? Now that the implementation is using a PUBLICATION parameter, isn't everything determined just by that parameter? I don't see the reason why a protocol change is needed anymore. And, if there is no protocol change needed, then this documentation change is also not needed. 5. - Next, the following message part appears for each column included in - the publication (except generated columns): + Next, the following message parts appear for each column included in + the publication (generated columns are excluded unless the parameter + + publish_generated_columns specifies otherwise): Like the previous comment above, I think everything is now determined by the PUBLICATION parameter. So maybe this should just be referring to that instead. == doc/src/sgml/ref/create_publication.sgml 6. + +publish_generated_columns (boolean) + nit - the ID is based on the old parameter name. ~ 7. + + This option is only available for replicating generated column data from the publisher + to a regular, non-generated column in the subscriber. + IMO remove this paragraph. I really don't think you should be mentioning the subscriber here at all. AFAIK this parameter is only for determining if the generated column will be published or not. What happens at the other end (e.g. logic whether it gets ignored or not by the subscriber) is more like a matrix of behaviours that could be documented in the "Logical Replication" section. But not here. (I removed this in my nitpicks attachment) ~~~ 8. + + This parameter can only be set true if copy_data is + set to false. + IMO remove this paragraph too. The user can create a PUBLICATION before a SUBSCRIPTION even exists so to say it "can only be set..." is not correct. Sure, your patch 0001 does not support the COPY of generated columns but if you want to document that then it should be documented in the CREATE SUBSCRIBER docs. But not here. (I removed this in my nitpicks attachment) TBH, it would be better if patches 0001 and 0002 were merged then you can avoid all this. IIUC they were only separate in the first place because 2 different people wrote them. It is not making reviews easier with them split. == Please see the attachment which implements some of the nits above. == Kind Regards, Peter Smith. Fujitsu Australia diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 2e7804e..cca54bc 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -515,8 +515,8 @@ CREATE TABLE people ( Generated columns may be skipped during logical replication according to the - CREATE PUBLICATION option - + CREATE PUBLICATION parameter + publish_generated_columns. diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml index e133dc3..cd20bd4 100644 --- a/doc/src/sgml/ref/create_publication.sgml +++ b/doc/src/sgml/ref/create_publication.sgml @@ -223,7 +223,7 @@ CREATE PUBLICATION name - + publish_generated_columns (boolean) @@ -231,14 +231,6 @@ CREATE PUBLICATION name associated with the publication should be replicated. The default is false. - - This opti
Re: [HACKERS] make async slave to wait for lsn to be replayed
On Tue, Sep 17, 2024 at 9:00 AM Alexander Lakhin wrote: > 16.09.2024 21:55, Alexander Korotkov wrote: > > Please find two patches attached. The first one does minor cleanup > > including misuse of words you've pointed. The second one adds missing > > wait_for_catchup(). That should fix the test failure you've spotted. > > Please, check if it fixes an issue for you. > > Thank you for looking at that! > > Unfortunately, the issue is still here — the test failed for me 6 out of > 10 runs, as below: > [05:14:02.807](0.135s) ok 8 - got error after standby promote > error running SQL: 'psql::1: ERROR: recovery is not in progress > HINT: Waiting for LSN can only be executed during recovery.' > while running 'psql -XAtq -d port=12734 host=/tmp/04hQ75NuXf > dbname='postgres' -f - -v ON_ERROR_STOP=1' with sql 'CALL > pg_wal_replay_wait('0/300F248');' at > .../src/test/recovery/../../../src/test/perl/PostgreSQL/Test/Cluster.pm line > 2140. > > 043_wal_replay_wait_standby.log: > 2024-09-17 05:14:02.714 UTC [1817258] 043_wal_replay_wait.pl ERROR: recovery > is not in progress > 2024-09-17 05:14:02.714 UTC [1817258] 043_wal_replay_wait.pl DETAIL: > Recovery ended before replaying target LSN > 2/570CD648; last replay LSN 0/300F210. > 2024-09-17 05:14:02.714 UTC [1817258] 043_wal_replay_wait.pl STATEMENT: CALL > pg_wal_replay_wait('2/570CD648'); > 2024-09-17 05:14:02.714 UTC [1817155] LOG: checkpoint starting: force > 2024-09-17 05:14:02.714 UTC [1817154] LOG: database system is ready to > accept connections > 2024-09-17 05:14:02.811 UTC [1817270] 043_wal_replay_wait.pl LOG: statement: > CALL pg_wal_replay_wait('0/300F248'); > 2024-09-17 05:14:02.811 UTC [1817270] 043_wal_replay_wait.pl ERROR: recovery > is not in progress > > pg_waldump -p .../t_043_wal_replay_wait_primary_data/pgdata/pg_wal/ > 00010003 > rmgr: Transaction len (rec/tot): 34/34, tx:748, lsn: > 0/0300F1E8, prev 0/0300F1A8, desc: COMMIT > 2024-09-17 05:14:01.654874 UTC > rmgr: Standby len (rec/tot): 50/50, tx: 0, lsn: > 0/0300F210, prev 0/0300F1E8, desc: RUNNING_XACTS > nextXid 749 latestCompletedXid 748 oldestRunningXid 749 > > I wonder, can you reproduce this with that bgwriter's modification? Yes, now I did reproduce. I got that the problem could be that insert LSN is not yet written at primary, thus wait_for_catchup doesn't wait for it. I've workarounded that using pg_switch_wal(). The revised patchset is attached. > I've also found two more "achievements" coined by 3c5db1d6b: > doc/src/sgml/func.sgml: It may also happen that target > lsn is not achieved > src/backend/access/transam/xlog.c- * recovery was ended before > achieving the target LSN. Fixed this at well in 0001. -- Regards, Alexander Korotkov Supabase v2-0001-Minor-cleanup-related-to-pg_wal_replay_wait-proce.patch Description: Binary data v2-0002-Ensure-standby-promotion-point-in-043_wal_replay_.patch Description: Binary data
Re: First draft of PG 17 release notes
On Wed, 11 Sept 2024 at 16:10, Bruce Momjian wrote: > You are right that I do mention changes specifically designed for the > use of extensions, but there is no mention in the commit message of its > use for extensions. In fact, I thought this was too low-level to be of > use for extensions. However, if people feel it should be added, we have > enough time to add it. Another new API that is useful for extension authors is the following one (I'm obviously biased since I'm the author, and I don't know if there's still time): commit 14dd0f27d7cd56ffae9ecdbe324965073d01a9ff Author: Nathan Bossart Date: Thu Jan 4 16:09:34 2024 -0600 Add macros for looping through a List without a ListCell. Many foreach loops only use the ListCell pointer to retrieve the content of the cell, like so: ListCell *lc; foreach(lc, mylist) { int myint = lfirst_int(lc); ... } This commit adds a few convenience macros that automatically declare the loop variable and retrieve the current cell's contents. This allows us to rewrite the previous loop like this: foreach_int(myint, mylist) { ... } > An interesting idea would be > to report all function signature changes in each major release in some > way. I think that might be useful, but it very much depends how long that list gets. If it gets too long I think authors will just try to compile and only look at the ones that break for them.
Re: Detailed release notes
On 2024-Sep-16, Bruce Momjian wrote: > We could try: > >Add backend support for injection points (Michael Paquier) §1 §2 §3 §4 > > and maybe only add numbers if there is more than one commit. Well, this reads like references to sections 1, 2, 3 and 4, but they aren't that, and probably such sections don't even exist. I think the use of the section symbol is misleading and typographically wrong. https://www.monotype.com/resources/punctuation-series-section-sign Maybe it would work to use numbers in square brackets instead: Add backend support for injection points (Michael Paquier) [1] [2] [3] [4] Maybe use the word "commit" for the first one, which would make it clearer what the links are about: Add backend support for injection points (Michael Paquier) [commit 1] [2] [3] [4] and if there's only one, say just "[commit]". -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
Re: First draft of PG 17 release notes
On 2024-Sep-11, Bruce Momjian wrote: > An interesting idea would be to report all function signature changes > in each major release in some way. Hmm, extension authors are going to realize this as soon as they try to compile, so it doesn't seem necessary. Having useful APIs _added_ is a different matter, because those might help them realize that they can remove parts (or #ifdef-out for newer PG versions) of their code, or add new features; there's no Clippit saying "it looks like you're compiling for Postgres 18, would you like to ...?". -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "The important things in the world are problems with society that we don't understand at all. The machines will become more complicated but they won't be more complicated than the societies that run them."(Freeman Dyson)
Re: Detailed release notes
On Tue, 17 Sept 2024 at 10:12, Alvaro Herrera wrote: > Maybe it would work to use numbers in square brackets instead: > > Add backend support for injection points (Michael Paquier) [1] [2] [3] [4] Another option would be to add them in superscript using the html tag (or even using some unicode stuff): Add backend support for injection points (Michael Paquier)1, 2, 3, 4 So similar to footnotes in a sense, i.e. if you want details click on the small numbers
Re: Can we rely on the ordering of paths in pathlist?
Hi Richard Guo I tried to changed the comment, can you help me to check if this is correct?Many thanks. - /* - * get_cheapest_parallel_safe_total_inner - * Find the unparameterized parallel-safe path with the least total cost. - */ + /* get_cheapest_parallel_safe_total_inner + * Skip paths that do not meet the criteria,find the unparameterized parallel-safe path with the least total cost and return NULL if it does not exist. + * + */ Thanks wenhui qiu 于2024年7月31日周三 09:21写道: > Hi Richard Guo > Today is the last day of the commitfest cycle.I think this patch > should be commented ,Except for the comments, I tested it good to me > > > Thanks > > wenhui qiu 于2024年7月25日周四 16:18写道: > >> Hi Richard Guo >> Is it necessary to add some comments here? >> >> >> + if (!innerpath->parallel_safe || >> + !bms_is_empty(PATH_REQ_OUTER(innerpath))) >> + continue; >> + >> + if (matched_path != NULL && >> + compare_path_costs(matched_path, innerpath, TOTAL_COST) <= 0) >> + continue; >> + >> + matched_path = innerpath; >> >> Richard Guo 于2024年1月10日周三 15:08写道: >> >>> >>> On Tue, Apr 11, 2023 at 11:43 AM Andy Fan >>> wrote: >>> On Tue, Apr 11, 2023 at 11:03 AM Richard Guo wrote: > As the comment above add_path() says, 'The pathlist is kept sorted by > total_cost, with cheaper paths at the front.' And it seems that > get_cheapest_parallel_safe_total_inner() relies on this ordering > (without being mentioned in the comments, which I think we should do). > I think the answer for ${subject} should be yes. Per the comments in add_partial_path, we have * add_partial_path * * As in add_path, the partial_pathlist is kept sorted with the cheapest * total path in front. This is depended on by multiple places, which * just take the front entry as the cheapest path without searching. * >>> >>> I'm not sure about this conclusion. Surely we can depend on that the >>> partial_pathlist is kept sorted by total_cost ASC. This is emphasized >>> in the comment of add_partial_path, and also leveraged in practice, such >>> as in many places we just use linitial(rel->partial_pathlist) as the >>> cheapest partial path. >>> >>> But get_cheapest_parallel_safe_total_inner works on pathlist not >>> partial_pathlist. And for pathlist, I'm not sure if it's a good >>> practice to depend on its ordering. Because >>> >>> 1) the comment of add_path only mentions that add_path_precheck relies >>> on this ordering, but it does not mention that other functions also do; >>> >>> 2) other than add_path_precheck, I haven't observed any other functions >>> that rely on this ordering. The only exception as far as I notice is >>> get_cheapest_parallel_safe_total_inner. >>> >>> On the other hand, if we declare that we can rely on the pathlist being >>> sorted in ascending order by total_cost, we should update the comment >>> for add_path to highlight this aspect. We should also include a comment >>> for get_cheapest_parallel_safe_total_inner to clarify why an early exit >>> is possible, similar to what we do for add_path_precheck. Additionally, >>> in several places, we can optimize our code by taking advantage of this >>> fact and terminate the iteration through the pathlist early when looking >>> for the cheapest path of a certain kind. >>> >>> Thanks >>> Richard >>> >>
Re: [HACKERS] make async slave to wait for lsn to be replayed
17.09.2024 10:47, Alexander Korotkov wrote: Yes, now I did reproduce. I got that the problem could be that insert LSN is not yet written at primary, thus wait_for_catchup doesn't wait for it. I've workarounded that using pg_switch_wal(). The revised patchset is attached. Thank you for the revised patch! The improved test works reliably for me (100 out of 100 runs passed), Best regards, Alexander
Re: Using per-transaction memory contexts for storing decoded tuples
On Mon, Sep 16, 2024 at 10:43 PM Masahiko Sawada wrote: > > On Fri, Sep 13, 2024 at 3:58 AM Amit Kapila wrote: > > > > On Thu, Sep 12, 2024 at 4:03 AM Masahiko Sawada > > wrote: > > > > > > We have several reports that logical decoding uses memory much more > > > than logical_decoding_work_mem[1][2][3]. For instance in one of the > > > reports[1], even though users set logical_decoding_work_mem to > > > '256MB', a walsender process was killed by OOM because of using more > > > than 4GB memory. > > > > > > As per the discussion in these threads so far, what happened was that > > > there was huge memory fragmentation in rb->tup_context. > > > rb->tup_context uses GenerationContext with 8MB memory blocks. We > > > cannot free memory blocks until all memory chunks in the block are > > > freed. If there is a long-running transaction making changes, its > > > changes could be spread across many memory blocks and we end up not > > > being able to free memory blocks unless the long-running transaction > > > is evicted or completed. Since we don't account fragmentation, block > > > header size, nor chunk header size into per-transaction memory usage > > > (i.e. txn->size), rb->size could be less than > > > logical_decoding_work_mem but the actual allocated memory in the > > > context is much higher than logical_decoding_work_mem. > > > > > > > It is not clear to me how the fragmentation happens. Is it because of > > some interleaving transactions which are even ended but the memory > > corresponding to them is not released? > > In a generation context, we can free a memory block only when all > memory chunks there are freed. Therefore, individual tuple buffers are > already pfree()'ed but the underlying memory blocks are not freed. > I understood this part but didn't understand the cases leading to this problem. For example, if there is a large (and only) transaction in the system that allocates many buffers for change records during decoding, in the end, it should free memory for all the buffers allocated in the transaction. So, wouldn't that free all the memory chunks corresponding to the memory blocks allocated? My guess was that we couldn't free all the chunks because there were small interleaving transactions that allocated memory but didn't free it before the large transaction ended. > > Can we try reducing the size of > > 8MB memory blocks? The comment atop allocation says: "XXX the > > allocation sizes used below pre-date generation context's block > > growing code. These values should likely be benchmarked and set to > > more suitable values.", so do we need some tuning here? > > Reducing the size of the 8MB memory block would be one solution and > could be better as it could be back-patchable. It would mitigate the > problem but would not resolve it. I agree to try reducing it and do > some benchmark tests. If it reasonably makes the problem less likely > to happen, it would be a good solution. > makes sense. -- With Regards, Amit Kapila.
[PATCH] Mention service key word more prominently in pg_service.conf docs
Hi hackers, I just noticed that the docs for pg_service.conf (https://www.postgresql.org/docs/current/libpq-pgservice.html) don't mention the actual key word to use in the libpq connection string until the example in the last sentence, but it mentions the env var in the first paragraph. Here's a patch to make the key word equally prominent. - ilmari >From 380f9a717ff8216201e44c4f2caa34b596b13e7a Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= Date: Tue, 17 Sep 2024 10:17:46 +0100 Subject: [PATCH] Mention the `service` key word more prominently in the pg_service.conf docs The env var was mentioned in the first paragraph, but the key word wasn't mentioned until the last sentence. --- doc/src/sgml/libpq.sgml | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml index 783e8e750b..ef7e3076ce 100644 --- a/doc/src/sgml/libpq.sgml +++ b/doc/src/sgml/libpq.sgml @@ -9317,7 +9317,8 @@ The connection service file allows libpq connection parameters to be associated with a single service name. That service name can then be - specified in a libpq connection string, and the associated settings will be + specified using the service key word + in a libpq connection string, and the associated settings will be used. This allows connection parameters to be modified without requiring a recompile of the libpq-using application. The service name can also be specified using the PGSERVICE environment variable. -- 2.39.5
jsonb_strip_nulls with arrays?
Currently: jsonb_strip_nulls ( jsonb ) → jsonb Deletes all object fields that have null values from the given JSON value, recursively. Null values that are not object fields are untouched. > Null values that are not object fields are untouched. Can we revisit this and make it work with arrays, too? Tbh, at first sight that looked like the expected behavior for me. That is strip nulls from arrays as well. This has been available since 9.5 and iiuc predates lots of the jsonb array work. In practice, though, whenever jsonb_build_array is used (especially with jsonpath), a few nulls do appear in the resulting array most of the times, Currently, there’s no expressive way to remove this. We could also have jsonb_array_strip_nulls(jsonb) as well
Re: [PATCH] Mention service key word more prominently in pg_service.conf docs
> On 17 Sep 2024, at 11:24, Dagfinn Ilmari Mannsåker wrote: > I just noticed that the docs for pg_service.conf > (https://www.postgresql.org/docs/current/libpq-pgservice.html) don't > mention the actual key word to use in the libpq connection string until > the example in the last sentence, but it mentions the env var in the > first paragraph. Here's a patch to make the key word equally prominent. Fair point, that seems like a good change, will apply. -- Daniel Gustafsson
Re: Pgoutput not capturing the generated columns
Review comments for v31-0001. (I tried to give only new comments, but there might be some overlap with comments I previously made for v30-0001) == src/backend/catalog/pg_publication.c 1. + + if (publish_generated_columns_given) + { + values[Anum_pg_publication_pubgencolumns - 1] = BoolGetDatum(publish_generated_columns); + replaces[Anum_pg_publication_pubgencolumns - 1] = true; + } nit - unnecessary whitespace above here. == src/backend/replication/pgoutput/pgoutput.c 2. prepare_all_columns_bms + /* Iterate the cols until generated columns are found. */ + cols = bms_add_member(cols, i + 1); How does the comment relate to the statement that follows it? ~~~ 3. + * Skip generated column if pubgencolumns option was not + * specified. nit - /pubgencolumns option/publish_generated_columns parameter/ == src/bin/pg_dump/pg_dump.c 4. getPublications: nit - /i_pub_gencolumns/i_pubgencols/ (it's the same information but simpler) == src/bin/pg_dump/pg_dump.h 5. + bool pubgencolumns; } PublicationInfo; nit - /pubgencolumns/pubgencols/ (it's the same information but simpler) == vsrc/bin/psql/describe.c 6. bool has_pubviaroot; + bool has_pubgencol; nit - /has_pubgencol/has_pubgencols/ (plural consistency) == src/include/catalog/pg_publication.h 7. + /* true if generated columns data should be published */ + bool pubgencolumns; } FormData_pg_publication; nit - /pubgencolumns/pubgencols/ (it's the same information but simpler) ~~~ 8. + bool pubgencolumns; PublicationActions pubactions; } Publication; nit - /pubgencolumns/pubgencols/ (it's the same information but simpler) == src/test/regress/sql/publication.sql 9. +-- Test the publication with or without 'PUBLISH_GENERATED_COLUMNS' parameter +SET client_min_messages = 'ERROR'; +CREATE PUBLICATION pub1 FOR ALL TABLES WITH (PUBLISH_GENERATED_COLUMNS=1); +\dRp+ pub1 + +CREATE PUBLICATION pub2 FOR ALL TABLES WITH (PUBLISH_GENERATED_COLUMNS=0); +\dRp+ pub2 9a. nit - Use lowercase for the parameters. ~ 9b. nit - Fix the comment to say what the test is actually doing: "Test the publication 'publish_generated_columns' parameter enabled or disabled" == src/test/subscription/t/031_column_list.pl 10. Later I think you should add another test here to cover the scenario that I was discussing with Sawada-San -- e.g. when there are 2 publications for the same table subscribed by just 1 subscription but having different values of the 'publish_generated_columns' for the publications. == Kind Regards, Peter Smith. Fujitsu Australia diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml index 2e7804e..cca54bc 100644 --- a/doc/src/sgml/ddl.sgml +++ b/doc/src/sgml/ddl.sgml @@ -515,8 +515,8 @@ CREATE TABLE people ( Generated columns may be skipped during logical replication according to the - CREATE PUBLICATION option - + CREATE PUBLICATION parameter + publish_generated_columns. diff --git a/doc/src/sgml/ref/create_publication.sgml b/doc/src/sgml/ref/create_publication.sgml index e133dc3..cd20bd4 100644 --- a/doc/src/sgml/ref/create_publication.sgml +++ b/doc/src/sgml/ref/create_publication.sgml @@ -223,7 +223,7 @@ CREATE PUBLICATION name - + publish_generated_columns (boolean) @@ -231,14 +231,6 @@ CREATE PUBLICATION name associated with the publication should be replicated. The default is false. - - This option is only available for replicating generated column data from the publisher - to a regular, non-generated column in the subscriber. - - - This parameter can only be set true if copy_data is - set to false. - diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c index 272b6a1..7ebb851 100644 --- a/src/backend/catalog/pg_publication.c +++ b/src/backend/catalog/pg_publication.c @@ -999,7 +999,7 @@ GetPublication(Oid pubid) pub->pubactions.pubdelete = pubform->pubdelete; pub->pubactions.pubtruncate = pubform->pubtruncate; pub->pubviaroot = pubform->pubviaroot; - pub->pubgencolumns = pubform->pubgencolumns; + pub->pubgencols = pubform->pubgencols; ReleaseSysCache(tup); @@ -1211,7 +1211,7 @@ pg_get_publication_tables(PG_FUNCTION_ARGS) if (att->attisdropped) continue; - if (att->attgenerated && !pub->pubgencolumns) + if (att->attgenerated && !pub->pubgencols) continue; attnums[nattnums++] = att->attnum; diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c index 6242a09..0129db1 100644 --- a/src/backend/commands/publicationcmds.c +++ b/sr
Re: SQL:2011 application time
On 05.09.24 14:09, Peter Eisentraut wrote: On 07.08.24 22:54, Paul Jungwirth wrote: Here are some fixes based on outstanding feedback (some old some new). I have studied your patches v39-0001 through v39-0004, which correspond to what had been reverted plus the new empty range check plus various minor fixes. This looks good to me now, so I propose to go ahead with that. Btw., in your 0003 you point out that this prevents using the WITHOUT OVERLAPS functionality for non-range types. But I think this could be accomplished by adding an "is empty" callback as a support function or something like that. I'm not suggesting to do that here, but it might be worth leaving a comment about that possibility. I have committed these, as explained here. I look forward to an updated patch set from you to review the "FOR PORTION OF" patches next.
Re: Add contrib/pg_logicalsnapinspect
On Tue, Sep 17, 2024 at 10:46 AM David G. Johnston wrote: > > > > On Monday, September 16, 2024, shveta malik wrote: >> >> On Tue, Sep 17, 2024 at 10:18 AM shveta malik wrote: >> > >> > Thanks for addressing the comments. I have not started reviewing v4 >> > yet, but here are few more comments on v3: >> > >> >> I just noticed that when we pass NULL input, both the new functions >> give 1 row as output, all cols as NULL: >> >> newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL); >> magic | checksum | version >> ---+--+- >>| | >> >> (1 row) >> >> Similar behavior with pg_get_logical_snapshot_info(). While the >> existing 'pg_ls_logicalsnapdir' function gives this error, which looks >> more meaningful: >> >> newdb1=# select * from pg_ls_logicalsnapdir(NULL); >> ERROR: function pg_ls_logicalsnapdir(unknown) does not exist >> LINE 1: select * from pg_ls_logicalsnapdir(NULL); >> HINT: No function matches the given name and argument types. You >> might need to add explicit type casts. >> >> >> Shouldn't the new functions have same behavior? > > > No. Since the name pg_ls_logicalsnapdir has zero single-argument > implementations passing a null value as an argument is indeed attempt to > invoke a function signature that doesn’t exist. > > If there is exactly one single input argument function of the given name the > parser is going to cast the null literal to the data type of the single > argument and invoke the function. It will not and cannot be convinced to > fail to find a matching function. Okay, understood. Thanks for explaining. > I can see an argument that they should produce an empty set instead of a > single all-null row, but the idea that they wouldn’t even be found is > contrary to a core design of the system. Okay, a single row can be investigated if it comes under this scope. But I see why 'ERROR' is not a possibility here. thanks Shveta
Re: per backend I/O statistics
Hi, On Fri, 13 Sept 2024 at 19:09, Bertrand Drouvot wrote: > On Fri, Sep 13, 2024 at 04:45:08PM +0300, Nazir Bilal Yavuz wrote: > > - The pgstat_reset_io_counter_internal() is called in the > > pgstat_shutdown_hook(). This causes the stats_reset column showing the > > termination time of the old backend when its proc num is reassigned to > > a new backend. > > doh! Nice catch, thanks! > > And also new backends that are not re-using a previous "existing" process slot > are getting the last reset time (if any). So the right place to fix this is in > pgstat_io_init_backend_cb(): done in v4 attached. v4 also sets the reset time > to > 0 in pgstat_shutdown_hook() (but that's not necessary though, that's just to > be > right "conceptually" speaking). Thanks, I confirm that it is fixed. You mentioned some refactoring upthread: On Fri, 6 Sept 2024 at 18:03, Bertrand Drouvot wrote: > > - If we agree on the current proposal then I'll do some refactoring in > pg_stat_get_backend_io(), pg_stat_get_my_io() and pg_stat_get_io() to avoid > duplicated code (it's not done yet to ease the review). Could we remove pg_stat_get_my_io() completely and use pg_stat_get_backend_io() with the current backend's pid to get the current backend's stats? If you meant the same thing, please don't mind it. -- Regards, Nazir Bilal Yavuz Microsoft
Re: Psql meta-command conninfo+
Hi, On Mon, Sep 16, 2024 at 8:31 PM Tom Lane wrote: > Alvaro Herrera writes: > > On 2024-Sep-16, Jim Jones wrote: > >> * The value of "Current User" does not match the function current_user() > >> --- as one might expcect. It is a little confusing, as there is no > >> mention of "Current User" in the docs. In case this is the intended > >> behaviour, could you please add it to the docs? > > > It is intended. As Peter said[1], what we wanted was to display > > client-side info, so PQuser() is the right thing to do. Now maybe > > "Current User" is not the perfect column header, but at least the > > definition seems consistent with the desired end result. > > Seems like "Session User" would be closer to being accurate, since > PQuser()'s result does not change when you do SET ROLE etc. > > > Now, I think > > the current docs saying to look at session_user() are wrong, they should > > point to the libpq docs for the function instead; something like "The > > name of the current user, as returned by PQuser()" and so on. > > Sure, but this does not excuse choosing a misleading column name > when there are better choices readily available. > Maybe we can rename "Current User" to "Authenticated User" just like the previous author because it is a user returned by PQuser(). For the "Session User", I believe it is working as expected, since session_user can be changed with SET SESSION AUTHORIZATION. ``` $ bin/psql "port=5430 sslmode=disable dbname=postgres" -x -h localhost postgres=# \conninfo+ Connection Information -[ RECORD 1 ]+-- Database | postgres Current User | hunaid Session User | hunaid Host | localhost Host Address | 127.0.0.1 Port | 5430 Protocol Version | 3 SSL Connection | false GSSAPI Authenticated | false Client Encoding | UTF8 Server Encoding | UTF8 Backend PID | 1337 postgres=# set SESSION AUTHORIZATION postgres; SET postgres=# \conninfo+ Connection Information -[ RECORD 1 ]+-- Database | postgres Current User | hunaid Session User | postgres Host | localhost Host Address | 127.0.0.1 Port | 5430 Protocol Version | 3 SSL Connection | false GSSAPI Authenticated | false Client Encoding | UTF8 Server Encoding | UTF8 Backend PID | 1337 ``` We can update the docs as follows: Authenticated User: The name of the user returned by PQuser(). Session User: The session user's name. Opinions? Regards, Hunaid Sohail
Re: Parallel workers stats in pg_stat_database
Here is an updated patch fixing the aforementionned problems with tests and vacuum stats. -- Benoit Lobréau Consultant http://dalibo.comFrom 1b52f5fb4e977599b8925c69193d31148042ca7d Mon Sep 17 00:00:00 2001 From: benoit Date: Wed, 28 Aug 2024 02:27:13 +0200 Subject: [PATCH] Adds four parallel workers stat columns to pg_stat_database * parallel_workers_planned * parallel_workers_launched * parallel_maint_workers_planned * parallel_maint_workers_launched --- doc/src/sgml/monitoring.sgml | 36 +++ src/backend/access/brin/brin.c| 4 +++ src/backend/access/nbtree/nbtsort.c | 4 +++ src/backend/catalog/system_views.sql | 4 +++ src/backend/commands/vacuumparallel.c | 5 +++ src/backend/executor/execMain.c | 7 src/backend/executor/execUtils.c | 3 ++ src/backend/executor/nodeGather.c | 3 ++ src/backend/executor/nodeGatherMerge.c| 3 ++ src/backend/utils/activity/pgstat_database.c | 36 +++ src/backend/utils/adt/pgstatfuncs.c | 12 +++ src/include/catalog/pg_proc.dat | 20 +++ src/include/nodes/execnodes.h | 3 ++ src/include/pgstat.h | 7 src/test/regress/expected/rules.out | 4 +++ src/test/regress/expected/select_parallel.out | 27 ++ src/test/regress/expected/vacuum_parallel.out | 19 ++ src/test/regress/sql/select_parallel.sql | 15 src/test/regress/sql/vacuum_parallel.sql | 11 ++ 19 files changed, 223 insertions(+) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 55417a6fa9..8c4b11c11d 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -3508,6 +3508,42 @@ description | Waiting for a newly initialized WAL file to reach durable storage + + + parallel_workers_planned bigint + + + Number of parallel workers planned by queries on this database + + + + + + parallel_workers_launched bigint + + + Number of parallel workers obtained by queries on this database + + + + + + parallel_maint_workers_planned bigint + + + Number of parallel workers planned by utilities on this database + + + + + + parallel_maint_workers_launched bigint + + + Number of parallel workers obtained by utilities on this database + + + stats_reset timestamp with time zone diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c index 6467bed604..9eceb87b52 100644 --- a/src/backend/access/brin/brin.c +++ b/src/backend/access/brin/brin.c @@ -2540,6 +2540,10 @@ _brin_end_parallel(BrinLeader *brinleader, BrinBuildState *state) /* Shutdown worker processes */ WaitForParallelWorkersToFinish(brinleader->pcxt); + pgstat_update_parallel_maint_workers_stats( + (PgStat_Counter) brinleader->pcxt->nworkers_to_launch, + (PgStat_Counter) brinleader->pcxt->nworkers_launched); + /* * Next, accumulate WAL usage. (This must wait for the workers to finish, * or we might get incomplete data.) diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c index f5d7b3b0c3..232e1a0942 100644 --- a/src/backend/access/nbtree/nbtsort.c +++ b/src/backend/access/nbtree/nbtsort.c @@ -1611,6 +1611,10 @@ _bt_end_parallel(BTLeader *btleader) /* Shutdown worker processes */ WaitForParallelWorkersToFinish(btleader->pcxt); + pgstat_update_parallel_maint_workers_stats( + (PgStat_Counter) btleader->pcxt->nworkers_to_launch, + (PgStat_Counter) btleader->pcxt->nworkers_launched); + /* * Next, accumulate WAL usage. (This must wait for the workers to finish, * or we might get incomplete data.) diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index 19cabc9a47..48bf9e5535 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -1073,6 +1073,10 @@ CREATE VIEW pg_stat_database AS pg_stat_get_db_sessions_abandoned(D.oid) AS sessions_abandoned, pg_stat_get_db_sessions_fatal(D.oid) AS sessions_fatal, pg_stat_get_db_sessions_killed(D.oid) AS sessions_killed, +pg_stat_get_db_parallel_workers_planned(D.oid) as parallel_workers_planned, +pg_stat_get_db_parallel_workers_launched(D.oid) as parallel_workers_launched, +pg_stat_get_db_parallel_maint_workers_planned(D.oid) as parallel_maint_workers_planned, +pg_stat_get_db_parallel_maint_workers_launched(D.oid) as parallel_maint_workers_launched, pg_stat_get_db_stat_reset_time(D.oid) AS stats_re
Re: Exit walsender before confirming remote flush in logical replication
Thanks for everyone's work on this, I am very interested in it getting into a release. What is the status of this? My use case is Patroni - when it needs to do a failover, it shuts down the primary. However, large transactions can cause it to stay in the "shutting down" state for a long time, which means your entire HA system is now non-functional. I like the idea of a new flag. I'll test this out soon if the original authors want to make a rebased patch. This thread is old, so if I don't hear back in a bit, I'll create and test a new one myself. :) Cheers, Greg
Re: per backend I/O statistics
Hi, On Tue, Sep 17, 2024 at 02:52:01PM +0300, Nazir Bilal Yavuz wrote: > Hi, > > On Fri, 13 Sept 2024 at 19:09, Bertrand Drouvot > wrote: > > On Fri, Sep 13, 2024 at 04:45:08PM +0300, Nazir Bilal Yavuz wrote: > > > - The pgstat_reset_io_counter_internal() is called in the > > > pgstat_shutdown_hook(). This causes the stats_reset column showing the > > > termination time of the old backend when its proc num is reassigned to > > > a new backend. > > > > doh! Nice catch, thanks! > > > > And also new backends that are not re-using a previous "existing" process > > slot > > are getting the last reset time (if any). So the right place to fix this is > > in > > pgstat_io_init_backend_cb(): done in v4 attached. v4 also sets the reset > > time to > > 0 in pgstat_shutdown_hook() (but that's not necessary though, that's just > > to be > > right "conceptually" speaking). > > Thanks, I confirm that it is fixed. Thanks! > You mentioned some refactoring upthread: > > On Fri, 6 Sept 2024 at 18:03, Bertrand Drouvot > wrote: > > > > - If we agree on the current proposal then I'll do some refactoring in > > pg_stat_get_backend_io(), pg_stat_get_my_io() and pg_stat_get_io() to avoid > > duplicated code (it's not done yet to ease the review). > > Could we remove pg_stat_get_my_io() completely and use > pg_stat_get_backend_io() with the current backend's pid to get the > current backend's stats? The reason why I keep pg_stat_get_my_io() is because (as mentioned in [1]), the statistics snapshot is build for "my backend stats" (means it depends of the stats_fetch_consistency value). I can see use cases for that. OTOH, pg_stat_get_backend_io() behaves as if stats_fetch_consistency is set to none (each execution re-fetches counters from shared memory) (see [2]). Indeed, the snapshot is not build in each backend to copy all the others backends stats, as 1/ I think that there is no use case (there is no need to get others backends I/O statistics while taking care of the stats_fetch_consistency) and 2/ that could be memory expensive depending of the number of max connections. So I think it's better to keep both functions as they behave differently. Thoughts? > If you meant the same thing, please don't > mind it. What I meant to say is to try to remove the duplicate code that we can find in those 3 functions (say creating a new function that would contain the duplicate code and make use of this new function in the 3 others). Did not look at it in details yet. [1]: https://www.postgresql.org/message-id/ZtXR%2BCtkEVVE/LHF%40ip-10-97-1-34.eu-west-3.compute.internal [2]: https://www.postgresql.org/message-id/ZtsZtaRza9bFFeF8%40ip-10-97-1-34.eu-west-3.compute.internal Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: per backend I/O statistics
Hi, On Tue, 17 Sept 2024 at 16:07, Bertrand Drouvot wrote: > On Tue, Sep 17, 2024 at 02:52:01PM +0300, Nazir Bilal Yavuz wrote: > > Could we remove pg_stat_get_my_io() completely and use > > pg_stat_get_backend_io() with the current backend's pid to get the > > current backend's stats? > > The reason why I keep pg_stat_get_my_io() is because (as mentioned in [1]), > the > statistics snapshot is build for "my backend stats" (means it depends of the > stats_fetch_consistency value). I can see use cases for that. > > OTOH, pg_stat_get_backend_io() behaves as if stats_fetch_consistency is set to > none (each execution re-fetches counters from shared memory) (see [2]). > Indeed, > the snapshot is not build in each backend to copy all the others backends > stats, > as 1/ I think that there is no use case (there is no need to get others > backends > I/O statistics while taking care of the stats_fetch_consistency) and 2/ that > could be memory expensive depending of the number of max connections. > > So I think it's better to keep both functions as they behave differently. > > Thoughts? Yes, that is correct. Sorry, you already had explained it and I had agreed with it but I forgot. > > If you meant the same thing, please don't > > mind it. > > What I meant to say is to try to remove the duplicate code that we can find in > those 3 functions (say creating a new function that would contain the > duplicate > code and make use of this new function in the 3 others). Did not look at it in > details yet. I got it, thanks for the explanation. -- Regards, Nazir Bilal Yavuz Microsoft
Re: per backend I/O statistics
Hi, On Tue, Sep 17, 2024 at 04:47:51PM +0300, Nazir Bilal Yavuz wrote: > Hi, > > On Tue, 17 Sept 2024 at 16:07, Bertrand Drouvot > wrote: > > So I think it's better to keep both functions as they behave differently. > > > > Thoughts? > > Yes, that is correct. Sorry, you already had explained it and I had > agreed with it but I forgot. No problem at all! (I re-explained because I'm not always 100% sure that my explanations are crystal clear ;-) ) Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: jsonb_strip_nulls with arrays?
On 2024-09-17 Tu 5:26 AM, Florents Tselai wrote: Currently: |jsonb_strip_nulls| ( |jsonb| ) → |jsonb| Deletes all object fields that have null values from the given JSON value, recursively. Null values that are not object fields are untouched. > Null values that are not object fields are untouched. Can we revisit this and make it work with arrays, too? Tbh, at first sight that looked like the expected behavior for me. That is strip nulls from arrays as well. This has been available since 9.5 and iiuc predates lots of the jsonb array work. I don't think that's a great idea. Removing an object field which has a null value shouldn't have any effect on the surrounding data, nor really any on other operations (If you try to get the value of the missing field it should give you back null). But removing a null array member isn't like that at all - unless it's the trailing member of the array it will renumber all the succeeding array members. And I don't think we should be changing the behaviour of a function, that people might have been relying on for the better part of a decade. In practice, though, whenever jsonb_build_array is used (especially with jsonpath), a few nulls do appear in the resulting array most of the times, Currently, there’s no expressive way to remove this. We could also have jsonb_array_strip_nulls(jsonb) as well We could, if we're going to do anything at all in this area. Another possibility would be to provide a second optional parameter for json{b}_strip_nulls. That's probably a better way to go. cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: Regression tests fail with tzdata 2024b
Wolfgang Walther writes: > The core regression tests need to be run with a timezone that tests > special cases in the timezone handling code. But that might not be true > for extensions - all they want could be a stable output across major and > minor versions of postgres and versions of tzdata. It could be helpful > to set pg_regress' timezone to UTC, for example? I would not recommend that choice. It would mask simple errors such as failing to apply the correct conversion between timestamptz and timestamp values. Also, if you have test cases that are affected by this issue at all, you probably have a need/desire to test things like DST transitions. regards, tom lane
Re: A starter task
On 2024-09-15 Su 6:17 PM, sia kc wrote: About inlining not sure how it is done with gmail. Maybe should use another email client. Click the three dots with the tooltip "Show trimmed content". Then you can scroll down and put your reply inline. (Personally I detest the Gmail web interface, and use a different MUA, but you can do this even with the Gmail web app.) cheers andrew -- Andrew Dunstan EDB:https://www.enterprisedb.com
Re: [PATCH] Add additional extended protocol commands to psql: \parse and \bind
Hello Michael and Anthonin, 22.08.2024 10:33, Michael Paquier wrote: Looks OK to me. I have spent more time double-checking the whole, and it looks like we're there, so applied. Now let's play with it in more regression tests. Note that the refactoring patch has been merged with the original one in a single commit. Please look at an assertion failure, caused by \bind_named: regression=# SELECT $1 \parse s \bind_named s regression=# \bind_named \bind_named: missing required argument regression=# 1 \g psql: common.c:1501: ExecQueryAndProcessResults: Assertion `pset.stmtName != ((void *)0)' failed. Best regards, Alexander
Re: Adding skip scan (including MDAM style range skip scan) to nbtree
On Mon, Sep 16, 2024 at 3:13 PM Peter Geoghegan wrote: > I agree with your approach, but I'm concerned about it causing > confusion inside _bt_parallel_done. And so I attach a v2 revision of > your bug fix. v2 adds a check that nails that down, too. Pushed this just now. Thanks -- Peter Geoghegan
RE: AIX support
> Do you still need mkldexport.sh? Surely there's a better way to do that > in year 2024. Some quick googling says there's a '-bexpall' option to > 'ld', which kind of sounds like what we want. Will that work? How do > other programs do this? We have noticed couple of caveats with these flags -bexpall/-bexpfull in other opensource tools on AIX. This option would export too many symbols causing problems because a shared library may re-export symbols from another library causing confused dependencies, duplicate symbols. We have similar discussion wrt to these flag in Cmake https://gitlab.kitware.com/cmake/cmake/-/issues/19163 Also, I tried some sample program to verify the same as below >> cat foo.c #include #include int func1() { char str1[] = "Hello ", str2[] = "world! "; strcat(str1,str2); puts(str1); return 0; } >> gcc -c foo.c -o foo.o >> gcc -shared -Wl,-bexpall -o foo.so foo.o >> dump -Tov foo.so foo.so: ***Object Module Header*** # Sections Symbol Ptr # Symbols Opt Hdr Len Flags 4 0x0d8812072 0x3002 Flags=( EXEC DYNLOAD SHROBJ DEP_SYSTEM ) Timestamp = "Sep 17 10:17:35 2024" Magic = 0x1df (32-bit XCOFF) ***Optional Header*** Tsize Dsize Bsize Tstart Dstart 0x0548 0x010c 0x0004 0x1128 0x2670 SNloaderSNentry SNtext SNtoc SNdata 0x0004 0x 0x0001 0x0002 0x0002 TXTalignDATAalign TOC vstamp entry 0x0005 0x0004 0x2750 0x0001 0x maxSTACKmaxDATA SNbss magic modtype 0x 0x 0x0003 0x010bRE ***Loader Section*** ***Loader Symbol Table Information*** [Index] Value Scn IMEX Sclass Type IMPid Name [0] 0x268c.data RW SECdef[noIMid] __rtinit [1] 0xundef IMP DS EXTref libgcc_s.a(shr.o) __cxa_finalize [2] 0xundef IMP DS EXTref libgcc_s.a(shr.o) _GLOBAL__AIXI_shr_o [3] 0xundef IMP DS EXTref libgcc_s.a(shr.o) _GLOBAL__AIXD_shr_o [4] 0xundef IMP DS EXTref libc.a(shr.o) __strtollmax [5] 0xundef IMP DS EXTref libc.a(shr.o) puts [6] 0x26f4.data EXP DS Ldef[noIMid] __init_aix_libgcc_cxa_atexit [7] 0x2724.data EXP DS Ldef[noIMid] _GLOBAL__AIXI_foo_so [8] 0x2730.data EXP DS Ldef[noIMid] _GLOBAL__AIXD_foo_so >>[9] 0x273c.data EXP DS SECdef[noIMid] strcat [10]0x2744.data EXP DS Ldef[noIMid] func1 The code makes use of strcat from libc but re-exports the symbol (because of -bexpall). As of now due to the limitation with these flags (-bexpall / -bexpfull ? ), the solution here is to continue to extract the symbols from the object files and use that export file as part of building the shared library. (Continue to use the mkldexport.sh script to generate the export symbols). Thanks, Sriram.
Re: Conflict detection for update_deleted in logical replication
On Mon, Sep 16, 2024 at 11:53 PM Amit Kapila wrote: > > On Tue, Sep 17, 2024 at 6:08 AM Masahiko Sawada wrote: > > > > On Fri, Sep 13, 2024 at 12:56 AM shveta malik > > wrote: > > > > > > On Fri, Sep 13, 2024 at 11:38 AM Amit Kapila > > > wrote: > > > > > > > > > > > > > > > > So in brief, this solution is only for bidrectional setup? For > > > > > > non-bidirectional, > > > > > > feedback_slots is non-configurable and thus irrelevant. > > > > > > > > > > Right. > > > > > > > > > > > > > One possible idea to address the non-bidirectional case raised by > > > > Shveta is to use a time-based cut-off to remove dead tuples. As > > > > mentioned earlier in my email [1], we can define a new GUC parameter > > > > say vacuum_committs_age which would indicate that we will allow rows > > > > to be removed only if the modified time of the tuple as indicated by > > > > committs module is greater than the vacuum_committs_age. We could keep > > > > this parameter a table-level option without introducing a GUC as this > > > > may not apply to all tables. I checked and found that some other > > > > replication solutions like GoldenGate also allowed similar parameters > > > > (tombstone_deletes) to be specified at table level [2]. The other > > > > advantage of allowing it at table level is that it won't hamper the > > > > performance of hot-pruning or vacuum in general. Note, I am careful > > > > here because to decide whether to remove a dead tuple or not we need > > > > to compare its committs_time both during hot-pruning and vacuum. > > > > > > +1 on the idea, > > > > I agree that this idea is much simpler than the idea originally > > proposed in this thread. > > > > IIUC vacuum_committs_age specifies a time rather than an XID age. > > > > Your understanding is correct that vacuum_committs_age specifies a time. > > > > > But > > how can we implement it? If it ends up affecting the vacuum cutoff, we > > should be careful not to end up with the same result of > > vacuum_defer_cleanup_age that was discussed before[1]. Also, I think > > the implementation needs not to affect the performance of > > ComputeXidHorizons(). > > > > I haven't thought about the implementation details yet but I think > during pruning (for example in heap_prune_satisfies_vacuum()), apart > from checking if the tuple satisfies > HeapTupleSatisfiesVacuumHorizon(), we should also check if the tuple's > committs is greater than configured vacuum_committs_age (for the > table) to decide whether tuple can be removed. Sounds very costly. I think we need to do performance tests. Even if the vacuum gets slower only on the particular table having the vacuum_committs_age setting, it would affect overall autovacuum performance. Also, it would affect HOT pruning performance. > > > > but IIUC this value doesn’t need to be significant; it > > > can be limited to just a few minutes. The one which is sufficient to > > > handle replication delays caused by network lag or other factors, > > > assuming clock skew has already been addressed. > > > > I think that in a non-bidirectional case the value could need to be a > > large number. Is that right? > > > > As per my understanding, even for non-bidirectional cases, the value > should be small. For example, in the case, pointed out by Shveta [1], > where the updates from 2 nodes are received by a third node, this > setting is expected to be small. This setting primarily deals with > concurrent transactions on multiple nodes, so it should be small but I > could be missing something. > I might be missing something but the scenario I was thinking of is something below. Suppose that we setup uni-directional logical replication between Node A and Node B (e.g., Node A -> Node B) and both nodes have the same row with key = 1: Node A: T1: UPDATE t SET val = 2 WHERE key = 1; (10:00 AM) -> This change is applied on Node B at 10:01 AM. Node B: T2: DELETE FROM t WHERE key = 1; (05:00 AM) If a vacuum runs on Node B at 06:00 AM, the change of T1 coming from Node A would raise an "update_missing" conflict. On the other hand, if a vacuum runs on Node B at 11:00 AM, the change would raise an "update_deleted" conflict. It looks whether we detect an "update_deleted" or an "updated_missing" depends on the timing of vacuum, and to avoid such a situation, we would need to set vacuum_committs_age to more than 5 hours. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: AIO v2.0
On Mon, Sep 16, 2024 at 01:51:42PM -0400, Andres Freund wrote: > On 2024-09-16 07:43:49 -0700, Noah Misch wrote: > > On Fri, Sep 06, 2024 at 03:38:16PM -0400, Andres Freund wrote: > > Reattaching descriptors and memory in each child may work, or one could just > > block io_method=io_uring under EXEC_BACKEND. > > I think the latter option is saner Works for me. > > > +pgaio_uring_submit(uint16 num_staged_ios, PgAioHandle **staged_ios) > > > +{ > > > > + if (ret == -EINTR) > > > + { > > > + elog(DEBUG3, "submit EINTR, nios: %d", num_staged_ios); > > > + continue; > > > + } > > > > Since io_uring_submit() is a wrapper around io_uring_enter(), this should > > also > > retry on EAGAIN. "man io_uring_enter" has: > > > > EAGAIN The kernel was unable to allocate memory for the request, or > > otherwise ran out of resources to handle it. The application should wait > > for some completions and try again. > > Hm. I'm not sure that makes sense. We only allow a limited number of IOs to be > in flight for each uring instance. That's different to a use of uring to > e.g. wait for incoming network data on thousands of sockets, where you could > have essentially unbounded amount of requests outstanding. > > What would we wait for? What if we were holding a critical lock in that > moment? Would it be safe to just block for some completions? What if there's > actually no IO in progress? I'd try the following. First, scan for all IOs of all processes at AHS_DEFINED and later, advancing them to AHS_COMPLETED_SHARED. This might be unsafe today, but discovering why it's unsafe likely will inform design beyond EAGAIN returns. I don't specifically know of a way it's unsafe. Do just one pass of that; there may be newer IOs in progress afterward. If submit still gets EAGAIN, sleep a bit and retry. Like we do in pgwin32_open_handle(), fail after a fixed number of iterations. This isn't great if we hold a critical lock, but it beats the alternative of PANIC on the first EAGAIN. > > > +FileStartWriteV(struct PgAioHandle *ioh, File file, > > > + int iovcnt, off_t offset, > > > + uint32 wait_event_info) > > > +{ > > > ... > > > > FileStartWriteV() gets to state AHS_PREPARED, so let's align with the state > > name by calling it FilePrepareWriteV (or FileWriteVPrepare or whatever). > > Hm - that doesn't necessarily seem right to me. I don't think the caller > should assume that the IO will just be prepared and not already completed by > the time FileStartWriteV() returns - we might actually do the IO > synchronously. Yes. Even if it doesn't become synchronous IO, some other process may advance the IO to AHS_COMPLETED_SHARED by the next wake-up of the process that defined the IO. Still, I think this shouldn't use the term "Start" while no state name uses that term. What else could remove that mismatch? > > Is there any specific decision you'd like to settle before patch 6 exits > > WIP? > > Patch 6 specifically? That I really mainly kept separate for review - it No. I'll rephrase as "Is there any specific decision you'd like to settle before the next cohort of patches exits WIP?" > doesn't seem particularly interesting to commit it earlier than 7, or do you > think differently? No, I agree a lone commit of 6 isn't a win. Roughly, the eight patches 6-9,12-15 could be a minimal attractive unit. I've not thought through that grouping much. > In case you mean 6+7 or 6 to ~11, I can think of the following: > > - I am worried about the need for bounce buffers for writes of checksummed > buffers. That quickly ends up being a significant chunk of memory, > particularly when using a small shared_buffers with a higher than default > number of connection. I'm currently hacking up a prototype that'd prevent us > from setting hint bits with just a share lock. I'm planning to start a > separate thread about that. AioChooseBounceBuffers() limits usage to 256 blocks (2MB) per MaxBackends. Doing better is nice, but I don't consider this a blocker. I recommend dealing with the worry by reducing the limit initially (128 blocks?). Can always raise it later. > - The header split doesn't yet quite seem right yet I won't have a strong opinion on that one. The aio.c/aio_io.c split did catch my attention. I made a note to check it again once those files get header comments. > - I'd like to implement retries in the later patches, to make sure that it > doesn't have design implications Yes, that's a blocker to me. > - Worker mode needs to be able to automatically adjust the number of running > workers, I think - otherwise it's going to be too hard to tune. Changing that later wouldn't affect much else, so I'd not consider it a blocker. (The worst case is that we think the initial AIO release will be a loss for most users, so we wrap it in debug_ terminology like we did for debug_io_direct. I'm
Re: Using per-transaction memory contexts for storing decoded tuples
On Tue, Sep 17, 2024 at 2:06 AM Amit Kapila wrote: > > On Mon, Sep 16, 2024 at 10:43 PM Masahiko Sawada > wrote: > > > > On Fri, Sep 13, 2024 at 3:58 AM Amit Kapila wrote: > > > > > > On Thu, Sep 12, 2024 at 4:03 AM Masahiko Sawada > > > wrote: > > > > > > > > We have several reports that logical decoding uses memory much more > > > > than logical_decoding_work_mem[1][2][3]. For instance in one of the > > > > reports[1], even though users set logical_decoding_work_mem to > > > > '256MB', a walsender process was killed by OOM because of using more > > > > than 4GB memory. > > > > > > > > As per the discussion in these threads so far, what happened was that > > > > there was huge memory fragmentation in rb->tup_context. > > > > rb->tup_context uses GenerationContext with 8MB memory blocks. We > > > > cannot free memory blocks until all memory chunks in the block are > > > > freed. If there is a long-running transaction making changes, its > > > > changes could be spread across many memory blocks and we end up not > > > > being able to free memory blocks unless the long-running transaction > > > > is evicted or completed. Since we don't account fragmentation, block > > > > header size, nor chunk header size into per-transaction memory usage > > > > (i.e. txn->size), rb->size could be less than > > > > logical_decoding_work_mem but the actual allocated memory in the > > > > context is much higher than logical_decoding_work_mem. > > > > > > > > > > It is not clear to me how the fragmentation happens. Is it because of > > > some interleaving transactions which are even ended but the memory > > > corresponding to them is not released? > > > > In a generation context, we can free a memory block only when all > > memory chunks there are freed. Therefore, individual tuple buffers are > > already pfree()'ed but the underlying memory blocks are not freed. > > > > I understood this part but didn't understand the cases leading to this > problem. For example, if there is a large (and only) transaction in > the system that allocates many buffers for change records during > decoding, in the end, it should free memory for all the buffers > allocated in the transaction. So, wouldn't that free all the memory > chunks corresponding to the memory blocks allocated? My guess was that > we couldn't free all the chunks because there were small interleaving > transactions that allocated memory but didn't free it before the large > transaction ended. We haven't actually checked with the person who reported the problem, so this is just a guess, but I think there were concurrent transactions, including long-running INSERT transactions. For example, suppose a transaction that inserts 10 million rows and many OLTP-like (short) transactions are running at the same time. The scenario I thought of was that one 8MB Generation Context Block contains 1MB of the large insert transaction changes, and the other 7MB contains short OLTP transaction changes. If there are just 256 such blocks, even after all short-transactions have completed, the Generation Context will have allocated 2GB of memory until we decode the commit record of the large transaction, but rb->size will say 256MB. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: Detailed release notes
Em ter., 17 de set. de 2024 às 05:12, Alvaro Herrera < alvhe...@alvh.no-ip.org> escreveu: > Add backend support for injection points (Michael Paquier) [commit 1] [2] > [3] [4] > I think this way would be fine. And it would be good to have a target="_blank" on commit links so a new window or tab would be opened instead of going back and forth. This way you can open these 4 links and then navigate on them to see exactly what was done on every commit. regards Marcos
Re: [PATCH] WIP: replace method for jsonpath
On Sep 16, 2024, at 18:39, Florents Tselai wrote: > Here’s an updated version of this patch. Oh, nice function. But a broader question for hackers: Is replace() specified in the SQL/JSON spec? If not, what’s the process for evaluating whether or not to add features not specified by the spec? > As a future note: > It’s worth noting that both this newly added jspItem and other ones like > (jpiDecimal, jpiString) > use jspGetRightArg and jspGetLeftArg. > left and right can be confusing if more complex methods are added in the > future. > i.e. jsonpath methods with nargs>=3 . > I was wondering if we’d like something like JSP_GETARG(n) So far I think we have only functions defined by the spec, which tend to be unary or binary, so left and right haven’t been an issue. Best, David
Re: [PATCH] WIP: replace method for jsonpath
> On 17 Sep 2024, at 9:40 PM, David E. Wheeler wrote: > > On Sep 16, 2024, at 18:39, Florents Tselai wrote: > >> Here’s an updated version of this patch. > > Oh, nice function. > > But a broader question for hackers: Is replace() specified in the SQL/JSON > spec? If not, what’s the process for evaluating whether or not to add > features not specified by the spec? That’s the first argument I was expecting, and it’s a valid one. From a user’s perspective the answer is: Why not? The more text-processing facilities I have in jsonb, The less back-and-forth-parentheses-fu I do, The easier my life is. From a PG gatekeeping it’s a more complicated issue. It’s not part of the spec, But I think the jsonb infrastructure in PG is really powerful already and we can built on it, And can evolve into a superset DSL of jsonpath. For example, apache/age have lift-and-shifted this infra and built another DSL (cypher) on top of it. > >> As a future note: >> It’s worth noting that both this newly added jspItem and other ones like >> (jpiDecimal, jpiString) >> use jspGetRightArg and jspGetLeftArg. >> left and right can be confusing if more complex methods are added in the >> future. >> i.e. jsonpath methods with nargs>=3 . >> I was wondering if we’d like something like JSP_GETARG(n) > > So far I think we have only functions defined by the spec, which tend to be > unary or binary, so left and right haven’t been an issue. If the answer to the Q above is: “we stick to the spec” then there’s no thinking about this. But tbh, I’ve already started experimenting with other text methods in text $.strip() / trim() / upper() / lower() etc. Fallback scenario: make this an extension, but in a first pass I didn’t find any convenient hooks. One has to create a whole new scanner, grammar etc. > > Best, > > David >
Re: query_id, pg_stat_activity, extended query protocol
On Wed, Sep 18, 2024 at 07:50:27AM +0900, Michael Paquier wrote: > On Tue, Sep 17, 2024 at 05:01:18PM -0500, Sami Imseih wrote: > > > Then, please see attached two lightly-updated patches. 0001 is for a > > > backpatch down to v14. This is yours to force things in the exec and > > > bind messages for all portal types, with the test (placed elsewhere in > > > 14~15 branches). 0002 is for HEAD to add some sanity checks, blowing > > > up the tests of pg_stat_statements if one is not careful with the > > > query ID reporting. > > > > These 2 patches look good to me; except for the slight typo > > In the commit message of 0002. "backpatch" instead of "backpatck". > > Yes, I've noticed this one last Friday and fixed the typo in the > commit log after sending the previous patch series. So, I have applied 0001 down to 14, followed by 0002 on HEAD. For the sake of completeness, I have tested all the five PortalStrategys with the extended query protocol and with the sanity checks of 0002 in place and compute_query_id=regress to force the computations, so I'd like to think that we are pretty good now. 0002 is going to be interesting to see moving forward. I am wondering how existing out-of-core extensions will react on that and if it will help catching up any issues. So, let's see how the experiment goes with HEAD on this side. Perhaps we'll have to revert 0002 at the end, or perhaps not... -- Michael signature.asc Description: PGP signature
Re: Conflict detection for update_deleted in logical replication
On Tue, Sep 17, 2024 at 11:24 PM Masahiko Sawada wrote: > > On Mon, Sep 16, 2024 at 11:53 PM Amit Kapila wrote: > > > > On Tue, Sep 17, 2024 at 6:08 AM Masahiko Sawada > > wrote: > > > > I haven't thought about the implementation details yet but I think > > during pruning (for example in heap_prune_satisfies_vacuum()), apart > > from checking if the tuple satisfies > > HeapTupleSatisfiesVacuumHorizon(), we should also check if the tuple's > > committs is greater than configured vacuum_committs_age (for the > > table) to decide whether tuple can be removed. > > Sounds very costly. I think we need to do performance tests. Even if > the vacuum gets slower only on the particular table having the > vacuum_committs_age setting, it would affect overall autovacuum > performance. Also, it would affect HOT pruning performance. > Agreed that we should do some performance testing and additionally think of any better way to implement. I think the cost won't be much if the tuples to be removed are from a single transaction because the required commit_ts information would be cached but when the tuples are from different transactions, we could see a noticeable impact. We need to test to say anything concrete on this. > > > > > > but IIUC this value doesn’t need to be significant; it > > > > can be limited to just a few minutes. The one which is sufficient to > > > > handle replication delays caused by network lag or other factors, > > > > assuming clock skew has already been addressed. > > > > > > I think that in a non-bidirectional case the value could need to be a > > > large number. Is that right? > > > > > > > As per my understanding, even for non-bidirectional cases, the value > > should be small. For example, in the case, pointed out by Shveta [1], > > where the updates from 2 nodes are received by a third node, this > > setting is expected to be small. This setting primarily deals with > > concurrent transactions on multiple nodes, so it should be small but I > > could be missing something. > > > > I might be missing something but the scenario I was thinking of is > something below. > > Suppose that we setup uni-directional logical replication between Node > A and Node B (e.g., Node A -> Node B) and both nodes have the same row > with key = 1: > > Node A: > T1: UPDATE t SET val = 2 WHERE key = 1; (10:00 AM) > -> This change is applied on Node B at 10:01 AM. > > Node B: > T2: DELETE FROM t WHERE key = 1; (05:00 AM) > > If a vacuum runs on Node B at 06:00 AM, the change of T1 coming from > Node A would raise an "update_missing" conflict. On the other hand, if > a vacuum runs on Node B at 11:00 AM, the change would raise an > "update_deleted" conflict. It looks whether we detect an > "update_deleted" or an "updated_missing" depends on the timing of > vacuum, and to avoid such a situation, we would need to set > vacuum_committs_age to more than 5 hours. > Yeah, in this case, it would detect a different conflict (if we don't set vacuum_committs_age to greater than 5 hours) but as per my understanding, the primary purpose of conflict detection and resolution is to avoid data inconsistency in a bi-directional setup. Assume, in the above case it is a bi-directional setup, then we want to have the same data in both nodes. Now, if there are other cases like the one you mentioned that require to detect the conflict reliably than I agree this value could be large and probably not the best way to achieve it. I think we can mention in the docs that the primary purpose of this is to achieve data consistency among bi-directional kind of setups. Having said that even in the above case, the result should be the same whether the vacuum has removed the row or not. Say, if the vacuum has not yet removed the row (due to vacuum_committs_age or otherwise) then also because the incoming update has a later timestamp, we will convert the update to insert as per last_update_wins resolution method, so the conflict will be considered as update_missing. And, say, the vacuum has removed the row and the conflict detected is update_missing, then also we will convert the update to insert. In short, if UPDATE has lower commit-ts, DELETE should win and if UPDATE has higher commit-ts, UPDATE should win. So, we can expect data consistency in bidirectional cases and expect a deterministic behavior in other cases (e.g. the final data in a table does not depend on the order of applying the transactions from other nodes). -- With Regards, Amit Kapila.
Re: DROP OWNED BY fails to clean out pg_init_privs grants
=?UTF-8?B?0JXQs9C+0YAg0KfQuNC90LTRj9GB0LrQuNC9?= writes: > This query does not expect that test database may already contain some > information about custom user that ran test_pg_dump-running. I'm perfectly content to reject this as being an abuse of the test case. Our TAP tests are built on the assumption that they use databases created within the test case. Apparently, you've found a way to use the meson test infrastructure to execute a TAP test in the equivalent of "make installcheck" rather than "make check" mode. I am unwilling to buy into the proposition that our TAP tests should be proof against doing that after making arbitrary changes to the database's initial state. If anything, the fact that this is possible is a bug in our meson scripts. regards, tom lane
Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Hi All, I hope you're doing well. I'm writing to kindly requesting if there is a bug tracker ID or any reference number associated with this issue, I would appreciate it if you could share it with me. Thank you for your time and assistance. Please let me know if there's any additional information you need from me. Best regards, Nikhil On Fri, 30 Aug, 2024, 2:23 am Tom Lane, wrote: > Richard Guo writes: > > On Thu, Aug 29, 2024 at 4:47 AM Tom Lane wrote: > >> In the meantime, I think this test case is mighty artificial, > >> and it wouldn't bother me any to just take it out again for the > >> time being. > > > Yeah, I think we can remove the 't1.two+t2.two' test case if we go > > with your proposed patch to address this performance regression. > > Here's a polished-up patchset for that. I made the memoize test > removal a separate patch because (a) it only applies to master > and (b) it seems worth calling out as something we might be able > to revert later. > > I found one bug in the draft patch: add_nulling_relids only processes > Vars of level zero, so we have to apply it before not after adjusting > the Vars' levelsup. An alternative could be to add a levelsup > parameter to add_nulling_relids, but that seemed like unnecessary > complication. > > regards, tom lane > >
Re: Introduce XID age and inactive timeout based replication slot invalidation
On Mon, Sep 16, 2024 at 3:31 PM Bharath Rupireddy wrote: > > Hi, > > > Please find the attached v46 patch having changes for the above review > comments and your test review comments and Shveta's review comments. > Thanks for addressing comments. Is there a reason that we don't support this invalidation on hot standby for non-synced slots? Shouldn't we support this time-based invalidation there too just like other invalidations? thanks Shveta
Re: Conflict Detection and Resolution
On Thu, 12 Sept 2024 at 14:03, Ajin Cherian wrote: > > On Tue, Sep 3, 2024 at 7:42 PM vignesh C wrote: > > On Fri, 30 Aug 2024 at 11:01, Nisha Moond > wrote: > > > > Here is the v11 patch-set. Changes are: > > 1) This command crashes: > ALTER SUBSCRIPTION name RESET CONFLICT RESOLVER FOR NULL; > #0 __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116 > #1 0x55c67270600a in ResetConflictResolver (subid=16404, > conflict_type=0x0) at conflict.c:744 > #2 0x55c67247e0c3 in AlterSubscription (pstate=0x55c6748ff9d0, > stmt=0x55c67497dfe0, isTopLevel=true) at subscriptioncmds.c:1664 > > + | ALTER SUBSCRIPTION name RESET CONFLICT > RESOLVER FOR conflict_type > + { > + AlterSubscriptionStmt *n = > + > makeNode(AlterSubscriptionStmt); > + > + n->kind = > ALTER_SUBSCRIPTION_RESET_CONFLICT_RESOLVER; > + n->subname = $3; > + n->conflict_type = $8; > + $$ = (Node *) n; > + } > + ; > +conflict_type: > + Sconst > { $$ = $1; } > + | NULL_P > { $$ = NULL; } > ; > > May be conflict_type should be changed to: > +conflict_type: > + Sconst > { $$ = $1; } > ; > > > Fixed. > Few comments: 1) This should be in (fout->remoteVersion >= 18) check to support dumping backward compatible server objects, else dump with older version will fail: + /* Populate conflict type fields using the new query */ + confQuery = createPQExpBuffer(); + appendPQExpBuffer(confQuery, + "SELECT confrtype, confrres FROM pg_catalog.pg_subscription_conflict " + "WHERE confsubid = %u;", subinfo[i].dobj.catId.oid); + confRes = ExecuteSqlQuery(fout, confQuery->data, PGRES_TUPLES_OK); + + ntuples = PQntuples(confRes); + for (j = 0; j < ntuples; j++) 2) Can we check and throw an error before the warning is logged in this case as it seems strange to throw a warning first and then an error for the same track_commit_timestamp configuration: postgres=# create subscription sub1 connection ... publication pub1 conflict resolver (insert_exists = 'last_update_wins'); WARNING: conflict detection and resolution could be incomplete due to disabled track_commit_timestamp DETAIL: Conflicts update_origin_differs and delete_origin_differs cannot be detected, and the origin and commit timestamp for the local row will not be logged. ERROR: resolver last_update_wins requires "track_commit_timestamp" to be enabled HINT: Make sure the configuration parameter "track_commit_timestamp" is set. Regards, Vignesh
define pg_structiszero(addr, s, r)
Hi hackers, There is some places where we check that a struct is full of zeroes: pgstat_report_bgwriter() pgstat_report_checkpointer() pgstat_relation_flush_cb() Indeed that's the way we check if there is pending statistics to flush/report. The current code is like (taking pgstat_relation_flush_cb() as an example): " static const PgStat_TableCounts all_zeroes; . . if (memcmp(&lstats->counts, &all_zeroes, sizeof(PgStat_TableCounts)) == 0) . . " The static declaration is not "really" related to the purpose of the function it is declared in. It's there "only" to initialize a memory area with zeroes and to use it in the memcmp. I think it would make sense to "hide" all of this in a new macro, so please find attached a patch proposal doing so (Andres suggested something along those lines in [1] IIUC). The macro is created in pgstat_internal.h as it looks like that "only" the statistics related code would benefit of it currently (could be moved to other header file later on if needed). [1]: https://www.postgresql.org/message-id/20230105002733.ealhzubjaiqis6ua%40awork3.anarazel.de Looking forward to your feedback, Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com >From 2e9071b54f7fd28027759de871b662a3eaa8e4a3 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot Date: Tue, 10 Sep 2024 01:22:02 + Subject: [PATCH v1] define pg_structiszero(addr, s, r) This new macro allows to test if a memory region (representing a struct "s") starting at addr and of size sizeof(s) is full of zeroes. --- src/backend/utils/activity/pgstat_bgwriter.c | 6 -- src/backend/utils/activity/pgstat_checkpointer.c | 9 + src/backend/utils/activity/pgstat_relation.c | 9 - src/include/utils/pgstat_internal.h | 11 +++ 4 files changed, 24 insertions(+), 11 deletions(-) 69.1% src/backend/utils/activity/ 30.8% src/include/utils/ diff --git a/src/backend/utils/activity/pgstat_bgwriter.c b/src/backend/utils/activity/pgstat_bgwriter.c index 364a7a2024..61cbc27760 100644 --- a/src/backend/utils/activity/pgstat_bgwriter.c +++ b/src/backend/utils/activity/pgstat_bgwriter.c @@ -30,7 +30,7 @@ void pgstat_report_bgwriter(void) { PgStatShared_BgWriter *stats_shmem = &pgStatLocal.shmem->bgwriter; - static const PgStat_BgWriterStats all_zeroes; + bool is_all_zeroes; Assert(!pgStatLocal.shmem->is_shutdown); pgstat_assert_is_up(); @@ -39,7 +39,9 @@ pgstat_report_bgwriter(void) * This function can be called even if nothing at all has happened. In * this case, avoid unnecessarily modifying the stats entry. */ - if (memcmp(&PendingBgWriterStats, &all_zeroes, sizeof(all_zeroes)) == 0) + pg_structiszero(&PendingBgWriterStats, PgStat_BgWriterStats, is_all_zeroes); + + if (is_all_zeroes) return; pgstat_begin_changecount_write(&stats_shmem->changecount); diff --git a/src/backend/utils/activity/pgstat_checkpointer.c b/src/backend/utils/activity/pgstat_checkpointer.c index bbfc9c7e18..3b9b688ae8 100644 --- a/src/backend/utils/activity/pgstat_checkpointer.c +++ b/src/backend/utils/activity/pgstat_checkpointer.c @@ -29,8 +29,7 @@ PgStat_CheckpointerStats PendingCheckpointerStats = {0}; void pgstat_report_checkpointer(void) { - /* We assume this initializes to zeroes */ - static const PgStat_CheckpointerStats all_zeroes; + bool is_all_zeroes; PgStatShared_Checkpointer *stats_shmem = &pgStatLocal.shmem->checkpointer; Assert(!pgStatLocal.shmem->is_shutdown); @@ -40,8 +39,10 @@ pgstat_report_checkpointer(void) * This function can be called even if nothing at all has happened. In * this case, avoid unnecessarily modifying the stats entry. */ - if (memcmp(&PendingCheckpointerStats, &all_zeroes, - sizeof(all_zeroes)) == 0) + pg_structiszero(&PendingCheckpointerStats, PgStat_CheckpointerStats, + is_all_zeroes); + + if (is_all_zeroes) return; pgstat_begin_changecount_write(&stats_shmem->changecount); diff --git a/src/backend/utils/activity/pgstat_relation.c b/src/backend/utils/activity/pgstat_relation.c index 8a3f7d434c..09e519a6ff 100644 --- a/src/backend/utils/activity/pgstat_relation.c +++ b/src/backend/utils/activity/pgstat_relation.c @@ -801,7 +801,7 @@ pgstat_twophase_postabort(TransactionId xid, uint16 info, bool pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait) { - static const PgStat_TableCounts all_zeroes; + bool is_all_zeroes; Oid dboid; PgStat_TableStatus *lstats; /* pending stats entry */ PgStatShared_Relation *shtabstats; @@ -816,11 +816,10 @@ pgstat_relation_flush_cb(PgStat_EntryRef *entry_ref, bool nowait) * Ignore entries that didn't accumulate any actual counts, such as * indexes that were opened by the planner but not used. */ - if (memcmp(&lstats->counts, &all_zeroes, - sizeof(PgStat_TableCounts)) == 0) - { + pg_structiszero(&lstats->counts, PgStat_TableCounts, is_all_zeroes); + + if (is_all_zeroes
Re: pg_trgm comparison bug on cross-architecture replication due to different char implementation
On Mon, Sep 16, 2024 at 9:24 AM Noah Misch wrote: > > On Thu, Sep 12, 2024 at 03:42:48PM -0700, Masahiko Sawada wrote: > > On Tue, Sep 10, 2024 at 3:05 PM Noah Misch wrote: > > > On Tue, Sep 10, 2024 at 05:56:47PM -0400, Tom Lane wrote: > > > > Got it. So now I'm wondering if we need all the complexity of storing > > > > stuff in the GIN metapages. Could we simply read the (primary's) > > > > signedness out of pg_control and use that? > > > I've attached a PoC patch for this idea. We write the default char > > signedness to the control file at initdb time. Then when comparing two > > trgms, pg_trgm opclasses use a comparison function based on the char > > signedness of the cluster. I've confirmed that the patch fixes the > > reported case at least. > > I agree that proves the concept. Thanks. I like the simplicity of this approach. If we agree with this approach, I'd like to proceed with it. Regardless of what approach we take, I wanted to provide some regression tests for these changes, but I could not come up with a reasonable idea. It would be great if we could do something like 027_stream_regress.pl on cross-architecture replication. But just executing 027_stream_regress.pl on cross-architecture replication could not be sufficient since we would like to confirm query results as well. If we could have a reasonable tool or way, it would also help find other similar bugs related architectural differences. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
Re: [PATCH] WIP: replace method for jsonpath
On Sep 17, 2024, at 15:03, Florents Tselai wrote: > Fallback scenario: make this an extension, but in a first pass I didn’t find > any convenient hooks. > One has to create a whole new scanner, grammar etc. Yeah, it got me thinking about the RFC-9535 JSONPath "Function Extension" feature[1], which allows users to add functions. Would be cool to have a way to register jsonpath functions somehow, but I would imagine it’d need quite a bit of specification similar to RFC-9535. Wouldn’t surprise me to see something like that appear in a future version of the spec, with an interface something like CREATE OPERATOR. I don’t have a strong feeling about what should be added that’s not in the spec; my main interest is not having to constantly sync my port[2]. I’m already behind, and’t just been a couple months! 😂 Best, David [1]: https://www.rfc-editor.org/rfc/rfc9535.html#name-function-extensions [2]: https://github.com/theory/sqljson
miscellaneous pg_upgrade cleanup
Here are a few miscellaneous cleanup patches for pg_upgrade. I don't think there's anything controversial here. 0001 removes some extra whitespace in the status message for failed data type checks. I noticed that when the check fails, this status message is indented beyond all the other output. This appears to have been introduced in commit 347758b, so I'd back-patch this one to v17. 0002 improves the coding style in many of the new upgrade task callback functions. I refrained from adjusting this code too much when converting these tasks to use the new pg_upgrade task framework (see commit 40e2e5e), but now I think we should. This decreases the amount of indentation in some places and removes a few dozen lines of code. 0003 adds names to the UpgradeTaskSlotState enum and the UpgradeTaskSlot struct. I'm not aware of any established project policy in this area, but I figured it'd be good to at least be consistent within the same file. Thoughts? -- nathan >From c958311e5fd6df460e0c1289646fe99b9dd5f7f7 Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 17 Sep 2024 11:30:50 -0500 Subject: [PATCH v1 1/3] remove extra whitespace in pg_upgrade report --- src/bin/pg_upgrade/check.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c index 1c277ce17b..c9a3f06b80 100644 --- a/src/bin/pg_upgrade/check.c +++ b/src/bin/pg_upgrade/check.c @@ -419,7 +419,7 @@ process_data_type_check(DbInfo *dbinfo, PGresult *res, void *arg) */ if (!(*state->result)) { - pg_log(PG_REPORT, "failed check: %s", _(state->check->status)); + pg_log(PG_REPORT, "failed check: %s", _(state->check->status)); appendPQExpBuffer(*state->report, "\n%s\n%s%s\n", _(state->check->report_text), _("A list of the problem columns is in the file:"), -- 2.39.3 (Apple Git-146) >From 6f3cd97974eb6f73c1513c4e6093c168c42c Mon Sep 17 00:00:00 2001 From: Nathan Bossart Date: Tue, 17 Sep 2024 11:57:52 -0500 Subject: [PATCH v1 2/3] improve style of upgrade task callback functions --- src/bin/pg_upgrade/check.c | 205 +++ src/bin/pg_upgrade/version.c | 29 +++-- 2 files changed, 99 insertions(+), 135 deletions(-) diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c index c9a3f06b80..bc4c9b11a0 100644 --- a/src/bin/pg_upgrade/check.c +++ b/src/bin/pg_upgrade/check.c @@ -390,69 +390,55 @@ process_data_type_check(DbInfo *dbinfo, PGresult *res, void *arg) { struct data_type_check_state *state = (struct data_type_check_state *) arg; int ntups = PQntuples(res); + charoutput_path[MAXPGPATH]; + int i_nspname = PQfnumber(res, "nspname"); + int i_relname = PQfnumber(res, "relname"); + int i_attname = PQfnumber(res, "attname"); + FILE *script = NULL; AssertVariableIsOfType(&process_data_type_check, UpgradeTaskProcessCB); - if (ntups) - { - charoutput_path[MAXPGPATH]; - int i_nspname; - int i_relname; - int i_attname; - FILE *script = NULL; - booldb_used = false; + if (ntups == 0) + return; - snprintf(output_path, sizeof(output_path), "%s/%s", -log_opts.basedir, -state->check->report_filename); + snprintf(output_path, sizeof(output_path), "%s/%s", +log_opts.basedir, +state->check->report_filename); - /* -* Make sure we have a buffer to save reports to now that we found a -* first failing check. -*/ - if (*state->report == NULL) - *state->report = createPQExpBuffer(); + /* +* Make sure we have a buffer to save reports to now that we found a first +* failing check. +*/ + if (*state->report == NULL) + *state->report = createPQExpBuffer(); - /* -* If this is the first time we see an error for the check in question -* then print a status message of the failure. -*/ - if (!(*state->result)) - { - pg_log(PG_REPORT, "failed check: %s", _(state->check->status)); - appendPQExpBuffer(*state->report, "\n%s\n%s%s\n", - _(state->check->report_text), -
Re: jsonb_strip_nulls with arrays?
On Tue, Sep 17, 2024 at 5:11 PM Andrew Dunstan wrote: > > On 2024-09-17 Tu 5:26 AM, Florents Tselai wrote: > > Currently: > > > jsonb_strip_nulls ( jsonb ) → jsonb > > Deletes all object fields that have null values from the given JSON value, > recursively. Null values that are not object fields are untouched. > > > > Null values that are not object fields are untouched. > > > Can we revisit this and make it work with arrays, too? > > Tbh, at first sight that looked like the expected behavior for me. > > That is strip nulls from arrays as well. > > > This has been available since 9.5 and iiuc predates lots of the jsonb > array work. > > > I don't think that's a great idea. Removing an object field which has a > null value shouldn't have any effect on the surrounding data, nor really > any on other operations (If you try to get the value of the missing field > it should give you back null). But removing a null array member isn't like > that at all - unless it's the trailing member of the array it will renumber > all the succeeding array members. > > And I don't think we should be changing the behaviour of a function, that > people might have been relying on for the better part of a decade. > > > > In practice, though, whenever jsonb_build_array is used (especially with > jsonpath), > > a few nulls do appear in the resulting array most of the times, > > Currently, there’s no expressive way to remove this. > > > We could also have jsonb_array_strip_nulls(jsonb) as well > > > We could, if we're going to do anything at all in this area. Another > possibility would be to provide a second optional parameter for > json{b}_strip_nulls. That's probably a better way to go. > Here's a patch that adds that argument (only for jsonb; no json implementation yet) That's how I imagined & implemented it, but there may be non-obvious pitfalls in the semantics. as-is version select jsonb_strip_nulls('[1,2,null,3,4]'); jsonb_strip_nulls [1, 2, null, 3, 4] (1 row) select jsonb_strip_nulls('{"a":1,"b":null,"c":[2,null,3],"d":{"e":4,"f":null}}'); jsonb_strip_nulls {"a": 1, "c": [2, null, 3], "d": {"e": 4}} (1 row) with the additional boolean flag added select jsonb_strip_nulls('[1,2,null,3,4]', *true*); jsonb_strip_nulls --- [1, 2, 3, 4] (1 row) select jsonb_strip_nulls('{"a":1,"b":null,"c":[2,null,3],"d":{"e":4,"f":null}}', *true*); jsonb_strip_nulls -- {"a": 1, "c": [2, 3], "d": {"e": 4}} (1 row) GH PR view: https://github.com/Florents-Tselai/postgres/pull/6/files > cheers > > > andrew > > > -- > Andrew Dunstan > EDB: https://www.enterprisedb.com > > v1-0002-Add-docs-for-strip_in_arrays-argument.patch Description: Binary data v1-0001-jsonb_strip_nulls-jsonb-bool-wip.patch Description: Binary data
Re: Test improvements and minor code fixes for formatting.c.
On Sun, Sep 08, 2024 at 05:32:16PM -0400, Tom Lane wrote: > In looking at this, I found that there's also no test coverage > for the , V, or PL format codes. Also, the possibility of > overflow while converting an input value to int in order to > pass it to int_to_roman was ignored. Attached is a patch that > adds more test coverage and cleans up the Roman-numeral code > a little bit. I stared at the patch for a while, and it looks good to me. > BTW, I also discovered that there is a little bit of support > for a "B" format code: we can parse it, but then we ignore it. > And it's not documented. Oracle defines this code as a flag > that: > > Returns blanks for the integer part of a fixed-point number > when the integer part is zero (regardless of zeros in the > format model). > > It doesn't seem super hard to implement that, but given the > complete lack of complaints about it being missing, maybe we > should just rip out the incomplete support instead? AFAICT it's been like that since it was introduced [0]. I searched the archives and couldn't find any discussion about this format code. Given that, I don't have any concerns about removing it unless it causes ERRORs for calls that currently succeed, but even then, it's probably fine. This strikes me as something that might be fun for an aspiring hacker, though. [0] https://postgr.es/c/b866d2e -- nathan
Re: query_id, pg_stat_activity, extended query protocol
> Then, please see attached two lightly-updated patches. 0001 is for a > backpatch down to v14. This is yours to force things in the exec and > bind messages for all portal types, with the test (placed elsewhere in > 14~15 branches). 0002 is for HEAD to add some sanity checks, blowing > up the tests of pg_stat_statements if one is not careful with the > query ID reporting. These 2 patches look good to me; except for the slight typo In the commit message of 0002. "backpatch" instead of "backpatck". That leaves us with considering v5-0002 [1]. I do think this is good for overall correctness of the queryId being advertised after a cache revalidation, even if users of pg_stat_activity will hardly notice this. [1] https://www.postgresql.org/message-id/DB325894-3EE3-4B2E-A18C-4B34E7B2F5EC%40gmail.com Regards, Sami
Custom connstr in background_psql()
Hi, hackers! I've noticed that there is no way to specify a custom connection string when calling the PostgreSQL::Test::Cluster->background_psql() method compared to the PostgreSQL::Test:Cluster->psql(). It seems useful to have this feature while testing with BackgroundPsql, for example, when the default host value needs to be overridden to establish different types of connections. What do you think?diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm index 143dc8c101..ff0bda0c8d 100644 --- a/src/test/perl/PostgreSQL/Test/Cluster.pm +++ b/src/test/perl/PostgreSQL/Test/Cluster.pm @@ -2181,6 +2181,11 @@ returned. Set B to 0 to ignore errors instead. Set a timeout for a background psql session. By default, timeout of $PostgreSQL::Test::Utils::timeout_default is set up. +=item connstr => B + +If set, use this as the connection string for the connection to the +backend. + =item replication => B If set, add B to the conninfo string. @@ -2204,12 +2209,23 @@ sub background_psql my $replication = $params{replication}; my $timeout = undef; + # Build the connection string. + my $psql_connstr; + if (defined $params{connstr}) + { + $psql_connstr = $params{connstr}; + } + else + { + $psql_connstr = $self->connstr($dbname); + } + $psql_connstr .= defined $replication ? " replication=$replication" : ""; + my @psql_params = ( $self->installed_command('psql'), '-XAtq', '-d', - $self->connstr($dbname) - . (defined $replication ? " replication=$replication" : ""), + $psql_connstr, '-f', '-');
Re: Adding skip scan (including MDAM style range skip scan) to nbtree
On Mon, Sep 16, 2024 at 6:05 PM Tomas Vondra wrote: > I've been looking at this patch over the couple last days, mostly doing > some stress testing / benchmarking (hence the earlier report) and basic > review. Thanks for taking a look! Very helpful. > I do have some initial review comments, and the testing produced > some interesting regressions (not sure if those are the cases where > skipscan can't really help, that Peter mentioned he needs to look into). The one type of query that's clearly regressed in a way that's just not acceptable are queries where we waste CPU cycles during scans where it's truly hopeless. For example, I see a big regression on one of the best cases for the Postgres 17 work, described here: https://pganalyze.com/blog/5mins-postgres-17-faster-btree-index-scans#a-practical-example-3x-performance-improvement Notably, these cases access exactly the same buffers/pages as before, so this really isn't a matter of "doing too much skipping". The number of buffers hit exactly matches what you'll see on Postgres 17. It's just that we waste too many CPU cycles in code such as _bt_advance_array_keys, to uselessly maintain skip arrays. I'm not suggesting that there won't be any gray area with these regressions -- nothing like this will ever be that simple. But it seems to me like I should go fix these obviously-not-okay cases next, and then see where that leaves everything else, regressions-wise. That seems likely to be the most efficient way of dealing with the regressions. So I'll start there. That said, I *would* be surprised if you found a regression in any query that simply didn't receive any new scan key transformations in new preprocessing code in places like _bt_decide_skipatts and _bt_skip_preproc_shrink. I see that many of the queries that you're using for your stress-tests "aren't really testing skip scan", in this sense. But I'm hardly about to tell you that you shouldn't spend time on such queries -- that approach just discovered a bug affecting Postgres 17 (that was also surprising, but it still happened!). My point is that it's worth being aware of which test queries actually use skip arrays in the first place -- it might help you with your testing. There are essentially no changes to _bt_advance_array_keys that'll affect traditional SAOP arrays (with the sole exception of changes made by v6-0003-Refactor-handling-of-nbtree-array-redundancies.patch, which affect every kind of array in the same way). > 1) v6-0001-Show-index-search-count-in-EXPLAIN-ANALYZE.patch > > - I find the places that increment "nsearches" a bit random. Each AM > does it in entirely different place (at least it seems like that to me). > Is there a way make this a bit more consistent? >From a mechanical perspective there is nothing at all random about it: we do this at precisely the same point that we currently call pgstat_count_index_scan, which in each index AM maps to one descent of the index. It is at least consistent. Whenever a B-Tree index scan shows "Index Scans: N", you'll see precisely the same number by swapping it with an equivalent contrib/btree_gist-based GiST index and running the same query again (assuming that index tuples that match the array keys are spread apart in both the B-Tree and GiST indexes). (Though I see problems with the precise place that nbtree calls pgstat_count_index_scan right now, at least in certain edge-cases, which I discuss below in response to your questions about that.) > uint64 btps_nsearches; /* instrumentation */ > > Instrumentation what? What's the counter for? Will fix. In case you missed it, there is another thread + CF Entry dedicated to discussing this instrumentation patch: https://commitfest.postgresql.org/49/5183/ https://www.postgresql.org/message-id/flat/cah2-wzkrqvaqr2ctnqtzp0z6ful4-3ed6eqb0yx38xbnj1v...@mail.gmail.com > - I see _bt_first moved the pgstat_count_index_scan, but doesn't that > mean we skip it if the earlier code does "goto readcomplete"? Shouldn't > that still count as an index scan? In my opinion, no, it should not. We're counting the number of times we'll have descended the tree using _bt_search (or using _bt_endpoint, perhaps), which is a precisely defined physical cost. A little like counting the number of buffers accessed. I actually think that this aspect of how we call pgstat_count_index_scan is a bug that should be fixed, with the fix backpatched to Postgres 17. Right now, we see completely different counts for a parallel index scan, compared to an equivalent serial index scan -- differences that cannot be explained as minor differences caused by parallel scan implementation details. I think that it's just wrong right now, on master, since we're simply not counting the thing that we're supposed to be counting (not reliably, not if it's a parallel index scan). > - show_indexscan_nsearches does this: > > if (scanDesc && scanDesc->nsearches > 0) > ExplainPropertyUInteger("Index Searches", NULL, >
Re: query_id, pg_stat_activity, extended query protocol
On Tue, Sep 17, 2024 at 05:01:18PM -0500, Sami Imseih wrote: > > Then, please see attached two lightly-updated patches. 0001 is for a > > backpatch down to v14. This is yours to force things in the exec and > > bind messages for all portal types, with the test (placed elsewhere in > > 14~15 branches). 0002 is for HEAD to add some sanity checks, blowing > > up the tests of pg_stat_statements if one is not careful with the > > query ID reporting. > > These 2 patches look good to me; except for the slight typo > In the commit message of 0002. "backpatch" instead of "backpatck". Yes, I've noticed this one last Friday and fixed the typo in the commit log after sending the previous patch series. > That leaves us with considering v5-0002 [1]. I do think this is good > for overall correctness of the queryId being advertised after a cache > revalidation, even if users of pg_stat_activity will hardly notice this. > > [1] > https://www.postgresql.org/message-id/DB325894-3EE3-4B2E-A18C-4B34E7B2F5EC%40gmail.com > Yeah. I need more time to evaluate this one. Also, please find one of the scripts I have used for the execute/fetch case, that simply does an INSERT RETURNING with a small fetch size to create a larger window in pg_stat_activity where we don't report the query ID. One can run it like that, crude still on point: # Download a JDBC driver # Create the table to use. psql -c 'create table aa (a int);' postgres CLASSPATH=postgresql-42.7.4.jar java TestReturning.java Then, while running the script, you would notice that pg_stat_activity reports NULL for the query ID with the query text while the batch fetches are processing. I've taken and expanded one of the scripts you have sent for 1d477a907e63. I'd like to get to the point where we are able to test that in core reliably. The sanity checks in the executor paths are a good step forward but they do nothing for the fetch cases. Perhaps Andrew Dunstan work to expose libpq's APIs with the perl TAP tests would help at some point to control the extended protocol queries, but we are going to need more for the fetch case as there are no hooks that would help to grab a query ID. A second option I have in mind would be to set up an injection point that produces a NOTICE if a query ID is set when we end processing an execute message, then check the number of NOTICE messages produced as these can be predictible depending on the number of rows and the fetch size.. This won't fly far unless we can control the fetch size. -- Michael import java.sql.*; public class Test { static final String DB_URL = "jdbc:postgresql://localhost/postgres"; public static void main(String[] args) { Connection conn = null; // Open a connection try { conn = DriverManager.getConnection(DB_URL); conn.setAutoCommit(false); PreparedStatement statement = conn.prepareStatement("INSERT into aa SELECT generate_series(1,1000) RETURNING a"); statement.setFetchSize(100); ResultSet rs = statement.executeQuery(); while (rs.next()) { } conn.commit(); statement.close(); } catch (SQLException e) { e.printStackTrace(); try { conn.rollback(); } catch (SQLException ee) { ee.printStackTrace(); } } } } signature.asc Description: PGP signature
Re: Custom connstr in background_psql()
On Wed, Sep 18, 2024 at 01:08:26AM +0300, a.ima...@postgrespro.ru wrote: > I've noticed that there is no way to specify a custom connection string when > calling the PostgreSQL::Test::Cluster->background_psql() method compared to > the > PostgreSQL::Test:Cluster->psql(). It seems useful to have this feature while > testing with BackgroundPsql, for example, when the default host value needs > to be overridden to establish different types of connections. > > What do you think? I think that it makes sense to extend the routine as you are suggesting. At least I can see myself using it depending on the test suite I am dealing with. So count me in. -- Michael signature.asc Description: PGP signature
Re: query_id, pg_stat_activity, extended query protocol
> would help to grab a query ID. A second option I have in mind would > be to set up an injection point that produces a NOTICE if a query ID > is set when we end processing an execute message, then check the > number of NOTICE messages produced as these can be predictible > depending on the number of rows and the fetch size.. This won't fly > far unless we can control the fetch size. FWIW, I do like the INJECTION_POINT idea and actually mentioned something similar up the thread [1] for the revalidate cache case, but I can see it being applied to all the other places we expect the queryId to be set. [1] https://www.postgresql.org/message-id/465EECA3-D98C-4E46-BBDB-4D057617DD89%40gmail.com -- Sami
Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
This commit seems to trigger elog(), not reproducible in the parent commit. 6e086fa2e77 Allow parallel workers to cope with a newly-created session user ID. postgres=# SET min_parallel_table_scan_size=0; CLUSTER pg_attribute USING pg_attribute_relid_attnum_index; ERROR: pg_attribute catalog is missing 26 attribute(s) for relation OID 70321 postgres=# \errverbose ERROR: XX000: pg_attribute catalog is missing 26 attribute(s) for relation OID 70321 LOCATION: RelationBuildTupleDesc, relcache.c:658 This is not completely deterministic: postgres=# CLUSTER pg_attribute USING pg_attribute_relid_attnum_index; CLUSTER postgres=# CLUSTER pg_attribute USING pg_attribute_relid_attnum_index; CLUSTER postgres=# CLUSTER pg_attribute USING pg_attribute_relid_attnum_index; CLUSTER postgres=# CLUSTER pg_attribute USING pg_attribute_relid_attnum_index; CLUSTER postgres=# CLUSTER pg_attribute USING pg_attribute_relid_attnum_index; ERROR: pg_attribute catalog is missing 26 attribute(s) for relation OID 70391 But I think this will be reproducible in any database with a nontrivial number of attributes.
Re: Detailed release notes
On Tue, Sep 17, 2024 at 03:29:54PM -0300, Marcos Pegoraro wrote: > Em ter., 17 de set. de 2024 às 05:12, Alvaro Herrera > escreveu: > > Add backend support for injection points (Michael Paquier) [commit 1] [2] > [3] [4] > > I think this way would be fine. > > And it would be good to have a > target="_blank" > on commit links so a new window or tab would be opened instead of going back > and forth. > This way you can open these 4 links and then navigate on them to see exactly > what was done on every commit. I think trying to add text to each item is both redundant and confusing, because I am guessing that many people will not even know what a commit is, and will be confused by clicking on the link. What I have done is to add text to the top of "Appendix E. Release Notes" explaining the symbol and what it links to. Patch attached. We can still consider a different symbol or number labeling, but I think this patch is in the right direction. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?" diff --git a/doc/src/sgml/release.sgml b/doc/src/sgml/release.sgml index cf6ba540876..8d408a50eb1 100644 --- a/doc/src/sgml/release.sgml +++ b/doc/src/sgml/release.sgml @@ -69,6 +69,15 @@ For new features, add links to the documentation sections. review, so each item is truly a community effort. + + Section markers (§) in the release notes link to https://git.postgresql.org/gitweb/?p=postgresql.git";>gitweb + pages which show the primary git commit + messages and source tree changes responsible for the release note item. + There might be additional git commits which + are not shown. + +
Re: Detailed release notes
Bruce Momjian writes: > On Tue, Sep 17, 2024 at 03:29:54PM -0300, Marcos Pegoraro wrote: >> Em ter., 17 de set. de 2024 às 05:12, Alvaro Herrera >> >>> Add backend support for injection points (Michael Paquier) [commit 1] [2] > I think trying to add text to each item is both redundant and confusing, Also very clutter-y. I'm not convinced that any of this is a good idea that will stand the test of time: I estimate that approximately 0.01% of people who read the release notes will want these links. But if we keep it I think the annotations have to be very unobtrusive. (Has anyone looked at the PDF rendering of this? It seems rather unfortunate to me.) regards, tom lane
Re: BUG #18545: \dt breaks transaction, calling error when executed in SET SESSION AUTHORIZATION
Justin Pryzby writes: > This commit seems to trigger elog(), not reproducible in the > parent commit. Yeah, I can reproduce that. Will take a look tomorrow. regards, tom lane
Re: query_id, pg_stat_activity, extended query protocol
On Tue, Sep 17, 2024 at 06:39:17PM -0500, Sami Imseih wrote: > FWIW, I do like the INJECTION_POINT idea and actually mentioned something > similar up the thread [1] for the revalidate cache case, but I can see it > being applied > to all the other places we expect the queryId to be set. > > [1] > https://www.postgresql.org/message-id/465EECA3-D98C-4E46-BBDB-4D057617DD89%40gmail.com FWIW, I was thinking about something like what has been done in indexcmds.c for 5bbdfa8a18dc as the query ID value is not predictible across releases, but we could see whether it is set or not. -- Michael signature.asc Description: PGP signature
Re: Detailed release notes
On Tue, Sep 17, 2024 at 08:22:41PM -0400, Tom Lane wrote: > Bruce Momjian writes: > > On Tue, Sep 17, 2024 at 03:29:54PM -0300, Marcos Pegoraro wrote: > >> Em ter., 17 de set. de 2024 às 05:12, Alvaro Herrera > >> > >>> Add backend support for injection points (Michael Paquier) [commit 1] [2] > > > I think trying to add text to each item is both redundant and confusing, > > Also very clutter-y. I'm not convinced that any of this is a good > idea that will stand the test of time: I estimate that approximately > 0.01% of people who read the release notes will want these links. Yes, I think 0.01% is accurate. > But if we keep it I think the annotations have to be very unobtrusive. I very much agree. > (Has anyone looked at the PDF rendering of this? It seems rather > unfortunate to me.) Oh, not good. See my build: https://momjian.us/expire/postgres-US.pdf#page=2890 Those numbers are there because all external links get numbers that start a 1 at the top of the section and increment for every link --- I have no idea why. You can see that acronyms that have external links have the same numbering: https://momjian.us/expire/postgres-US.pdf#page=3150 I wonder if we should remove the numbers in both cases. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com When a patient asks the doctor, "Am I going to die?", he means "Am I going to die soon?"
Re: Virtual generated columns
On Mon, Sep 16, 2024 at 5:22 PM jian he wrote: > > in v7. > seems I am confused with the version number. here, I attached another minor change in tests. make ERROR: invalid ON DELETE action for foreign key constraint containing generated column becomes ERROR: foreign key constraints on virtual generated columns are not supported change contrib/pageinspect/sql/page.sql expand information on t_infomask, t_bits information. change RelationBuildLocalRelation make the transient TupleDesc->TupleConstr three bool flags more accurate. v6-0001-virtual-generated-columns-misc-changes.no-cfbot Description: Binary data
Re: Trim the heap free memory
Thank you very much for your response and suggestions. As you mentioned, the patch here is actually designed for glibc's ptmalloc2 andis not applicable to other platforms. I will consider supporting it only on the Linux platform in the future. In the memory management strategy of ptmalloc2, there is a certain amount of non-garbage-collected memory, which is closely related to the order and method of memory allocation and release. To reduce the performance overhead caused by frequent allocation and release of small blocks of memory, ptmalloc2 intentionally retains this part of the memory. The malloc_trim function locks, traverses memory blocks, and uses madvise to release this part of the memory, but this process may also have a negative impact on performance. In the process of exploring solutions, I also considered a variety of strategies, including scheduling malloc_trim to be executed at regular intervals or triggering malloc_trim after a specific number of free operations. However, we found that these methods are not optimal solutions. > We can see that out of about 43K test queries, 32K saved nothing > whatever, and in only four was more than a couple of meg saved. > That's pretty discouraging IMO. It might be useful to look closer > at the behavior of those top four though. I see them as I have previously encountered situations where the non-garbage-collected memory of wal_sender was approximately hundreds of megabytes or even exceeded 1GB, but I was unable to reproduce this situation using simple SQL. Therefore, I introduced an asynchronous processing function, hoping to manage memory more efficiently without affecting performance. In addition, I have considered the following optimization strategies: 1. Adjust the configuration of ptmalloc2 through the mallopt function to use mmap rather than sbrk for memory allocation. This can immediately return the memory to the operating system when it is released, but it may affect performance due to the higher overhead of mmap. 2. Use other memory allocators such as jemalloc or tcmalloc, and adjust relevant parameters to reduce the generation of non-garbage-collected memory. However, these allocators are designed for multi-threaded and may lead to increased memory usage per process. 3. Build a set of memory context (memory context) allocation functions based on mmap, delegating the responsibility of memory management entirely to the database level. Although this solution can effectively control memory allocation, it requires a large-scale engineering implementation. I look forward to further discussing these solutions with you and exploring the best memory management practices together. Best regards, Shawn Tom Lane 于2024年9月16日周一 03:16写道: > I wrote: > > The single test case you showed suggested that maybe we could > > usefully prod glibc to free memory at query completion, but we > > don't need all this interrupt infrastructure to do that. I think > > we could likely get 95% of the benefit with about a five-line > > patch. > > To try to quantify that a little, I wrote a very quick-n-dirty > patch to apply malloc_trim during finish_xact_command and log > the effects. (I am not asserting this is the best place to > call malloc_trim; it's just one plausible possibility.) Patch > attached, as well as statistics collected from a run of the > core regression tests followed by > > grep malloc_trim postmaster.log | sed 's/.*LOG:/LOG:/' | sort -k4n | uniq > -c >trim_savings.txt > > We can see that out of about 43K test queries, 32K saved nothing > whatever, and in only four was more than a couple of meg saved. > That's pretty discouraging IMO. It might be useful to look closer > at the behavior of those top four though. I see them as > > 2024-09-15 14:58:06.146 EDT [960138] LOG: malloc_trim saved 7228 kB > 2024-09-15 14:58:06.146 EDT [960138] STATEMENT: ALTER TABLE > delete_test_table ADD PRIMARY KEY (a,b,c,d); > > 2024-09-15 14:58:09.861 EDT [960949] LOG: malloc_trim saved 12488 kB > 2024-09-15 14:58:09.861 EDT [960949] STATEMENT: with recursive > search_graph(f, t, label, is_cycle, path) as ( > select *, false, array[row(g.f, g.t)] from graph g > union distinct > select g.*, row(g.f, g.t) = any(path), path || row(g.f, > g.t) > from graph g, search_graph sg > where g.f = sg.t and not is_cycle > ) > select * from search_graph; > > 2024-09-15 14:58:09.866 EDT [960949] LOG: malloc_trim saved 12488 kB > 2024-09-15 14:58:09.866 EDT [960949] STATEMENT: with recursive > search_graph(f, t, label) as ( > select * from graph g > union distinct > select g.* > from graph g, search_graph sg > where g.f = sg.t > ) cycle f, t set is_cycle to 'Y' default 'N' using path > select * from search_graph; > > 2024-09-15 14:58:09.853
Re: Use streaming read API in ANALYZE
On Sun, Sep 15, 2024 at 12:14 AM Mats Kindahl wrote: > I used the combination of your patch and making the computation of > vacattrstats for a relation available through the API and managed to > implement something that I think does the right thing. (I just sampled a few > different statistics to check if they seem reasonable, like most common vals > and most common freqs.) See attached patch. Cool. I went ahead and committed that small new function and will mark the open item closed. > I need the vacattrstats to set up the two streams for the internal relations. > I can just re-implement them in the same way as is already done, but this > seems like a small change that avoids unnecessary code duplication. Unfortunately we're not in a phase where we can make non-essential changes, we're right about to release and we're only committing fixes, and it seems like you have a way forward (albeit with some duplication). We can keep talking about that for v18. >From your earlier email: > I'll take a look at the thread. I really think the ReadStream abstraction is > a good step in the right direction. Here's something you or your colleagues might be interested in: I was looking around for a fun extension to streamify as a demo of the technology, and I finished up writing a quick patch to streamify pgvector's HNSW index scan, which worked well enough to share[1] (I think it should in principle be able to scale with the number of graph connections, at least 16x), but then people told me that it's of limited interest because everybody knows that HNSW indexes have to fit in memory (I think there may also be memory prefetch streaming opportunities, unexamined for now). But that made me wonder what the people with the REALLY big indexes do for hyperdimensional graph search on a scale required to build Skynet, and that led me back to Timescale pgvectorscale[2]. I see two obvious signs that this thing is eminently and profitably streamifiable: (1) The stated aim is optimising for indexes that don't fit in memory, hence "Disk" in the name of the research project it is inspired by, (2) I see that DIskANN[3] is aggressively using libaio (Linux) and overlapped/IOCP (Windows). So now I am waiting patiently for a Rustacean to show up with patches for pgvectorscale to use ReadStream, which would already get read-ahead advice and vectored I/O (Linux, macOS, FreeBSD soon hopefully), and hopefully also provide a nice test case for the AIO patch set which redirects buffer reads through io_uring (Linux, basically the newer better libaio) or background I/O workers (other OSes, which works surprisingly competitively). Just BTW for comparison with DiskANN we have also had early POC-quality patches that drive AIO with overlapped/IOCP (Windows) which will eventually be rebased and proposed (Windows isn't really a primary target but we wanted to validate that the stuff we're working on has abstractions that will map to the obvious system APIs found in the systems PostgreSQL targets). For completeness, I've also had it mostly working on the POSIX AIO of FreeBSD, HP-UX and AIX (though we dropped support for those last two so that was a bit of a dead end). [1] https://www.postgresql.org/message-id/flat/CA%2BhUKGJ_7NKd46nx1wbyXWriuZSNzsTfm%2BrhEuvU6nxZi3-KVw%40mail.gmail.com [2] https://github.com/timescale/pgvectorscale [3] https://github.com/microsoft/DiskANN
Re: Pgoutput not capturing the generated columns
Hi, here are my review comments for patch v31-0002. == 1. General. IMO patches 0001 and 0002 should be merged when next posted. IIUC the reason for the split was only because there were 2 different authors but that seems to be not relevant anymore. == Commit message 2. When 'copy_data' is true, during the initial sync, the data is replicated from the publisher to the subscriber using the COPY command. The normal COPY command does not copy generated columns, so when 'publish_generated_columns' is true, we need to copy using the syntax: 'COPY (SELECT column_name FROM table_name) TO STDOUT'. ~ 2a. Should clarify that 'copy_data' is a SUBSCRIPTION parameter. 2b. Should clarify that 'publish_generated_columns' is a PUBLICATION parameter. == src/backend/replication/logical/tablesync.c make_copy_attnamelist: 3. - for (i = 0; i < rel->remoterel.natts; i++) + desc = RelationGetDescr(rel->localrel); + localgenlist = palloc0(rel->remoterel.natts * sizeof(bool)); Each time I review this code I am tricked into thinking it is wrong to use rel->remoterel.natts here for the localgenlist. AFAICT the code is actually fine because you do not store *all* the subscriber gencols in 'localgenlist' -- you only store those with matching names on the publisher table. It might be good if you could add an explanatory comment about that to prevent any future doubts. ~~~ 4. + if (!remotegenlist[remote_attnum]) + ereport(ERROR, + (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE), + errmsg("logical replication target relation \"%s.%s\" has a generated column \"%s\" " + "but corresponding column on source relation is not a generated column", + rel->remoterel.nspname, rel->remoterel.relname, NameStr(attr->attname; This error message has lots of good information. OTOH, I think when copy_data=false the error would report the subscriber column just as "missing", which is maybe less helpful. Perhaps that other copy_data=false "missing" case can be improved to share the same error message that you have here. ~~~ fetch_remote_table_info: 5. IIUC, this logic needs to be more sophisticated to handle the case that was being discussed earlier with Sawada-san [1]. e.g. when the same table has gencols but there are multiple subscribed publications where the 'publish_generated_columns' parameter differs. Also, you'll need test cases for this scenario, because it is too difficult to judge correctness just by visual inspection of the code. 6. nit - Change 'hasgencolpub' to 'has_pub_with_pubgencols' for readability, and initialize it to 'false' to make it easy to use later. ~~~ 7. - * Get column lists for each relation. + * Get column lists for each relation and check if any of the publication + * has generated column option. and + /* Check if any of the publication has generated column option */ + if (server_version >= 18) nit - tweak the comments to name the publication parameter properly. ~~~ 8. foreach(lc, MySubscription->publications) { if (foreach_current_index(lc) > 0) appendStringInfoString(&pub_names, ", "); appendStringInfoString(&pub_names, quote_literal_cstr(strVal(lfirst(lc; } I know this is existing code, but shouldn't all this be done by using the purpose-built function 'get_publications_str' ~~~ 9. + ereport(ERROR, + errcode(ERRCODE_CONNECTION_FAILURE), + errmsg("could not fetch gencolumns information from publication list: %s", +pub_names.data)); and + errcode(ERRCODE_UNDEFINED_OBJECT), + errmsg("failed to fetch tuple for gencols from publication list: %s", +pub_names.data)); nit - /gencolumns information/generated column publication information/ to make the errmsg more human-readable ~~~ 10. + bool gencols_allowed = server_version >= 18 && hasgencolpub; + + if (!gencols_allowed) + appendStringInfo(&cmd, " AND a.attgenerated = ''"); Can the 'gencols_allowed' var be removed, and the condition just be replaced with if (!has_pub_with_pubgencols)? It seems equivalent unless I am mistaken. == Please refer to the attachment which implements some of the nits mentioned above. == [1] https://www.postgresql.org/message-id/CAD21AoBun9crSWaxteMqyu8A_zme2ppa2uJvLJSJC2E3DJxQVA%40mail.gmail.com Kind Regards, Peter Smith. Fujitsu Australia diff --git a/src/backend/replication/logical/tablesync.c b/src/backend/replication/logical/tablesync.c index 723c44c..6d17984 100644 --- a/src/backend/replication/logical/tablesync.c +++ b/src/backend/replication/logical/tablesync.c @@ -850,7 +850,7 @@ fetch_remote_table_info(char *nspname, char *relname, bool **remotegenlist_res, Oid qualRow[] = {TEXTOID}; boolisnull; bool *remotegenlist; - boolhasgencolpub; + boolhas_pub_with_pubgencols = false; int natt; ListCell *lc; Bitmapset *included_cols = NULL; @@ -897,8 +897,8 @@ fetch_remote_table_info(char *nspname, char *relname, bool **
Re: Switch PgStat_HashKey.objoid from Oid to uint64
On Fri, Sep 13, 2024 at 04:03:13AM +, Bertrand Drouvot wrote: > Overall, the patch LGTM. Thanks for the review, I've applied that, then, detailing in the commit log what this changes and the three format bumps required. -- Michael signature.asc Description: PGP signature
Re: DROP OWNED BY fails to clean out pg_init_privs grants
>On Thu, Jun 20, 2024 at 3:43PM Hannu Krosing < han...@google.com > wrote: >> Still it would be nice to have some public support for users of >> non-managed PostgreSQL databases as well >+1. > >-- >Robert Haas >EDB: http://www.enterprisedb.com Hello! I have recently been researching postgres build using meson. And I came across a failure of the src/test/modules/test_pg_dump-running test. You can see regression.diffs in attached file. It turned out that the test fails if the cluster is initialized by custom user. In my case by user postgres. Script to reproduce test_pg_fump failure: --- meson setup -Dprefix=$PGPREFIX build ninja -j64 -C build install >/dev/null ninja -j64 -C build install-test-files >/dev/null initdb -U postgres -k -D $PGPREFIX/data pg_ctl -D $PGPREFIX/data -l logfile start psql -U postgres -c "CREATE USER test SUPERUSER" psql -U postgres -c "CREATE DATABASE test" meson test --setup running --suite postgresql:test_pg_dump-running -C build/ --- You can catch the same failure if build using make and run make installcheck -C src/test/modules/test_pg_dump. I have looked at the test and found queries like below: --- SELECT pg_describe_object(classid,objid,objsubid) COLLATE "C" AS obj, pg_describe_object(refclassid,refobjid,0) AS refobj, deptype FROM pg_shdepend JOIN pg_database d ON dbid = d.oid WHERE d.datname = current_database() ORDER BY 1, 3; --- This query does not expect that test database may already contain some information about custom user that ran test_pg_dump-running. -- Egor Chindyaskin Postgres Professional: https://postgrespro.com regression.diffs Description: Binary data
Re: define pg_structiszero(addr, s, r)
On Wed, Sep 18, 2024 at 04:16:12AM +, Bertrand Drouvot wrote: > The macro is created in pgstat_internal.h as it looks like that "only" the > statistics related code would benefit of it currently (could be moved to other > header file later on if needed). I'm OK to add a helper macro in pgstat_internal.h as this is a pattern used only for some stats kinds (the other one I'm aware of is the allzero check for pages around bufmgr.c), cleaning up all these static declarations to make the memcpy() calls cheaper. That can also be useful for anybody doing a custom pgstats kind, fixed or variable-numbered. #define pg_structiszero(addr, s, r) \ Locating that at the top of pgstat_internal.h seems a bit out of order to me. Perhaps it would be better to move it closer to the inline functions? Also, is this the best name to use here? Right, this is something that may be quite generic. However, if we limit its scope in the stats, perhaps this should be named pgstat_entry_all_zeros() or something like that? -- Michael signature.asc Description: PGP signature
Re: query_id, pg_stat_activity, extended query protocol
On Wed, Sep 18, 2024 at 09:38:32AM +0900, Michael Paquier wrote: > FWIW, I was thinking about something like what has been done in > indexcmds.c for 5bbdfa8a18dc as the query ID value is not predictible > across releases, but we could see whether it is set or not. By the way, with the main issue fixed as of 933848d16dc9, could it be possible to deal with the plan cache part in a separate thread? This is from the start a separate thread to me, and we've done quite a bit here already. -- Michael signature.asc Description: PGP signature
Re: Add contrib/pg_logicalsnapinspect
On Tue, Sep 17, 2024 at 12:44 PM Bertrand Drouvot wrote: > > Hi, > > On Tue, Sep 17, 2024 at 10:18:35AM +0530, shveta malik wrote: > > Thanks for addressing the comments. I have not started reviewing v4 > > yet, but here are few more comments on v3: > > > > 1) > > +#include "port/pg_crc32c.h" > > > > It is not needed in pg_logicalinspect.c as it is already included in > > internal_snapbuild.h > > Yeap, forgot to remove that one when creating the new "internal".h file, done > in v5 attached, thanks! > > > > > 2) > > + values[0] = Int16GetDatum(ondisk.builder.state); > > > > + values[8] = LSNGetDatum(ondisk.builder.last_serialized_snapshot); > > + values[9] = TransactionIdGetDatum(ondisk.builder.next_phase_at); > > + values[10] = Int64GetDatum(ondisk.builder.committed.xcnt); > > > > We can have values[i++] in all the places and later we can check : > > Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS); > > Then we need not to keep track of number even in later part of code, > > as it goes till 14. > > Right, let's do it that way (as it is done in pg_walinspect for example). > > > 4) > > Most of the output columns in pg_get_logical_snapshot_info() look > > self-explanatory except 'state'. Should we have meaningful 'text' here > > corresponding to SnapBuildState? Similar to what we do for > > 'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses > > for ReplicationSlotInvalidationCause) > > Yeah we could. I was not sure about that (and that was my first remark in [1]) > , as the module is mainly for debugging purpose, I was thinking that the one > using it could refer to "snapbuild.h". Let's see what others think. > okay, makes sense. lets wait what others have to say. Thanks for the patch. Few trivial things: 1) May be we shall change 'INTERNAL_SNAPBUILD_H' in snapbuild_internal.h to 'SNAPBUILD_INTERNAL_H'? 2) ValidateSnapshotFile() It is not only validating, but loading the content as well. So may be we can rename to ValidateAndRestoreSnapshotFile? 3) sgml: a) + The pg_logicalinspect functions are called using an LSN argument that can be extracted from the output name of the pg_ls_logicalsnapdir() function. Is it possible to give link to pg_ls_logicalsnapdir function here? b) + Gets logical snapshot metadata about a snapshot file that is located in the pg_logical/snapshots directory. located in server's pg_logical/snapshots directory (i.e. use server keyword, similar to how pg_ls_logicalsnapdir , pg_ls_logicalmapdir explains it) thanks Shveta
Re: Regression tests fail with tzdata 2024b
On Tue, Sep 17, 2024 at 4:15 PM Tom Lane wrote: > Wolfgang Walther writes: > > The core regression tests need to be run with a timezone that tests > > special cases in the timezone handling code. But that might not be true > > for extensions - all they want could be a stable output across major and > > minor versions of postgres and versions of tzdata. It could be helpful > > to set pg_regress' timezone to UTC, for example? > > I would not recommend that choice. It would mask simple errors such > as failing to apply the correct conversion between timestamptz and > timestamp values. Also, if you have test cases that are affected by > this issue at all, you probably have a need/desire to test things like > DST transitions. > As far as I'm aware timescaledb does not rely on specifics of tzdata version but just needs a stable setting for timezone. I guess I'll adjust our tests to not depend on upstream pg_regress timezone. -- Regards, Sven Klemm