Re: Schema variables - new implementation for Postgres 15
On 7/21/22 08:16, Pavel Stehule wrote: Hi new update of session variable;s implementation - fresh rebase - new possibility to trace execution with DEBUG1 notification - new SRF function pg_debug_show_used_session_variables that returns content of sessionvars hashtab - redesign of work with list of variables for reset, recheck, on commit drop, on commit reset Hi Pavel, I don't know exactly what failed but the docs (html/pdf) don't build: cd ~/pg_stuff/pg_sandbox/pgsql.schema_variables/doc/src/sgml $ make html /usr/bin/xmllint --path . --noout --valid postgres.sgml postgres.sgml:374: element link: validity error : IDREF attribute linkend references an unknown ID "catalog-pg-variable" make: *** [Makefile:135: html-stamp] Error 4 Erik Rijkers Regards Pavel
Re: Improve description of XLOG_RUNNING_XACTS
At Thu, 21 Jul 2022 11:21:09 +0900, Fujii Masao wrote in > > > On 2022/07/21 10:13, Masahiko Sawada wrote: > > Hi, > > I realized that standby_desc_running_xacts() in standbydesc.c doesn't > > describe subtransaction XIDs. I've attached the patch to improve the > > description. > > +1 > > The patch looks good to me. The subxids can reach TOTAL_MAX_CACHED_SUBXIDS = PGPROC_MAX_CACHED_SUBXIDS(=64) * PROCARRAY_MAXPROCS. xact_desc_commit also shows subtransactions but they are at maximum 64. I feel like -0.3 if there's no obvious advantage showing them. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Schema variables - new implementation for Postgres 15
Hi, On Thu, Jul 21, 2022 at 09:09:47AM +0200, Erik Rijkers wrote: > On 7/21/22 08:16, Pavel Stehule wrote: > > Hi > > > > new update of session variable;s implementation > > > > - fresh rebase > > - new possibility to trace execution with DEBUG1 notification > > - new SRF function pg_debug_show_used_session_variables that returns > > content of sessionvars hashtab > > - redesign of work with list of variables for reset, recheck, on commit > > drop, on commit reset Thanks for working on those! I will keep reviewing the patchset. > I don't know exactly what failed but the docs (html/pdf) don't build: > > cd ~/pg_stuff/pg_sandbox/pgsql.schema_variables/doc/src/sgml > > $ make html > /usr/bin/xmllint --path . --noout --valid postgres.sgml > postgres.sgml:374: element link: validity error : IDREF attribute linkend > references an unknown ID "catalog-pg-variable" > make: *** [Makefile:135: html-stamp] Error 4 Apparently most of the changes in catalogs.sgml didn't survive the last rebase. I do see the needed section in v20220709-0012-documentation.patch: > + > + pg_variable > [...]
Re: Pulling up direct-correlated ANY_SUBLINK
On Tue, Sep 10, 2019 at 9:49 PM Tom Lane wrote: > > Can we try to pull up direct-correlated ANY SubLink with the help of > > LATERAL? > > Perhaps. But what's the argument that you'd end up with a better > plan? LATERAL pretty much constrains things to use a nestloop, > so I'm not sure there's anything fundamentally different. Sorry for the noise on replying such an old thread, but recently I realized that pulling up direct-correlated ANY SubLink with LATERAL may cause another problem that we cannot find any legal join order due to the constraints imposed by LATERAL references. Below is an example: select * from A where exists (select * from B where A.i in (select C.i from C where C.j = B.j)); For this query, after we converting the ANY SubLink to a LATERAL subquery, the subquery cannot be pulled up further into the parent query because its qual contains lateral reference to 'B', which is outside a higher semi join. When considering the join of 'A' and the 'subquery', we decide it's not legal due to the LATERAL reference. As a result, we end up with not finding any legal join order for level 2. Thanks Richard
Re: Improve description of XLOG_RUNNING_XACTS
On Thu, Jul 21, 2022 at 4:29 PM Kyotaro Horiguchi wrote: > > At Thu, 21 Jul 2022 11:21:09 +0900, Fujii Masao > wrote in > > > > > > On 2022/07/21 10:13, Masahiko Sawada wrote: > > > Hi, > > > I realized that standby_desc_running_xacts() in standbydesc.c doesn't > > > describe subtransaction XIDs. I've attached the patch to improve the > > > description. > > > > +1 > > > > The patch looks good to me. > > The subxids can reach TOTAL_MAX_CACHED_SUBXIDS = > PGPROC_MAX_CACHED_SUBXIDS(=64) * PROCARRAY_MAXPROCS. xact_desc_commit > also shows subtransactions but they are at maximum 64. I feel like > -0.3 if there's no obvious advantage showing them. xxx_desc() functions are debugging purpose functions as they are used by pg_waldump and pg_walinspect etc. I think such functions should show all contents unless there is reason to hide them. Particularly, standby_desc_running_xacts() currently shows subtransaction information only when subtransactions are overflowed, which got me confused when inspecting WAL records. Regards, -- Masahiko Sawada EDB: https://www.enterprisedb.com/
Re: Remove fls(), use pg_bitutils.h facilities instead?
On Thu, Jul 21, 2022 at 1:34 AM Tom Lane wrote: > How is it sane to ask for a segment bin for zero pages? Seems like > something should have short-circuited such a case well before here. It's intended. There are two ways you can arrive here with n == 0: * There's a special case in execParallel.c that creates a DSA segment "in-place" with initial size dsa_minimum_size(). That's because we don't know yet if we have any executor nodes that need a DSA segment (Parallel Hash, Parallel Bitmap Heap Scan), so we create one with the minimum amount of space other than the DSA control meta-data, so you get an in-place segment 0 with 0 usable pages. As soon as someone tries to allocate one byte, the first external DSM segment will be created. * A full segment can be re-binned into slot 0. On Thu, Jul 21, 2022 at 1:48 AM Tom Lane wrote: > Seems like passing a size_t to pg_leftmost_one_pos32 isn't great. > It was just as wrong before (if the caller-supplied argument is > indeed a size_t), but no time like the present to fix it. > > We could have pg_bitutils.h #define pg_leftmost_one_pos_size_t > as the appropriate one of pg_leftmost_one_pos32/64, perhaps. Yeah. From e2b8448932ec7606c61dd847c3370ae9c31d23e9 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 21 Jul 2022 18:07:45 +1200 Subject: [PATCH v3 1/2] Extend size_t support in pg_bitutils.h. Use a more compact notation that allows us to add more size_t variants as required. This will be used by a later commit. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CA%2BhUKG%2B7dSX1XF8yFGmYk-%3D48dbjH2kmzZj16XvhbrWP-9BzRg%40mail.gmail.com --- src/include/port/pg_bitutils.h | 32 1 file changed, 12 insertions(+), 20 deletions(-) diff --git a/src/include/port/pg_bitutils.h b/src/include/port/pg_bitutils.h index 04e58cd1c4..814e0b2dba 100644 --- a/src/include/port/pg_bitutils.h +++ b/src/include/port/pg_bitutils.h @@ -175,16 +175,6 @@ pg_nextpower2_64(uint64 num) return ((uint64) 1) << (pg_leftmost_one_pos64(num) + 1); } -/* - * pg_nextpower2_size_t - * Returns the next higher power of 2 above 'num', for a size_t input. - */ -#if SIZEOF_SIZE_T == 4 -#define pg_nextpower2_size_t(num) pg_nextpower2_32(num) -#else -#define pg_nextpower2_size_t(num) pg_nextpower2_64(num) -#endif - /* * pg_prevpower2_32 * Returns the next lower power of 2 below 'num', or 'num' if it's @@ -211,16 +201,6 @@ pg_prevpower2_64(uint64 num) return ((uint64) 1) << pg_leftmost_one_pos64(num); } -/* - * pg_prevpower2_size_t - * Returns the next lower power of 2 below 'num', for a size_t input. - */ -#if SIZEOF_SIZE_T == 4 -#define pg_prevpower2_size_t(num) pg_prevpower2_32(num) -#else -#define pg_prevpower2_size_t(num) pg_prevpower2_64(num) -#endif - /* * pg_ceil_log2_32 * Returns equivalent of ceil(log2(num)) @@ -299,4 +279,16 @@ pg_rotate_left32(uint32 word, int n) return (word << n) | (word >> (32 - n)); } +/* size_t variants of the above, as required */ + +#if SIZEOF_SIZE_T == 4 +#define pg_leftmost_one_pos_size_t pg_leftmost_one_pos32 +#define pg_nextpower2_size_t pg_nextpower2_32 +#define pg_prevpower2_size_t pg_prevpower2_32 +#else +#define pg_leftmost_one_pos_size_t pg_leftmost_one_pos64 +#define pg_nextpower2_size_t pg_nextpower2_64 +#define pg_prevpower2_size_t pg_prevpower2_64 +#endif + #endif /* PG_BITUTILS_H */ -- 2.36.1 From ad854c0b7fa88a12ec957f9b5bef9b0510e424ac Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Tue, 19 Jul 2022 17:47:10 +1200 Subject: [PATCH v3 2/2] Remove fls function. Commit 4f658dc8 provided the traditional BSD fls() function so it could be used in several places. Later we added a bunch of similar sorts of things in pg_bitutils.h. Let's drop fls.c and the configure probe, and reuse the newer code. Reviewed-by: David Rowley Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CA%2BhUKG%2B7dSX1XF8yFGmYk-%3D48dbjH2kmzZj16XvhbrWP-9BzRg%40mail.gmail.com --- configure | 13 -- configure.ac | 1 - src/backend/access/hash/hashutil.c | 2 +- src/backend/optimizer/path/allpaths.c | 5 +- src/backend/optimizer/prep/prepunion.c | 2 +- src/backend/utils/mmgr/dsa.c | 14 +- src/include/pg_config.h.in | 3 -- src/include/port.h | 4 -- src/port/fls.c | 64 -- src/tools/msvc/Mkvcbuild.pm| 2 +- src/tools/msvc/Solution.pm | 1 - 11 files changed, 19 insertions(+), 92 deletions(-) delete mode 100644 src/port/fls.c diff --git a/configure b/configure index 59fa82b8d7..33a425562f 100755 --- a/configure +++ b/configure @@ -16771,19 +16771,6 @@ esac fi -ac_fn_c_check_func "$LINENO" "fls" "ac_cv_func_fls" -if test "x$ac_cv_func_fls" = xyes; then : - $as_echo "#define HAVE_FLS 1" >>confdefs.h - -else - case " $LIBOBJS " in - *" fls.$ac_objext "* ) ;; - *) LIBOBJS="$LIBOBJS fls.$ac_objext" - ;
[PATCH v1] eliminate duplicate code in table.c
There are some duplicate code in table.c, add a static inline function to eliminate the duplicates. -- Regards Junwang Zhao 0001-eliminate-duplicate-code-in-table.c.patch Description: Binary data
Re: Improve description of XLOG_RUNNING_XACTS
At Thu, 21 Jul 2022 16:58:45 +0900, Masahiko Sawada wrote in > > > The patch looks good to me. By the way +1 to this. > > The subxids can reach TOTAL_MAX_CACHED_SUBXIDS = > > PGPROC_MAX_CACHED_SUBXIDS(=64) * PROCARRAY_MAXPROCS. xact_desc_commit > > also shows subtransactions but they are at maximum 64. I feel like > > -0.3 if there's no obvious advantage showing them. > > xxx_desc() functions are debugging purpose functions as they are used > by pg_waldump and pg_walinspect etc. I think such functions should > show all contents unless there is reason to hide them. Particularly, > standby_desc_running_xacts() currently shows subtransaction > information only when subtransactions are overflowed, which got me > confused when inspecting WAL records. I'm not sure just confusing can justify that but after finding logicalmsg_desc dumps the whole content, I no longer feel against to show subxacts. Just for information, but as far as I saw, relmap_desc shows only the length of "data" but doesn't dump all of it. generic_desc behaves the same way. Thus we could just show "%d subxacts" instead of dumping out the all subxact ids just to avoid that confusion. However, again, I no longer object to show all subxids. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Handle infinite recursion in logical replication setup
On Wed, Jul 20, 2022 at 2:33 PM vignesh C wrote: > > Modified. Apart from this I have run pgperltidy on the perl file and > renamed 032_origin.pl to 030_origin.pl as currently there is > 029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file. > Thanks for the comment, the attached patch has the changes for the same. > Pushed. Kindly rebase the remaining patches. -- With Regards, Amit Kapila.
add a missing space
This is a minor fix that adds a missing space in file lockdefs.h -- Regards Junwang Zhao 0001-add-a-missing-space.patch Description: Binary data
Re: [PATCH] Introduce array_shuffle() and array_sample()
On Tue, 19 Jul 2022 at 21:21, Martin Kalcher wrote: > > Here is a patch with dimension aware array_shuffle() and array_sample(). > +1 for this feature, and this way of handling multi-dimensional arrays. > If you think array_flatten() is desirable, i can take a look at it. That's not something I've ever wanted -- personally, I rarely use multi-dimensional arrays in Postgres. A couple of quick comments on the current patch: It's important to mark these new functions as VOLATILE, not IMMUTABLE, otherwise they won't work as expected in queries. See https://www.postgresql.org/docs/current/xfunc-volatility.html It would be better to use pg_prng_uint64_range() rather than rand() to pick elements. Partly, that's because it uses a higher quality PRNG, with a larger internal state, and it ensures that the results are unbiased across the range. But more importantly, it interoperates with setseed(), allowing predictable sequences of "random" numbers to be generated -- something that's useful in writing repeatable regression tests. Assuming these new functions are made to interoperate with setseed(), which I think they should be, then they also need to be marked as PARALLEL RESTRICTED, rather than PARALLEL SAFE. See https://www.postgresql.org/docs/current/parallel-safety.html, which explains why setseed() and random() are parallel restricted. In my experience, the requirement for sampling with replacement is about as common as the requirement for sampling without replacement, so it seems odd to provide one but not the other. Of course, we can always add a with-replacement function later, and give it a different name. But maybe array_sample() could be used for both, with a "with_replacement" boolean parameter? Regards, Dean
Re: i.e. and e.g.
At Thu, 21 Jul 2022 12:30:04 +0700, John Naylor wrote in > On Thu, Jul 21, 2022 at 11:22 AM Kyotaro Horiguchi > wrote: > > > > At Thu, 21 Jul 2022 10:20:30 +0700, John Naylor < > john.nay...@enterprisedb.com> wrote in > > > need to change back branches. It does make sense for v15, though, and I > > > just forgot to consider that. > > > > Ok, I'm fine with that. Thanks! > > This is done. Thanks! -- Kyotaro Horiguchi NTT Open Source Software Center
Re: make -C libpq check fails obscurely if tap tests are disabled
On 2022-Jul-20, Andrew Dunstan wrote: > On 2022-07-20 We 13:23, Justin Pryzby wrote: > > PATH=... && @echo "TAP tests not enabled. Try configuring with > > --enable-tap-tests" > > /bin/sh: 1: @echo: not found > > > > make is telling the shell to run "@echo" , rather than running "echo" > > silently. > Yeah. It's a bit ugly but I think the attached would fix it. Here's a different take. Just assign the variable separately. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "No renuncies a nada. No te aferres a nada." >From 9dee1cfa2fd819d9242e725dcd35e3951924647d Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Thu, 21 Jul 2022 10:53:06 +0200 Subject: [PATCH] Fix interfaces/libpq makefile --- src/interfaces/libpq/Makefile | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile index b5fd72a4ac..8abdb092c2 100644 --- a/src/interfaces/libpq/Makefile +++ b/src/interfaces/libpq/Makefile @@ -143,11 +143,13 @@ install: all installdirs install-lib test-build: $(MAKE) -C test all +check installcheck: export PATH := $(CURDIR)/test:$(PATH) + check: test-build all - PATH="$(CURDIR)/test:$$PATH" && $(prove_check) + $(prove_check) installcheck: test-build all - PATH="$(CURDIR)/test:$$PATH" && $(prove_installcheck) + $(prove_installcheck) installdirs: installdirs-lib $(MKDIR_P) '$(DESTDIR)$(includedir)' '$(DESTDIR)$(includedir_internal)' '$(DESTDIR)$(datadir)' -- 2.30.2
Re: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?
At Wed, 20 Jul 2022 17:09:09 +0530, Bharath Rupireddy wrote in > Hi, > > After the commit [1], is it correct to say errmsg("invalid data in file > \"%s\"", BACKUP_LABEL_FILE))); in do_pg_backup_stop() when we hold the > contents in backend global memory, not actually reading from backup_label > file? However, it is correct to say that in read_backup_label. > > IMO, we can either say "invalid backup_label contents found" or we can be > more descriptive and say "invalid "START WAL LOCATION" line found in > backup_label content" and "invalid "BACKUP FROM" line found in > backup_label content" and so on. > > Thoughts? Previously there the case the "char *labelfile" is loaded from a file, but currently it is alwasy a string build on the process. In that sense, nowadays it is a kind of internal error, which I think is not supposed to be exposed to users. So I think we can leave the code alone to avoid back-patching obstacles. But if we decided to change the code around, I'd like to change the string into a C struct, so that we don't need to parse it. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
RE: Data is copied twice when specifying both child and parent table in publication
On Thur, Jul 14, 2022 at 12:46 PM Peter Smith wrote: > Here are some review comments for the v6 patch (HEAD only): Thanks for your comments. > 1. Commit message > > If there are two publications that publish the parent table and the child > table > separately, and both specify the option PUBLISH_VIA_PARTITION_ROOT, > subscribing > to both publications from one subscription causes initial copy twice. What we > expect is to be copied only once. > > ~ > > I don’t think the parameter even works in uppercase, so maybe better to say: > PUBLISH_VIA_PARTITION_ROOT -> 'publish_via_partition_root' It seems that there are more places to use lowercase than uppercase, so improved it as suggested. > 2. > > What we expect is to be copied only once. > > SUGGESTION > It should only be copied once. > > ~~~ > > 3. > > To fix this, we extend the API of the function pg_get_publication_tables. > Now, the function pg_get_publication_tables could receive the publication > list. > And then, if we specify option viaroot, we could exclude the partitioned table > whose ancestor belongs to the publication list when getting the table > informations. > > ~ > > Don't you mean "partition table" instead of "partitioned table"? > > SUGGESTION (also reworded) > To fix this, the API function pg_get_publication_tables has been > extended to take a publication list. Now, when getting the table > information, if the publish_via_partition_root is true, the function > can exclude a partition table whose ancestor is also published by the > same publication list. Improved and fixed as suggested. > 4. src/backend/catalog/pg_publication.c - pg_get_publication_tables > > - publication = GetPublicationByName(pubname, false); > + arr = PG_GETARG_ARRAYTYPE_P(0); > + deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT, > + &elems, NULL, &nelems); > > Maybe should have some comment to describe that this function > parameter is now an array of publications names. Add the following comment: `/* Deconstruct the parameter into elements. */`. Also improved the comment above the function pg_get_publication_tables: `Returns information of tables in one or more publications.` --> `Returns information of the tables in the given publication array.` > 5. > > + /* get Oids of tables from each publication */ > > Uppercase comment Improved as suggested. > 6. > > + ArrayType *arr; > + Datum*elems; > + int nelems, > + i; > + Publication *publication; > + bool viaroot = false; > + List*pub_infos = NIL; > + ListCell *lc1, > +*lc2; > > The 'publication' should be declared only in the loop that uses it. > It's also not good that this is shadowing the same variable name in a > later declaration. Reverted changes to variable "publication" declarations. > 7. > > + * Publications support partitioned tables, although all changes > + * are replicated using leaf partition identity and schema, so we > + * only need those. > */ > + if (publication->alltables) > + current_tables = GetAllTablesPublicationRelations(publication->pubviaroot); > + else > + { > + List*relids, > +*schemarelids; > + > + relids = GetPublicationRelations(publication->oid, > + publication->pubviaroot ? > + PUBLICATION_PART_ROOT : > + PUBLICATION_PART_LEAF); > + schemarelids = GetAllSchemaPublicationRelations(publication->oid, > + publication->pubviaroot ? > + PUBLICATION_PART_ROOT : > + PUBLICATION_PART_LEAF); > + current_tables = list_concat(relids, schemarelids); > + } > > Somehow I was confused by this comment because it says you only need > the LEAF tables but then the subsequent code is getting ROOT relations > anyway... Can you clarify the comment some more? I think this is a slight mistake when publication parameter "publish_via_partition_root" was introduced before. I improved the comment to the following: ``` Publications support partitioned tables. If publish_via_partition_root is false, all changes are replicated using leaf partition identity and schema, so we only need those. Otherwise, If publish_via_partition_root is true, get the partitioned table itself. ``` > 8. > > + bool viaroot = false; > > I think that should have a comment something like: > /* At least one publication is using publish_via_partition_root */ Improved as suggested. > 9. > > + /* > + * Record the published table and the corresponding publication so > + * that we can get row filters and column list later. > + */ > + foreach(lc1, tables) > + { > + Oid relid = lfirst_oid(lc1); > + > + foreach(lc2, pub_infos) > + { > + pub_info *pubinfo = (pub_info *) lfirst(lc2); > + > + if (list_member_oid(pubinfo->table_list, relid)) > + { > + Oid*result = (Oid *) malloc(sizeof(Oid) * 2); > + > + result[0] = relid; > + result[1] = pubinfo->pubid; > + > + results = lappend(results, result); > + } > + } > } > > I felt a bit uneasy about the double-looping here. I wonder if these > 'results' could have been accumulated within the existing loop over > all publications. Th
Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
On Wed, Jul 20, 2022 at 8:19 PM Önder Kalacı wrote: > >> >> > - Add a new class of invalidation callbacks: Today, if we do ALTER TABLE >> > or CREATE INDEX on a table, the CacheRegisterRelcacheCallback helps us to >> > re-create the cache entries. In this case, as far as I can see, we need a >> > callback that is called when table "ANALYZE"d, because that is when the >> > statistics change. That is the time picking a new index makes sense. >> > However, that seems like adding another dimension to this patch, which I >> > can try but also see that committing becomes even harder. >> > >> >> This idea sounds worth investigating. I see that this will require >> more work but OTOH, we can't allow the existing system to regress >> especially because depending on workload it might regress badly. > > > Just to note if that is not clear: This patch avoids (or at least aims to > avoid assuming no bugs) changing the behavior of the existing systems with > PRIMARY KEY or UNIQUE index. In that case, we still use the relevant indexes. > >> >> We >> can create a patch for this atop the base patch for easier review/test >> but I feel we need some way to address this point. >> > > One another idea could be to re-calculate the index, say after N > updates/deletes for the table. We may consider using subscription_parameter > for getting N -- with a good default, or even hard-code into the code. I > think the cost of re-calculating should really be pretty small compared to > the other things happening during logical replication. So, a sane default > might work? > One difficulty in deciding the value of N for the user or choosing a default would be that we need to probably consider the local DML operations on the table as well. > If you think the above doesn't work, I can try to work on a separate patch > which adds something like "analyze invalidation callback". > I suggest we should give this a try and if this turns out to be problematic or complex then we can think of using some heuristic as you are suggesting above. > >> >> >> Now, your point related to planner code in the patch bothers me as >> well but I haven't studied the patch in detail to provide any >> alternatives at this stage. Do you have any other ideas to make it >> simpler or solve this problem in some other way? >> > > One idea I tried earlier was to go over the existing indexes and on the > table, then get the IndexInfo via BuildIndexInfo(). And then, try to find a > good heuristic to pick an index. In the end, I felt like that is doing a > sub-optimal job, requiring a similar amount of code of the current patch, and > still using the similar infrastructure. > > My conclusion for that route was I should either use a very simple heuristic > (like pick the index with the most columns) and have a suboptimal index pick, > Not only that but say all index have same number of columns then we need to probably either pick the first such index or use some other heuristic. > > OR use a complex heuristic with a reasonable index pick. And, the latter > approach converged to the planner code in the patch. Do you think the former > approach is acceptable? > In this regard, I was thinking in which cases a sequence scan can be better than the index scan (considering one is available). I think if a certain column has a lot of duplicates (for example, a column has a boolean value) then probably doing a sequence scan is better. Now, considering this even though your other approach sounds simpler but could lead to unpredictable results. So, I think the latter approach is preferable. BTW, do we want to consider partial indexes for the scan in this context? I mean it may not have data of all rows so how that would be usable? Few comments: === 1. static List * +CreateReplicaIdentityFullPaths(Relation localrel) { ... + /* + * Rather than doing all the pushups that would be needed to use + * set_baserel_size_estimates, just do a quick hack for rows and width. + */ + rel->rows = rel->tuples; Is it a good idea to set rows without any selectivity estimation? Won't this always set the entire rows in a relation? Also, if we don't want to use set_baserel_size_estimates(), how will we compute baserestrictcost which will later be used in the costing of paths (for example, costing of seqscan path (cost_seqscan) uses it)? In general, I think it will be better to consider calling some top-level planner functions even for paths. Can we consider using make_one_rel() instead of building individual paths? On similar lines, in function PickCheapestIndexPathIfExists(), can we use set_cheapest()? 2. @@ -57,9 +60,6 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel, int2vector *indkey = &idxrel->rd_index->indkey; bool hasnulls = false; - Assert(RelationGetReplicaIndex(rel) == RelationGetRelid(idxrel) || -RelationGetPrimaryKeyIndex(rel) == RelationGetRelid(idxrel)); You have removed this assertion but there is a comment ("This i
Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.
On Thu, Jul 21, 2022 at 10:03 AM Michael Paquier wrote: > > On Thu, Jul 21, 2022 at 09:54:05AM +0530, Amit Kapila wrote: > > Yeah, it is not very clear to me either. I think this won't be > > difficult to change one or another way depending on future needs. At > > this stage, it appeared to me that bitmask is a better way to > > represent this information but if you and other feels using enum is a > > better idea then I am fine with that as well. > > Please don't get me wrong :) > > I favor a bitmask over an enum here, as you do, with a clean > layer for those flags. > Okay, let's see what Peter Smith has to say about this as he was in favor of using enum here? -- With Regards, Amit Kapila.
Re: add a missing space
On Thu, Jul 21, 2022 at 2:08 PM Junwang Zhao wrote: > > This is a minor fix that adds a missing space in file lockdefs.h > LGTM. I'll push this in some time. -- With Regards, Amit Kapila.
Re: add a missing space
Great, thanks! On Thu, Jul 21, 2022 at 6:13 PM Amit Kapila wrote: > > On Thu, Jul 21, 2022 at 2:08 PM Junwang Zhao wrote: > > > > This is a minor fix that adds a missing space in file lockdefs.h > > > > LGTM. I'll push this in some time. > > -- > With Regards, > Amit Kapila. -- Regards Junwang Zhao
Re: Handle infinite recursion in logical replication setup
On Thu, Jul 21, 2022 at 2:06 PM Amit Kapila wrote: > > On Wed, Jul 20, 2022 at 2:33 PM vignesh C wrote: > > > > Modified. Apart from this I have run pgperltidy on the perl file and > > renamed 032_origin.pl to 030_origin.pl as currently there is > > 029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file. > > Thanks for the comment, the attached patch has the changes for the same. > > > > Pushed. Kindly rebase the remaining patches. Thanks for pushing the patch. The attached v37 version contains the rebased patch for the remaining patches. Regards, Vignesh From c162bfe2367f71b6e9a2e13ee00b68b61d08840e Mon Sep 17 00:00:00 2001 From: Vigneshwaran C Date: Mon, 27 Jun 2022 18:44:18 +0530 Subject: [PATCH v37 2/2] Document bidirectional logical replication steps in various scenarios. Document the steps for the following: a) Setting bidirectional replication between two nodes. b) Adding a new node when there is no table data on any of the nodes. c) Adding a new node when table data is present on the existing nodes. d) Generic steps for adding a new node to an existing set of nodes. Author: Vignesh C Reviewed-By: Peter Smith, Amit Kapila, Shi yu Discussion: https://www.postgresql.org/message-id/CALDaNm0gwjY_4HFxvvty01BOT01q_fJLKQ3pWP9=9orqubh...@mail.gmail.com --- doc/src/sgml/logical-replication.sgml | 301 ++ doc/src/sgml/ref/create_subscription.sgml | 5 +- 2 files changed, 305 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml index bdf1e7b727..c94b3bfd27 100644 --- a/doc/src/sgml/logical-replication.sgml +++ b/doc/src/sgml/logical-replication.sgml @@ -1479,4 +1479,305 @@ CREATE SUBSCRIPTION mysub CONNECTION 'dbname=foo host=bar user=repuser' PUBLICAT incremental changes to those tables. + + + Bidirectional logical replication + + +Bidirectional replication is useful for creating a multi-master database +environment for replicating read/write operations performed by any of the +member nodes. The steps to create a bidirectional replication in various +scenarios are given below. + + + + + Setting up bidirectional logical replication requires multiple steps to be + performed on various nodes. Because not all operations are transactional, + the user is advised to take backups. + + + + + Setting bidirectional replication between two nodes + +The following steps demonstrate how to create a two-node bidirectional +replication when there is no table data present on both nodes +node1 and node2: + + + +Create a publication on node1: + +node1=# CREATE PUBLICATION pub_node1 FOR TABLE t1; +CREATE PUBLICATION + + + +Create a publication on node2: + +node2=# CREATE PUBLICATION pub_node2 FOR TABLE t1; +CREATE PUBLICATION + + + +Lock the table t1 on node1 and +node2 in EXCLUSIVE mode until the +setup is completed. + + + +Create a subscription on node2 to subscribe to +node1: + +node2=# CREATE SUBSCRIPTION sub_node2_node1 +node2-# CONNECTION 'dbname=foo host=node1 user=repuser' +node2-# PUBLICATION pub_node1 +node2-# WITH (copy_data = off, origin = none); +CREATE SUBSCRIPTION + + + +Create a subscription on node1 to subscribe to +node2: + +node1=# CREATE SUBSCRIPTION sub_node1_node2 +node1-# CONNECTION 'dbname=foo host=node2 user=repuser' +node1-# PUBLICATION pub_node2 +node1-# WITH (copy_data = off, origin = none); +CREATE SUBSCRIPTION + + + +Now the bidirectional logical replication setup is complete between +node1 and node2. Any incremental +changes from node1 will be replicated to +node2, and any incremental changes from +node2 will be replicated to node1. + + + + + Adding a new node when there is no table data on any of the nodes + +The following steps demonstrate adding a new node node3 +to the existing node1 and node2 when +there is no t1 data on any of the nodes. This requires +creating subscriptions on node1 and +node2 to replicate the data from +node3 and creating subscriptions on +node3 to replicate data from node1 +and node2. Note: These steps assume that the +bidirectional logical replication between node1 and +node2 is already completed. + + + +Create a publication on node3: + +node3=# CREATE PUBLICATION pub_node3 FOR TABLE t1; +CREATE PUBLICATION + + + +Lock table t1 on all the nodes node1, +node2 and node3 in +EXCLUSIVE mode until the setup is completed. + + + +Create a subscription on node1 to subscribe to +node3: + +node1=# CREATE SUBSCRIPTION sub_node1_node3 +node1-# CONNECTION 'dbname=foo host=node3 user=repuser' +node1-# PUBLICATION pub_node3 +node1-# WITH (copy_data = off, origin = none); +CREATE SUBSCRIPTION + + + +Create a subscription on node2 to subscribe to +node3: + +node2=# CREATE SUBSCRIPTION sub_node2_node3 +node2-# CONNECT
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
On 2022-Jul-20, Alvaro Herrera wrote: > I see the following alternatives: > > 1. not backpatch this fix to 14 and older > 2. use a different GUC; either allow_invalid_pages as previously >suggested, or create a new one just for this purpose > 3. not provide any overriding mechanism in versions 14 and older I've got no opinions on this. I don't like either 1 or 3, so I'm going to add and backpatch a new GUC allow_recovery_tablespaces as the override mechanism. If others disagree with this choice, please speak up. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/
Re: [PATCH v1] eliminate duplicate code in table.c
On Thu, Jul 21, 2022 at 1:56 PM Junwang Zhao wrote: > > There are some duplicate code in table.c, add a static inline function > to eliminate the duplicates. > Can we name function as validate_object_type, or check_object_type? Otherwise, the patch looks fine to me. Let's see if others have something to say. -- With Regards, Amit Kapila.
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
On Wed, Jul 20, 2022 at 10:51 PM Alvaro Herrera wrote: > v26 here. I spent some time fighting the readdir() stuff for > Windows (so that get_dirent_type returns LNK for junction points) > but couldn't make it to work and was unable to figure out why. Was it because of this? https://www.postgresql.org/message-id/CA%2BhUKGKv%2B736Pc8kSj3%3DDijDGd1eC79-uT3Vi16n7jYkcc_raw%40mail.gmail.com
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera wrote: > On 2022-Jul-20, Alvaro Herrera wrote: > > I see the following alternatives: > > > > 1. not backpatch this fix to 14 and older > > 2. use a different GUC; either allow_invalid_pages as previously > >suggested, or create a new one just for this purpose > > 3. not provide any overriding mechanism in versions 14 and older > > I've got no opinions on this. I don't like either 1 or 3, so I'm going > to add and backpatch a new GUC allow_recovery_tablespaces as the > override mechanism. > > If others disagree with this choice, please speak up. Would it help if we back-patched the allow_in_place_tablespaces stuff? I'm not sure how hard/destabilising that would be, but I could take a look tomorrow.
Re: [PATCH] Introduce array_shuffle() and array_sample()
Am 21.07.22 um 10:41 schrieb Dean Rasheed: A couple of quick comments on the current patch: Thank you for your feedback! It's important to mark these new functions as VOLATILE, not IMMUTABLE, otherwise they won't work as expected in queries. See https://www.postgresql.org/docs/current/xfunc-volatility.html CREATE FUNCTION marks functions as VOLATILE by default if not explicitly marked otherwise. I assumed function definitions in pg_proc.dat have the same behavior. I will fix that. https://www.postgresql.org/docs/current/sql-createfunction.html It would be better to use pg_prng_uint64_range() rather than rand() to pick elements. Partly, that's because it uses a higher quality PRNG, with a larger internal state, and it ensures that the results are unbiased across the range. But more importantly, it interoperates with setseed(), allowing predictable sequences of "random" numbers to be generated -- something that's useful in writing repeatable regression tests. I agree that we should use pg_prng_uint64_range(). However, in order to achieve interoperability with setseed() we would have to use drandom_seed (rather than pg_global_prng_state) as rng state, which is declared statically in float.c and exclusively used by random(). Do we want to expose drandom_seed to other functions? Assuming these new functions are made to interoperate with setseed(), which I think they should be, then they also need to be marked as PARALLEL RESTRICTED, rather than PARALLEL SAFE. See https://www.postgresql.org/docs/current/parallel-safety.html, which explains why setseed() and random() are parallel restricted. As mentioned above, i assumed the default here is PARALLEL UNSAFE. I'll fix that. In my experience, the requirement for sampling with replacement is about as common as the requirement for sampling without replacement, so it seems odd to provide one but not the other. Of course, we can always add a with-replacement function later, and give it a different name. But maybe array_sample() could be used for both, with a "with_replacement" boolean parameter? We can also add a "with_replacement" boolean parameter which is false by default to array_sample() later. I do not have a strong opinion about that and will implement it, if desired. Any opinions about default/no-default? Martin
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
On 2022-Jul-21, Thomas Munro wrote: > On Wed, Jul 20, 2022 at 10:51 PM Alvaro Herrera > wrote: > > v26 here. I spent some time fighting the readdir() stuff for > > Windows (so that get_dirent_type returns LNK for junction points) > > but couldn't make it to work and was unable to figure out why. > > Was it because of this? > > https://www.postgresql.org/message-id/CA%2BhUKGKv%2B736Pc8kSj3%3DDijDGd1eC79-uT3Vi16n7jYkcc_raw%40mail.gmail.com Oh, that sounds very likely, yeah. I didn't think of testing the FILE_ATTRIBUTE_DIRECTORY bit for junction points. I +1 pushing both of these patches to 14. Then this patch becomes a couple of lines shorter. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Before you were born your parents weren't as boring as they are now. They got that way paying your bills, cleaning up your room and listening to you tell them how idealistic you are." -- Charles J. Sykes' advice to teenagers
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
On 2022-Jul-21, Thomas Munro wrote: > On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera > wrote: > > I've got no opinions on this. I don't like either 1 or 3, so I'm going > > to add and backpatch a new GUC allow_recovery_tablespaces as the > > override mechanism. > > > > If others disagree with this choice, please speak up. > > Would it help if we back-patched the allow_in_place_tablespaces stuff? > I'm not sure how hard/destabilising that would be, but I could take a > look tomorrow. Yeah, I think that would reduce cruft. I'm not sure this is more against backpatching policy or less, compared to adding a separate GUC just for this bugfix. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "The problem with the facetime model is not just that it's demoralizing, but that the people pretending to work interrupt the ones actually working." (Paul Graham)
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
On 2022-Jul-21, Alvaro Herrera wrote: > Yeah, I think that would reduce cruft. I'm not sure this is more > against backpatching policy or less, compared to adding a separate > GUC just for this bugfix. cruft: { {"allow_recovery_tablespaces", PG_POSTMASTER, WAL_RECOVERY, gettext_noop("Continues recovery after finding invalid database directories."), gettext_noop("It is possible for tablespace drop to interfere with database creation " "so that WAL replay is forced to create fake database directories. " "These should have been dropped by the time recovery ends; " "but in case they aren't, this option lets recovery continue if they " "are present. Note that these directories must be removed manually afterwards."), GUC_NOT_IN_SAMPLE }, &allow_recovery_tablespaces, false, NULL, NULL, NULL }, This is not a very good explanation, but I don't know how to make it better. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "I think my standards have lowered enough that now I think 'good design' is when the page doesn't irritate the living f*ck out of me." (JWZ)
Re: [PoC] Reducing planning time when tables have many partitions
On 7/5/22 13:57, Yuya Watari wrote: On Mon, Jul 4, 2022 at 6:28 AM Tom Lane wrote: Perhaps the bms_is_subset class could be handled in a similar way, ie do a one-time pass to make a List of all EquivalenceMembers that use a RelOptInfo. Thank you for giving your idea. I will try to polish up my algorithm based on your suggestion. This work has significant interest for highly partitioned configurations. Are you still working on this patch? According to the current state of the thread, changing the status to 'Waiting on author' may be better until the next version. Feel free to reverse the status if you need more feedback. -- Regards Andrey Lepikhov Postgres Professional
Re: [PATCH v1] eliminate duplicate code in table.c
Hi hackers, > > There are some duplicate code in table.c, add a static inline function > > to eliminate the duplicates. > > > > Can we name function as validate_object_type, or check_object_type? > > Otherwise, the patch looks fine to me. Let's see if others have > something to say. LGTM -- Best regards, Aleksander Alekseev v2-0001-Eliminate-duplicate-code-in-table.c.patch Description: Binary data
Re: [PATCH v1] eliminate duplicate code in table.c
On Thu, Jul 21, 2022 at 5:09 PM Aleksander Alekseev wrote: > > > > There are some duplicate code in table.c, add a static inline function > > > to eliminate the duplicates. > > > > > > > Can we name function as validate_object_type, or check_object_type? > > > > Otherwise, the patch looks fine to me. Let's see if others have > > something to say. > > LGTM > @@ -161,10 +121,32 @@ table_openrv_extended(const RangeVar *relation, LOCKMODE lockmode, * * Note that it is often sensible to hold a lock beyond relation_close; * in that case, the lock is released automatically at xact end. - * + * */ void table_close(Relation relation, LOCKMODE lockmode) I don't think this change should be part of this patch. Do you see a reason for doing this? -- With Regards, Amit Kapila.
Re: [PATCH v1] eliminate duplicate code in table.c
Hi Amit, > I don't think this change should be part of this patch. Do you see a > reason for doing this? My bad. I thought this was done by pgindent. -- Best regards, Aleksander Alekseev v3-0001-Eliminate-duplicate-code-in-table.c.patch Description: Binary data
Re: [RFC] building postgres with meson - v10
Hi, On 2022-07-18 23:23:27 +0300, Andres Freund wrote: > Bilal, Peter previously commented on the pg_regress change for ecpg, > perhaps > you can comment on that? > > In > https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com > On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote: > > dff7b5a960 meson: prereq: regress: allow to specify director containing > > expected files. > > > > This could use a bit more explanation, but it doesn't look > > controversial so far While testing ECPG, C and exe files are generated by meson so these files are in the meson's build directory but expected files are in the source directory. However; there was no way to set different paths for inputs (C and exe files') and expected files' directory. So, I added `--expecteddir` to separately set expected files' directory. Greetings, Nazir Bilal Yavuz On Mon, 18 Jul 2022 at 23:23, Andres Freund wrote: > Hi, > > On 2022-07-18 11:33:09 +0200, Peter Eisentraut wrote: > > The following patches are ok to commit IMO: > > > > a1c5542929 prereq: Deal with paths containing \ and spaces in > basebackup_to_shell tests > > e37951875d meson: prereq: psql: Output dir and dependency generation for > sql_help > > 18cc9fbd02 meson: prereq: ecpg: Add and use output directory argument > for preproc/*.pl > > 58a32694e9 meson: prereq: Move snowball_create.sql creation into perl > file > > 59b8bffdaf meson: prereq: Add output path arg in generate-lwlocknames.pl > > 2db97b00d5 meson: prereq: generate-errcodes.pl: Accept output file > > fb8f52f21d meson: prereq: unicode: Allow to specify output directory > > 8f1e4410d6 meson: prereq: Refactor dtrace postprocessing make rules > > 3d18a20b11 meson: prereq: Add --outdir option to gen_node_support.pl > > I pushed these. Thanks for the reviews and patches! > > The symbol export stuff has also been pushed (discussed in a separate > thread). > > It's nice to see the meson patchset length reduced by this much. > > I pushed a rebased version of the remaining branches to git. I'll be on > vacation for a bit, I'm not sure I can get a new version with further > cleanups > out before. > > > Given that we can't use src/tools/gen_versioning_script.pl for the make > build, > due to not depending on perl for tarball builds, I'm inclined to rewrite it > python (which we depend on via meson anyway) and consider it a meson > specific > wrapper? > > > Bilal, Peter previously commented on the pg_regress change for ecpg, > perhaps > you can comment on that? > > In > https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com > On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote: > > dff7b5a960 meson: prereq: regress: allow to specify director containing > > expected files. > > > > This could use a bit more explanation, but it doesn't look > > controversial so far. > > Greetings, > > Andres Freund >
Re: [PATCH v1] eliminate duplicate code in table.c
On 2022-Jul-21, Junwang Zhao wrote: > There are some duplicate code in table.c, add a static inline function > to eliminate the duplicates. Hmm, but see commit 2ed532ee8c47 about this kind of check. Perhaps we should change these error messages to conform to the same message style. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "I love the Postgres community. It's all about doing things _properly_. :-)" (David Garamond)
Re: [PATCH] Introduce array_shuffle() and array_sample()
On Thu, 21 Jul 2022 at 12:15, Martin Kalcher wrote: > > I agree that we should use pg_prng_uint64_range(). However, in order to > achieve interoperability with setseed() we would have to use > drandom_seed (rather than pg_global_prng_state) as rng state, which is > declared statically in float.c and exclusively used by random(). Do we > want to expose drandom_seed to other functions? > Ah, I didn't realise that setseed() and random() were bound up so tightly. It does feel as though, if we're adding more user-facing functions that return random sequences, there ought to be a way to seed them, and I wouldn't want to have separate setseed functions for each one. I'm inclined to say that we want a new pg_global_prng_user_state that is updated by setseed(), and used by random(), array_shuffle(), array_sample(), and any other user-facing random functions we add later. I can also see that others might not like expanding the scope of setseed() in this way. Regards, Dean
Re: [RFC] building postgres with meson - v10
Hi, Sorry for the first email. On Mon, 18 Jul 2022 at 23:23, Andres Freund wrote: > > In https://postgr.es/m/0e81e45c-c9a5-e95b-2782-ab2dfec8bf57%40enterprisedb.com > On 2022-07-06 11:03:31 +0200, Peter Eisentraut wrote: > > dff7b5a960 meson: prereq: regress: allow to specify director containing > > expected files. > > > > This could use a bit more explanation, but it doesn't look > > controversial so far. While testing ECPG, C and exe files are generated by meson so these files are in the meson's build directory but expected files are in the source directory. However; there was no way to set different paths for inputs (C and exe files') and expected files' directory. So, I added `--expecteddir` to separately set expected files' directory. Greetings, Nazir Bilal Yavuz
Re: [PATCH v1] eliminate duplicate code in table.c
Hi Alvaro, > Hmm, but see commit 2ed532ee8c47 about this kind of check. Perhaps we > should change these error messages to conform to the same message style. Good point! Done. -- Best regards, Aleksander Alekseev v4-0001-Eliminate-duplicate-code-in-table.c.patch Description: Binary data
Re: MultiXact\SLRU buffers configuration
Good day, all. I did benchmark of patch on 2 socket Xeon 5220 CPU @ 2.20GHz . I used "benchmark" used to reproduce problems with SLRU on our customers setup. In opposite to Shawn's tests I concentrated on bad case: a lot of contention. slru-funcs.sql - function definitions - functions creates a lot of subtrunsactions to stress subtrans - and select random rows for share to stress multixacts slru-call.sql - function call for benchmark slru-ballast.sql - randomly select 1000 consequent rows "for update skip locked" to stress multixacts patch1 - make SLRU buffers configurable patch2 - make "8-associative banks" Benchmark done by pgbench. Inited with scale 1 to induce contention. pgbench -i -s 1 testdb Benchmark 1: - low number of connections (50), 60% slru-call, 40% slru-ballast pgbench -f slru-call.sql@60 -f slru-ballast.sql@40 -c 50 -j 75 -P 1 -T 30 testdb version | subtrans | multixact | tps | buffers | offs/memb | func+ballast +--+---+-- master | 32 | 8/16 | 184+119 patch1 | 32 | 8/16 | 184+119 patch1 | 1024 | 8/16 | 121+77 patch1 | 1024 | 512/1024 | 118+75 patch2 | 32 | 8/16 | 190+122 patch2 | 1024 | 8/16 | 190+125 patch2 | 1024 | 512/1024 | 190+127 As you see, this test case degrades with dumb increase of SLRU buffers. But use of "hash table" in form of "associative buckets" makes performance stable. Benchmark 2: - high connection number (600), 98% slru-call, 2% slru-ballast pgbench -f slru-call.sql@98 -f slru-ballast.sql@2 -c 600 -j 75 -P 1 -T 30 testdb I don't paste "ballast" tps here since 2% make them too small, and they're very noisy. version | subtrans | multixact | tps | buffers | offs/memb | func +--+---+-- master | 32 | 8/16 | 13 patch1 | 32 | 8/16 | 13 patch1 | 1024 | 8/16 | 31 patch1 | 1024 | 512/1024 | 53 patch2 | 32 | 8/16 | 12 patch2 | 1024 | 8/16 | 34 patch2 | 1024 | 512/1024 | 67 In this case simple buffer increase does help. But "buckets" increase performance gain. I didn't paste here results third part of patch ("Pack SLRU...") because I didn't see any major performance gain from it, and it consumes large part of patch diff. Rebased versions of first two patch parts are attached. regards, Yura Sokolov slru-ballast.sql Description: application/sql slru-call.sql Description: application/sql slru-func.sql Description: application/sql From 41ec9d1c54184c515d53ecc8021c4a998813f2a9 Mon Sep 17 00:00:00 2001 From: Andrey Borodin Date: Sun, 11 Apr 2021 21:18:10 +0300 Subject: [PATCH v21 2/2] Divide SLRU buffers into 8-associative banks We want to eliminate linear search within SLRU buffers. To do so we divide SLRU buffers into banks. Each bank holds approximately 8 buffers. Each SLRU pageno may reside only in one bank. Adjacent pagenos reside in different banks. --- src/backend/access/transam/slru.c | 43 --- src/include/access/slru.h | 2 ++ 2 files changed, 41 insertions(+), 4 deletions(-) diff --git a/src/backend/access/transam/slru.c b/src/backend/access/transam/slru.c index b65cb49d7ff..abc534bbd06 100644 --- a/src/backend/access/transam/slru.c +++ b/src/backend/access/transam/slru.c @@ -134,7 +134,7 @@ typedef enum static SlruErrorCause slru_errcause; static int slru_errno; - +static void SlruAdjustNSlots(int* nslots, int* banksize, int* bankoffset); static void SimpleLruZeroLSNs(SlruCtl ctl, int slotno); static void SimpleLruWaitIO(SlruCtl ctl, int slotno); static void SlruInternalWritePage(SlruCtl ctl, int slotno, SlruWriteAll fdata); @@ -148,6 +148,30 @@ static bool SlruScanDirCbDeleteCutoff(SlruCtl ctl, char *filename, int segpage, void *data); static void SlruInternalDeleteSegment(SlruCtl ctl, int segno); +/* + * Pick bank size optimal for N-assiciative SLRU buffers. + * We expect bank number to be picked from lowest bits of requested pageno. + * Thus we want number of banks to be power of 2. This routine computes number + * of banks aiming to make each bank of size 8. So we can pack page number and + * statuses of each bank on one cacheline. + */ +static void SlruAdjustNSlots(int* nslots, int* banksize, int* bankoffset) +{ + int nbanks = 1; + *banksize = *nslots; + *bankoffset = 0; + while (*banksize > 15) + { + if ((*banksize & 1) != 0) + *banksize +=1; + *banksize /= 2; + nbanks *= 2; + *bankoffset += 1; + } + elog(DEBUG5, "nslots %d banksize %d nbanks %d ", *nslots, *banksize, nbanks); + *nslots = *banksize * nbanks; +} + /* * Initialization of shared memory */ @@ -156,6 +180,8 @@ Size SimpleLruShmemSize(int nslots, int nlsns) { Size sz; + int bankoffset, banksize; + SlruAdjustNSlots(&nslots, &banksize, &bankoffset); /* we assume nslots isn't so large as to risk overflow */ sz = MAXALIGN(sizeof(SlruSharedData)); @@ -190,6 +216,8 @@ SimpleLruInit(
Re: Improve description of XLOG_RUNNING_XACTS
Hi On Thu, Jul 21, 2022 at 6:44 AM Masahiko Sawada wrote: > > Hi, > > I realized that standby_desc_running_xacts() in standbydesc.c doesn't > describe subtransaction XIDs. I've attached the patch to improve the > description. Here is an example by pg_wlaldump: > > * HEAD > rmgr: Standby len (rec/tot): 58/58, tx: 0, lsn: > 0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050 > latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048 > > * w/ patch > rmgr: Standby len (rec/tot): 58/58, tx: 0, lsn: > 0/01D0C608, prev 0/01D0C5D8, desc: RUNNING_XACTS nextXid 1050 > latestCompletedXid 1047 oldestRunningXid 1048; 1 xacts: 1048; 1 > subxacts: 1049 > I think this is a good addition to debugging info. +1 If we are going to add 64 subxid numbers then it would help if we could be more verbose and print "subxid overflowed" instead of "subxid ovf". -- Best Wishes, Ashutosh Bapat
Re: SLRUs in the main buffer pool, redux
Good day, Thomas В Пт, 27/05/2022 в 23:24 +1200, Thomas Munro пишет: > Rebased, debugged and fleshed out a tiny bit more, but still with > plenty of TODO notes and questions. I will talk about this idea at > PGCon, so I figured it'd help to have a patch that actually applies, > even if it doesn't work quite right yet. It's quite a large patch but > that's partly because it removes a lot of lines... Looks like it have to be rebased again.
Re: Remove fls(), use pg_bitutils.h facilities instead?
Thomas Munro writes: > On Thu, Jul 21, 2022 at 1:34 AM Tom Lane wrote: >> How is it sane to ask for a segment bin for zero pages? Seems like >> something should have short-circuited such a case well before here. > It's intended. There are two ways you can arrive here with n == 0: OK. >> We could have pg_bitutils.h #define pg_leftmost_one_pos_size_t >> as the appropriate one of pg_leftmost_one_pos32/64, perhaps. > Yeah. Patches look good to me. regards, tom lane
Re: [PATCH v1] eliminate duplicate code in table.c
On Thu, Jul 21, 2022 at 6:12 PM Aleksander Alekseev wrote: > > Hi Alvaro, > > > Hmm, but see commit 2ed532ee8c47 about this kind of check. Perhaps we > > should change these error messages to conform to the same message style. > > Good point! Done. > Yeah, that's better. On again thinking about the function name, I wonder if validate_relation_type() suits here as there is no generic object being passed? -- With Regards, Amit Kapila.
Re: System catalog documentation chapter
On Wed, Jul 20, 2022 at 09:19:17PM -0400, Bruce Momjian wrote: > On Wed, Jul 20, 2022 at 04:32:46PM -0400, Bruce Momjian wrote: > > On Wed, Jul 20, 2022 at 04:23:21PM -0400, Isaac Morland wrote: > > > Will there be a comprehensive list somewhere? Even if it just lists the > > > views, > > > gives maybe a one-sentence description, and links to the relevant > > > chapter, I > > > would find that helpful sometimes. > > > > I was not planning on that since we don't do that in any other cases I > > can think of. > > > > > I ask because I occasionally find myself wanting a comprehensive list of > > > functions, and as far as I can tell it doesn't exist. I'm hoping to avoid > > > that > > > situation for views. > > > > Well, then we just leave them where the are and link to them from other > > parts of the documentation, which I assume/hope we already do. > > People have mentioned the view documentation doesn't belong in the > Internals part. Maybe we can just move it to the Server > Administration part and leave it together. Thinking some more about this, I wonder if we should distinguish system views that are needed for a task vs those used for reporting. For example, pg_stat_activity is a dymamic view and is needed for monitoring. pg_prepared_statements just reports the prepared statements. Could it be that over time, we have moved the "needed for a task" views into the relevant sections, and the reporting views have just stayed as a group, and that is okay --- maybe they just need to be moved to Server Administration? -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: [PATCH v1] eliminate duplicate code in table.c
Hi Amit, > Yeah, that's better. On again thinking about the function name, I > wonder if validate_relation_type() suits here as there is no generic > object being passed? Yep, validate_relation_type() sounds better. -- Best regards, Aleksander Alekseev v5-0001-Eliminate-duplicate-code-in-table.c.patch Description: Binary data
Re: Make name optional in CREATE STATISTICS
On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent wrote: > > On Wed, 13 Jul 2022 at 08:07, Simon Riggs > wrote: > > > > > + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ] > > I think this is ready for a committer, so I've marked it as such. > Picking this up... I tend to agree with Matthias' earlier point about avoiding code duplication in the grammar. Without going off and refactoring other parts of the grammar not related to this patch, it's still a slightly smaller, simpler change, and less code duplication, to do this using a new opt_stats_name production in the grammar, as in the attached. I also noticed a comment in CreateStatistics() that needed updating. Barring any further comments, I'll push this shortly. Regards, Dean From 8963355b2d8451be8f71a3bd2890e99e31f7d3ff Mon Sep 17 00:00:00 2001 From: Dean Rasheed Date: Thu, 21 Jul 2022 14:48:28 +0100 Subject: [PATCH] Make the name optional in CREATE STATISTICS. This allows users to omit the statistics name in a CREATE STATISTICS command, letting the system auto-generate a sensible, unique name, putting the statistics object in the same schema as the table. Simon Riggs, reviewed by Matthias van de Meent. Discussion: https://postgr.es/m/canbhv-fgd2d_c3zftft2arfx_tapsgoekes58rlzx5xzqp5...@mail.gmail.com --- doc/src/sgml/ref/create_statistics.sgml | 12 ++-- src/backend/commands/statscmds.c| 7 +- src/backend/parser/gram.y | 13 +++- src/test/regress/expected/stats_ext.out | 87 +++-- src/test/regress/sql/stats_ext.sql | 20 -- 5 files changed, 86 insertions(+), 53 deletions(-) diff --git a/doc/src/sgml/ref/create_statistics.sgml b/doc/src/sgml/ref/create_statistics.sgml index 9a8c904c08..b847944f37 100644 --- a/doc/src/sgml/ref/create_statistics.sgml +++ b/doc/src/sgml/ref/create_statistics.sgml @@ -21,11 +21,11 @@ PostgreSQL documentation -CREATE STATISTICS [ IF NOT EXISTS ] statistics_name +CREATE STATISTICS [ [ IF NOT EXISTS ] statistics_name ] ON ( expression ) FROM table_name -CREATE STATISTICS [ IF NOT EXISTS ] statistics_name +CREATE STATISTICS [ [ IF NOT EXISTS ] statistics_name ] [ ( statistics_kind [, ... ] ) ] ON { column_name | ( expression ) }, { column_name | ( expression ) } [, ...] FROM table_name @@ -60,8 +60,8 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na If a schema name is given (for example, CREATE STATISTICS myschema.mystat ...) then the statistics object is created in the specified schema. Otherwise it is created in the current schema. - The name of the statistics object must be distinct from the name of any - other statistics object in the same schema. + If given, the name of the statistics object must be distinct from the name + of any other statistics object in the same schema. @@ -78,6 +78,7 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na exists. A notice is issued in this case. Note that only the name of the statistics object is considered here, not the details of its definition. + Statistics name is required when IF NOT EXISTS is specified. @@ -88,6 +89,9 @@ CREATE STATISTICS [ IF NOT EXISTS ] statistics_na The name (optionally schema-qualified) of the statistics object to be created. + If the name is omitted, PostgreSQL chooses a + suitable name based on the parent table's name and the defined column + name(s) and/or expression(s). diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c index cd5e2f2b6b..415016969d 100644 --- a/src/backend/commands/statscmds.c +++ b/src/backend/commands/statscmds.c @@ -155,10 +155,9 @@ CreateStatistics(CreateStatsStmt *stmt) /* * If the node has a name, split it up and determine creation namespace. - * If not (a possibility not considered by the grammar, but one which can - * occur via the "CREATE TABLE ... (LIKE)" command), then we put the - * object in the same namespace as the relation, and cons up a name for - * it. + * If not, put the object in the same namespace as the relation, and cons + * up a name for it. (This can happen either via "CREATE STATISTICS ..." + * or via "CREATE TABLE ... (LIKE)".) */ if (stmt->defnames) namespaceId = QualifiedNameGetCreationNamespace(stmt->defnames, diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index d649a1b8d1..0a874a04aa 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -434,7 +434,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); old_aggr_definition old_aggr_list oper_argtypes RuleActionList RuleActionMulti opt_column_list columnList opt_name_list -sort_clause opt_sort_clause sortby_list index_params stats_params +sort_clause opt_sort_clause sortby_list index_params +opt_stats_name stats_params opt_include opt_c_include index_includi
Re: [PATCH v1] eliminate duplicate code in table.c
Hi Amit, > Yep, validate_relation_type() sounds better. Or maybe validate_relation_kind() after all? -- Best regards, Aleksander Alekseev
postgres_fdw: Fix bug in checking of return value of PQsendQuery().
Hi, I found that fetch_more_data_begin() in postgres_fdw reports an error when PQsendQuery() returns the value less than 0 as follows though PQsendQuery() can return only 1 or 0. I think this is a bug. Attached is the patch that fixes this bug. This needs to be back-ported to v14 where async execution was supported in postgres_fdw. if (PQsendQuery(fsstate->conn, sql) < 0) pgfdw_report_error(ERROR, NULL, fsstate->conn, false, fsstate->query); Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATIONFrom 538f33a17f3623cec54768a7325d64f8e97abe06 Mon Sep 17 00:00:00 2001 From: Fujii Masao Date: Thu, 21 Jul 2022 22:52:50 +0900 Subject: [PATCH] postgres_fdw: Fix bug in checking of return value of PQsendQuery(). When postgres_fdw begins an asynchronous data fetch, it submits FETCH query by using PQsendQuery(). If PQsendQuery() fails and returns 0, postgres_fdw should report an error. But, previously, postgres_fdw reported an error only when the return value is less than 0, though PQsendQuery() never return the values other than 0 and 1. Therefore postgres_fdw could not handle the failure to send FETCH query in an asynchronous data fetch. This commit fixes postgres_fdw so that it reports an error when PQsendQuery() returns 0. Back-patch to v14 where asynchronous execution was supported in postgres_fdw. Author: Fujii Masao Reviewed-by: Discussion: https://postgr.es/m/ --- contrib/postgres_fdw/postgres_fdw.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c index f3b93954ee..048db542d3 100644 --- a/contrib/postgres_fdw/postgres_fdw.c +++ b/contrib/postgres_fdw/postgres_fdw.c @@ -7070,7 +7070,7 @@ fetch_more_data_begin(AsyncRequest *areq) snprintf(sql, sizeof(sql), "FETCH %d FROM c%u", fsstate->fetch_size, fsstate->cursor_number); - if (PQsendQuery(fsstate->conn, sql) < 0) + if (!PQsendQuery(fsstate->conn, sql)) pgfdw_report_error(ERROR, NULL, fsstate->conn, false, fsstate->query); /* Remember that the request is in process */ -- 2.36.0
Re: Custom tuplesorts for extensions
Hi, John! On Thu, Jul 21, 2022 at 6:44 AM John Naylor wrote: > On Tue, Jul 12, 2022 at 3:23 PM Alexander Korotkov > wrote: > > There are some places, which potentially could cause a slowdown. I'm > > going to make some experiments with that. > > I haven't looked at the patches, so I don't know of a specific place to look > for a slowdown, but I thought it might help to perform the same query tests > as my most recent test for evaluating qsort variants (some description in > [1]), and here is the spreadsheet. Overall, the differences look like noise. > A few cases with unabbreviatable text look a bit faster with the patch. I'm > not sure if that's a real difference, but in any case I don't see a slowdown > anywhere. > > [1] > https://www.postgresql.org/message-id/CAFBsxsHeTACMP1JVQ%2Bm35-v2NkmEqsJMHLhEfWk4sTB5aw_jkQ%40mail.gmail.com Great, thank you very much for the feedback! -- Regards, Alexander Korotkov
Re: postgres_fdw: Fix bug in checking of return value of PQsendQuery().
On Thu, 21 Jul 2022 at 22:22, Fujii Masao wrote: > Hi, > > I found that fetch_more_data_begin() in postgres_fdw reports an error when > PQsendQuery() returns the value less than 0 as follows though PQsendQuery() > can return only 1 or 0. I think this is a bug. Attached is the patch that > fixes this bug. This needs to be back-ported to v14 where async execution was > supported in postgres_fdw. > > if (PQsendQuery(fsstate->conn, sql) < 0) > pgfdw_report_error(ERROR, NULL, fsstate->conn, false, > fsstate->query); > > Regards, +1. However, I think check whether the result equals 0 or 1 might be better. Anyway, the patch works correctly. $ grep 'PQsendQuery(' -rn . --include '*.c' ./contrib/postgres_fdw/postgres_fdw.c:7073: if (PQsendQuery(fsstate->conn, sql) < 0) ./contrib/postgres_fdw/connection.c:647:if (!PQsendQuery(conn, sql)) ./contrib/postgres_fdw/connection.c:782:if (!PQsendQuery(conn, query)) ./contrib/postgres_fdw/connection.c:1347: if (!PQsendQuery(conn, query)) ./contrib/postgres_fdw/connection.c:1575: if (PQsendQuery(entry->conn, "DEALLOCATE ALL")) ./contrib/dblink/dblink.c:720: retval = PQsendQuery(conn, sql); ./contrib/dblink/dblink.c:1146: if (!PQsendQuery(conn, sql)) ./src/test/isolation/isolationtester.c:669: if (!PQsendQuery(conn, step->sql)) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:500: if (PQsendQuery(conn, "SELECT 1; SELECT 2") != 1) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:532: if (PQsendQuery(conn, "SELECT 1.0/g FROM generate_series(3, -1, -1) g") != 1) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:1000:if (PQsendQuery(conn, "SELECT 1") != 1) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:1046:if (PQsendQuery(conn, "SELECT 1") != 1) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:1084:if (PQsendQuery(conn, "SELECT 1") != 1) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:1094:if (PQsendQuery(conn, "SELECT 2") != 1) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:1118:if (PQsendQuery(conn, "SELECT 1") != 1) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:1132:if (PQsendQuery(conn, "SELECT 2") != 1) ./src/test/modules/libpq_pipeline/libpq_pipeline.c:1159:if (PQsendQuery(conn, "SELECT pg_catalog.pg_advisory_unlock(1,1)") != 1) ./src/bin/pg_basebackup/pg_basebackup.c:1921: if (PQsendQuery(conn, basebkp) == 0) ./src/bin/pg_amcheck/pg_amcheck.c:891: if (PQsendQuery(slot->connection, sql) == 0) ./src/bin/psql/common.c:1451: success = PQsendQuery(pset.db, query); ./src/bin/scripts/reindexdb.c:551: status = PQsendQuery(conn, sql.data) == 1; ./src/bin/scripts/vacuumdb.c:947: status = PQsendQuery(conn, sql) == 1; ./src/bin/pgbench/pgbench.c:3089: r = PQsendQuery(st->con, sql); ./src/bin/pgbench/pgbench.c:4012: if (!PQsendQuery(st->con, "ROLLBACK")) ./src/backend/replication/libpqwalreceiver/libpqwalreceiver.c:663: if (!PQsendQuery(streamConn, query)) ./src/interfaces/libpq/fe-exec.c:1421:PQsendQuery(PGconn *conn, const char *query) ./src/interfaces/libpq/fe-exec.c:2319: if (!PQsendQuery(conn, query)) -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size
On Wed, Jul 20, 2022 at 11:09 PM Amit Langote wrote: > On Wed, Jul 20, 2022 at 12:37 AM Andres Freund wrote: > > On 2022-07-19 20:40:11 +0900, Amit Langote wrote: > > > About that, I was wondering if the blocks in llvm_compile_expr() need > > > to be hand-coded to match what's added in ExecInterpExpr() or if I've > > > missed some tool that can be used instead? > > > > The easiest way is to just call an external function for the implementation > > of > > the step. But yes, otherwise you need to handcraft it. > > Ok, thanks. > > So I started updating llvm_compile_expr() for handling the new > ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly > realized that code could have been consolidated into less code, or > IOW, into fewer new ExprEvalSteps. So, I refactored things that way > and am now retrying adding the code to llvm_compile_expr() based on > new, better consolidated, code. > > Here's the updated version, without the llvm pieces, in case you'd > like to look at it even in this state. I'll post a version with llvm > pieces filled in tomorrow. (I have merged the different patches into > one for convenience.) And here's a version with llvm pieces filled in. Because I wrote all of it while not really understanding how the LLVM constructs like blocks and branches work, the only reason I think those llvm_compile_expr() additions may be correct is that all the tests in jsonb_sqljson.sql pass even if I add the following line at the top: set jit_above_cost to 0; -- Thanks, Amit Langote EDB: http://www.enterprisedb.com v4-0001-Some-improvements-to-JsonExpr-evaluation.patch Description: Binary data
Re: [PATCH v1] eliminate duplicate code in table.c
yeah, IMHO validate_relation_kind() is better ;) On Thu, Jul 21, 2022 at 10:21 PM Aleksander Alekseev wrote: > > Hi Amit, > > > Yep, validate_relation_type() sounds better. > > Or maybe validate_relation_kind() after all? > > -- > Best regards, > Aleksander Alekseev -- Regards Junwang Zhao
Re: shared-memory based stats collector - v70
On Wed, 20 Jul 2022 at 15:09, Andres Freund wrote: > > Each backend only had stats for things it touched. But the stats collector > read all files at startup into hash tables and absorbed all generated stats > into those as well. Fascinating. I'm surprised this didn't raise issues previously for people with millions of tables. I wonder if it wasn't causing issues and we just didn't hear about them because there were other bigger issues :) > >All that said -- having all objects loaded in shared memory makes my > >work way easier. > > What are your trying to do? I'm trying to implement an exporter for prometheus/openmetrics/etc that dumps directly from shared memory without going through the SQL backend layer. I believe this will be much more reliable, lower overhead, safer, and consistent than writing SQL queries. Ideally I would want to dump out the stats without connecting to each database. I suspect that would run into problems where the schema really adds a lot of information (such as which table each index is on or which table a toast relation is for. There are also some things people think of as stats that are maintained in the catalog such as reltuples and relpages. So I'm imagining this won't strictly stay true in the end. It seems like just having an interface to iterate over the shared hash table and return entries one by one without filtering by database would be fairly straightforward and I would be able to do most of what I want just with that. There's actually enough meta information in the stats entries to be able to handle them as they come instead of trying to process look up specific stats one by one. -- greg
Re: [PATCH v1] eliminate duplicate code in table.c
btw, there are some typos in Patch v5, %s/ralation/relation/g On Thu, Jul 21, 2022 at 10:05 PM Aleksander Alekseev wrote: > > Hi Amit, > > > Yeah, that's better. On again thinking about the function name, I > > wonder if validate_relation_type() suits here as there is no generic > > object being passed? > > Yep, validate_relation_type() sounds better. > > -- > Best regards, > Aleksander Alekseev -- Regards Junwang Zhao
Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size
On Thu, Jul 21, 2022 at 11:55 PM Amit Langote wrote: > On Wed, Jul 20, 2022 at 11:09 PM Amit Langote wrote: > > On Wed, Jul 20, 2022 at 12:37 AM Andres Freund wrote: > > > On 2022-07-19 20:40:11 +0900, Amit Langote wrote: > > > > About that, I was wondering if the blocks in llvm_compile_expr() need > > > > to be hand-coded to match what's added in ExecInterpExpr() or if I've > > > > missed some tool that can be used instead? > > > > > > The easiest way is to just call an external function for the implementation of > > > the step. But yes, otherwise you need to handcraft it. > > > > Ok, thanks. > > > > So I started updating llvm_compile_expr() for handling the new > > ExprEvalSteps that the patch adds to ExecExprInterp(), but quickly > > realized that code could have been consolidated into less code, or > > IOW, into fewer new ExprEvalSteps. So, I refactored things that way > > and am now retrying adding the code to llvm_compile_expr() based on > > new, better consolidated, code. > > > > Here's the updated version, without the llvm pieces, in case you'd > > like to look at it even in this state. I'll post a version with llvm > > pieces filled in tomorrow. (I have merged the different patches into > > one for convenience.) > > And here's a version with llvm pieces filled in. > > Because I wrote all of it while not really understanding how the LLVM > constructs like blocks and branches work, the only reason I think > those llvm_compile_expr() additions may be correct is that all the > tests in jsonb_sqljson.sql pass even if I add the following line at > the top: > > set jit_above_cost to 0; Oh and I did build --with-llvm. :-) -- Thanks, Amit Langote EDB: http://www.enterprisedb.com -- Thanks, Amit Langote EDB: http://www.enterprisedb.com
Re: postgres_fdw: Fix bug in checking of return value of PQsendQuery().
Fujii Masao writes: > I found that fetch_more_data_begin() in postgres_fdw reports an error when > PQsendQuery() returns the value less than 0 as follows though PQsendQuery() > can return only 1 or 0. I think this is a bug. Attached is the patch that > fixes this bug. This needs to be back-ported to v14 where async execution was > supported in postgres_fdw. Yup, clearly a thinko. regards, tom lane
Re: [PATCH] Introduce array_shuffle() and array_sample()
Am 21.07.22 um 14:25 schrieb Dean Rasheed: I'm inclined to say that we want a new pg_global_prng_user_state that is updated by setseed(), and used by random(), array_shuffle(), array_sample(), and any other user-facing random functions we add later. I like the idea. How would you organize the code? I imagine some sort of user prng that is encapsulated in something like utils/adt/random.c and is guaranteed to always be seeded. Martin
Re: [PATCH v1] eliminate duplicate code in table.c
Hi Junwang, > btw, there are some typos in Patch v5, %s/ralation/relation/g D'oh! > yeah, IMHO validate_relation_kind() is better ;) Cool. Here is the corrected patch. Thanks! -- Best regards, Aleksander Alekseev v6-0001-Eliminate-duplicate-code-in-table.c.patch Description: Binary data
Re: [PATCH v1] eliminate duplicate code in table.c
LGTM On Thu, Jul 21, 2022 at 11:52 PM Aleksander Alekseev wrote: > > Hi Junwang, > > > btw, there are some typos in Patch v5, %s/ralation/relation/g > > D'oh! > > > yeah, IMHO validate_relation_kind() is better ;) > > Cool. Here is the corrected patch. Thanks! > > -- > Best regards, > Aleksander Alekseev -- Regards Junwang Zhao
Re: Make name optional in CREATE STATISTICS
On Thu, 21 Jul 2022 at 15:12, Dean Rasheed wrote: > > On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent > wrote: > > > > On Wed, 13 Jul 2022 at 08:07, Simon Riggs > > wrote: > > > > > > > + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ] > > > > I think this is ready for a committer, so I've marked it as such. > > > > Picking this up... > > I tend to agree with Matthias' earlier point about avoiding code > duplication in the grammar. Without going off and refactoring other > parts of the grammar not related to this patch, it's still a slightly > smaller, simpler change, and less code duplication, to do this using a > new opt_stats_name production in the grammar, as in the attached. > > I also noticed a comment in CreateStatistics() that needed updating. > > Barring any further comments, I'll push this shortly. Nice change, please proceed. -- Simon Riggshttp://www.EnterpriseDB.com/
let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER
Hi, Currently, it's possible to remove the rolissuper bit from the bootstrap superuser, but this leaves that user - and the system in general - in an odd state. The bootstrap user continues to own all of the objects it owned before, e.g. all of the system catalogs. Direct DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's possible to do things like rename a system function out of the way and create a new function with the same signature. Therefore, creating a new superuser and making the original one a non-superuser is probably not viable from a security perspective, because anyone who gained access to that role would likely have little difficulty mounting a Trojan horse attack against the current superusers. There are other problems, too. (1) pg_parameter_acl entries are considered to be owned by the bootstrap superuser, so while the bootstrap user loses the ability to directly ALTER SYSTEM SET archive_command, they can still grant that ability to some other user (possibly one they've just created, if they still have CREATEROLE) which pretty much gives the whole show away. (2) When a trusted extension is created, the extension objects are documented as ending up owned by the bootstrap superuser, and the bootstrap user will end up owning them even if they are no longer super. (3) Range constructors end up getting owned by the bootstrap user, too. I haven't really tried to verify whether ownership of trusted extension objects or range constructors would allow the bootstrap not-a-superuser to escalate back to superuser, but it seems fairly likely. I believe these object ownership assignments were made with the idea that the bootstrap user would always be a superuser. pg_upgrade refers to the "install user" rather than the bootstrap superuser, but it's talking about the same thing. If you've made the bootstrap user non-super, pg_upgrade will fail. It is only able to connect as the bootstrap user, and it must connect as superuser or it can't do the things it needs to do. All in all, it seems to me that various parts of the system are built around the assumption that you will not try to execute ALTER ROLE bootstrap_superuser NOSUPERUSER. I suggest that we formally prohibit that, as per the attached patch. Otherwise, I suppose we need to prevent privilege escalation attacks from a bootstrap ex-superuser, which seems fairly impractical and a poor use of engineering resources. Or I suppose we could continue with the present state of affairs where our code and documentation assume you won't do that but nothing actually stops you from doing it, but that doesn't seem to have much to recommend it. -- Robert Haas EDB: http://www.enterprisedb.com v1-0001-Do-not-allow-removal-of-superuser-privileges-from.patch Description: Binary data
Re: Make name optional in CREATE STATISTICS
On 7/21/22 16:12, Dean Rasheed wrote: > On Wed, 20 Jul 2022 at 12:01, Matthias van de Meent > wrote: >> >> On Wed, 13 Jul 2022 at 08:07, Simon Riggs >> wrote: >>> + *CREATE STATISTICS [ [IF NOT EXISTS] stats_name ] >> >> I think this is ready for a committer, so I've marked it as such. >> > > Picking this up... > > I tend to agree with Matthias' earlier point about avoiding code > duplication in the grammar. Without going off and refactoring other > parts of the grammar not related to this patch, it's still a slightly > smaller, simpler change, and less code duplication, to do this using a > new opt_stats_name production in the grammar, as in the attached. > > I also noticed a comment in CreateStatistics() that needed updating. > > Barring any further comments, I'll push this shortly. > +1 -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Add connection active, idle time to pg_stat_activity
Hello, I have addressed the reviews. @Aleksander Alekseev thanks for reporting the issue. I have altered the patch to respect the behavior of pg_stat_activity, specifically [1] > Another important point is that when a server process is asked to display any > of these statistics, > it first fetches the most recent report emitted by the collector process and > then continues to use this snapshot > for all statistical views and functions until the end of its current > transaction. > So the statistics will show static information as long as you continue the > current transaction. For the patch it means no computing of real-time values of total_*_time. Here is an example to illustrate the new behavior: =# begin; =*# select total_active_time, total_idle_in_transaction_time from pg_stat_activity where pid = pg_backend_pid(); total_active_time | total_idle_in_transaction_time ---+ 0.124 | 10505.098 postgres=*# select pg_sleep(10); postgres=*# select total_active_time, total_idle_in_transaction_time from pg_stat_activity where pid = pg_backend_pid(); total_active_time | total_idle_in_transaction_time ---+ 0.124 | 10505.098 postgres=*# commit; postgres=# select total_active_time, total_idle_in_transaction_time from pg_stat_activity where pid = pg_backend_pid(); total_active_time | total_idle_in_transaction_time ---+ 10015.796 | 29322.831 [1] https://www.postgresql.org/docs/14/monitoring-stats.html#MONITORING-STATS-VIEWS Regards, Sergey From b5298301a3f5223bd78c519ddcddbd1bec9cf000 Mon Sep 17 00:00:00 2001 From: Sergey Dudoladov Date: Wed, 20 Apr 2022 23:47:37 +0200 Subject: [PATCH] pg_stat_activity: add 'total_active_time' and 'total_idle_in_transaction_time' catversion bump because of the change in the contents of pg_stat_activity Author: Sergey Dudoladov, based on the initial version by Rafia Sabih Reviewed-by: Aleksander Alekseev, Bertrand Drouvot, and Atsushi Torikoshi Discussion: https://www.postgresql.org/message-id/flat/CA%2BFpmFcJF0vwi-SWW0wYO-c-FbhyawLq4tCpRDCJJ8Bq%3Dja-gA%40mail.gmail.com --- doc/src/sgml/monitoring.sgml| 20 + src/backend/catalog/system_views.sql| 4 ++- src/backend/utils/activity/backend_status.c | 33 - src/backend/utils/adt/pgstatfuncs.c | 8 - src/include/catalog/pg_proc.dat | 6 ++-- src/include/pgstat.h| 10 --- src/include/utils/backend_status.h | 10 +++ src/test/regress/expected/rules.out | 12 8 files changed, 75 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml index 7dbbab6f5c..943927fe34 100644 --- a/doc/src/sgml/monitoring.sgml +++ b/doc/src/sgml/monitoring.sgml @@ -979,6 +979,26 @@ postgres 27093 0.0 0.0 30096 2752 ?Ss 11:34 0:00 postgres: ser additional types. + + + + total_active_time double precision + + + Time in milliseconds this backend spent in active and + fastpath function call states. + + + + + + total_idle_in_transaction_time double precision + + + Time in milliseconds this backend spent in idle in transaction + and idle in transaction (aborted) states. + + diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql index f369b1fc14..2ec6ea2304 100644 --- a/src/backend/catalog/system_views.sql +++ b/src/backend/catalog/system_views.sql @@ -864,7 +864,9 @@ CREATE VIEW pg_stat_activity AS s.backend_xmin, S.query_id, S.query, -S.backend_type +S.backend_type, +S.total_active_time, +S.total_idle_in_transaction_time FROM pg_stat_get_activity(NULL) AS S LEFT JOIN pg_database AS D ON (S.datid = D.oid) LEFT JOIN pg_authid AS U ON (S.usesysid = U.oid); diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c index c7ed1e6d7a..8fe2929fba 100644 --- a/src/backend/utils/activity/backend_status.c +++ b/src/backend/utils/activity/backend_status.c @@ -336,6 +336,8 @@ pgstat_bestart(void) lbeentry.st_activity_start_timestamp = 0; lbeentry.st_state_start_timestamp = 0; lbeentry.st_xact_start_timestamp = 0; + lbeentry.st_total_active_time = 0; + lbeentry.st_total_transaction_idle_time = 0; lbeentry.st_databaseid = MyDatabaseId; /* We have userid for client-backends, wal-sender and bgworker processes */ @@ -524,6 +526,8 @@ pgstat_report_activity(BackendState state, const char *cmd_str) TimestampTz start_timestamp; TimestampTz current_timestamp; int len = 0; + int64 a
Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER
Robert Haas writes: > Currently, it's possible to remove the rolissuper bit from the > bootstrap superuser, but this leaves that user - and the system in > general - in an odd state. The bootstrap user continues to own all of > the objects it owned before, e.g. all of the system catalogs. Direct > DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's > possible to do things like rename a system function out of the way and > create a new function with the same signature. Therefore, creating a > new superuser and making the original one a non-superuser is probably > not viable from a security perspective, because anyone who gained > access to that role would likely have little difficulty mounting a > Trojan horse attack against the current superusers. True, but what if the idea is to have *no* superusers? I seem to recall people being interested in setups like that. On the whole I don't have any objection to your proposal, I just worry that somebody else will. Of course there's always "UPDATE pg_authid SET rolsuper = false", which makes it absolutely clear that you're breaking the glass cover. regards, tom lane
Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER
On Thu, Jul 21, 2022 at 9:28 AM Tom Lane wrote: > Robert Haas writes: > > Currently, it's possible to remove the rolissuper bit from the > > bootstrap superuser, but this leaves that user - and the system in > > general - in an odd state. The bootstrap user continues to own all of > > the objects it owned before, e.g. all of the system catalogs. Direct > > DML on system catalogs is blocked by pg_class_aclmask_ext(), but it's > > possible to do things like rename a system function out of the way and > > create a new function with the same signature. Therefore, creating a > > new superuser and making the original one a non-superuser is probably > > not viable from a security perspective, because anyone who gained > > access to that role would likely have little difficulty mounting a > > Trojan horse attack against the current superusers. > > True, but what if the idea is to have *no* superusers? I seem > to recall people being interested in setups like that. > > On the whole I don't have any objection to your proposal, I just > worry that somebody else will. > > Of course there's always "UPDATE pg_authid SET rolsuper = false", > which makes it absolutely clear that you're breaking the glass cover. > > I would expect an initdb option (once this is possible) to specify this desire and we just never set one up in the first place. It seems impractical to remove one after it already exists. Though we could enable the option (or a function) tied to the specific predefined role that, say, permits catalog changes, when that day comes. David J.
Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER
"David G. Johnston" writes: > On Thu, Jul 21, 2022 at 9:28 AM Tom Lane wrote: >> True, but what if the idea is to have *no* superusers? I seem >> to recall people being interested in setups like that. > I would expect an initdb option (once this is possible) to specify this > desire and we just never set one up in the first place. It seems > impractical to remove one after it already exists. There has to be a role that owns the built-in objects. Robert's point is that pretending that that role isn't high-privilege is silly. regards, tom lane
Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER
On Thu, Jul 21, 2022 at 12:28 PM Tom Lane wrote: > True, but what if the idea is to have *no* superusers? I seem > to recall people being interested in setups like that. Hmm, right. There's nothing that stops you from de-super-ing all of your superusers today, and then if you ever need to do anything as superuser again, you have to start up in single-user mode, which will treat your session as super regardless. But considering how much power the bootstrap user still has, I'm not sure that's really buying you very much. In particular, the new GRANT ALTER SYSTEM stuff looks sufficient to allow the bootstrap user to break out to the OS, so if we want to regard no-superusers as a supported configuration, we probably need to tighten that up. I think it's kind of hopeless, though, because of the fact that you can also freely Trojan functions and operators in pg_catalog. Maybe that's insufficient to break out to the OS or assume superuser privileges, but you should be able to at least Trojan every other user on the system. > On the whole I don't have any objection to your proposal, I just > worry that somebody else will. OK, good to know. Thanks. > Of course there's always "UPDATE pg_authid SET rolsuper = false", > which makes it absolutely clear that you're breaking the glass cover. Right. -- Robert Haas EDB: http://www.enterprisedb.com
Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER
Robert Haas writes: > ... if > we want to regard no-superusers as a supported configuration, we > probably need to tighten that up. I think it's kind of hopeless, Yeah, I agree. At least, I'm uninterested in spending any of my own time trying to make that usefully-more-secure than it is today. If somebody else is interested enough to do the legwork, we can look at what they come up with. regards, tom lane
Re: PG 15 (and to a smaller degree 14) regression due to ExprEvalStep size
On 2022-Jul-21, Amit Langote wrote: > Because I wrote all of it while not really understanding how the LLVM > constructs like blocks and branches work, the only reason I think > those llvm_compile_expr() additions may be correct is that all the > tests in jsonb_sqljson.sql pass even if I add the following line at > the top: I suggest to build with --enable-coverage, then run the regression tests and do "make coverage-html" and see if your code appears covered in the report. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "¿Cómo puedes confiar en algo que pagas y que no ves, y no confiar en algo que te dan y te lo muestran?" (Germán Poo)
Re: let's disallow ALTER ROLE bootstrap_superuser NOSUPERUSER
On Thu, Jul 21, 2022 at 01:02:50PM -0400, Tom Lane wrote: > Robert Haas writes: >> ... if >> we want to regard no-superusers as a supported configuration, we >> probably need to tighten that up. I think it's kind of hopeless, > > Yeah, I agree. At least, I'm uninterested in spending any of my > own time trying to make that usefully-more-secure than it is today. > If somebody else is interested enough to do the legwork, we can > look at what they come up with. Given the current assumptions the code makes about the bootstrap superuser, I think it makes sense to disallow removing its superuser attribute (at least via ALTER ROLE NOSUPERUSER). It seems like there is much work to do before a no-superuser configuration could be formally supported. If/when such support materializes, it might be possible to remove the restriction that Robert is proposing. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: [PATCH] Introduce array_shuffle() and array_sample()
Am 21.07.22 um 10:41 schrieb Dean Rasheed: It's important to mark these new functions as VOLATILE, not IMMUTABLE, otherwise they won't work as expected in queries. See https://www.postgresql.org/docs/current/xfunc-volatility.html It would be better to use pg_prng_uint64_range() rather than rand() to pick elements. Partly, that's because it uses a higher quality PRNG, with a larger internal state, and it ensures that the results are unbiased across the range. But more importantly, it interoperates with setseed(), allowing predictable sequences of "random" numbers to be generated -- something that's useful in writing repeatable regression tests. Assuming these new functions are made to interoperate with setseed(), which I think they should be, then they also need to be marked as PARALLEL RESTRICTED, rather than PARALLEL SAFE. See https://www.postgresql.org/docs/current/parallel-safety.html, which explains why setseed() and random() are parallel restricted. Here is an updated patch that marks the functions VOLATILE PARALLEL RESTRICTED and uses pg_prng_uint64_range() rather than rand().From 26676802f05d00c31e0b2d5997f61734aa421fca Mon Sep 17 00:00:00 2001 From: Martin Kalcher Date: Sun, 17 Jul 2022 18:06:04 +0200 Subject: [PATCH] Introduce array_shuffle() and array_sample() * array_shuffle() shuffles the elements of an array. * array_sample() chooses n elements from an array by random. Shuffling of arrays can already be accomplished with SQL using unnest() and array_agg(order by random()). But that is very slow compared to the new functions. In addition the new functions are dimension aware. --- doc/src/sgml/func.sgml | 35 + src/backend/utils/adt/arrayfuncs.c | 189 ++- src/include/catalog/pg_proc.dat | 6 + src/test/regress/expected/arrays.out | 60 + src/test/regress/sql/arrays.sql | 17 +++ 5 files changed, 306 insertions(+), 1 deletion(-) diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index b6783b7ad0..6b96897244 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -19395,6 +19395,41 @@ SELECT NULLIF(value, '(none)') ... + + + + array_sample + +array_sample ( array anyarray, n integer ) +anyarray + + +Returns n randomly chosen elements from array. +The order of the elements in resulting array is unspecified. + + +array_sample(ARRAY[[1,2],[3,4],[5,6]], 2) +{{5,6},{3,4}} + + + + + + + array_shuffle + +array_shuffle ( anyarray ) +anyarray + + +Shuffles the first dimension of the array. + + +array_shuffle(ARRAY[[1,2],[3,4],[5,6]]) +{{5,6},{3,4},{1,2}} + + + diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c index fb167f226a..64da980348 100644 --- a/src/backend/utils/adt/arrayfuncs.c +++ b/src/backend/utils/adt/arrayfuncs.c @@ -34,7 +34,7 @@ #include "utils/memutils.h" #include "utils/selfuncs.h" #include "utils/typcache.h" - +#include "common/pg_prng.h" /* * GUC parameter @@ -6872,3 +6872,190 @@ trim_array(PG_FUNCTION_ARGS) PG_RETURN_DATUM(result); } + +/* + * Produce array with max n random items from given array in random order. + * + * array: array object to examine (must not be NULL) + * n: max number of items + * elmtyp, elmlen, elmbyval, elmalign: info for the datatype of the items + * + * NOTE: it would be cleaner to look up the elmlen/elmbval/elmalign info + * from the system catalogs, given the elmtype. However, the caller is + * in a better position to cache this info across multiple uses, or even + * to hard-wire values if the element type is hard-wired. + */ +static ArrayType * +array_shuffle_n(ArrayType *array, int n, +Oid elmtyp, int elmlen, bool elmbyval, char elmalign) +{ + ArrayType *result; + int ndim, + *dims, + *lbs, +rdims[MAXDIM], +nelm, +nitem, +i, +j, +k; + Datum elm, + *elms, + *relms; + bool nul, + *nuls, + *rnuls; + + ndim = ARR_NDIM(array); + dims = ARR_DIMS(array); + lbs = ARR_LBOUND(array); + + if (ndim < 1 || dims[0] < 1 || n < 1) + return construct_empty_array(elmtyp); + + deconstruct_array(array, elmtyp, elmlen, elmbyval, elmalign, + &relms, &rnuls, &nelm); + + /* Calculate number of elements per item. */ + nelm = (ndim > 1) ? ArrayGetNItems(ndim - 1, dims + 1) : 1; + elms = relms; + nuls = rnuls; + nitem = dims[0]; + n = Min(n, nitem); + + /* + * Shuffle array using Fisher-Yates algorithm. Swap head with an randomly + * chosen item and increment head. + */ + for (i = 0; i < n; i++) + { + k = (int) pg_prng_uint64_range(&pg_global_prng_state, 0, nitem - i - 1) * nelm; + + /* Swap all elements in head with elements in item k. */ + for (j = 0; j < nelm; j++) + { +
Re: Make name optional in CREATE STATISTICS
On 2022-Jul-21, Dean Rasheed wrote: > I tend to agree with Matthias' earlier point about avoiding code > duplication in the grammar. Without going off and refactoring other > parts of the grammar not related to this patch, it's still a slightly > smaller, simpler change, and less code duplication, to do this using a > new opt_stats_name production in the grammar, as in the attached. > > I also noticed a comment in CreateStatistics() that needed updating. > > Barring any further comments, I'll push this shortly. Thanks. I was looking at the recently modified REINDEX syntax and noticed there another spot for taking an optional name. I ended up reusing OptSchemaName for that, as in the attached patch. I think adding production-specific additional productions is pointless and probably bloats the grammar. So let me +1 your push of the patch you posted, just to keep things moving forward, but in addition I propose to later rename OptSchemaName to something more generic and use it in these three places. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ >From 143467419c19aea6a79db46da461aae4d9afabac Mon Sep 17 00:00:00 2001 From: Alvaro Herrera Date: Thu, 21 Jul 2022 16:48:51 +0200 Subject: [PATCH] Rework grammar for REINDEX --- src/backend/parser/gram.y | 80 +++--- src/test/regress/expected/create_index.out | 4 +- 2 files changed, 42 insertions(+), 42 deletions(-) diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y index d649a1b8d1..85ab17ca5a 100644 --- a/src/backend/parser/gram.y +++ b/src/backend/parser/gram.y @@ -560,7 +560,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query); %type generic_option_elem alter_generic_option_elem %type generic_option_list alter_generic_option_list -%type reindex_target_type reindex_target_multitable reindex_name_optional +%type reindex_target_type reindex_target_type_multi %type copy_generic_opt_arg copy_generic_opt_arg_list_item %type copy_generic_opt_elem @@ -9085,7 +9085,9 @@ DropTransformStmt: DROP TRANSFORM opt_if_exists FOR Typename LANGUAGE name opt_d * * QUERY: * - * REINDEX [ (options) ] type [CONCURRENTLY] + * REINDEX [ (options) ] {TABLE | INDEX} [CONCURRENTLY] + * REINDEX [ (options) ] SCHEMA [CONCURRENTLY] + * REINDEX [ (options) ] {SYSTEM | DATABASE} [] */ ReindexStmt: @@ -9102,37 +9104,6 @@ ReindexStmt: makeDefElem("concurrently", NULL, @3)); $$ = (Node *) n; } - | REINDEX reindex_target_multitable opt_concurrently name -{ - ReindexStmt *n = makeNode(ReindexStmt); - - n->kind = $2; - n->name = $4; - n->relation = NULL; - n->params = NIL; - if ($3) - n->params = lappend(n->params, - makeDefElem("concurrently", NULL, @3)); - $$ = (Node *) n; -} - | REINDEX reindex_name_optional -{ - ReindexStmt *n = makeNode(ReindexStmt); - n->kind = $2; - n->name = NULL; - n->relation = NULL; - n->params = NIL; - $$ = (Node *)n; -} - | REINDEX '(' utility_option_list ')' reindex_name_optional -{ - ReindexStmt *n = makeNode(ReindexStmt); - n->kind = $5; - n->name = NULL; - n->relation = NULL; - n->params = $3; - $$ = (Node *)n; -} | REINDEX '(' utility_option_list ')' reindex_target_type opt_concurrently qualified_name { ReindexStmt *n = makeNode(ReindexStmt); @@ -9146,11 +9117,25 @@ ReindexStmt: makeDefElem("concurrently", NULL, @6)); $$ = (Node *) n; } - | REINDEX '(' utility_option_list ')' reindex_target_multitable opt_concurrently name + + | REINDEX SCHEMA opt_concurrently name { ReindexStmt *n = makeNode(ReindexStmt); - n->kind = $5; + n->kind = REINDEX_OBJECT_SCHEMA; + n->name = $4; + n->relation = NULL; + n->params = NIL; + if ($3) + n->params = lappend(n->params, + makeDefElem("concurrently", NULL, @3)); + $$ = (Node *) n; +} + | REINDEX '(' utility_option_list ')' SCHEMA opt_concurrently name +{ + ReindexStmt *n = makeNode(ReindexStmt); + + n->kind = REINDEX_OBJECT_SCHEMA; n->name = $7; n->relation = NULL; n->params = $3; @@ -9159,18 +9144,31 @@ ReindexStmt: makeDefElem("concurrently", NULL, @6)); $$ = (Node *) n; } + | REINDEX reindex_target_type_multi OptSchemaName +{ + ReindexStmt *n = makeNode(ReindexStmt); + n->kind = $2; + n->name = NULL; + n->relation = NULL; + n->params = NIL; + $$ = (Node *)n; +} + | REINDEX '(' utility_option_list ')' reindex_target_type_multi OptSchemaName +{ + ReindexStmt *n = makeNode(ReindexStmt); + n->kind = $5; + n->name = $6; + n->relation = NULL; + n->params = $3; + $$ = (Node *)n; +} ; reindex_target_type: INDEX { $$ = REINDEX_OBJ
Re: First draft of the PG 15 release notes
On Tue, Jul 12, 2022 at 02:47:07PM -0400, Bruce Momjian wrote: > On Mon, Jul 11, 2022 at 11:31:32PM -0700, Noah Misch wrote: > > On Mon, Jul 11, 2022 at 12:39:57PM -0400, Bruce Momjian wrote: > > > I had trouble reading the sentences in the order you used so I > > > restructured it: > > > > > > The new default is one of the secure schema usage patterns that > > linkend="ddl-schemas-patterns"/> has recommended since the security > > > release for CVE-2018-1058. The change applies to newly-created > > > databases in existing clusters and for new clusters. Upgrading a > > > cluster or restoring a database dump will preserve existing permissions. > > > > I agree with the sentence order change. > > Great. > > > > For existing databases, especially those having multiple users, consider > > > issuing REVOKE to adopt this new default. For new > > > databases having zero need to defend against insider threats, granting > > > USAGE permission on their public > > > schemas will yield the behavior of prior releases. > > > > s/USAGE/CREATE/ in the last sentence. Looks good with that change. > > Ah, yes, of course. Patch applied, I also adjusted the second paragraph to be more symmetric. You can see the results here: https://momjian.us/pgsql_docs/release-15.html -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: Allow placeholders in ALTER ROLE w/o superuser
On Wed, Jul 20, 2022 at 11:50:10AM -0400, Tom Lane wrote: > I think that 13d838815 has completely changed the terms that this > discussion needs to be conducted under. It seems clear to me now > that if you want to relax this only-superusers restriction, what > you have to do is store the OID of the role that issued ALTER ROLE/DB SET, > and then apply the same checks that would be used in the ordinary case > where a placeholder is being filled in after being set intra-session. > That is, we'd no longer assume that a value coming from pg_db_role_setting > was set with superuser privileges, but we'd know exactly who did set it. I was imagining that the permissions checks would apply at ALTER ROLE/DB SET time, not at login time. Otherwise, changing a role's privileges might impact other roles' parameters, and it's not clear (at least to me) what should happen when the role is dropped. Another reason I imagined it this way is because that's basically how it works today. We assume that the pg_db_role_setting entry was added by a superuser, but we don't check that the user that ran ALTER ROLE/DB SET is still superuser every time you log in. > This might also tie into Nathan's question in another thread about > exactly what permissions should be required to issue ALTER ROLE/DB SET. > In particular I'm wondering if different permissions should be needed to > override an existing entry than if there is no existing entry. If not, > we could find ourselves downgrading a superuser-set entry to a > non-superuser-set entry, which might have bad consequences later > (eg, by rendering the entry nonfunctional because when we actually > load the extension we find out the GUC is SUSET). Yeah, this is why I suggested storing something that equates to PGC_SUSET any time a role is superuser or has grantable GUC permissions. > Possibly related to this: I felt while working on 13d838815 that > PGC_SUSET and PGC_SU_BACKEND should be usable as GucContext > values for GUC variables, indicating that the GUC requires special > privileges to be set, but we should no longer use them as passed-in > GucContext values. That is, we should remove privilege tests from > the call sites, like this: > > (void) set_config_option(stmt->name, > ExtractSetVariableArgs(stmt), > -(superuser() ? PGC_SUSET : PGC_USERSET), > +PGC_USERSET, > PGC_S_SESSION, > action, true, 0, false); > > and instead put that behavior inside set_config_option_ext, which > would want to look at superuser_arg(srole) instead, and indeed might > not need to do anything because pg_parameter_aclcheck would subsume > the test. I didn't pursue this further because it wasn't essential > to fixing the bug. But it seems relevant here, because that line of > thought leads to the conclusion that storing PGC_SUSET vs PGC_USERSET > is entirely the wrong approach. Couldn't ProcessGUCArray() use set_config_option_ext() with the context indicated by pg_db_role_setting? Also, instead of using PGC_USERSET in all the set_config_option() call sites, shouldn't we remove the "context" argument altogether? I am likely misunderstanding your proposal, but while I think simplifying set_config_option() is worthwhile, I don't see why it would preclude storing the context in pg_db_role_setting. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: Fix gcc warning in sync.c (usr/src/backend/storage/sync/sync.c)
On 2022-Jul-15, Thomas Munro wrote: > On Mon, Jul 11, 2022 at 9:45 PM Kyotaro Horiguchi > wrote: > > So, for starters, I compiled the whole tree with -Wshadow=local. and I > > saw many warnings with it. At a glance all of them are reasonably > > "fixed" but I don't think it is what we want... > > Wow, yeah. Previous threads on this topic: https://postgr.es/m/mn2pr18mb2927f7b5f690065e1194b258e3...@mn2pr18mb2927.namprd18.prod.outlook.com https://postgr.es/m/CAApHDvpqBR7u9yzW4yggjG=QfN=fzsc8wo2ckokpqtif-+i...@mail.gmail.com https://postgr.es/m/877k1psmpf@mailbox.samurai.com -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ Thou shalt study thy libraries and strive not to reinvent them without cause, that thy code may be short and readable and thy days pleasant and productive. (7th Commandment for C Programmers)
Re: Transparent column encryption
On Mon, Jul 18, 2022 at 3:53 AM Peter Eisentraut wrote: > Some other products make use of secure enclaves to do computations on > (otherwise) encrypted values on the server. I don't fully know how that > works, but I suspect that asymmetric keys can play a role in that. (I > don't have any immediate plans for that in my patch. It seems to be a > dying technology at the moment.) > > Asymmetric keys gives you some more options for how you set up the keys > at the beginning. For example, you create the asymmetric key pair on > the host where your client program that wants access to the encrypted > data will run. You put the private key in an appropriate location for > run time. You send the public key to another host. On that other host, > you create the CEK, encrypt it with the CMK, and then upload it into the > server (CREATE COLUMN ENCRYPTION KEY). Then you can wipe that second > host. That way, you can be even more sure that the unencrypted CEK > isn't left anywhere. I'm not sure whether this method is very useful in > practice, but it's interesting. As long as it's clear to people trying this that the "public" key cannot actually be made public, I suppose. That needs to be documented IMO. I like your idea of providing a symmetric option as well. > In any case, as I mentioned above, this particular aspect is up for > discussion. > > Also note that if you use a KMS (cmklookup "run" method), the actual > algorithm doesn't even matter (depending on details of the KMS setup), > since you just tell the KMS "decrypt this", and the KMS knows by itself > what algorithm to use. Maybe there should be a way to specify "unknown" > in the ckdcmkalg field. +1, an officially client-defined method would probably be useful. > The short answer is, these same algorithms are used in equivalent > products (see MS SQL Server, MongoDB). They even reference the same > exact draft document. > > Besides that, here is my analysis for why these are good choices: You > can't use any of the counter modes, because since the encryption happens > on the client, there is no way to coordinate to avoid nonce reuse. So > among mainstream modes, you are basically left with AES-CBC with a > random IV. In that case, even if you happen to reuse an IV, the > possible damage is very contained.) I think both AES-GCM-SIV and XChaCha20-Poly1305 are designed to handle the nonce problem as well. In any case, if I were deploying this, I'd want to know the characteristics/limits of our chosen suites (e.g. how much data can be encrypted per key) so that I could plan rotations correctly. Something like the table in [1]? > > Since we're requiring "canonical" use of text format, and the docs say > > there are no embedded or trailing nulls allowed in text values, could we > > steal the use of a single zero byte to mean NULL? One additional > > complication would be that the client would have to double-check that > > we're not writing a NULL into a NOT NULL column, and complain if it > > reads one during decryption. Another complication would be that the > > client would need to complain if it got a plaintext NULL. > > You're already alluding to some of the complications. Also consider > that null values could arise from, say, outer joins. So you could be in > a situation where encrypted and unencrypted null values coexist. (I realize I'm about to wade into the pool of what NULL means in SQL, the subject of which I've stayed mostly, gleefully, ignorant.) To be honest that sounds pretty useful. Any unencrypted null must have come from the server computation; it's a clear claim by the server that no such rows exist. (If the encrypted column is itself NOT NULL then there's no ambiguity to begin with, I think.) That wouldn't be transparent behavior anymore, so it may (understandably) be a non-goal for the patch, but it really does sound useful. And it might be a little extreme, but if I as a user decided that I wanted in-band encrypted null, it wouldn't be particularly surprising to me if such a column couldn't be included in an outer join. Just like I can't join on a randomized encrypted column, or add two encrypted NUMERICs to each other. In fact I might even want the server to enforce NOT NULL transparently on the underlying pg_encrypted_* column, to make sure that I didn't accidentally push an unencrypted NULL by mistake. > And of > course the server doesn't know about the encrypted null values. So how > do you maintain semantics, like for aggregate functions, primary keys, > anything that treats null values specially? Could you elaborate? Any special cases seem like they'd be important to document regardless of whether or not we support in-band null encryption. For example, do you plan to support encrypted primary keys, null or not? That seems like it'd be particularly difficult during CEK rotation. > How do clients deal with a > mix of encrypted and unencrypted null values, how do they know which one > is real. That one seems s
Re: Transparent column encryption
On Mon, Jul 18, 2022 at 9:07 AM Robert Haas wrote: > Even there, what can be accomplished with a feature that only encrypts > individual column values is by nature somewhat limited. If you have a > text column that, for one row, stores the value 'a', and for some > other row, stores the entire text of Don Quixote in the original > Spanish, it is going to be really difficult to keep an adversary who > can read from the disk from distinguishing those rows. If you want to > fix that, you're going to need to do block-level encryption or > something of that sort. A minimum padding option would fix the leak here, right? If every entry is the same length then there's no information to be gained, at least in an offline analysis. I think some work around that is probably going to be needed for serious use of this encryption, in part because of the use of text format as the canonical input. If the encrypted values of 1, 10, 100, and 1000 hypothetically leaked their exact lengths, then an encrypted int wouldn't be very useful. So I'd want to quantify (and possibly configure) exactly how much data you can encrypt in a single message before the length starts being leaked, and then make sure that my encrypted values stay inside that bound. --Jacob
Re: Make name optional in CREATE STATISTICS
On Thu, 21 Jul 2022 at 18:42, Alvaro Herrera wrote: > > Thanks. I was looking at the recently modified REINDEX syntax and > noticed there another spot for taking an optional name. I ended up > reusing OptSchemaName for that, as in the attached patch. I think > adding production-specific additional productions is pointless and > probably bloats the grammar. So let me +1 your push of the patch you > posted, just to keep things moving forward, but in addition I propose to > later rename OptSchemaName to something more generic and use it in these > three places. > OK, pushed. Before writing opt_stats_name, I went looking for anything else I could use, but didn't see anything. The difference between this and the index case, is that statistics objects can be schema-qualified, so it might be tricky to get something that'll work for all 3 places. Regards, Dean
Re: Transparent column encryption
On Mon, Jul 18, 2022 at 12:53:23PM +0200, Peter Eisentraut wrote: > Asymmetric keys gives you some more options for how you set up the keys at > the beginning. For example, you create the asymmetric key pair on the host > where your client program that wants access to the encrypted data will run. > You put the private key in an appropriate location for run time. You send > the public key to another host. On that other host, you create the CEK, > encrypt it with the CMK, and then upload it into the server (CREATE COLUMN > ENCRYPTION KEY). Then you can wipe that second host. That way, you can be > even more sure that the unencrypted CEK isn't left anywhere. I'm not sure > whether this method is very useful in practice, but it's interesting. > > In any case, as I mentioned above, this particular aspect is up for > discussion. I caution against adding complexity without a good reason, because historically complexity often leads to exploits and bugs, especially with crypto. -- Bruce Momjian https://momjian.us EDB https://enterprisedb.com Indecision is a decision. Inaction is an action. Mark Batterson
Re: make -C libpq check fails obscurely if tap tests are disabled
On 2022-07-21 Th 04:53, Alvaro Herrera wrote: > On 2022-Jul-20, Andrew Dunstan wrote: > >> On 2022-07-20 We 13:23, Justin Pryzby wrote: >>> PATH=... && @echo "TAP tests not enabled. Try configuring with >>> --enable-tap-tests" >>> /bin/sh: 1: @echo: not found >>> >>> make is telling the shell to run "@echo" , rather than running "echo" >>> silently. >> Yeah. It's a bit ugly but I think the attached would fix it. > Here's a different take. Just assign the variable separately. Nice, I didn't know you could do that. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: Handle infinite recursion in logical replication setup
Hi, On 7/21/22 6:34 AM, vignesh C wrote: On Thu, Jul 21, 2022 at 2:06 PM Amit Kapila wrote: On Wed, Jul 20, 2022 at 2:33 PM vignesh C wrote: Modified. Apart from this I have run pgperltidy on the perl file and renamed 032_origin.pl to 030_origin.pl as currently there is 029_on_error.pl, 031_column_list.pl and there is no 030_*.pl file. Thanks for the comment, the attached patch has the changes for the same. Pushed. Kindly rebase the remaining patches. Thanks for pushing the patch. The attached v37 version contains the rebased patch for the remaining patches. Thanks for the work on this feature -- this is definitely very helpful towards supporting more types of use cases with logical replication! I've read through the proposed documentation and did some light testing of the patch. I have two general comments about the docs as they currently read: 1. I'm concerned by calling this "Bidirectional replication" in the docs that we are overstating the current capabilities. I think this is accentuated int he opening paragraph: ==snip== Bidirectional replication is useful for creating a multi-master database environment for replicating read/write operations performed by any of the member nodes. ==snip== For one, we're not replicating reads, we're replicating writes. Amongst the writes, at this point we're only replicating DML. A reader could think that deploying can work for a full bidirectional solution. (Even if we're aspirationally calling this section "Bidirectional replication", that does make it sound like we're limited to two nodes, when we can support more than two). Perhaps "Logical replication between writers" or "Logical replication between primaries" or "Replicating changes between primaries", or something better. 2. There is no mention of conflicts in the documentation, e.g. referencing the "Conflicts" section of the documentation. It's very easy to create a conflicting transaction that causes a subscriber to be unable to continue to apply transactions: -- DB 1 CREATE TABLE abc (id int); CREATE PUBLICATION node1 FOR ALL TABLES ; -- DB2 CREATE TABLE abc (id int); CREATE PUBLICATION node2 FOR ALL TABLES ; CREATE SUBSCRIPTION node2_node1 CONNECTION 'dbname=logi port=5433' PUBLICATION node1 WITH (copy_data = off, origin = none); -- DB1 CREATE SUBSCRIPTION node1_node2 CONNECTION 'dbname=logi port=5434' PUBLICATION node2 WITH (copy_data = off, origin = none); INSERT INTO abc VALUES (1); -- DB2 INSERT INTO abc VALUES (2); -- DB1 ALTER TABLE abc ADD PRIMARY KEY id; INSERT INTO abc VALUES (3); -- DB2 INSERT INTO abc VALUES (3); -- DB1 cannot apply the transactions At a minimum, I think we should reference the documentation we have in the logical replication section on conflicts. We may also want to advise that a user is responsible for designing their schemas in a way to minimize the risk of conflicts. Thanks, Jonathan OpenPGP_signature Description: OpenPGP digital signature
Undocumented Order By vs Target List Volatile Function Behavior
Hey, This came up today on twitter as a claimed POLA violation: postgres=# select random(), random() order by random(); random| random -+- 0.08176638503720679 | 0.08176638503720679 (1 row) Which was explained long ago by Tom as: https://www.postgresql.org/message-id/9570.1193941378%40sss.pgh.pa.us The parser makes it behave equivalent to: SELECT random() AS foo ORDER BY foo; Which apparently extends to any column, even aliased ones, that use the same expression: postgres=# select random() as foo, random() as foo2 order by foo; foo |foo2 + 0.7334292196943459 | 0.7334292196943459 (1 row) The documentation does say: "A query using a volatile function will re-evaluate the function at every row where its value is needed." https://www.postgresql.org/docs/current/xfunc-volatility.html That sentence is insufficient to explain why, without the order by, the system chooses to evaluate random() twice, while with order by it does so only once. I propose extending the existing ORDER BY paragraph in the SELECT Command Reference as follows: "A limitation of this feature is that an ORDER BY clause applying to the result of a UNION, INTERSECT, or EXCEPT clause can only specify an output column name or number, not an expression." Add: A side-effect of this feature is that ORDER BY expressions containing volatile functions will execute the volatile function only once for the entire row; thus any column expressions using the same function will reuse the same function result. By way of example, note the output differences for the following two queries: postgres=# select random() as foo, random()*1 as foo2 from generate_series(1,2) order by foo; foo |foo2 + 0.2631492904302788 | 0.2631492904302788 0.9019166692448664 | 0.9019166692448664 (2 rows) postgres=# select random() as foo, random() as foo2 from generate_series(1,2); foo |foo2 + 0.7763978178239725 | 0.3569212477832773 0.7360531822096732 | 0.7028952103643864 (2 rows) David J.
Re: System column support for partitioned tables using heap
On Tue, Jul 19, 2022 at 11:22 PM Morris de Oryx wrote: > It might help if I show a sample insert handling function. The issue is with > the line at the end of the top CTE, insert_rows: > > returning xmax as inserted_transaction_id), > > That's what fails on partitions. Is there an alternative way to test what > happened to the row(s)? here's the full function. . I wrote a code generator, > so I don't have to hand-code all of these bits for each table+version: Oh I see. I didn't realize you were using INSERT .. ON CONFLICT UPDATE, but that makes tons of sense, and I don't see an obvious alternative to the way you wrote this. Hmm. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
Gurjeet Singh writes: > While poking at plperl's GUC in an internal discussion, I was able to > induce a crash (or an assertion failure in assert-enabled builds) as > an unprivileged user. > My investigation so far has revealed that the code path for the > following condition has never been tested, and because of this, when a > user tries to override an SUSET param via PGOPTIONS, Postgres tries to > perform a table lookup during process initialization. Because there's > no transaction in progress, and because this table is not in the > primed caches, we end up with code trying to dereference an > uninitialized CurrentResourceOwner. Right. So there are basically two things we could do about this: 1. set_config_option could decline to call pg_parameter_aclcheck if not IsTransactionState(), instead failing the assignment. This isn't a great answer because it would result in disallowing GUC assignments that users might expect to work. 2. We could move process_session_preload_libraries() to someplace where a transaction is in progress -- probably, relocate it to inside InitPostgres(). I'm inclined to think that #2 is a better long-term solution, because it'd allow you to session-preload libraries that expect to be able to do database access during _PG_init. (Right now that'd fail with the same sort of symptoms seen here.) But there's no denying that this might have surprising side-effects for extensions that weren't expecting such a change. It could also be reasonable to do both #1 and #2, with the idea that #1 might save us from crashing if there are any other code paths where we can reach that pg_parameter_aclcheck call outside a transaction. Thoughts? regards, tom lane
Re: Refactor to make use of a common function for GetSubscriptionRelations and GetSubscriptionNotReadyRelations.
On Thu, Jul 21, 2022 at 10:11 PM Amit Kapila wrote: > > On Thu, Jul 21, 2022 at 10:03 AM Michael Paquier wrote: > > > > On Thu, Jul 21, 2022 at 09:54:05AM +0530, Amit Kapila wrote: > > > Yeah, it is not very clear to me either. I think this won't be > > > difficult to change one or another way depending on future needs. At > > > this stage, it appeared to me that bitmask is a better way to > > > represent this information but if you and other feels using enum is a > > > better idea then I am fine with that as well. > > > > Please don't get me wrong :) > > > > I favor a bitmask over an enum here, as you do, with a clean > > layer for those flags. > > > > Okay, let's see what Peter Smith has to say about this as he was in > favor of using enum here? > I was in favour of enum mostly because I thought the bitmask of an earlier patch was mis-used; IMO each bit should only be for representing something as "on/set". So a bit for SUBSCRIPTION_REL_STATE_READY makes sense, but a bit for SUBSCRIPTION_REL_STATE_NOT_READY seemed strange/backwards to me. YMMV. So using a bitmask is fine, except I thought it should be implemented so that one of the bits is for a "NOT" modifier (IIUC this is kind of similar to what Michael [1] suggested above?). So "Not READY" would be (SUBSCRIPTION_REL_STATE_MOD_NOT | SUBSCRIPTION_REL_STATE_READY) Also, it may be better to add the bit constants for every one of the current states, even if you are not needing to use all of them just yet. In fact, I thought this patch probably can implement the fully capable common function (i.e. capable of multiple keys etc) right now, so there will be no need to revisit it again in the future. -- [1] https://www.postgresql.org/message-id/Ytiuj4hLykTvBF46%40paquier.xyz Kind Regards, Peter Smith. Fujitsu Australia
Expose Parallelism counters planned/execute in pg_stat_statements
Hi all: Here's a patch to add counters about planned/executed for parallelism to pg_stat_statements, as a way to follow-up on if the queries are planning/executing with parallelism, this can help to understand if you have a good/bad configuration or if your hardware is enough We decided to store information about the number of times is planned and the number of times executed the parallelism by queries Regards Anthony From 99e02378b04d496698147c858c80477659fb34ad Mon Sep 17 00:00:00 2001 From: asotolongo Date: Thu, 21 Jul 2022 17:56:59 -0400 Subject: [PATCH v1] Here's a patch to add the counters of parallelism planned/executed by statements to pg_stat_statements --- contrib/pg_stat_statements/Makefile | 2 +- .../expected/oldextversions.out | 58 .../pg_stat_statements--1.10--1.11.sql| 69 +++ .../pg_stat_statements/pg_stat_statements.c | 58 ++-- .../pg_stat_statements.control| 2 +- .../pg_stat_statements/sql/oldextversions.sql | 5 ++ doc/src/sgml/pgstatstatements.sgml| 18 + 7 files changed, 203 insertions(+), 9 deletions(-) create mode 100644 contrib/pg_stat_statements/pg_stat_statements--1.10--1.11.sql diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile index edc40c8bbf..0afb9060fa 100644 --- a/contrib/pg_stat_statements/Makefile +++ b/contrib/pg_stat_statements/Makefile @@ -6,7 +6,7 @@ OBJS = \ pg_stat_statements.o EXTENSION = pg_stat_statements -DATA = pg_stat_statements--1.4.sql \ +DATA = pg_stat_statements--1.4.sql pg_stat_statements--1.10--1.11.sql \ pg_stat_statements--1.9--1.10.sql pg_stat_statements--1.8--1.9.sql \ pg_stat_statements--1.7--1.8.sql pg_stat_statements--1.6--1.7.sql \ pg_stat_statements--1.5--1.6.sql pg_stat_statements--1.4--1.5.sql \ diff --git a/contrib/pg_stat_statements/expected/oldextversions.out b/contrib/pg_stat_statements/expected/oldextversions.out index efb2049ecf..f847127347 100644 --- a/contrib/pg_stat_statements/expected/oldextversions.out +++ b/contrib/pg_stat_statements/expected/oldextversions.out @@ -250,4 +250,62 @@ SELECT count(*) > 0 AS has_data FROM pg_stat_statements; t (1 row) +-- New functions and views for pg_stat_statements in 1.11 +AlTER EXTENSION pg_stat_statements UPDATE TO '1.11'; +\d pg_stat_statements + View "public.pg_stat_statements" + Column | Type | Collation | Nullable | Default ++--+---+--+- + userid | oid | | | + dbid | oid | | | + toplevel | boolean | | | + queryid| bigint | | | + query | text | | | + plans | bigint | | | + total_plan_time| double precision | | | + min_plan_time | double precision | | | + max_plan_time | double precision | | | + mean_plan_time | double precision | | | + stddev_plan_time | double precision | | | + calls | bigint | | | + total_exec_time| double precision | | | + min_exec_time | double precision | | | + max_exec_time | double precision | | | + mean_exec_time | double precision | | | + stddev_exec_time | double precision | | | + rows | bigint | | | + shared_blks_hit| bigint | | | + shared_blks_read | bigint | | | + shared_blks_dirtied| bigint | | | + shared_blks_written| bigint | | | + local_blks_hit | bigint | | | + local_blks_read| bigint | | | + local_blks_dirtied | bigint | | | + local_blks_written | bigint | | | + temp_blks_read | bigint | | | + temp_blks_written | bigint | | | + blk_read_time | double precision | | | + blk_write_time | double precision | | | + temp_blk_read_time | double precision | | | + temp_blk_write_time| double precision | | | + wal_records| bigint | | | + wal_fpi| bigint | | | + wal_bytes
Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
On Thu, Jul 21, 2022 at 05:44:11PM -0400, Tom Lane wrote: > Right. So there are basically two things we could do about this: > > 1. set_config_option could decline to call pg_parameter_aclcheck > if not IsTransactionState(), instead failing the assignment. > This isn't a great answer because it would result in disallowing > GUC assignments that users might expect to work. > > 2. We could move process_session_preload_libraries() to someplace > where a transaction is in progress -- probably, relocate it to > inside InitPostgres(). > > I'm inclined to think that #2 is a better long-term solution, > because it'd allow you to session-preload libraries that expect > to be able to do database access during _PG_init. (Right now > that'd fail with the same sort of symptoms seen here.) But > there's no denying that this might have surprising side-effects > for extensions that weren't expecting such a change. > > It could also be reasonable to do both #1 and #2, with the idea > that #1 might save us from crashing if there are any other > code paths where we can reach that pg_parameter_aclcheck call > outside a transaction. > > Thoughts? I wrote up a small patch along the same lines as #2 before seeing this message. It simply ensures that process_session_preload_libraries() is called within a transaction. I don't have a strong opinion about doing it this way versus moving this call somewhere else as you proposed, but I'd agree that #2 is a better long-term solution than #1. AFAICT shared_preload_libraries, even with EXEC_BACKEND, should not have the same problem. I'm not sure whether we should be worried about libraries that are already creating transactions in their _PG_init() functions. Off the top of my head, I don't recall seeing anything like that. Even if it does impact some extensions, it doesn't seem like it'd be too much trouble to fix. diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c index 8ba1c170f0..fd471d74a3 100644 --- a/src/backend/tcop/postgres.c +++ b/src/backend/tcop/postgres.c @@ -4115,8 +4115,15 @@ PostgresMain(const char *dbname, const char *username) /* * process any libraries that should be preloaded at backend start (this * likewise can't be done until GUC settings are complete) + * + * If the user provided a setting at session startup for a custom GUC + * defined by one of these libraries, we might need syscache access when + * evaluating whether she has permission to set it, so do this step within + * a transaction. */ +StartTransactionCommand(); process_session_preload_libraries(); +CommitTransactionCommand(); /* * Send this backend's cancellation info to the frontend. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: [PATCH] Log details for client certificate failures
On Wed, Jul 20, 2022 at 3:42 PM Tom Lane wrote: > Jacob Champion writes: > > I'm currently hardcoding an elevel of ERROR on the new guc_strdup()s, > > because that seems to be a common case for the check hooks. > > Really? That's almost certainly NOT okay. As an example, if you > have a problem with a new value loaded from postgresql.conf during > SIGHUP processing, throwing ERROR will cause the postmaster to exit. v4 attempts to fix this by letting the check hooks pass MCXT_ALLOC_NO_OOM to pg_clean_ascii(). (It's ignored in the frontend, which just mallocs.) > I wouldn't be too surprised if there are isolated cases where people > didn't understand what they were doing and wrote that, but that > needs to be fixed not emulated. I might be missing something, but in guc.c at least it appears to be the rule and not the exception. Thanks, --Jacob diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c index 5318719acb..1394ecaa2b 100644 --- a/src/backend/libpq/be-secure-openssl.c +++ b/src/backend/libpq/be-secure-openssl.c @@ -1116,7 +1116,7 @@ prepare_cert_name(char *name) #undef MAXLEN - return pg_clean_ascii(truncated); + return pg_clean_ascii(truncated, 0); } /* diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 5e8cd770c0..52fb2e52f1 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -2284,7 +2284,7 @@ retry1: */ if (strcmp(nameptr, "application_name") == 0) { - port->application_name = pg_clean_ascii(valptr); + port->application_name = pg_clean_ascii(valptr, 0); } } offset = valoffset + strlen(valptr) + 1; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 60400752e5..2f99fe9a6d 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -12479,9 +12479,18 @@ assign_maintenance_io_concurrency(int newval, void *extra) static bool check_application_name(char **newval, void **extra, GucSource source) { + char *clean; + /* Only allow clean ASCII chars in the application name */ - *newval = guc_strdup(ERROR, pg_clean_ascii(*newval)); + clean = pg_clean_ascii(*newval, MCXT_ALLOC_NO_OOM); + if (!clean) + return false; + + clean = guc_strdup(WARNING, clean); + if (!clean) + return false; + *newval = clean; return true; } @@ -12495,9 +12504,18 @@ assign_application_name(const char *newval, void *extra) static bool check_cluster_name(char **newval, void **extra, GucSource source) { + char *clean; + /* Only allow clean ASCII chars in the cluster name */ - *newval = guc_strdup(ERROR, pg_clean_ascii(*newval)); + clean = pg_clean_ascii(*newval, MCXT_ALLOC_NO_OOM); + if (!clean) + return false; + + clean = guc_strdup(WARNING, clean); + if (!clean) + return false; + *newval = clean; return true; } diff --git a/src/common/string.c b/src/common/string.c index db15324c62..97b3d45d36 100644 --- a/src/common/string.c +++ b/src/common/string.c @@ -62,7 +62,10 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base) /* * pg_clean_ascii -- Replace any non-ASCII chars with a "\xXX" string * - * Makes a palloc'd copy of the string passed in, which must be '\0'-terminated. + * Makes a newly allocated copy of the string passed in, which must be + * '\0'-terminated. In the backend, additional alloc_flags may be provided and + * will be passed as-is to palloc_extended(); in the frontend, alloc_flags is + * ignored and the copy is malloc'd. * * This function exists specifically to deal with filtering out * non-ASCII characters in a few places where the client can provide an almost @@ -80,23 +83,46 @@ strtoint(const char *pg_restrict str, char **pg_restrict endptr, int base) * for now. */ char * -pg_clean_ascii(const char *str) +pg_clean_ascii(const char *str, int alloc_flags) { - StringInfoData buf; - const char *p; + size_t dstlen; + char *dst; + const char *p; + size_t i = 0; - initStringInfo(&buf); + /* Worst case, each byte can become four bytes, plus a null terminator. */ + dstlen = strlen(str) * 4 + 1; + +#ifdef FRONTEND + dst = malloc(dstlen); +#else + dst = palloc_extended(dstlen, alloc_flags); +#endif + + if (!dst) + return NULL; for (p = str; *p != '\0'; p++) { + /* Only allow clean ASCII chars in the string */ if (*p < 32 || *p > 126) - appendStringInfo(&buf
Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
Nathan Bossart writes: > +StartTransactionCommand(); > process_session_preload_libraries(); > +CommitTransactionCommand(); Yeah, that way would avoid any questions about changing the order of operations, but it seems like a mighty expensive solution: it's adding a transaction to each backend start on the off chance that (a) session_preload_libraries/local_preload_libraries is nonempty and (b) the loaded libraries are going to do anything where it'd matter. So that's why I thought of moving the call inside a pre-existing transaction. If we had to back-patch this into any released versions, I'd agree with taking the performance hit in order to reduce the chance of side-effects. But I think as long as we only have to do it in v15, it's not too late to possibly cause some compatibility issues for extensions. regards, tom lane
Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
On Thu, Jul 21, 2022 at 2:44 PM Tom Lane wrote: > > Gurjeet Singh writes: > > While poking at plperl's GUC in an internal discussion, I was able to > > induce a crash (or an assertion failure in assert-enabled builds) as > > an unprivileged user. > > My investigation so far has revealed that the code path for the > > following condition has never been tested, and because of this, when a > > user tries to override an SUSET param via PGOPTIONS, Postgres tries to > > perform a table lookup during process initialization. Because there's > > no transaction in progress, and because this table is not in the > > primed caches, we end up with code trying to dereference an > > uninitialized CurrentResourceOwner. > > Right. So there are basically two things we could do about this: > > 1. set_config_option could decline to call pg_parameter_aclcheck > if not IsTransactionState(), instead failing the assignment. > This isn't a great answer because it would result in disallowing > GUC assignments that users might expect to work. > > 2. We could move process_session_preload_libraries() to someplace > where a transaction is in progress -- probably, relocate it to > inside InitPostgres(). > > I'm inclined to think that #2 is a better long-term solution, > because it'd allow you to session-preload libraries that expect > to be able to do database access during _PG_init. (Right now > that'd fail with the same sort of symptoms seen here.) But > there's no denying that this might have surprising side-effects > for extensions that weren't expecting such a change. > > It could also be reasonable to do both #1 and #2, with the idea > that #1 might save us from crashing if there are any other > code paths where we can reach that pg_parameter_aclcheck call > outside a transaction. > > Thoughts? I had debated just wrapping the process_session_preload_libraries() call with a transaction, like Nathan's patch posted ealier on this thread does. But I hesitated because of the sensitivity around the order of operations/call during process initialization. I like the idea of performing library initialization in InitPostgres(), as it performs the first transaction of the connection, and because of the libraries' ability to gin up new GUC variables that might need special handling, and also if it allows them to do database access. I think anywhere after the 'PostAuthDelay' check in InitPostgres() would be a good place to perform process_session_preload_libraries(). I'm inclined to invoke it as late as possible, before we commit the transaction. As for making set_config_option() throw an error if not in transaction, I'm not a big fan of checks that break the flow, and of unrelated code showing up when reading a function. For a casual reader, such a check for transaction would make for a jarring experience; "why are we checking for active transaction in the guts of guc.c?", they might think. If anything, such an error should be thrown from or below pg_parameter_aclcheck(). But I am not sure if it should be exposed as an error. A user encountering that error is not at fault. Hence I believe an assertion check is more suitable for catching code that invokes set_config_option() outside a transaction. Best regards, Gurjeet http://Gurje.et
Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
On Thu, Jul 21, 2022 at 3:29 PM Nathan Bossart wrote: > > On Thu, Jul 21, 2022 at 05:44:11PM -0400, Tom Lane wrote: > > Right. So there are basically two things we could do about this: > > > > 1. set_config_option could decline to call pg_parameter_aclcheck > > if not IsTransactionState(), instead failing the assignment. > > This isn't a great answer because it would result in disallowing > > GUC assignments that users might expect to work. > > > > 2. We could move process_session_preload_libraries() to someplace > > where a transaction is in progress -- probably, relocate it to > > inside InitPostgres(). > > > > I'm inclined to think that #2 is a better long-term solution, > > because it'd allow you to session-preload libraries that expect > > to be able to do database access during _PG_init. (Right now > > that'd fail with the same sort of symptoms seen here.) But > > there's no denying that this might have surprising side-effects > > for extensions that weren't expecting such a change. > > > > It could also be reasonable to do both #1 and #2, with the idea > > that #1 might save us from crashing if there are any other > > code paths where we can reach that pg_parameter_aclcheck call > > outside a transaction. > > > > Thoughts? > > I wrote up a small patch along the same lines as #2 before seeing this > message. It simply ensures that process_session_preload_libraries() is > called within a transaction. I don't have a strong opinion about doing it > this way versus moving this call somewhere else as you proposed, but I'd > agree that #2 is a better long-term solution than #1. AFAICT > shared_preload_libraries, even with EXEC_BACKEND, should not have the same > problem. > > I'm not sure whether we should be worried about libraries that are already > creating transactions in their _PG_init() functions. Off the top of my > head, I don't recall seeing anything like that. Even if it does impact > some extensions, it doesn't seem like it'd be too much trouble to fix. > > diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c > index 8ba1c170f0..fd471d74a3 100644 > --- a/src/backend/tcop/postgres.c > +++ b/src/backend/tcop/postgres.c > @@ -4115,8 +4115,15 @@ PostgresMain(const char *dbname, const char *username) > /* > * process any libraries that should be preloaded at backend start (this > * likewise can't be done until GUC settings are complete) > + * > + * If the user provided a setting at session startup for a custom GUC > + * defined by one of these libraries, we might need syscache access when > + * evaluating whether she has permission to set it, so do this step > within > + * a transaction. > */ > +StartTransactionCommand(); > process_session_preload_libraries(); > +CommitTransactionCommand(); > > /* > * Send this backend's cancellation info to the frontend. (none of the following is your patch's fault) I don't think that is a good call-site for process_session_preload_libraries(), because a library being loaded can declare its own GUCs, hence I believe this should be called at least before the call to BeginReportingGUCOptions(). If an extension creates a GUC with GUC_REPORT flag, it is violating expectations. But since the DefineCustomXVariable() stack does not prevent the callers from doing so, we must still honor the protocol followed for all params with GUC_REPORT. And hence the I think it'd be a good idea to ban the callers of DefineCustomXVariable() from declaring their variable GUC_REPORT, to ensure that only core code can define such variables. Best regards, Gurjeet http://Gurje.et
Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
On Thu, Jul 21, 2022 at 07:30:20PM -0400, Tom Lane wrote: > Nathan Bossart writes: >> +StartTransactionCommand(); >> process_session_preload_libraries(); >> +CommitTransactionCommand(); > > Yeah, that way would avoid any questions about changing the order of > operations, but it seems like a mighty expensive solution: it's > adding a transaction to each backend start on the off chance that > (a) session_preload_libraries/local_preload_libraries is nonempty and > (b) the loaded libraries are going to do anything where it'd matter. > So that's why I thought of moving the call inside a pre-existing > transaction. > > If we had to back-patch this into any released versions, I'd agree with > taking the performance hit in order to reduce the chance of side-effects. > But I think as long as we only have to do it in v15, it's not too late to > possibly cause some compatibility issues for extensions. Yeah, fair point. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
At Thu, 21 Jul 2022 23:14:57 +1200, Thomas Munro wrote in > On Thu, Jul 21, 2022 at 11:01 PM Alvaro Herrera > wrote: > > On 2022-Jul-20, Alvaro Herrera wrote: > > > I see the following alternatives: > > > > > > 1. not backpatch this fix to 14 and older > > > 2. use a different GUC; either allow_invalid_pages as previously > > >suggested, or create a new one just for this purpose > > > 3. not provide any overriding mechanism in versions 14 and older > > > > I've got no opinions on this. I don't like either 1 or 3, so I'm going > > to add and backpatch a new GUC allow_recovery_tablespaces as the > > override mechanism. > > > > If others disagree with this choice, please speak up. > > Would it help if we back-patched the allow_in_place_tablespaces stuff? > I'm not sure how hard/destabilising that would be, but I could take a > look tomorrow. +1. Addiotional reason for me is it is a developer option. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Expose Parallelism counters planned/execute in pg_stat_statements
On Thu, Jul 21, 2022 at 06:26:58PM -0400, Anthony Sotolongo wrote: > Hi all: > Here's a patch to add counters about planned/executed for parallelism to > pg_stat_statements, as a way to follow-up on if the queries are > planning/executing with parallelism, this can help to understand if you have > a good/bad configuration or if your hardware is enough +1, I was missing something like this before, but it didn't occur to me to use PSS: https://www.postgresql.org/message-id/20200310190142.gb29...@telsasoft.com > My hope is to answer to questions like these: > > . is query (ever? usually?) using parallel paths? > . is query usefully using parallel paths? > . what queries are my max_parallel_workers(_per_process) being used for ? > . Are certain longrunning or frequently running queries which are using > parallel paths using all max_parallel_workers and precluding other queries > from using parallel query ? Or, are semi-short queries sometimes precluding > longrunning queries from using parallelism, when the long queries would > better benefit ? This patch is storing the number of times the query was planned/executed using parallelism, but not the number of workers. Would it make sense to instead store the the *number* of workers launched/planned ? Otherwise, it might be that a query is consistently planned to use a large number of workers, but then runs with few. I'm referring to the fields shown in "explain/analyze". (Then, the 2nd field should be renamed to "launched"). Workers Planned: 2 Workers Launched: 2 I don't think this is doing the right thing for prepared statements, like PQprepare()/PQexecPrepared(), or SQL: PREPARE p AS SELECT; EXECUTE p; Right now, the docs say that it shows the "number of times the statement was planned to use parallelism", but the planning counter is incremented during each execution. PSS already shows "calls" and "plans" separately. The documentation doesn't mention prepared statements as a reason why they wouldn't match, which seems like a deficiency. This currently doesn't count parallel workers used by utility statements, such as CREATE INDEX and VACUUM (see max_parallel_maintenance_workers). If that's not easy to do, mention that in the docs as a limitation. You should try to add some test to contrib/pg_stat_statements/sql, or add parallelism test to an existing test. Note that the number of parallel workers launched isn't stable, so you can't test that part.. You modified pgss_store() to take two booleans, but pass "NULL" instead of "false". Curiously, of all the compilers in cirrusci, only MSVC complained .. "planed" is actually spelled "planned", with two enns. The patch has some leading/trailing whitespace (maybe shown by git log depending on your configuration). Please add this patch to the next commitfest. https://commitfest.postgresql.org/39/ -- Justin
Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
On Thu, Jul 21, 2022 at 4:35 PM Gurjeet Singh wrote: > I like the idea of performing library initialization in > InitPostgres(), as it performs the first transaction of the > connection, and because of the libraries' ability to gin up new GUC > variables that might need special handling, and also if it allows them > to do database access. > > I think anywhere after the 'PostAuthDelay' check in InitPostgres() > would be a good place to perform process_session_preload_libraries(). > I'm inclined to invoke it as late as possible, before we commit the > transaction. > > As for making set_config_option() throw an error if not in > transaction, I'm not a big fan of checks that break the flow, and of > unrelated code showing up when reading a function. For a casual > reader, such a check for transaction would make for a jarring > experience; "why are we checking for active transaction in the guts > of guc.c?", they might think. If anything, such an error should be > thrown from or below pg_parameter_aclcheck(). > > But I am not sure if it should be exposed as an error. A user > encountering that error is not at fault. Hence I believe an assertion > check is more suitable for catching code that invokes > set_config_option() outside a transaction. Please see attached the patch that implements the above proposal. The process_session_preload_libraries() call has been moved to the end of InitPostgres(), just before we report the backend to PgBackendStatus and commit the first transaction. One notable side effect of this change is that process_session_preload_libraries() is now called _before_ we SetProcessingMode(NormalProcessing). Which means any database access performed by _PG_init() of an extension will be doing it in InitProcessing mode. I'm not sure if that's problematic. The patch also adds an assertion in pg_parameter_aclcheck() to ensure that there's a transaction is in progress before it's called. The patch now lets the user connect, throws a warning, and does not crash. $ PGOPTIONS="-c plperl.on_plperl_init=" psql -U test WARNING: permission denied to set parameter "plperl.on_plperl_init" Expanded display is used automatically. psql (15beta1) Type "help" for help. postgres@B:694512=> Best regards, Gurjeet http://Gurje.et move_process_session_preload_libraries.patch Description: Binary data
Re: Make name optional in CREATE STATISTICS
On Thu, Jul 21, 2022 at 07:42:12PM +0200, Alvaro Herrera wrote: > Thanks. I was looking at the recently modified REINDEX syntax and > noticed there another spot for taking an optional name. I ended up > reusing OptSchemaName for that, as in the attached patch. I think > adding production-specific additional productions is pointless and > probably bloats the grammar. So let me +1 your push of the patch you > posted, just to keep things moving forward, but in addition I propose to > later rename OptSchemaName to something more generic and use it in these > three places. This slightly changes the behavior of the grammar, as CONCURRENTLY was working on DATABASE as follows: * On HEAD: =# reindex database concurrently postgres; REINDEX =# reindex (concurrently) database concurrently postgres; REINDEX =# reindex (concurrently) database ; REINDEX =# reindex (concurrently) database postgres; REINDEX =# reindex database concurrently postgres; REINDEX =# reindex database concurrently; ERROR: 42601: syntax error at or near ";" And actually, even on HEAD, the last case is marked as supported by the docs but we don't allow it in the parser. My mistake on this one. Now, with the patch, I get: =# reindex (concurrently) database concurrently; ERROR: 42601: syntax error at or near "concurrently" LINE 1: reindex (concurrently) database concurrently postgres ; =# reindex (concurrently) database concurrently postgres; ERROR: 42601: syntax error at or near "concurrently" LINE 1: reindex (concurrently) database concurrently postgres; =# reindex (concurrently) database ; REINDEX =# reindex (concurrently) database postgres; REINDEX =# reindex database concurrently postgres; ERROR: 42601: syntax error at or near "concurrently" LINE 1: reindex database concurrently postgres; =# reindex database concurrently; ERROR: 42601: syntax error at or near "concurrently" So this indeed has as effect to make possible the use of CONCURRENTLY for DATABASE and SYSTEM only within the parenthesized grammar. Seeing the simplifications this creates, I'd agree with dropping this part of the grammar. I think that I would add the following queries to create_index.sql to test this grammar, as we can rely on this code path generating an error: REINDEX (CONCURRENTLY) SYSTEM postgres; REINDEX (CONCURRENTLY) SYSTEM; At least it would validate the parsing for DATABASE. This breaks reindexdb for the database case, because the query generated in run_reindex_command() adds CONCURRENTLY only *after* the database name, and we should be careful to generate something backward-compatible in this tool, as well. The fix is simple: you can add CONCURRENTLY within the section with TABLESPACE and VERBOSE for a connection >= 14, and use it after the object name for <= 13. -- Michael signature.asc Description: PGP signature
Re: Fwd: Unprivileged user can induce crash by using an SUSET param in PGOPTIONS
On Thu, Jul 21, 2022 at 05:39:35PM -0700, Gurjeet Singh wrote: > One notable side effect of this change is that > process_session_preload_libraries() is now called _before_ we > SetProcessingMode(NormalProcessing). Which means any database access > performed by _PG_init() of an extension will be doing it in > InitProcessing mode. I'm not sure if that's problematic. I cannot see a reason why on top of my mind. The restrictions of InitProcessing apply to two code paths of bgworkers connecting to a database, and normal processing is used as a barrier to prevent the creation of some objects. > The patch also adds an assertion in pg_parameter_aclcheck() to ensure > that there's a transaction is in progress before it's called. + /* It's pointless to call this function, unless we're in a transaction. */ + Assert(IsTransactionState()); This can involve extension code, I think that this should be at least an elog(ERROR) so as we have higher chances of knowing if something still goes wrong in the wild. > The patch now lets the user connect, throws a warning, and does not crash. > > $ PGOPTIONS="-c plperl.on_plperl_init=" psql -U test > WARNING: permission denied to set parameter "plperl.on_plperl_init" > Expanded display is used automatically. > psql (15beta1) > Type "help" for help. I am wondering whether we'd better have a test on this one with a non-superuser. Except for a few tests in the unsafe section, session_preload_libraries has a limited amount of coverage. + /* +* process any libraries that should be preloaded at backend start (this +* can't be done until GUC settings are complete). Note that these libraries +* can declare new GUC variables. +*/ + process_session_preload_libraries(); There is no point in doing that during bootstrap anyway, no? -- Michael signature.asc Description: PGP signature
Re: standby recovery fails (tablespace related) (tentative patch and discussion)
At Thu, 21 Jul 2022 13:25:05 +0200, Alvaro Herrera wrote in > On 2022-Jul-21, Alvaro Herrera wrote: > > > Yeah, I think that would reduce cruft. I'm not sure this is more > > against backpatching policy or less, compared to adding a separate > > GUC just for this bugfix. > > cruft: > > { > {"allow_recovery_tablespaces", PG_POSTMASTER, WAL_RECOVERY, > gettext_noop("Continues recovery after finding invalid database > directories."), > gettext_noop("It is possible for tablespace drop to interfere > with database creation " > "so that WAL replay is forced to create fake > database directories. " > "These should have been dropped by the time recovery > ends; " > "but in case they aren't, this option lets recovery > continue if they " > "are present. Note that these directories must be > removed manually afterwards."), > GUC_NOT_IN_SAMPLE > }, > &allow_recovery_tablespaces, > false, > NULL, NULL, NULL > }, > > This is not a very good explanation, but I don't know how to make it > better. It looks a bit too detailed. I crafted the following.. Recovery can create tentative in-place tablespace directories under pg_tblspc/. They are assumed to be removed until reaching recovery consistency, but otherwise PostgreSQL raises a PANIC-level error, aborting the recovery. Setting allow_recovery_tablespaces to true causes the system to allow such directories during normal operation. In case those directories are left after reaching consistency, that implies data loss and metadata inconsistency and may cause failure of future tablespace creation. Though, after writing this, I became to think that piggy-backing on allow_in_place_tablespaces might be a bit different.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Re: Memory leak fix in psql
On Thu, Jul 21, 2022 at 02:43:03PM +0800, Japin Li wrote: > I attached a patch for v14 [1] based on master, if you want to apply it, > please consider reviewing it. We are talking about a few hundred bytes leaked each time, so this does not worry me much in the older branches, honestly. -- Michael signature.asc Description: PGP signature
Re: Reducing logs produced by TAP tests running pg_regress on crash
On Thu, Jun 23, 2022 at 02:30:13PM +0900, Michael Paquier wrote: > One idea I got to limit the useless output generated is to check the > status of the cluster after running the regression test suite as > restart_on_crash is disabled by default in Cluster.pm, and avoid any > follow-up logic in these tests if the cluster is not running anymore, > as of the attached. So, this is still an open item.. Thomas, any objections about this one? Checking for the status of the node after completing pg_regress still sounds like a good idea to me, because as restart_after_crash is disabled we would generate a ton of logs coming from regression.diffs for nothing. On top of that the parallel connections make harder finding which query failed, and the logs of the backend provide enough context already on a hard crash. -- Michael signature.asc Description: PGP signature
Re: Reducing logs produced by TAP tests running pg_regress on crash
On Fri, Jul 22, 2022 at 1:09 PM Michael Paquier wrote: > On Thu, Jun 23, 2022 at 02:30:13PM +0900, Michael Paquier wrote: > > One idea I got to limit the useless output generated is to check the > > status of the cluster after running the regression test suite as > > restart_on_crash is disabled by default in Cluster.pm, and avoid any > > follow-up logic in these tests if the cluster is not running anymore, > > as of the attached. > > So, this is still an open item.. > > Thomas, any objections about this one? Checking for the status of the > node after completing pg_regress still sounds like a good idea to me, > because as restart_after_crash is disabled we would generate a ton of > logs coming from regression.diffs for nothing. On top of that the > parallel connections make harder finding which query failed, and the > logs of the backend provide enough context already on a hard crash. What if the clue we need to understand why it crashed was in the regression diffs that we didn't dump? I wonder if we should move the noise suppression check closer to pg_regress, so that it works also for the "main" pg_regress run, not only the one in this new TAP test. As discussed in this thread, inconclusively: https://www.postgresql.org/message-id/flat/CA%2BhUKGL7hxqbadkto7e1FCOLQhuHg%3DwVn_PDZd6fDMbQrrZisA%40mail.gmail.com