Re: Useless field ispartitioned in CreateStmtContext
On 2024-Nov-05, hugo wrote: > Hi, Kirill > > Sorry for the late reply, thanks for your suggestion. > A simple fix has been added to the attached patch. Actually, AFAICT my patch at https://commitfest.postgresql.org/50/5224/ adds a use of this field, so if you remove it, I might have to put it back for that. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "But static content is just dynamic content that isn't moving!" http://smylers.hates-software.com/2007/08/15/fe244d0c.html
Re: doc fail about ALTER TABLE ATTACH re. NO INHERIT
On 2024-Nov-05, Alvaro Herrera wrote: > All in all, I think this text serves no purpose and should be removed > (from all live branches), as in the attached patch. Attached here. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "Los cuentos de hadas no dan al niño su primera idea sobre los monstruos. Lo que le dan es su primera idea de la posible derrota del monstruo." (G. K. Chesterton) >From 718c3e17a76f4adf219f4e1d1aaf79766efaeb02 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=C3=81lvaro=20Herrera?= Date: Tue, 5 Nov 2024 12:52:47 +0100 Subject: [PATCH] doc: Remove incorrect assertion about ALTER TABLE failing --- doc/src/sgml/ref/alter_table.sgml | 4 1 file changed, 4 deletions(-) diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index 36770c012a6..c054530e723 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -1026,10 +1026,6 @@ WITH ( MODULUS numeric_literal, REM UNIQUE and PRIMARY KEY constraints from the parent table will be created in the partition, if they don't already exist. - If any of the CHECK constraints of the table being - attached are marked NO INHERIT, the command will fail; - such constraints must be recreated without the - NO INHERIT clause. -- 2.39.5
Re: SQL:2011 application time
Foreign key violation errors are incorrectly raised in a few cases for a temporal foreign key with default ON UPDATE NO ACTION. Test is based on the commited v39 patches (used a snapshot version of PG18 devel available from PGDG). If there exists a single referencing row for a foreign key (with default ON UPDATE NO ACTION) with a range such as: c d |--| and a single row in the referenced table, and the referenced row's range is updated as in one of the following cases: a b c d e f X>>>|==| ERROR 1: [a,f) updated to [b,f) or |==|<>>|== ERROR 3: [a,) updated to [b,) then an error is incorrectly raised (also, if the referencing range is [c,) instead of [c,d), then the last case also fails). See SQL-code below for how to reproduce the errors. --- CREATE TABLE temporal_rng ( id int4range, valid_at daterange, CONSTRAINT temporal_rng_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS) ); CREATE TABLE temporal_fk_rng2rng ( id int4range, valid_at daterange, parent_id int4range, CONSTRAINT temporal_fk_rng2rng_pk PRIMARY KEY (id, valid_at WITHOUT OVERLAPS), CONSTRAINT temporal_fk_rng2rng_fk FOREIGN KEY (parent_id, PERIOD valid_at) REFERENCES temporal_rng ); -- ERROR 1 INSERT INTO temporal_rng (id, valid_at) VALUES ('[1,2)', daterange('2018-01-01', '2018-03-01')); INSERT INTO temporal_fk_rng2rng (id, valid_at, parent_id) VALUES ('[2,3)', daterange('2018-01-15', '2018-02-01'), '[1,2)'); -- ERROR: update or delete on table "temporal_rng" violates foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng" -- DETAIL: Key (id, valid_at)=([1,2), [2018-01-01,2018-03-01)) is still referenced from table "temporal_fk_rng2rng". UPDATE temporal_rng SET valid_at = daterange('2018-01-05', '2018-03-01') WHERE id = '[1,2)' AND valid_at = daterange('2018-01-01', '2018-03-01'); -- ERROR: update or delete on table "temporal_rng" violates foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng" -- DETAIL: Key (id, valid_at)=([1,2), [2018-01-01,2018-03-01)) is still referenced from table "temporal_fk_rng2rng". UPDATE temporal_rng SET valid_at = daterange('2018-01-01', '2018-02-15') WHERE id = '[1,2)' AND valid_at = daterange('2018-01-01', '2018-03-01'); -- ERROR 2 TRUNCATE temporal_rng, temporal_fk_rng2rng; INSERT INTO temporal_rng (id, valid_at) VALUES ('[1,2)', daterange('2018-01-05', NULL)); INSERT INTO temporal_fk_rng2rng (id, valid_at, parent_id) VALUES ('[2,3)', daterange('2018-01-15', '2018-02-01'), '[1,2)'); -- ERROR: update or delete on table "temporal_rng" violates foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng" -- DETAIL: Key (id, valid_at)=([1,2), [2018-01-05,)) is still referenced from table "temporal_fk_rng2rng". UPDATE temporal_rng SET valid_at = daterange('2018-01-05', '2018-02-15') WHERE id = '[1,2)' AND valid_at = daterange('2018-01-05', NULL); -- ERROR 3 TRUNCATE temporal_rng, temporal_fk_rng2rng; INSERT INTO temporal_rng (id, valid_at) VALUES ('[1,2)', daterange('2018-01-01', NULL)); INSERT INTO temporal_fk_rng2rng (id, valid_at, parent_id) VALUES ('[2,3)', daterange('2018-01-15', '2018-02-01'), '[1,2)'); -- ERROR: update or delete on table "temporal_rng" violates foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng" -- DETAIL: Key (id, valid_at)=([1,2), [2018-01-01,)) is still referenced from table "temporal_fk_rng2rng". UPDATE temporal_rng SET valid_at = daterange('2018-01-05', NULL) WHERE id = '[1,2)' AND valid_at = daterange('2018-01-01', NULL); --- I think the problem is the check in ri_restrict: SELECT 1 FROM [ONLY] x WHERE $1 = fkatt1 [AND ...] FOR KEY SHARE OF x it will be performed in the NO ACTION case when ri_Check_Pk_Match returns false, and it'll then incorrectly assume that the presence of a referencing row in the is an error. However, ri_Check_Pk_Match only tests wheter a temporal primary key's old range is contained by the multirange that includes its new updated range. If that's true, then all references are necessarily still valid. However, even if it is not contained, all references can still be valid. So, only testing for the presence of a referencing row is not enough. For example, for ERROR1, the range [a,f) is updated to [b,f): a b c d f X>>>|==| Clearly the old range: a c d f |==| is no longer contained by (the multirange returned by range_agg of) the new range: b c d f |=
Re: define pg_structiszero(addr, s, r)
Em ter., 5 de nov. de 2024 às 00:23, David Rowley escreveu: > On Tue, 5 Nov 2024 at 06:39, Ranier Vilela wrote: > > I think we can add a small optimization to this last patch [1]. > > The variable *aligned_end* is only needed in the second loop (for). > > So, only before the for loop do we actually declare it. > > > > Result before this change: > > check zeros using BERTRAND 1 0.31s > > > > Result after this change: > > check zeros using BERTRAND 1 0.18s > > > > + const unsigned char *aligned_end; > > > > + /* Multiple bytes comparison(s) at once */ > > + aligned_end = (const unsigned char *) ((uintptr_t) end & > (~(sizeof(size_t) - 1))); > > + for (; p < aligned_end; p += sizeof(size_t)) > > I think we all need to stop using Godbolt's servers to run benchmarks > on. These servers are likely to be running various other workloads in > highly virtualised environments and are not going to be stable servers > that would give consistent benchmark results. > > I tried your optimisation in the attached allzeros.c and here are my > results: > > # My version > $ gcc allzeros.c -O2 -o allzeros && for i in {1..3}; do ./allzeros; done > char: done in 1566400 nanoseconds > size_t: done in 195400 nanoseconds (8.01638 times faster than char) > char: done in 1537500 nanoseconds > size_t: done in 196300 nanoseconds (7.8324 times faster than char) > char: done in 1543600 nanoseconds > size_t: done in 196300 nanoseconds (7.86347 times faster than char) > > # Ranier's optimization > $ gcc allzeros.c -O2 -D RANIERS_OPTIMIZATION -o allzeros && for i in > {1..3}; do ./allzeros; done > char: done in 1943100 nanoseconds > size_t: done in 531700 nanoseconds (3.6545 times faster than char) > char: done in 1957200 nanoseconds > size_t: done in 458400 nanoseconds (4.26963 times faster than char) > char: done in 1949500 nanoseconds > size_t: done in 469000 nanoseconds (4.15672 times faster than char) > > Seems to be about half as fast with gcc on -O2 > Thanks for test coding. I've tried with msvc 2022 32bits Here the results: C:\usr\src\tests\allzeros>allzeros char: done in 71431900 nanoseconds size_t: done in 18010900 nanoseconds (3.96604 times faster than char) C:\usr\src\tests\allzeros>allzeros char: done in 71070100 nanoseconds size_t: done in 19654300 nanoseconds (3.61601 times faster than char) C:\usr\src\tests\allzeros>allzeros char: done in 68682400 nanoseconds size_t: done in 19841100 nanoseconds (3.46162 times faster than char) C:\usr\src\tests\allzeros>allzeros char: done in 63215100 nanoseconds size_t: done in 17920200 nanoseconds (3.52759 times faster than char) C:\usr\src\tests\allzeros>c /DRANIERS_OPTIMIZATION Microsoft (R) Program Maintenance Utility Versão 14.40.33813.0 Direitos autorais da Microsoft Corporation. Todos os direitos reservados. C:\usr\src\tests\allzeros>allzeros char: done in 67213800 nanoseconds size_t: done in 15049200 nanoseconds (4.46627 times faster than char) C:\usr\src\tests\allzeros>allzeros char: done in 51505900 nanoseconds size_t: done in 13645700 nanoseconds (3.77452 times faster than char) C:\usr\src\tests\allzeros>allzeros char: done in 62852600 nanoseconds size_t: done in 17863800 nanoseconds (3.51843 times faster than char) C:\usr\src\tests\allzeros>allzeros char: done in 51877200 nanoseconds size_t: done in 13759900 nanoseconds (3.77017 times faster than char) The function used to replace clock_getime is: timespec_get(ts, TIME_UTC) best regards, Ranier Vilela
Re: define pg_structiszero(addr, s, r)
Em ter., 5 de nov. de 2024 às 01:12, Michael Paquier escreveu: > On Tue, Nov 05, 2024 at 04:23:34PM +1300, David Rowley wrote: > > I tried your optimisation in the attached allzeros.c and here are my > results: > > > > # My version > > $ gcc allzeros.c -O2 -o allzeros && for i in {1..3}; do ./allzeros; done > > char: done in 1543600 nanoseconds > > size_t: done in 196300 nanoseconds (7.86347 times faster than char) > > > > # Ranier's optimization > > $ gcc allzeros.c -O2 -D RANIERS_OPTIMIZATION -o allzeros && for i in > > size_t: done in 531700 nanoseconds (3.6545 times faster than char) > > char: done in 1957200 nanoseconds > > I am not seeing numbers as good as yours, but the winner is clear as > well here: > Thanks for testing. > > $ gcc allzeros.c -O2 -o allzeros && for i in {1..3}; do > ./allzeros; done > char: done in 6578995 nanoseconds > size_t: done in 829916 nanoseconds (7.9273 times faster than char) > char: done in 6581465 nanoseconds > size_t: done in 829948 nanoseconds (7.92997 times faster than char) > char: done in 6585748 nanoseconds > size_t: done in 834929 nanoseconds (7.88779 times faster than char) > > $ gcc allzeros.c -O2 -D RANIERS_OPTIMIZATION -o allzeros && for i in > {1..3}; do ./allzeros; > done char: done in 6591803 nanoseconds > size_t: done in 1236102 nanoseconds (5.33273 times faster than char) > char: done in 6606219 nanoseconds > size_t: done in 1235979 nanoseconds (5.34493 times faster than char) > char: done in 6594413 nanoseconds > size_t: done in 1238770 nanoseconds (5.32336 times faster than char) > > I'm surprised to see that assigning aligned_end at these two different > locations has this much effect once the compiler optimizes the > surroundings, but well. > I think that's a plus point for the benefit of not touching the memory if it's not explicitly necessary. best regards, Ranier Vilela
Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)
Interesting idea. This patch needs a rebase. On Thu, 17 Oct 2024 at 15:59, Shayon Mukherjee wrote: > > > On Oct 16, 2024, at 6:01 PM, Shayon Mukherjee wrote: > > I'll take some time to think this through and familiarize myself with the > new systable_inplace_update_begin. In the meantime, aside from the in-place > updates on pg_index, I would love to learn any first impressions or > feedback on the patch folks may have. > > > My take away from whether or not an in-place update is needed on pg_index > [1] > > - It’s unclear to me why it’s needed. > - I understand the xmin would get incremented when > using CatalogTupleUpdate to update indisenabled. > - However, I haven’t been able to replicate any odd behavior locally or > CI. > - FWIW - REINDEX CONCURRENTLY (via index_swap), index_constraint_create > and few other places perform CatalogTupleUpdate to update the relevant > attributes as well. > > Based on the above summary and after my testing I would like to propose a > v3 of the patch. The only difference between this and v1 [2] is that the > update of pg_index row happens via CatalogTupleUpdate. > > [1] > https://www.postgresql.org/message-id/20180618215635.m5vrnxdxhxytv...@alap3.anarazel.de > [2] > https://www.postgresql.org/message-id/EF2313B8-A017-4869-9B7F-A24EDD8795DE%40gmail.com > > Thank you for bearing with me on this :D > Shayon > -- Regards, Rafia Sabih CYBERTEC PostgreSQL International GmbH
doc fail about ALTER TABLE ATTACH re. NO INHERIT
Hello, While doing final review for not-null constraints, I noticed that the ALTER TABLE ATTACH PARTITION have this phrase: If any of the CHECK constraints of the table being attached are marked NO INHERIT, the command will fail; such constraints must be recreated without the NO INHERIT clause. However, this is not true and apparently has never been true. I tried this in both master and pg10: create table parted (a int) partition by list (a); create table part1 (a int , check (a > 0) no inherit); alter table parted attach partition part1 for values in (1); In both versions (and I imagine all intermediate ones) that sequence works fine and results in this table: Table "public.part1" Column │ Type │ Collation │ Nullable │ Default │ Storage │ Stats target │ Description ┼─┼───┼──┼─┼─┼──┼─ a │ integer │ │ │ │ plain │ │ Partition of: parted FOR VALUES IN (1) Partition constraint: ((a IS NOT NULL) AND (a = 1)) Check constraints: "part1_a_check" CHECK (a > 0) NO INHERIT On the other hand, if we were to throw an error in the ALTER TABLE as the docs say, it would serve no purpose: the partition cannot have any more descendants, so the fact that the CHECK constraint is NO INHERIT makes no difference. So I think the code is fine and we should just fix the docs. Note that if you interpret it the other way around, i.e., that the "table being attached" is the parent table, then the first CREATE already fails: create table parted2 (a int check (a > 0) no inherit) partition by list (a); ERROR: cannot add NO INHERIT constraint to partitioned table "parted2" This says that we don't need to worry about this condition in the parent table either. All in all, I think this text serves no purpose and should be removed (from all live branches), as in the attached patch. This text came in with the original partitioning commit f0e44751d717. CCing Robert and Amit. -- Álvaro HerreraBreisgau, Deutschland — https://www.EnterpriseDB.com/ "No renuncies a nada. No te aferres a nada."
Re: general purpose array_sort
Hi jian, On Tue, Nov 5, 2024 at 3:13 PM jian he wrote: > > On Mon, Nov 4, 2024 at 7:34 PM Dean Rasheed wrote: > > > > Testing this with an array with non-default lower bounds, it fails to > > preserve the array bounds, which I think it should (note: > > array_reverse() and array_shuffle() do preserve the bounds): > > > > SELECT array_reverse(a), array_shuffle(a), array_sort(a) > > FROM (VALUES ('[10:12][20:21]={{1,2},{10,20},{3,4}}'::int[])) v(a); > > > > -[ RECORD 1 ]-+- > > array_reverse | [10:12][20:21]={{3,4},{10,20},{1,2}} > > array_shuffle | [10:12][20:21]={{10,20},{3,4},{1,2}} > > array_sort| [1:3][20:21]={{1,2},{3,4},{10,20}} > > > > if i understand it correctly, > array_create_iterator cannot cope with top dimension bound information. > since input array arguments already have dims, lbs information. > so at the end of array_sort directly copy > from the input array argument to astate. > > tuplesort_performsort won't need array bounds, we should be safe? > > > > v12-0001 same as v11-0001-general-purpose-array_sort.patch, only > resolve git conflict > v12-0002 preserve array bound information. > v12-0003 cache ArrayMetaState. > > after v12-0003 now > typedef struct ArraySortCachedInfo > { > TypeCacheEntry *typentry; > TypeCacheEntry *array_typentry; > ArrayMetaState array_meta; > } ArraySortCachedInfo; > > function array_create_iterator, get_typlenbyvalalign > will do cache search, we can cache ArrayMetaState. > so multiple array_create_iterator calls won't need to call > get_typlenbyvalalign. > every time. > > > 0002, I also have a 3 dimensional array test. > create table t(a int[]); > insert into t values ('[-1:-0]={7,1}'::int[]), > ('[-2:-0][20:21]={{1,2},{10,20},{1,-4}}'), > ('[-2:-0][20:22]={{-11,2,-1},{-11,2, 1},{-11,-4, 10}}'), > ('[-13:-10][0:1][20:22]={ > {{1,2,112},{1,2,-123}}, > {{10,-20,1},{11,123,3}}, > {{10,-20,1},{11,-123,-9}}, > {{1,2,-11},{1,2,211}}}'::int[]); > SELECT array_sort(t.a) from t; > SELECT array_sort((t.a) [-13:-10][0:1][21:22]) from t where array_ndims(a) = > 3; > SELECT array_sort((t.a) [-13:-11][0:1][21:22]) from t where array_ndims(a) = > 3; > SELECT array_sort((t.a) [-13:-11][0:0][20:21]) from t where array_ndims(a) = > 3; > > The test output is ok to me. Thanks for the bounds preserve solution, I just looked at 0002, + if (astate->arraystate != NULL) + { + memcpy(astate->arraystate->dims, dims, ndim * sizeof(int)); + memcpy(astate->arraystate->lbs, lbs, ndim * sizeof(int)); + Assert(ndim == astate->arraystate->ndims); + } It seems to me we only need to set astate->arraystate->lbs[0] = lbs[0] ? -- Regards Junwang Zhao
Re: Commit Timestamp and LSN Inversion issue
Hi Hackers, On 9/5/24 01:39, Amit Kapila wrote: On Wed, Sep 4, 2024 at 6:35 PM Aleksander Alekseev wrote: > > I don't think you can rely on a system clock for conflict resolution. > > In a corner case a DBA can move the clock forward or backward between > > recordings of Ts1 and Ts2. On top of that there is no guarantee that > > 2+ servers have synchronised clocks. It seems to me that what you are > > proposing will just hide the problem instead of solving it in the > > general case. > > > > It is possible that we can't rely on the system clock for conflict > resolution but that is not the specific point of this thread. As > mentioned in the subject of this thread, the problem is "Commit > Timestamp and LSN Inversion issue". The LSN value and timestamp for a > commit are not generated atomically, so two different transactions can > have them in different order. Hm Then I'm having difficulties understanding why this is a problem This is a potential problem pointed out during discussion of CDR [1] (Please read the point starting from "How is this going to deal .." and response by Shveta). The point of this thread is that though it appears to be a problem but practically there is no scenario where it can impact even when we implement "last_write_wins" startegy as explained in the initial email. If you or someone sees a problem due to LSN<->timestamp inversion then we need to explore the solution for it. and why it was necessary to mention CDR in this context in the first place. OK, let's forget about CDR completely. Who is affected by the current behavior and why would it be beneficial changing it? We can't forget CDR completely as this could only be a potential problem in that context. Right now, we don't have any built-in resolution strategies, so this can't impact but if this is a problem then we need to have a solution for it before considering a solution like "last_write_wins" strategy. I agree that we can't forget about CDR. This is precisely the problem we ran into here at pgEdge and why we came up with a solution (attached). Now, instead of discussing LSN<->timestamp inversion issue, you started to discuss "last_write_wins" strategy itself which we have discussed to some extent in the thread [2]. BTW, we are planning to start a separate thread as well just to discuss the clock skew problem w.r.t resolution strategies like "last_write_wins" strategy. So, we can discuss clock skew in that thread and keep the focus of this thread LSN<->timestamp inversion problem. Fact is that "last_write_wins" together with some implementation of Conflict free Replicated Data Types (CRDT) is good enough for many real world situations. Anything resembling a TPC-B or TPC-C is quite happy with it. The attached solution is minimally invasive because it doesn't move the timestamp generation (clock_gettime() call) into the critical section of ReserveXLogInsertLocation() that is protected by a spinlock. Instead it keeps track of the last commit-ts written to WAL in shared memory and simply bumps that by one microsecond if the next one is below or equal. There is one extra condition in that code section plus a function call by pointer for every WAL record. In the unlikely case of encountering a stalled or backwards running commit-ts, the expensive part of recalculating the CRC of the altered commit WAL-record is done later, after releasing the spinlock. I have not been able to measure any performance impact on a machine with 2x Xeon-Silver (32 HT cores). This will work fine until we have systems that can sustain a commit rate of one million transactions per second or higher for more than a few milliseconds. Regards, Jan diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 004f7e10e5..c0d2466f41 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -371,6 +371,12 @@ static void ShowTransactionStateRec(const char *str, TransactionState s); static const char *BlockStateAsString(TBlockState blockState); static const char *TransStateAsString(TransState state); +/* + * Hook function to be called while holding the WAL insert spinlock. + * to adjust commit timestamps via Lamport clock if needed. + */ +static void EnsureMonotonicTransactionStopTimestamp(void *data); + /* * transaction state accessors @@ -2215,6 +2221,13 @@ StartTransaction(void) if (TransactionTimeout > 0) enable_timeout_after(TRANSACTION_TIMEOUT, TransactionTimeout); + /* + * Reset XLogReserveInsertHook (function called while holding + * the WAL insert spinlock) + */ + XLogReserveInsertHook = NULL; + XLogReserveInsertHookData = NULL; + ShowTransactionState("StartTransaction"); } @@ -5837,6 +5850,7 @@ XactLogCommitRecord(TimestampTz commit_time, xl_xact_twophase xl_twophase; xl_xact_origin xl_origin; uint8 info; + XLogRecPtr result; Assert(CritSection
Re: Remove an obsolete comment in gistinsert()
Hi Tender, > While learning the GIST codes, I find an obsolete comment in gistinsert (). > > itup = gistFormTuple(giststate, r, > values, isnull, true /* size is currently > bogus */ ); Thanks for reporting. I agree that this is an oversight of commit 1f7ef54. The commit changed the signature of gistFormTuple(): ``` IndexTuple gistFormTuple(GISTSTATE *giststate, Relation r, - Datum attdata[], int datumsize[], bool isnull[]) + Datum attdata[], bool isnull[], bool newValues) ``` ... but left the comment for the `datumsize` argument: ``` itup = gistFormTuple(&buildstate->giststate, index, - values, NULL /* size is currently bogus */, isnull); + values, isnull, true /* size is currently bogus */); ``` I checked the rest of gistFormTuple() calls and also looked for other comments like this. There seems to be only one call like this to fix. -- Best regards, Aleksander Alekseev
Re: Pgoutput not capturing the generated columns
On Tue, 5 Nov 2024 at 12:32, Peter Smith wrote: > > Hi Vignesh, > > Here are my review comments for patch v48-0001. > > == > src/backend/catalog/pg_publication.c > > has_column_list_defined: > > 1. > + if (HeapTupleIsValid(cftuple)) > + { > + bool isnull = true; > + > + /* Lookup the column list attribute. */ > + (void) SysCacheGetAttr(PUBLICATIONRELMAP, cftuple, > +Anum_pg_publication_rel_prattrs, > +&isnull); > > AFAIK it is not necessary to assign a default value to 'isnull' here. > e.g. most of the other 100s of calls to SysCacheGetAttr elsewhere in > PostgreSQL source don't bother to do this. This is fixed in the v49 version patch attached at [1]. [1] - https://www.postgresql.org/message-id/CALDaNm3XV5mAeZzZMkOPSPieANMaxOH8xAydLqf8X5PQn%2Ba5EA%40mail.gmail.com Regards, Vignesh
Re: relfilenode statistics
On Tue, Nov 5, 2024 at 1:06 AM Bertrand Drouvot wrote: > Does it sound ok to you to move with the above principal? (I'm +1 on it). Yes, provided we can get a clean implementation of it. -- Robert Haas EDB: http://www.enterprisedb.com
Re: define pg_structiszero(addr, s, r)
Hi, On Tue, Nov 05, 2024 at 05:08:41PM +1300, David Rowley wrote: > On Tue, 5 Nov 2024 at 06:39, Ranier Vilela wrote: > > I think we can add a small optimization to this last patch [1]. > > I think if you want to make it faster, you could partially unroll the > inner-most loop, like: > > // size_t * 4 > for (; p < aligned_end - (sizeof(size_t) * 3); p += sizeof(size_t) * 4) > { > if (((size_t *) p)[0] != 0 | ((size_t *) p)[1] != 0 | ((size_t *) > p)[2] != 0 | ((size_t *) p)[3] != 0) > return false; > } Another option could be to use SIMD instructions to check multiple bytes is zero in a single operation. Maybe just an idea to keep in mind and experiment if we feel the need later on. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Re: doc: pgevent.dll location
On Tue, Nov 5, 2024 at 1:00 AM Gurjeet Singh wrote: > I'm unable to confirm the veracity of the claim that the pgevent.dll > file on Windows is now placed in a different directory, and that this > is a result of meson builds. But the patch looks good to me; it > applies cleanly to REL_17_STABLE and main branches. I think we will need to find someone who can confirm whether or not the claim is correct. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Proposal to Enable/Disable Index using ALTER INDEX (with patch)
On Thu, Oct 17, 2024 at 9:59 AM Shayon Mukherjee wrote: > My take away from whether or not an in-place update is needed on pg_index [1] > > - It’s unclear to me why it’s needed. > - I understand the xmin would get incremented when using CatalogTupleUpdate > to update indisenabled. > - However, I haven’t been able to replicate any odd behavior locally or CI. > - FWIW - REINDEX CONCURRENTLY (via index_swap), index_constraint_create and > few other places perform CatalogTupleUpdate to update the relevant attributes > as well. > > Based on the above summary and after my testing I would like to propose a v3 > of the patch. The only difference between this and v1 [2] is that the update > of pg_index row happens via CatalogTupleUpdate. > > [1] > https://www.postgresql.org/message-id/20180618215635.m5vrnxdxhxytv...@alap3.anarazel.de > [2] > https://www.postgresql.org/message-id/EF2313B8-A017-4869-9B7F-A24EDD8795DE%40gmail.com In-place updates are generally bad news, so I think this patch shouldn't use them. However, someone will need to investigate whether that breaks the indcheckxmin thing that Andres mentions in your [1], and if it does, figure out what to do about it. Creating a test case to show the breakage would probably be a good first step, to frame the discussion. -- Robert Haas EDB: http://www.enterprisedb.com
Re: MultiXact\SLRU buffers configuration
On Tue, Oct 29, 2024 at 1:45 PM Thom Brown wrote: > Taking a look at what's happening under the hood, it seems to be > getting stuck here: > > if (nextMXOffset == 0) > { > /* Corner case 2: next multixact is still > being filled in */ > LWLockRelease(MultiXactOffsetSLRULock); > CHECK_FOR_INTERRUPTS(); > pg_usleep(1000L); > goto retry; > } > > It clearly checks for interrupts, but when I saw this issue happen, it > wasn't interruptible. I don't understand the underlying issue here; however, if a process holds an lwlock, it's not interruptible, even if it checks for interrupts. So it could be that at this point in the code we're holding some other LWLock, such as a buffer content lock, and that's why this code fails to achieve its objective. -- Robert Haas EDB: http://www.enterprisedb.com
Re: COPY performance on Windows
Hello Takahashi-san, I am reluctant to draw conclusions about the general performance of this patch from one example. It seems that the performance could depend on many things: table size, column definitions, row width, hardware, OS version, shared_buffers, max_wal_size. I don't think we can say from your test here that performance is always worse on Windows. If it is, then I agree that we should think of what to do about it; but it seems possible to me that the behavior will be different in other circumstances. What I don't understand is why increasing the number of blocks should be worse. The comment before the FileZero() call has a comment explaining why a larger extension is thought to be better. If it's wrong, we should try to figure out why it's wrong. But it seems quite surprising that doing more work at once would be less efficient. That's not usually how things work. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Making error message more user-friendly with spaces in a URI
On 2024/11/01 22:38, Greg Sabino Mullane wrote: $ psql "postgres://localhost:5432/postgres?application_name=a b" psql: error: trailing data found: "a b" This works fine for me, and sets a space in the application_name string as expected. Do you have a different example? You'll encounter the error message if you run the example command against HEAD. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Making error message more user-friendly with spaces in a URI
On 2024/10/31 10:22, Yushi Ogiwara wrote: Hi, I made a patch to make the error message more user-friendly when using a URI to connect a database with psql. Current Error Message: $ psql "postgres://localhost:5432/postgres?application_name=a b" psql: error: trailing data found: "a b" Currently, if spaces exist in the URI, psql displays this generic error message. New Error Message (with patch): psql: error: found unexpected spaces: “a b“. Did you forget to percent-encode spaces? This revised message is clearer and more concise, guiding users to check for encoding issues. I agree the error message could be improved. The phrasing "Did you forget" feels a bit indirect to me. How about using something clearer and more direct instead? --- psql: error: unexpected spaces found "a b", use percent-encoded spaces instead --- Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Virtual generated columns
On 16.09.24 11:22, jian he wrote: in v7. doc/src/sgml/ref/alter_table.sgml and column_constraint is: section need representation of: GENERATED ALWAYS AS ( generation_expr ) [VIRTUAL] I have addressed this in patch v8. in RelationBuildTupleDesc(Relation relation) we need to add "constr->has_generated_virtual" for the following code? if (constr->has_not_null || constr->has_generated_stored || ndef > 0 || attrmiss || relation->rd_rel->relchecks > 0) fixed in v8 also seems there will be table_rewrite for adding virtual generated columns, but we can avoid that. The attached patch is the change and the tests. i've put the tests in src/test/regress/sql/fast_default.sql, since it already has event triggers and trigger functions, we don't want to duplicate it. Also added in v8. Thanks!
Re: New function normal_rand_array function to contrib/tablefunc.
On Tue, 5 Nov 2024 at 15:23, Aleksander Alekseev wrote: > > > > =# SELECT array_random(1, 10, random(0, 3)) FROM generate_series( ... ) > > > {5} > > > {1, 3, 8} > > > {7, 6} > > > ... > > > > Yeah, that looks like a neater API. > > > > Something that bothers me somewhat is that it's completely trivial for > > the user to write such a function for themselves, so is it really > > useful enough to include in core? > > I think it would be useful. Many users don't bother writing C > extensions for tasks like this. So at least our implementation is > going to be faster. > If we are going to add such a function to core, then I think we should make it consistent and at least as flexible as the other array functions, and support multi-dimensional arrays with optional non-default lower-bounds, like array_fill(). I.e., something like: random_array(min int, max int, dims int[] [, lbounds int[]]) -> int[] Returns an array filled with random values in the range min <= x <= max, having dimensions of the lengths specified by dims. The optional lbounds argument supplies lower-bound values for each dimension (which default to all 1). Regards, Dean
Re: Virtual generated columns
On 18.09.24 04:38, jian he wrote: On Mon, Sep 16, 2024 at 5:22 PM jian he wrote: in v7. seems I am confused with the version number. here, I attached another minor change in tests. make ERROR: invalid ON DELETE action for foreign key constraint containing generated column becomes ERROR: foreign key constraints on virtual generated columns are not supported I think the existing behavior is fine. The first message is about something that is invalid anyway. The second message is just that something is not supported yet. If we end up implementing, then users will get the first message. change contrib/pageinspect/sql/page.sql expand information on t_infomask, t_bits information. added to v8 patch change RelationBuildLocalRelation make the transient TupleDesc->TupleConstr three bool flags more accurate. I don't think we need that. At the time this is used, the generation expressions are not added to the table yet. Note that stored generated columns are not dealt with here either. If there is a bug, then we can fix it, but if not, then I'd rather keep the code simpler.
Re: MultiXact\SLRU buffers configuration
> On 1 Nov 2024, at 00:25, Thom Brown wrote: > > Unfortunately I didn't gather much information when it was occuring, and > prioritised getting rid of the process blocking replay. I just attached gdb > to it, got a backtrace and then "print ProcessInterrupts()". > Currently I do not see how this wait can result in a deadlock. But I did observe standby in a pathological sequential scan encountering recent multixact again and again (new each time). I hope this situation will be alleviated by recent cahnges - now there is not a millisecond wait, but hopefully smaller amount of time. In v17 we also added injection point test which reproduces reading very recent multixact. If there is a deadlock I hope buildfarm will show it to us. Best regards, Andrey Borodin.
Re: Always have pg_dump write rules in a consistent order
On 11/4/24 7:32 PM, Tom Lane wrote: Seems reasonable. Pushed with some trivial cosmetic adjustments to make it look more like the identical adjacent cases for policies and triggers. Thanks! Andreas
Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY
Hi Shlok, > Here the generated column 'b' is set as REPLICA IDENTITY for table > 'testpub_gencol'. When we create publication 'pub_gencol' we do not > specify any column list, so column 'b' will not be published. > So, the update message generated by the last UPDATE would have NULL > for column 'b'. > > To avoid the issue, we can disallow UPDATE/DELETE on table with > unpublished generated column as REPLICA IDENTITY. I have attached a > patch for the same. I don't think this would be a correct fix. Let's say I *don't* have any publications: ``` =# CREATE TABLE testpub_gencol (a INT, b INT GENERATED ALWAYS AS (a + 1) STORED NOT NULL); CREATE TABLE =# CREATE UNIQUE INDEX testpub_gencol_idx ON testpub_gencol (b); CREATE INDEX =# INSERT INTO testpub_gencol (a) VALUES (1); INSERT 0 1 =# UPDATE testpub_gencol SET a = 100 WHERE a = 1; UPDATE 1 eax=# SELECT * FROM testpub_gencol ; a | b -+- 100 | 101 (1 row) ``` So far everything works fine. You are saying that when one creates a publication UPDATEs should stop working. That would be rather surprising behavior for a typical user not to mention that it will break the current behavior. I believe one would expect that both UPDATEs and the publication should continue to work. Perhaps we should forbid the creation of a publication like this instead. Or alternatively include a generated column to the publication list if it's used as a replica identity. Or maybe even keep everything as is. Thoughts? -- Best regards, Aleksander Alekseev
Re: COPY performance on Windows
Hi Ryohei, Thanks for the patch. Here are my two cents. > I noticed that the COPY performance on PG 17.0 Windows is worse than PG 16.4. > > [...] > > By applying the attached patch to PG 17.0, the copy result is 401.5s. So we are trading a potential 3.8% speedup in certain environments for the increased code complexity due to a couple of added #ifdef's here. If we really want to do this, firstly the patch should have detailed comments in front of #ifdefs so that in 10+ years from now someone who didn't read this thread would know what they are for. Secondly, more detailed research should be made on how this patch affects the performance on Windows depending on the software version and particular choice of hardware. Perhaps what you found is not the only and/or the most important bottleneck. Your patch may (or may not) cause performance degradation in other setups. Last but not least one should double check that this will not cause performance degradation on *nix systems. To be honest, personally I wouldn't bother because of 3.8% speedup at best (for 10+% - maybe). This being said perhaps you and other people on the mailing list (reviewers, committers) feel otherwise. -- Best regards, Aleksander Alekseev
Remove an obsolete comment in gistinsert()
Hi, While learning the GIST codes, I find an obsolete comment in gistinsert (). itup = gistFormTuple(giststate, r, values, isnull, true /* size is currently bogus */ ); The comment "/* size is currently bogus */" is weird because it follows a bool variable. I tracked the commit history. The original gistFormTuple() prototype is as below: extern IndexTuple gistFormTuple(GISTSTATE *giststate, Relation r, Datum *attdata, int *datumsize, bool *isnull); you can see this commit 37c8393 to check that. After 1f7ef54, the function prototype changed, as below: extern IndexTuple gistFormTuple(GISTSTATE *giststate, Relation r, Datum *attdata, bool *isnull, bool newValues); As you can see, "int *datumsize" had been removed, but the comments in gistbuildCallback() and gistinsert() were not removed together. By the way, after commit 5edb24a, the comments in gistbuildCallback() was removed. So I think we can now remove the obsolete comment from gistinsert(). -- Thanks, Tender Wang 0001-Remove-an-obsolete-comment-in-gistinsert.patch Description: Binary data
Re: Useless field ispartitioned in CreateStmtContext
Hi, Kirill Sorry for the late reply, thanks for your suggestion. A simple fix has been added to the attached patch. -- hugo v1-0001-Remove-useless-field-ispartitioned-in-CreateStmtC.patch Description: Binary data
Re: Add reject_limit option to file_fdw
On 2024-10-18 01:51, Fujii Masao wrote: Thanks for your review and sorry for my late reply. On 2024/10/17 22:45, torikoshia wrote: Hi, 4ac2a9bec introduced reject_limit option to the COPY command, and I was wondering if it might be beneficial to add the same option to file_fdw. Although there may be fewer practical use cases compared to COPY, it could still be useful in situations where the file being read via file_fdw is subject to modifications and there is a need to tolerate a limited number of errors. Agreed. What do you think? I've attached a patch. Thanks for the patch! Could you add it to the next CommitFest? Added an entry for this patch: https://commitfest.postgresql.org/50/5331/ +ALTER FOREIGN TABLE agg_bad OPTIONS (reject_limit '1'); +SELECT * FROM agg_bad; + a | b +-+ + 100 | 99.097 + 42 | 324.78 +(2 rows) Wouldn't it be better to include a test where a SELECT query fails, even with on_error set to "ignore," because the number of errors exceeds reject_limit? Agreed. + if (cstate->opts.reject_limit > 0 && \ The trailing backslash isn't needed here. Agreed. + + reject_limit This entry should be placed right after the on_error option, following the same order as in the COPY command documentation. Based on the synopsis of the CREATE/ALTER FOREIGN TABLE commands, the value for the foreign table's option must be single-quoted. I’m not entirely sure if this is the correct approach, but in order to accommodate this, the patch modifies the value of reject_limit option to accept not only numeric values but also strings. I don't have a better approach for this, so I'm okay with your solution. Just one note: it would be helpful to explain and comment why defGetCopyRejectLimitOption() accepts and parses both int64 and string values. Added a comment for this. -- Regards, -- Atsushi Torikoshi Seconded from NTT DATA GROUP CORPORATION to SRA OSS K.K.From 02b81f376d9b316a06d68c7bf714c6729100162c Mon Sep 17 00:00:00 2001 From: Atsushi Torikoshi Date: Tue, 5 Nov 2024 21:57:16 +0900 Subject: [PATCH v2] file_fdw: Add reject_limit option to file_fdw Commit 4ac2a9bece introduced REJECT_LIMIT option for COPY. This patch extends support for this option to file_fdw. As well as REJECT_LIMIT option for COPY, this option limits the maximum number of erroneous rows that can be skipped. If more rows encounter data type conversion errors than allowed by reject_limit, the access to the file_fdw foreign table will fail with an error, even when on_error = 'ignore'. Based on the synopsis of the CREATE/ALTER FOREIGN TABLE commands, the value for the foreign table's option must be single-quoted. To accommodate this, the patch modifies defGetCopyRejectLimitOption to accept not only int64 but also strings. --- contrib/file_fdw/data/agg.bad2 | 5 + contrib/file_fdw/expected/file_fdw.out | 15 ++- contrib/file_fdw/file_fdw.c| 8 contrib/file_fdw/sql/file_fdw.sql | 7 ++- doc/src/sgml/file-fdw.sgml | 12 src/backend/commands/copy.c| 15 ++- 6 files changed, 59 insertions(+), 3 deletions(-) create mode 100644 contrib/file_fdw/data/agg.bad2 diff --git a/contrib/file_fdw/data/agg.bad2 b/contrib/file_fdw/data/agg.bad2 new file mode 100644 index 00..04279ce55b --- /dev/null +++ b/contrib/file_fdw/data/agg.bad2 @@ -0,0 +1,5 @@ +56;@7.8@ +100;@99.097@ +0;@aaa@ +42;@324.78@ +1;@bbb@ diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out index 593fdc782e..dbfae52baf 100644 --- a/contrib/file_fdw/expected/file_fdw.out +++ b/contrib/file_fdw/expected/file_fdw.out @@ -206,7 +206,7 @@ SELECT * FROM agg_csv c JOIN agg_text t ON (t.a = c.a) ORDER BY c.a; SELECT * FROM agg_bad; -- ERROR ERROR: invalid input syntax for type real: "aaa" CONTEXT: COPY agg_bad, line 3, column b: "aaa" --- on_error and log_verbosity tests +-- on_error, log_verbosity and reject_limit tests ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore'); SELECT * FROM agg_bad; NOTICE: 1 row was skipped due to data type incompatibility @@ -224,6 +224,19 @@ SELECT * FROM agg_bad; 42 | 324.78 (2 rows) +\set filename :abs_srcdir '/data/agg.bad2' +ALTER FOREIGN TABLE agg_bad OPTIONS (SET filename :'filename', ADD reject_limit '1'); +SELECT * FROM agg_bad; +ERROR: skipped more than REJECT_LIMIT (1) rows due to data type incompatibility +CONTEXT: COPY agg_bad, line 5, column b: "bbb" +ALTER FOREIGN TABLE agg_bad OPTIONS (SET reject_limit '2'); +SELECT * FROM agg_bad; + a | b +-+ + 100 | 99.097 + 42 | 324.78 +(2 rows) + ANALYZE agg_bad; -- misc query tests \t on diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c index 043204c3e7..9e2896f32a 100644 --- a/contrib/file_fdw/file_fdw.c +++ b/contrib/file_fdw/file_fdw.c @@ -77,6 +77,7 @@ static const struct FileFdwO
Re: Time to add a Git .mailmap?
> On 5 Nov 2024, at 10:33, Alvaro Herrera wrote: > Therefore I +1 Daniel's original proposal with thanks, and BTW I'm not > sorry for changing my name to use the hard-won ' accent on it :-) Done. -- Daniel Gustafsson
Re: doc: pgevent.dll location
Robert Haas writes: > On Tue, Nov 5, 2024 at 1:00 AM Gurjeet Singh wrote: >> I'm unable to confirm the veracity of the claim that the pgevent.dll >> file on Windows is now placed in a different directory, and that this >> is a result of meson builds. But the patch looks good to me; it >> applies cleanly to REL_17_STABLE and main branches. > I think we will need to find someone who can confirm whether or not > the claim is correct. More to the point: if that does happen, isn't it a bug to be fixed rather than documented? regards, tom lane
Re: New function normal_rand_array function to contrib/tablefunc.
Hi Dean, Thanks for your input. > > Any reason not to have an interface as simple and straightforward as > > this: > > > > =# SELECT array_random(1, 10, random(0, 3)) FROM generate_series( ... ) > > {5} > > {1, 3, 8} > > {7, 6} > > ... > > > > Yeah, that looks like a neater API. > > Something that bothers me somewhat is that it's completely trivial for > the user to write such a function for themselves, so is it really > useful enough to include in core? I think it would be useful. Many users don't bother writing C extensions for tasks like this. So at least our implementation is going to be faster. > The other question is whether it's an array function or a random > function. I.e., should it be listed in "Table 9.55. Array Functions", > in which case the name array_random() makes sense, or should it be > listed in "Table 9.6. Random Functions", in which case it should > probably be called random_array(). I think the latter makes more > sense, since it's a function that generates random values, more > similar to the random(min, max) functions. Also I think it's more > useful if it shares the same PRNG, controlled by setseed(), and it > makes sense to group all such functions together. Good point. Personally I don't have a strong opinion on whether random_array() or array_random() is preferable, for me either option is OK. Perhaps we should cross-reference the function between two sections of the documentation to make it easier to find. -- Best regards, Aleksander Alekseev
Re: doc: pgevent.dll location
On Tue, 5 Nov 2024 at 15:10, Robert Haas wrote: > On Tue, Nov 5, 2024 at 1:00 AM Gurjeet Singh wrote: > > I'm unable to confirm the veracity of the claim that the pgevent.dll > > file on Windows is now placed in a different directory, and that this > > is a result of meson builds. But the patch looks good to me; it > > applies cleanly to REL_17_STABLE and main branches. > > I think we will need to find someone who can confirm whether or not > the claim is correct. > It is correct. In v16, it was in lib/, in v17, bin/. However, on Windows I think it probably makes (marginally) more sense to include it in bin/ - we already have other libraries in there that are linked at compile time. pgevent is something of a special case, but it's probably more similar to those libraries than the extensions etc. that are in lib/. Related sidenote: I got a complaint today that pg_regress.exe is no longer part of the Windows installation footprint - and indeed, it's not installed at all with Meson builds, which is problematic for those doing CI/CD testing of extensions. -- Dave Page pgAdmin: https://www.pgadmin.org PostgreSQL: https://www.postgresql.org
Re: index_delete_sort: Unnecessary variable "low" is used in heapam.c
On 2024/09/24 21:31, Daniel Gustafsson wrote: On 24 Sep 2024, at 10:32, btnakamurakoukil wrote: I noticed unnecessary variable "low" in index_delete_sort() (/postgres/src/backend/access/heap/heapam.c), patch attached. What do you think? That variable does indeed seem to not be used, and hasn't been used since it was committed in d168b666823. The question is if it's a left-over from development which can be removed, or if it should be set and we're missing an optimization. Having not read the referenced paper I can't tell so adding Peter Geoghegan who wrote this for clarification. It's been about a month without updates. How about removing the unnecessary variable as suggested? We can always add the "missing" logic later if needed. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Function scan FDW pushdown
Alexander Pyhalov писал(а) 2021-10-04 10:42: Ashutosh Bapat писал 2021-06-15 16:15: Hi Alexander, Hi. The current version of the patch is based on asymetric partition-wise join. Currently it is applied after v19-0001-Asymmetric-partitionwise-join.patch from on https://www.postgresql.org/message-id/792d60f4-37bc-e6ad-68ca-c2af5cbb2...@postgrespro.ru . So far I don't know how to visualize actual function expression used in function RTE, as in postgresExplainForeignScan() es->rtable comes from queryDesc->plannedstmt->rtable, and rte->functions is already 0. The actual function expression will be part of the Remote SQL of ForeignScan node so no need to visualize it separately. We still need to create tuple description for functions in get_tupdesc_for_join_scan_tuples(), so I had to remove setting newrte->functions to NIL in add_rte_to_flat_rtable(). With rte->functions in place, there's no issues for explain. The patch will have problems when there are multiple foreign tables all on different servers or use different FDWs. In such a case the function scan's RelOptInfo will get the fpinfo based on the first foreign table the function scan is paired with during join planning. But that may not be the best foreign table to join. We should be able to plan all the possible joins. Current infra to add one fpinfo per RelOptInfo won't help there. We need something better. I suppose attached version of the patch is more mature. The patch targets only postgres FDW, how do you see this working with other FDWs? Not now. We introduce necessary APIs for other FDWs, but implementing TryShippableJoinPaths() doesn't seem straightforward. If we come up with the right approach we could use it for 1. pushing down queries with IN () clause 2. joining a small local table with a large foreign table by sending the local table rows down to the foreign server. Hi, This is a long-overdue follow-up to the original patch. Note that this patch contains only the changes required for function scan pushdown, examples and code related to asymmetric join are dropped. In this version, we have fixed some of the issues: 1) Functions returning records are now deparsed correctly 2) I have benchmarked this version alongside the current master using HammerDB and it shows a consistent ~6000 number of orders per minute (NOPM) compared to ~100 NOPM on local setup with 1 virtual user and 10 warehouses The benchmarking was concluded in the following way: 1) schema was created and filled up by buildschema on the first instance 2) foreign schema was imported on the second instance 3) functions created by buildschema (all in namespace 'public') were exported from the first instance and created on the second one 4) HammerDB was connected to the second instance, and the benchmark was run there Note that the HammerDB service functions that were imported to the second instance are not the ones being pushed down, only PostgreSQL built-ins are. The issue with setting newrte->functions to NIL still persists. I've attempted to work around it by adding the rte->functions list to fdw_private field. I stored a copy of rte->functions for each RTE in the order they are listed in simple_rte_array and accessed it where the rte->functions were used in the original patch. While this being pretty straightforward, I found out the hard way that the RTEs are flattened and the original sequential order known at postgresGetForeignPlan() and postgresPlanDirectModify() is altered randomly. It prevents me from looking up the right functions list for the RTE by its rti in postgresExplainForeignScan() and its var->varno in get_tupdesc_for_join_scan_tuples(). I am aware that the rte->functions will now be copied even on instances that don't utilize a FDW, but I don't see a way to solve it. Any suggestions are welcome. Best regards, Gleb Kashkin, Postgres ProfessionalFrom a59b04e33147a075fdd4c6cef155440689850b83 Mon Sep 17 00:00:00 2001 From: Alexander Pyhalov Date: Mon, 17 May 2021 19:19:31 +0300 Subject: [PATCH] Push join with function scan to remote server The patch allows pushing joins with function RTEs to PostgreSQL data sources in general and postgres_fdw specifically. Co-authored-by: Gleb Kashkin --- contrib/postgres_fdw/deparse.c| 187 ++-- .../postgres_fdw/expected/postgres_fdw.out| 368 +++ contrib/postgres_fdw/postgres_fdw.c | 423 +++--- contrib/postgres_fdw/postgres_fdw.h | 6 + contrib/postgres_fdw/sql/postgres_fdw.sql | 148 ++ src/backend/optimizer/path/joinpath.c | 11 + src/backend/optimizer/plan/setrefs.c | 1 - src/include/foreign/fdwapi.h | 1 + src/test/regress/expected/create_view.out | 6 +- 9 files changed, 1046 insertions(+), 105 deletions(-) diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c index 4680d51733..17228dec47 100644 --- a/contrib/postgres
meson vs. extension builds
Hi, This issue has been bugging me for a while but I haven't gotten around to reporting it until now. Apologies if this is a duplicate report. When I try to build an extension such as pg_catcheck against a meson build, it does not work. If I do the same thing with a configure-based build, it does work. Here's the failure: ccache clang -isysroot /Library/Developer/CommandLineTools/SDKs/MacOSX14.0.sdk -fno-strict-aliasing -fwrapv -Wall -g -O2 -Wmissing-prototypes -Wpointer-arith -Werror=vla -Werror=unguarded-availability-new -Wendif-labels -Wmissing-format-attribute -Wcast-function-type -Wformat-security -Wdeclaration-after-statement -Wmissing-variable-declarations -Wno-unused-command-line-argument -Wno-compound-token-split-by-macro -I/Users/robert.haas/install/dev/include -I. -I./ -I/Users/robert.haas/install/dev/include/postgresql/server -I/Users/robert.haas/install/dev/include/postgresql/internal-c -o pg_catcheck.o pg_catcheck.c -MMD -MP -MF .deps/pg_catcheck.Po In file included from pg_catcheck.c:22: In file included from /Users/robert.haas/install/dev/include/postgresql/server/postgres_fe.h:25: /Users/robert.haas/install/dev/include/postgresql/server/c.h:78:10: fatal error: 'libintl.h' file not found #include ^~~ 1 error generated. make: *** [pg_catcheck.o] Error 1 The problem is that libintl.h is in /opt/local/include. When I run meson setup, I specify -Dextra_include_dirs=/opt/local/include -Dextra_lib_dirs=/opt/local/lib. And correspondingly, when I use configure, I use --with-libraries=/opt/local/lib --with-includes=/opt/local/include. But it seems that the configure settings automatically propagate through to where extension builds also pick them up, and the meson settings don't. Consequently, I am sad. Taking a diff of the generated lib/postgresql/pgxs/src/Makefile.global files, I see, inter alia, this: 240c240 < CPPFLAGS = -isysroot $(PG_SYSROOT) -I/opt/local/include/libxml2 -I/opt/local/include --- > CPPFLAGS = 315c315 < LDFLAGS = $(LDFLAGS_INTERNAL) -isysroot $(PG_SYSROOT) -L/opt/local/lib -L/opt/local/lib -Wl,-dead_strip_dylibs --- > LDFLAGS = $(LDFLAGS_INTERNAL) -isysroot > /Library/Developer/CommandLineTools/SDKs/MacOSX14.0.sdk > -Wl,-no_warn_duplicate_libraries Which is probably related. -- Robert Haas EDB: http://www.enterprisedb.com
Re: New function normal_rand_array function to contrib/tablefunc.
Hi, > If we are going to add such a function to core, then I think we should > make it consistent and at least as flexible as the other array > functions, and support multi-dimensional arrays with optional > non-default lower-bounds, like array_fill(). I.e., something like: > > random_array(min int, max int, dims int[] [, lbounds int[]]) -> int[] > > Returns an array filled with random values in the range min <= x <= max, > having dimensions of the lengths specified by dims. The optional lbounds > argument supplies lower-bound values for each dimension (which default > to all 1). FWIW we have several array_* functions that deal only with the first dimension of an array: array_shuffle(), array_sample() the recently added array_reverse() [1], the currently discussed array_sort() [2]. I suggest implementing the function in several steps. First implement random_array(min, max, len). It's straightforward and IMO will provide the most value to the majority of the users.Then we can either add an optional argument or a random_array_multidim() function. This can be implemented and discussed as a separate patch. This approach would be convenient for the author and the reviewers and also will allow us to deliver value to the users sooner. [1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=49d6c7d8daba [2]: https://commitfest.postgresql.org/50/5277/ -- Best regards, Aleksander Alekseev
doc: explain pgstatindex fragmentation
Hi, I thought it would be nice to give the user a better idea of what avg_leaf_density and leaf_fragmentation mean. Patch attached. What do you think?From 5c82c207776fb8cde2357081b8579ba6db195c06 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Yhuel?= Date: Tue, 5 Nov 2024 17:59:44 +0100 Subject: [PATCH] doc: explain pgstatindex fragmentation It was quite hard to guess what leaf_fragmentation meant without looking at pgstattuple's code. This patch aims to give to the user a better idea of what it means. --- doc/src/sgml/pgstattuple.sgml | 8 1 file changed, 8 insertions(+) diff --git a/doc/src/sgml/pgstattuple.sgml b/doc/src/sgml/pgstattuple.sgml index 4071da4ed9..8908b56663 100644 --- a/doc/src/sgml/pgstattuple.sgml +++ b/doc/src/sgml/pgstattuple.sgml @@ -277,6 +277,14 @@ leaf_fragmentation | 0 page-by-page, and should not be expected to represent an instantaneous snapshot of the whole index. + + + avg_leaf_density can be seen as the inverse of bloat, + while leaf_fragmentation represents a measure of disorder. + The higher leaf_fragmentation is, the less the physical + order of the index leaf pages corresponds to the logical order we would have + just after a REINDEX. + -- 2.39.5
Re: per backend I/O statistics
Hi, On Mon, Nov 04, 2024 at 10:01:50AM +, Bertrand Drouvot wrote: > And why not add more per-backend stats in the future? (once the I/O part is > done). > > I think that's one more reason to go with option 2 (and implementing a brand > new > PGSTAT_KIND_BACKEND kind). I'm starting working on option 2, I think it will be easier to discuss with a patch proposal to look at. If in the meantime, one strongly disagree with option 2 (means implement a brand new PGSTAT_KIND_BACKEND and keep PGSTAT_KIND_IO), please let me know. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pg_stat_statements: Avoid holding excessive lock
Hi hackers, In pg_stat_statements there is entry's mutex spinlock which is used to be able to modify counters without holding pgss->lock exclusively. Here is a comment from the top of pg_stat_statements.c: * Note about locking issues: to create or delete an entry in the shared * hashtable, one must hold pgss->lock exclusively. Modifying any field * in an entry except the counters requires the same. To look up an entry, * one must hold the lock shared. To read or update the counters within * an entry, one must hold the lock shared or exclusive (so the entry doesn't * disappear!) and also take the entry's mutex spinlock. * The shared state variable pgss->extent (the next free spot in the external * query-text file) should be accessed only while holding either the * pgss->mutex spinlock, or exclusive lock on pgss->lock. We use the mutex to * allow reserving file space while holding only shared lock on pgss->lock. * Rewriting the entire external query-text file, eg for garbage collection, * requires holding pgss->lock exclusively; this allows individual entries * in the file to be read or written while holding only shared lock. And a comment from pgssEntry declaration: slock_t mutex; /* protects the counters only */ Both comments say that spinlocking mutex should be used to protect counters only. But in pg_stat_statements_internal we do read entry->stats_since and entry->minmax_stats_since holding the entry's mutex spinlock. This is unnecessary since we only modify stats_since and minmax_stats_since while holding pgss->lock exclusively, and in pg_stat_statements_internal we are holding pgss->lock when reading them. So even without holding the entry's mutex spinlock there should be no race. I suggest eliminating holding the excessive lock. See the attached patch. This would also restore the consistency between the code and the comments about entry's mutex spinlock usage. Best regards, Karina Litskevich Postgres Professional: http://postgrespro.com/ From c817fc9373ca08488088414a052eaf77dbb55bde Mon Sep 17 00:00:00 2001 From: Karina Litskevich Date: Tue, 5 Nov 2024 20:27:25 +0300 Subject: [PATCH v1] pg_stat_statements: Avoid excessive lock holding Entry's mutex spinlock is used to be able to modify counters without holding pgss->lock exclusively (though holding it at least shared so the entry doesn't disappear). We never actually modify stats_since and minmax_stats_since without holding pgss->lock exclusively, so there is no need to hold entry's mutex spinlock when reading stats_since and minmax_stats_since. This also restores the consistency between the code and the comments about entry's mutex spinlock usage. --- contrib/pg_stat_statements/pg_stat_statements.c | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 1798e1d016..a66ad99a37 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -1869,9 +1869,16 @@ pg_stat_statements_internal(FunctionCallInfo fcinfo, /* copy counters to a local variable to keep locking time short */ SpinLockAcquire(&entry->mutex); tmp = entry->counters; + SpinLockRelease(&entry->mutex); + + /* + * There is no need to hold entry->mutex when reading stats_since and + * minmax_stats_since for (unlike counters) they are always written + * while holding pgss->lock exclusively. We are holding pgss->lock + * shared so there should be no race here. + */ stats_since = entry->stats_since; minmax_stats_since = entry->minmax_stats_since; - SpinLockRelease(&entry->mutex); /* Skip entry if unexecuted (ie, it's a pending "sticky" entry) */ if (IS_STICKY(tmp)) -- 2.34.1
Re: Segfault in jit tuple deforming on arm64 due to LLVM issue
On Thu, Oct 31, 2024 at 9:49 PM Daniel Gustafsson wrote: > > On 31 Oct 2024, at 06:48, Thomas Munro wrote: > > I guess at a minimum a copy of the licence would need to appear somewhere > > That's my interpretation of it as well. > > > perhaps under src/backend/jit/llvm? > > Since SectionMemoryManager.h is in src/backend/jit I wonder if it should be a > placed in a section in src/backend/jit/README with an overview of the what and > why (or maybe a new src/backend/jit/llvm/README would be even better). The > license doesn't have to be in a separate file AFAICT and including a (version > of) your excellent summary in the commit message along with it would probably > help readers. Thank you. I figured that src/backend/jit/llvm/SectionMemoryManager.LICENSE would be a good place, to make it clear which code it covers and to remember to delete it when the time comes. If we ever have to do more of this sort of thing we might want to rename it to something more general, but I hope not! I also updated the comments with a briefer summary of the points from the commit message, at the top of the .cpp file. I added a pgindent exclusion for the new alien header, or else it gets scrambled. I was worried for a while about the C++14 code in here (eg std::make_unique), something we've avoided using in the past, but after a bit of 3D versionography I'm pretty sure there is no issue and we don't have to contort the code to avoid it. Reasoning: Old LLVM required C++11. LLVM 9 switched to C++14. LLVM 14 switched C++17. Pretty soon they'll flip to C++20 or C++23, they don't mess around. The corresponding -std=c++XX flag finishes up in our compile lines, because llvm-config --cxxflags spits it out, to match the features they're using in headers that we include (easy to spot examples being std::make_unique (C++14) and std::string_view (C++17)), so you might say that PostgreSQL indirectly chases C++ standards much faster than it chases C standards. This particular code is a special case because it's guarded for LLVM 12+ only, so it's OK to use C++14 in that limited context even in back branches. We have to be careful that it doesn't contain C++17 code since it came from recent LLVM, but it doesn't seem to by inspection, and you can check on a machine with CXX=g++ and LLVM 14 on Linux, which uses -std=c++14 and fails if you add a use of and std::string_view. (Warning: the system C++ standard library on Macs and other Clang-based systems doesn't have enough version guards so it won't complain, but GCC and its standard library will explicitly tell you not to use C++17 features in a C++14 program.) If there are no further comments, I'm going to push this to all branches tomorrow morning. For master only, I will remove the #if condition and comment about LLVM 12+, as we now require 14+. From f81eb027042a35a8c93778526104a912a642c919 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Tue, 27 Aug 2024 08:58:48 +1200 Subject: [PATCH v8] Monkey-patch LLVM code to fix ARM relocation bug. Supply a new memory manager for RuntimeDyld, to avoid crashes in generated code caused by memory placement that can overflow a 32 bit data type. This is a drop-in replacement for the llvm::SectionMemoryManager class in the LLVM library, with patches from Michael Smith's proposed fix at https://www.github.com/llvm/llvm-project/pull/71968. We hereby slurp it into our own source tree, after moving into a new namespace llvm::backport and making some minor adjustments so that it can be compiled with older LLVM versions as far back as 12. It's harder to make it work on even older LLVM versions, but it doesn't seem likely that people are really using them so that is not investigated for now. The problem could also be addressed by switching to JITLink instead of RuntimeDyld, and that is the LLVM project's recommended solution as the latter is about to be deprecated. We'll have to do that soon enough anyway, and then when the LLVM version support window advances far enough in a few years we'll be able to delete this code. Unfortunately that wouldn't be enough for PostgreSQL today: in most relevant versions of LLVM, JITLink is missing or incomplete. Several other projects have already back-ported this fix into their fork of LLVM, which is a vote of confidence despite the lack of commit into LLVM as of today. We don't have our own copy of LLVM so we can't do exactly what they've done; instead we have a copy of the whole patched class so we can pass an instance of it to RuntimeDyld. The LLVM project hasn't chosen to commit the fix yet, and even if it did, it wouldn't be back-ported into the releases of LLVM that most of our users care about, so there is not much point in waiting any longer for that. If they make further changes and commit it to LLVM 19 or 20, we'll still need this for older versions, but we may want to resynchronize our copy and update some comments. The changes that we've had to make to our copy can be seen by diffing o
Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY
Hi, There was an issue reported recently by Sawada-san on a different thread [1]. I have created this thread to discuss the issue separately. Currently, generated columns can be published only when we explicitly specify it in the Publication column list. An issue was found that UPDATE and DELETE are allowed on the table even if its replica identity is set to generated columns that are not published. For example: CREATE TABLE testpub_gencol (a INT, b INT GENERATED ALWAYS AS (a + 1) STORED NOT NULL); CREATE UNIQUE INDEX testpub_gencol_idx ON testpub_gencol (b); ALTER TABLE testpub_gencol REPLICA IDENTITY USING index testpub_gencol_idx; CREATE PUBLICATION pub_gencol FOR TABLE testpub_gencol; UPDATE testpub_gencol SET a = 100 WHERE a = 1; Here the generated column 'b' is set as REPLICA IDENTITY for table 'testpub_gencol'. When we create publication 'pub_gencol' we do not specify any column list, so column 'b' will not be published. So, the update message generated by the last UPDATE would have NULL for column 'b'. To avoid the issue, we can disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY. I have attached a patch for the same. [1]: https://www.postgresql.org/message-id/CAD21AoA_RBkMa-6nUpBSoEP9s%3D46r3oq15vQkunVRCsYKXKMnA%40mail.gmail.com Thanks and regards, Shlok Kyal v1-0001-Disallow-UPDATE-DELETE-on-table-with-generated-co.patch Description: Binary data
Re: Doc: typo in config.sgml
On 02.11.24 14:18, Bruce Momjian wrote: On Sat, Nov 2, 2024 at 12:02:12PM +0900, Tatsuo Ishii wrote: Yes, we _allow_ LATIN1 characters in the SGML docs, but I replaced the LATIN1 characters we had with HTML entities, so there are none currently. I think it is too easy for non-Latin1 UTF8 to creep into our SGML docs so I added a cron job on my server to alert me when non-ASCII characters appear. So you convert LATIN1 characters to HTML entities so that it's easier to detect non-LATIN1 characters is in the SGML docs? If my understanding is correct, it can be also achieved by using some tools like: iconv -t ISO-8859-1 -f UTF-8 release-17.sgml If there are some non-LATIN1 characters in release-17.sgml, it will complain like: iconv: illegal input sequence at position 175 An advantage of this is, we don't need to covert each LATIN1 characters to HTML entities and make the sgml file authors life a little bit easier. I might have misread the feedback. I know people didn't want a Makfile rule to prevent it, but I though converting few UTF8's we had was acceptable. Let me think some more and come up with a patch. The question of encoding characters as entities is orthogonal to the issue of only allowing Unicode characters that have a mapping to Latin 1. This patch seems to confuse these two issues, and I don't think it actually fixed the second one, which is the one that was complained about. I don't think anyone actually complained about the first one, which is the one that was actually patched. I think the iconv approach is an idea worth checking out. It's also not necessarily true that the set of characters provided by the built-in PDF fonts is exactly the set of characters in Latin 1. It appears to be close enough, but I'm not sure, and I haven't found any authoritative information on that. Another approach for a fix would be to get FOP produce the required warnings or errors more reliably. I know it has a bunch of logging settings (ultimately via log4j), so there might be some possibilities.
Re: pg_dump --no-comments confusion
On 2024-Nov-04, Erik Wienhold wrote: > I think Bruce's suggestion is pretty clear that it does not mean line or > block comments, but rather the COMMENT command. But I also think that > "SQL" in front of the command name is unnecessary [...] +1 for "Do not dump COMMENT commands", which is what I think you're saying. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "Pensar que el espectro que vemos es ilusorio no lo despoja de espanto, sólo le suma el nuevo terror de la locura" (Perelandra, C.S. Lewis)
Re: Converting contrib SQL functions to new style
Le lundi 4 novembre 2024, 17:10:01 heure normale d’Europe centrale Tom Lane a écrit : > The cfbot says many of these fail regression tests --- lots of > > CREATE EXTENSION citext SCHEMA s; > +ERROR: extension "citext" has no installation script nor update path for > version "1.8" > > and such. I didn't poke into why, but maybe you missed "git add'ing" > some changes? > Sorry you're right I missed one for xml2. But most importantly I forgot to update the meson build files for every one of them, using only the Makefile... Please find attached a new version which should make this right. Regards, -- Ronan Dunklau >From 96e7794f49384107b92649f4314e7e1f27ad4b75 Mon Sep 17 00:00:00 2001 From: Ronan Dunklau Date: Mon, 28 Oct 2024 16:13:35 +0100 Subject: [PATCH v3 1/6] Use "new style" SQL function in citext extension Author: Tom Lane Discussion: https://www.postgresql.org/message-id/3395418.1618352794%40sss.pgh.pa.us --- contrib/citext/Makefile | 1 + contrib/citext/citext--1.7--1.8.sql | 60 + contrib/citext/citext.control | 2 +- contrib/citext/meson.build | 1 + 4 files changed, 63 insertions(+), 1 deletion(-) create mode 100644 contrib/citext/citext--1.7--1.8.sql diff --git a/contrib/citext/Makefile b/contrib/citext/Makefile index b9b3713f537..fc990607bf2 100644 --- a/contrib/citext/Makefile +++ b/contrib/citext/Makefile @@ -4,6 +4,7 @@ MODULES = citext EXTENSION = citext DATA = citext--1.4.sql \ + citext--1.7--1.8.sql \ citext--1.6--1.7.sql \ citext--1.5--1.6.sql \ citext--1.4--1.5.sql \ diff --git a/contrib/citext/citext--1.7--1.8.sql b/contrib/citext/citext--1.7--1.8.sql new file mode 100644 index 000..7c95a9883aa --- /dev/null +++ b/contrib/citext/citext--1.7--1.8.sql @@ -0,0 +1,60 @@ +/* contrib/citext/citext--1.7--1.8.sql */ + +-- complain if script is sourced in psql, rather than via ALTER EXTENSION +\echo Use "ALTER EXTENSION citext UPDATE TO '1.8'" to load this file. \quit + +CREATE OR REPLACE FUNCTION regexp_match(string citext, pattern citext) RETURNS TEXT[] +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.regexp_match( $1::pg_catalog.text, $2::pg_catalog.text, 'i' ); + +CREATE OR REPLACE FUNCTION regexp_match(string citext, pattern citext, flags text) RETURNS TEXT[] +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.regexp_match( $1::pg_catalog.text, $2::pg_catalog.text, CASE WHEN pg_catalog.strpos($3, 'c') = 0 THEN $3 || 'i' ELSE $3 END ); + +CREATE OR REPLACE FUNCTION regexp_matches(string citext, pattern citext) RETURNS SETOF TEXT[] +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE ROWS 1 +RETURN pg_catalog.regexp_matches( $1::pg_catalog.text, $2::pg_catalog.text, 'i' ); + +CREATE OR REPLACE FUNCTION regexp_matches(string citext, pattern citext, flags text) RETURNS SETOF TEXT[] +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE ROWS 10 +RETURN pg_catalog.regexp_matches( $1::pg_catalog.text, $2::pg_catalog.text, CASE WHEN pg_catalog.strpos($3, 'c') = 0 THEN $3 || 'i' ELSE $3 END ); + +CREATE OR REPLACE FUNCTION regexp_replace(string citext, pattern citext, replacement text) returns TEXT +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.regexp_replace( $1::pg_catalog.text, $2::pg_catalog.text, $3, 'i'); + +CREATE OR REPLACE FUNCTION regexp_replace(string citext, pattern citext, replacement text, flags text) returns TEXT +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.regexp_replace( $1::pg_catalog.text, $2::pg_catalog.text, $3, CASE WHEN pg_catalog.strpos($4, 'c') = 0 THEN $4 || 'i' ELSE $4 END); + +CREATE OR REPLACE FUNCTION regexp_split_to_array(string citext, pattern citext) RETURNS TEXT[] +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.regexp_split_to_array( $1::pg_catalog.text, $2::pg_catalog.text, 'i' ); + +CREATE OR REPLACE FUNCTION regexp_split_to_array(string citext, pattern citext, flags text) RETURNS TEXT[] +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.regexp_split_to_array( $1::pg_catalog.text, $2::pg_catalog.text, CASE WHEN pg_catalog.strpos($3, 'c') = 0 THEN $3 || 'i' ELSE $3 END ); + +CREATE OR REPLACE FUNCTION regexp_split_to_table(string citext, pattern citext) RETURNS SETOF TEXT +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.regexp_split_to_table( $1::pg_catalog.text, $2::pg_catalog.text, 'i' ); + +CREATE OR REPLACE FUNCTION regexp_split_to_table(string citext, pattern citext, flags text) RETURNS SETOF TEXT +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.regexp_split_to_table( $1::pg_catalog.text, $2::pg_catalog.text, CASE WHEN pg_catalog.strpos($3, 'c') = 0 THEN $3 || 'i' ELSE $3 END ); + +CREATE OR REPLACE FUNCTION strpos( citext, citext ) RETURNS INT +LANGUAGE SQL IMMUTABLE STRICT PARALLEL SAFE +RETURN pg_catalog.strpos( pg_catalog.lower( $1::pg_catalog.text ), pg_catalog.lower( $2::pg_catalog.text ) ); + +CREATE OR REPLACE FUNCTION replace( citext, citext, citext ) R
Re: minor doc issue in 9.16.2.1.1. Boolean Predicate Check Expressions
On Thu, Oct 31, 2024 at 11:51 PM Bruce Momjian wrote: > > On Fri, Oct 18, 2024 at 10:00:54AM +0800, jian he wrote: > > On Fri, Oct 18, 2024 at 2:05 AM Bruce Momjian wrote: > > > Yes, updated patch attached. > > > > > looks good. > > > > in the meantime, do you think it's necessary to slightly rephrase > > jsonb_path_match doc entry: > > > > currently doc entry: > > jsonb_path_match ( target jsonb, path jsonpath [, vars jsonb [, silent > > boolean ]] ) → boolean > > Returns the result of a JSON path predicate check for the specified JSON > > value. > > > > > > "the result of a JSON path predicate check for the specified JSON > > value." is a jsonb boolean. > > but jsonb_path_match returns sql boolean. > > maybe add something to describe case like: "if JSON path predicate > > check return jsonb null, jsonb_path_match will return SQL null". > > Yes, I think that is a good point, updated patch attached. > played with https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-SQLJSON-FILTER-EX-TABLE The patch looks good to me.
Re: Time to add a Git .mailmap?
On 2024-Nov-01, Peter Eisentraut wrote: > > LGTM. I'd also add this line while at it: > > > > Peter Eisentraut > > > > This takes care of all the duplicate "identities" in the history AFAICT. > > I'm not sure if this is a good use of the mailmap feature. If someone > commits under for a while and then later as > , and the mailmap maps everything to the most recent > one, that seems kind of misleading or unfair? While I would agree with this line of thinking if the situation were as you describe, it should be obvious that it isn't; nobody here uses or has ever used a work email as committer address[1][2]. Nevertheless, since this argument is about _your_ personal identity not mine, I'm not going to stand against you on it. Therefore I +1 Daniel's original proposal with thanks, and BTW I'm not sorry for changing my name to use the hard-won ' accent on it :-) > The examples on the gitmailmap man page all indicate that this feature > is to correct accidental variations or obvious mistakes, but not to > unify everything to the extent that it alters the historical record. I don't think these examples are normative. There's plenty of evidence that people look for ways to attribute contributions to individuals rather than email-based identities. See for example https://gitlab.com/gitlab-org/gitlab/-/issues/14909 https://github.com/microsoft/vscode/blob/main/.mailmap https://github.com/gradle/gradle/blob/master/.mailmap [1] AFAIK gmx.net is a ISP-supplied address, not a work address. [2] scra...@hub.org and si...@2ndquadrant.com might be considered work addresses, but they aren't really -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "E pur si muove" (Galileo Galilei)
Re: on_error table, saving error info to a table
Thanks for the v2 patch! I see v1 review comments got addressed in v2 along with some further improvements. 1) v2 Patch again needs re-base. 2) I think we need not worry whether table name is unique or not, table name can be provided by user and we can check if it does not exists then simply we can create it with appropriate columns, if it exists we use it to check if its correct on_error table and proceed. 3) Using #define in between the code? I don't see that style in copyfromparse.c file. I do see such style in other src file. So, not sure if committer would allow it or not. #define ERROR_TBL_COLUMNS 10 4) Below appears redundant to me, it was not the case in v1 patch set, where it had only one return and one increment of error as new added code was at the end of the block:- + cstate->num_errors++; + return true; + } cstate->num_errors++; I was not able to test the v2 due to conflicts in v2, I did attempt to resolve but I saw many failures in make world. Regards, Nishant. On Tue, Aug 20, 2024 at 5:31 AM jian he wrote: > On Mon, Jul 15, 2024 at 1:42 PM Nishant Sharma > wrote: > > > > Thanks for the patch! > > > > I was not able to understand why we need a special error table for COPY? > > Can't we just log it in a new log error file for COPY in a proper > format? Like > > using table approach, PG would be unnecessarily be utilising its > resources like > > extra CPU to format the data in pages to store in its table format, > writing and > > keeping the table in its store (which may or may not be required), the > user > > would be required to make sure it creates the error table with proper > columns > > to be used in COPY, etc.. > > > > > > Meanwhile, please find some quick review comments:- > > > > 1) Patch needs re-base. > > > > 2) If the columns of the error table are fixed, then why not create it > internally using > > some function or something instead of making the user create the table > correctly > > with all the columns? > > I'll think about it more. > previously, i tried to use SPI to create tables, but at that time, i > thought that's kind of excessive. > you need to create the table, check whether the table name is unique, > check the privilege. > now we quickly error out if the error saving table definition does not > meet. I guess that's less work to do. > > > > 3) I think, error messages can be improved, which looks big to me. > > > > 4) I think no need of below pstrdup, As CStringGetTextDatum creates a > text copy for > > the same:- > > err_code = > pstrdup(unpack_sql_state(cstate->escontext->error_data->sqlerrcode)); > > > > t_values[9] = CStringGetTextDatum(err_code); > > > > 5) Should 'on_error_rel' as not NULL be checked along with below, as I > can see it is > > passed as NULL from two locations? > > + if (cstate->opts.on_error == COPY_ON_ERROR_TABLE) > > + { > > > > 6) Below declarations can be shifted to the if block, where they are > used. Instead of > > keeping them as global in function? > > + HeapTuple on_error_tup; > > + TupleDesc on_error_tupDesc; > > > > + if (cstate->opts.on_error == COPY_ON_ERROR_TABLE) > > + { > > > > 7) Below comment going beyond 80 char width:- > > * if on_error is specified with 'table', then on_error_rel is the error > saving table > > > > 8) Need space after 'false' > > err_detail ? false: true; > > > > 9) Below call can fit in a single line. No need to keep the 2nd and 3rd > parameter in > > nextlines. > > + on_error_tup = heap_form_tuple(on_error_tupDesc, > > + t_values, > > + t_isnull); > > > > 10) Variable declarations Tab Spacing issue at multiple places. > > > > 11) Comments in the patch are not matched to PG comment style. > > > > > > Kindly note I have not tested the patch properly yet. Only checked it > with a positive test > > case. As I will wait for a conclusion on my opinion of the proposed > patch. > > > almost all these issues have been addressed. > The error messages are still not so good. I need to polish it. >
Re: Segfault in jit tuple deforming on arm64 due to LLVM issue
On Tue, Nov 5, 2024 at 9:00 AM Thomas Munro wrote: > Reasoning: Old LLVM required C++11. LLVM 9 switched to C++14. LLVM > 14 switched C++17. Pretty soon they'll flip to C++20 or C++23, they > don't mess around. The corresponding -std=c++XX flag finishes up in > our compile lines, because llvm-config --cxxflags spits it out, to > match the features they're using in headers that we include (easy to > spot examples being std::make_unique (C++14) and std::string_view > (C++17)), so you might say that PostgreSQL indirectly chases C++ > standards much faster than it chases C standards. This particular > code is a special case because it's guarded for LLVM 12+ only, so it's > OK to use C++14 in that limited context even in back branches. We > have to be careful that it doesn't contain C++17 code since it came > from recent LLVM, but it doesn't seem to by inspection, and you can > check on a machine with CXX=g++ and LLVM 14 on Linux, which uses > -std=c++14 and fails if you add a use of and > std::string_view. (Warning: the system C++ standard library on Macs > and other Clang-based systems doesn't have enough version guards so it > won't complain, but GCC and its standard library will explicitly tell > you not to use C++17 features in a C++14 program.) I think the switch to C++14 happened with LLVM 10[0] while the C++17 switch happened with LLVM 16[1]. Double checking on an ubuntu jammy (can't install earlier llvm version than 12 on those): VERSIONS=(20 19 18 17 16 15 14 13 12) for version in ${VERSIONS[@]}; do llvm-config-$version --cxxflags done -I/usr/lib/llvm-20/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-19/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-18/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-17/include -std=c++17 -fno-exceptions -funwind-tables -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-16/include -std=c++17 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-15/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-14/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-13/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/usr/lib/llvm-12/include -std=c++14 -fno-exceptions -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS Which is still fine since, as you said, the code only applied for LLVM12+ which will always have at least c++14. I've tried compiling and running against all those llvm versions and the associated stdc++ version earlier in the thread[2] and had no issues. > If there are no further comments, I'm going to push this to all > branches tomorrow morning. For master only, I will remove the #if > condition and comment about LLVM 12+, as we now require 14+. Patch looks good to me. [0]: https://github.com/llvm/llvm-project/blob/release/10.x/llvm/CMakeLists.txt#L53 [1]: https://github.com/llvm/llvm-project/blob/release/16.x/llvm/CMakeLists.txt#L70 [2]: https://www.postgresql.org/message-id/CAO6_XqqxEQ%3DJY%2BtYO-KQn3_pKQ3O-mPojcwG54L5eptiu1cSJQ%40mail.gmail.com
Re: Pgoutput not capturing the generated columns
On Tue, Nov 5, 2024 at 12:32 PM Peter Smith wrote: > > has_column_list_defined: > > 1. > + if (HeapTupleIsValid(cftuple)) > + { > + bool isnull = true; > + > + /* Lookup the column list attribute. */ > + (void) SysCacheGetAttr(PUBLICATIONRELMAP, cftuple, > +Anum_pg_publication_rel_prattrs, > +&isnull); > > AFAIK it is not necessary to assign a default value to 'isnull' here. > e.g. most of the other 100s of calls to SysCacheGetAttr elsewhere in > PostgreSQL source don't bother to do this. > Can we try to reuse this new function has_column_list_defined() in pgoutput_column_list_init() where we are trying to check and form a column list? For that, we need to change this function such that it also returns a column list when requested. Some more comments: 1. extern bool is_schema_publication(Oid pubid); +extern bool has_column_list_defined(Publication *pub, Oid relid); The order of declaration doesn't match with its definition in .c file. It would be better to define this function after is_schema_publication(). 2. + * 'include_gencols' flag indicates whether generated columns should be + * published when there is no column list. Typically, this will have the same + * value as the 'publish_generated_columns' publication parameter. + * + * Note that generated columns can be published only when present in a + * publication column list, or when include_gencols is true. */ bool -logicalrep_should_publish_column(Form_pg_attribute att, Bitmapset *columns) +logicalrep_should_publish_column(Form_pg_attribute att, Bitmapset *columns, + bool include_gencols) { if (att->attisdropped) return false; - /* - * Skip publishing generated columns if they are not included in the - * column list. - */ - if (!columns && att->attgenerated) - return false; + /* If a column list is provided, publish only the cols in that list. */ + if (columns) + return bms_is_member(att->attnum, columns); /* - * Check if a column is covered by a column list. + * If no column list is provided, generated columns will be published only + * if include_gencols is true, while all non-generated columns will always + * be published. The similar information is repeated multiple times in different words. I have adjusted the comments in this part of the code to avoid repetition. I have changed comments at a few other places in the patch. See attached. -- With Regards, Amit Kapila. diff --git a/src/backend/replication/logical/proto.c b/src/backend/replication/logical/proto.c index 2c8fbc593a..2c2085b2f9 100644 --- a/src/backend/replication/logical/proto.c +++ b/src/backend/replication/logical/proto.c @@ -1272,10 +1272,6 @@ logicalrep_should_publish_column(Form_pg_attribute att, Bitmapset *columns, if (columns) return bms_is_member(att->attnum, columns); - /* -* If no column list is provided, generated columns will be published only -* if include_gencols is true, while all non-generated columns will always -* be published. -*/ + /* All non-generated columns are always published. */ return att->attgenerated ? include_gencols : true; } diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c index 386b087f79..014454d67b 100644 --- a/src/backend/replication/pgoutput/pgoutput.c +++ b/src/backend/replication/pgoutput/pgoutput.c @@ -168,9 +168,8 @@ typedef struct RelationSyncEntry Bitmapset *columns; /* -* Include generated columns for publication is set to true if -* 'publish_generated_columns' parameter is true, and the relation -* contains generated columns. +* This is set if the 'publish_generated_columns' parameter is true, and +* the relation contains generated columns. */ boolinclude_gencols; @@ -1098,7 +1097,6 @@ pgoutput_column_list_init(PGOutputData *data, List *publications, * fetch_table_list. But one can later change the publication so we still * need to check all the given publication-table mappings and report an * error if any publications have a different column list. -* */ foreach(lc, publications) { @@ -1157,11 +1155,8 @@ pgoutput_column_list_init(PGOutputData *data, List *publications, if (!cols) { /* -* For the first publication with no specified column list, we -* retrieve and cache the table columns. This allows comparison -* with the column lists of other publications to detect any -* differences. Columns are retrieved only when there is more than -* one publication, as differences can only arise in that case. +* Cache the table columns for the first publication with no specified +* column list to detect publicati
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Tue, Nov 5, 2024 at 10:19 AM Robert Haas wrote: > On Tue, Nov 5, 2024 at 1:02 PM Nikolay Samokhvalov > wrote: > > hi, I have a proposal, resulted from numerous communications with > various folks, both very experienced and new Postgres users: > > > > 1) EXPLAIN ANALYZE Is sometimes very confusing (because there is > ANALYZE). Let's rename it to EXPLAIN EXECUTE? > > The trouble is that EXPLAIN EXECUTE already means something. > > robert.haas=# explain execute foo; > ERROR: prepared statement "foo" does not exist > > Granted, that would not make it impossible to make EXPLAIN (EXECUTE) a > synonym for EXPLAIN (ANALYZE), but IMHO it would be pretty confusing > if EXPLAIN EXECUTE and EXPLAIN (EXECUTE) did different things. > > > 2) VERBOSE doesn't include BUFFERS, and doesn't include SETTINGS; it > might be also confusing sometimes. Let's include them so VERBOSE would be > really verbose? > > I agree that the naming here isn't great, but I think making the > options non-orthogonal would probably be worse. > > > 3) small thing about grammar: allow omitting parentheses, so EXPLAIN > EXECUTE VERBOSE would work. > > Perhaps surprisingly, it turns out that this is not a small change. As > Tom mentions, this would have a pretty large blast radius. In fact, > the reason I wrote the patch to introduce parenthesized options for > EXPLAIN was precisely because the unparenthesized option syntax does > not scale nicely at all. > I appreciate all yours and Tom's very quick comments here! Item 3 is already solved, as it turned out. Let's focus on item 2. Is it really impossible to make VERBOSE really verbose?
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Tue, Nov 5, 2024 at 1:24 PM Nikolay Samokhvalov wrote: > Item 3 is already solved, as it turned out. ANALYZE and VERBOSE are treated specially because those options existed prior to the parenthesized syntax. Scaling that treatment to a large number of options will not work out. > Let's focus on item 2. Is it really impossible to make VERBOSE really verbose? It is, of course, not impossible. But the fact that something is possible does not necessarily mean that it is a good idea. I think it can be quite confusing when the same behavior is controlled in more than one way. If the VERBOSE option turns information about BUFFERS on and off, and the BUFFERS option does the same thing, what happens if I say EXPLAIN (VERBOSE ON, BUFFERS OFF)? Is it different if I say EXPLAIN (BUFFERS OFF, VERBOSE ON)? There's a lot of opportunity for the behavior to be confusing here. Then, too, we can argue about what should be included in VERBOSE. You propose BUFFERS and SETTINGS, but we've also got SERIALIZE (which is not even Boolean-valued), WAL, and MEMORY. One can argue that we ought to include everything when VERBOSE is specified; one can also argue that some of this stuff is too marginal and too high-overhead to justify its inclusion. Both arguments have merit, IMHO. I'm not very happy with the current situation. I agree that EXPLAIN has gotten a bit too complicated. However, I also know that not everyone wants the same things. And I can say from a PostgreSQL support perspective that I do not always want a customer to just "turn on everything", as EXPLAIN output can be extremely long and adding a whole bunch of additional details that make already-long output even longer can easily be actively unhelpful. For me personally, just plain EXPLAIN ANALYZE is usually enough. Sometimes I need VERBOSE to see the target lists at each level, and very occasionally I need BUFFERS to see how much data is being accessed, but at least for me, those are pretty rare cases. So I don't think I really believe the "everybody always wants that" argument. One of the most common things that I have to do with EXPLAIN output is trim the small amounts of relevant material out of the giant pile of things that don't matter to the problem at hand. If you enable an option that adds an extra line of output for every node and there are 100 nodes in the query plan, that is a whole lot of additional clutter. -- Robert Haas EDB: http://www.enterprisedb.com
Re: UUID v7
> On 2 Nov 2024, at 02:23, Masahiko Sawada wrote: > > Most of the timing durations were nanoseconds and fell into either 0 > ns. Others fell into >1023 bins. Indeed. We cannot have these 2 bits from nanoseconds :( I've tried to come up with some clever tricks, but have not succeeded. Let's use only microseconds. Best regards, Andrey Borodin. v30-0001-Implement-UUID-v7.patch Description: Binary data
RE: Rename Function: pg_postmaster_start_time
>> I suggest this change to simplify the terminology and make the function >> name more intuitive, as "postgres" directly refers to the database server. >> This seems more suitable to me. >Seems like an unnecessary change of a publicly facing feature. IMO >stability wins out over any debatable improvement the change may bring. There are several parts of the system where the term 'postmaster' appears and could potentially be changed to 'postgres'. In most cases, I agree with you: keeping the current term is a more cautious approach and ensures stability. However, in the case of this function, the adjustment is quite simple and doesn’t involve significant changes to the files; it’s really just a matter of 'replacing' the term. Regards, Maiquel Grassi.
Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
hi, I have a proposal, resulted from numerous communications with various folks, both very experienced and new Postgres users: 1) EXPLAIN ANALYZE Is sometimes very confusing (because there is ANALYZE). Let's rename it to EXPLAIN EXECUTE? 2) VERBOSE doesn't include BUFFERS, and doesn't include SETTINGS; it might be also confusing sometimes. Let's include them so VERBOSE would be really verbose? 3) small thing about grammar: allow omitting parentheses, so EXPLAIN EXECUTE VERBOSE would work. if both changes are done, we could use EXPLAIN (EXECUTE, VERBOSE) to be able to collect data in a great way for analysis. have a really nice week, Nik
Rename Function: pg_postmaster_start_time
Hi! I have been working on some servers that haven't been shut down or restarted for many days. As a result, the need arose to use the function 'pg_postmaster_start_time' (which I had completely forgotten existed). While working with it, I realized that it would be more appropriate for its name to be 'pg_postgres_start_time', to align with the name of the main process displayed by the operating system. The process, which was previously known as "postmaster," has long been referred to as "postgres". I suggest this change to simplify the terminology and make the function name more intuitive, as "postgres" directly refers to the database server. This seems more suitable to me. Attached is the patch with the suggest adjustment. Result: [postgres@192 bin]$ pwd /pgsql18devel/18devel_bin/bin [postgres@192 bin]$ ./psql -p 5454 -h ::1 psql (18devel) Type "help" for help. postgres=# select pg_postgres_start_time(); pg_postgres_start_time --- 2024-11-05 14:52:42.463293-03 (1 row) Regards, Maiquel Grassi. diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml index 73979f20ff..44285ab393 100644 --- a/doc/src/sgml/func.sgml +++ b/doc/src/sgml/func.sgml @@ -24816,9 +24816,9 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n); - pg_postmaster_start_time + pg_postgres_start_time -pg_postmaster_start_time () +pg_postgres_start_time () timestamp with time zone diff --git a/src/backend/utils/adt/timestamp.c b/src/backend/utils/adt/timestamp.c index e00cd3c969..7b176a4970 100644 --- a/src/backend/utils/adt/timestamp.c +++ b/src/backend/utils/adt/timestamp.c @@ -49,7 +49,7 @@ #define SAMESIGN(a,b) (((a) < 0) == ((b) < 0)) -/* Set at postmaster start */ +/* Set at postgres start */ TimestampTz PgStartTime; /* Set at configuration reload */ @@ -1623,7 +1623,7 @@ clock_timestamp(PG_FUNCTION_ARGS) } Datum -pg_postmaster_start_time(PG_FUNCTION_ARGS) +pg_postgres_start_time(PG_FUNCTION_ARGS) { PG_RETURN_TIMESTAMPTZ(PgStartTime); } diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat index f23321a41f..052dbe1548 100644 --- a/src/include/catalog/pg_proc.dat +++ b/src/include/catalog/pg_proc.dat @@ -8587,9 +8587,9 @@ # start time function { oid => '2560', descr => 'postmaster start time', - proname => 'pg_postmaster_start_time', provolatile => 's', + proname => 'pg_postgres_start_time', provolatile => 's', prorettype => 'timestamptz', proargtypes => '', - prosrc => 'pg_postmaster_start_time' }, + prosrc => 'pg_postgres_start_time' }, # config reload time function { oid => '2034', descr => 'configuration load time',
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Tue, Nov 5, 2024 at 1:02 PM Nikolay Samokhvalov wrote: > hi, I have a proposal, resulted from numerous communications with various > folks, both very experienced and new Postgres users: > > 1) EXPLAIN ANALYZE Is sometimes very confusing (because there is ANALYZE). > Let's rename it to EXPLAIN EXECUTE? The trouble is that EXPLAIN EXECUTE already means something. robert.haas=# explain execute foo; ERROR: prepared statement "foo" does not exist Granted, that would not make it impossible to make EXPLAIN (EXECUTE) a synonym for EXPLAIN (ANALYZE), but IMHO it would be pretty confusing if EXPLAIN EXECUTE and EXPLAIN (EXECUTE) did different things. > 2) VERBOSE doesn't include BUFFERS, and doesn't include SETTINGS; it might be > also confusing sometimes. Let's include them so VERBOSE would be really > verbose? I agree that the naming here isn't great, but I think making the options non-orthogonal would probably be worse. > 3) small thing about grammar: allow omitting parentheses, so EXPLAIN EXECUTE > VERBOSE would work. Perhaps surprisingly, it turns out that this is not a small change. As Tom mentions, this would have a pretty large blast radius. In fact, the reason I wrote the patch to introduce parenthesized options for EXPLAIN was precisely because the unparenthesized option syntax does not scale nicely at all. -- Robert Haas EDB: http://www.enterprisedb.com
Re: Interrupts vs signals
On Mon, 4 Nov 2024 at 20:42, Heikki Linnakangas wrote: > Having spent some time playing with this, I quite like option C: break > compatibility, but provide an out-of-tree header file with > *forward*-compatibility macros. That encourages extension authors to > adapt to new idioms, but avoids having to sprinkle extension code with > #if version checks to support old versions. +1 maintaining a subset of these things for every extension is kind of a pain > My plan is to put this on the Wiki Why the wiki and not as a file in the repo? Seems like it would be nice to update this file together with patches that introduce such breakages. To be clear, I think it shouldn't be possible to #include the file, such a forward compatibility file should always be copy-pasted. But having it in the same place as the code seems useful, just like we update docs together with the code. > We could add helpers for previous > incompatibilities in v17 and v16 too, although at quick glance I'm not > sure what those might be. One thing that I personally add to any extension I maintain is the new foreach macros introduced in PG17, because they are just so much nicer to use. > I tested this approach by adapting pg_cron to build with these patches > and the compatibility header, and compiling it with all supported server > versoins. Works great, see adapt-pg_cron.patch. Looks like a simple change indeed. On Mon, 4 Nov 2024 at 20:42, Heikki Linnakangas wrote: > > On 31/10/2024 02:32, Michael Paquier wrote: > > On Wed, Oct 30, 2024 at 01:23:54PM -0400, Robert Haas wrote: > >> On Wed, Oct 30, 2024 at 12:03 PM Heikki Linnakangas > >> wrote: > >>> C) We could provide "forward-compatibility" macros in a separate header > >>> file, to make the new "SetInterrupt" etc calls work in old PostgreSQL > >>> versions. Many of the extensions already have a header file like this, > >>> see e.g. citusdata/citus/src/include/pg_version_compat.h, > >>> pipelinedb/pipelinedb/include/compat.h. It might actually be a good idea > >>> to provide a semi-official header file like this on the Postgres wiki, > >>> to help extension authors. It would encourage extensions to use the > >>> latest idioms, while still being able to compile for older versions. > >>> > >>> I'm leaning towards option C). Let's rip off the band-aid, but provide > >>> documentation for how to adapt your extension code. And provide a > >>> forwards-compatibility header on the wiki, that extension authors can > >>> use to make the new Interrupt calls work against old server versions. > >> > >> I don't know which of these options is best, but I don't find any of > >> them categorically unacceptable. > > > > Looking at the compatibility macros of 0008 for the latches with > > INTERRUPT_GENERAL_WAKEUP under latch.h, the changes are not that bad > > to adapt to, IMO. It reminds of f25968c49697: hard breakage, no > > complaints I've heard of because I guess that most folks have been > > using an in-house compatibility headers. > > > > A big disadvantage of B is that someone may decide to add new code in > > core that depends on the past routines, and we'd better avoid that for > > this new layer of APIs for interrupt handling. A is a subset of C: do > > a hard switch in the core code, with C mentioning a compatibility > > layer in the wiki that does not exist in the core code. Any of A or C > > is OK, I would not choose B for the core backend. > Having spent some time playing with this, I quite like option C: break > compatibility, but provide an out-of-tree header file with > *forward*-compatibility macros. That encourages extension authors to > adapt to new idioms, but avoids having to sprinkle extension code with > #if version checks to support old versions. > > See attached pg_version_compatibility.h header. It allows compiling code > that uses basic SendInterrupt, RaiseInterrupt, WaitInterrupt calls with > older server versions. My plan is to put this on the Wiki, and update it > with similar compatibility helpers for other changes we might make in > v18 or future versions. We could add helpers for previous > incompatibilities in v17 and v16 too, although at quick glance I'm not > sure what those might be. > > I tested this approach by adapting pg_cron to build with these patches > and the compatibility header, and compiling it with all supported server > versoins. Works great, see adapt-pg_cron.patch. > > I pushed the preliminary cleanup patches from this patch set earlier, > only the main patches remain. Attached is a new version of those, with > mostly comment cleanups. > > -- > Heikki Linnakangas > Neon (https://neon.tech)
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
Nikolay Samokhvalov writes: > 1) EXPLAIN ANALYZE Is sometimes very confusing (because there is ANALYZE). > Let's rename it to EXPLAIN EXECUTE? This has got far too many years of history to be renamed now. > 2) VERBOSE doesn't include BUFFERS, and doesn't include SETTINGS; it might > be also confusing sometimes. Let's include them so VERBOSE would be really > verbose? This is not likely to fly for compatibility reasons. > 3) small thing about grammar: allow omitting parentheses, so EXPLAIN > EXECUTE VERBOSE would work. The reason for the parens is that the other way would require reserving all these options as keywords. regards, tom lane
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Tue, Nov 5, 2024 at 10:16 AM Tom Lane wrote: > Nikolay Samokhvalov writes: > > 1) EXPLAIN ANALYZE Is sometimes very confusing (because there is > ANALYZE). > > Let's rename it to EXPLAIN EXECUTE? > > This has got far too many years of history to be renamed now. > This is a really, really strange argument. Postgres keeps receiving new audiences at larger and larger scale. And they are confused. It's better late than never. I didn't believe we would have "quit" working in psql. > > > 2) VERBOSE doesn't include BUFFERS, and doesn't include SETTINGS; it > might > > be also confusing sometimes. Let's include them so VERBOSE would be > really > > verbose? > > This is not likely to fly for compatibility reasons. > Can you elaborate? > > > 3) small thing about grammar: allow omitting parentheses, so EXPLAIN > > EXECUTE VERBOSE would work. > > The reason for the parens is that the other way would require reserving > all these options as keywords. > turns out, EXPLAIN ANALYZE VERBOSE already working (it's just not as verbose as one might expect_: test=# explain analyze verbose select; QUERY PLAN -- Result (cost=0.00..0.01 rows=1 width=0) (1 row)
Re: Rename Function: pg_postmaster_start_time
On Tue, Nov 5, 2024 at 11:26 AM Maiquel Grassi wrote: > I suggest this change to simplify the terminology and make the function > name more intuitive, as "postgres" directly refers to the database server. > This seems more suitable to me. > Seems like an unnecessary change of a publicly facing feature. IMO stability wins out over any debatable improvement the change may bring. David J.
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
Nikolay Samokhvalov writes: > Let's focus on item 2. Is it really impossible to make VERBOSE really > verbose? It's obviously not "impossible" -- the code changes would likely be trivial. The question is whether it's a good idea. These semantics were (I presume) deliberately chosen when the options were added, so somebody thought not. You would need to go back and review the relevant mail thread and then make arguments why that decision was wrong. In short: we're not working in a green field here, and all these decisions have history. You will not get far by just popping up and saying "I think it should be different". You need to make a case why the decision was wrong, and why it was so wrong that we should risk cross-version-compatibility problems by changing. regards, tom lane
Re: Rename Function: pg_postmaster_start_time
On Tue, Nov 5, 2024 at 11:59 AM Maiquel Grassi wrote: > >> I suggest this change to simplify the terminology and make the function > >> name more intuitive, as "postgres" directly refers to the database > server. > >> This seems more suitable to me. > > >Seems like an unnecessary change of a publicly facing feature. IMO > >stability wins out over any debatable improvement the change may bring. > > There are several parts of the system where the term 'postmaster' appears > and could potentially be changed to 'postgres'. In most cases, I agree with > you: keeping the current term is a more cautious approach and ensures > stability. However, in the case of this function, the adjustment is quite > simple and doesn’t involve significant changes to the files; it’s really > just a matter of 'replacing' the term. > The ease or difficulty of making the change in the server has no meaningful bearing on whether breaking this public API is warranted or not. David J.
Re: Add ExprState hashing for GROUP BY and hashed SubPlans
On Sat, 2 Nov 2024 at 22:13, Andrei Lepikhov wrote: > Okay, I passed through the code. It looks good. Hashing expressions are > too simple to give notably impressive results, but it is a step in the > right direction. > I found only one minor issue: an outdated comment (see attachment). Thanks for looking. I ended up doing a bit more work on that comment. See attached. I was performing some final benchmarks on this and found that there's a small performance regression for hashed subplans. The test case is: create table t1 (a int); insert into t1 select a from generate_series(1,10)a; select * from t1 where a not in(select a from t1); With my Zen4 laptop I get: gcc: master tps = 58.0375255 v5-0001 tps = 56.11840762 clang: master tps = 58.72993378 v5-0001 tps = 53.39965968 Zen2 3990x gcc: master tps = 34.30392818 v5-0001 tps = 33.22067202 clang: master tps = 34.0497471 v5-0001 tps = 33.34801254 That's the average over 50 runs starting and stopping the server on each 10 second run. I'm still thinking about what to do about this. In the meantime, I'm not going to commit it. > Also, to make a hash calculation as fast as possible, should we add the > call of murmurhash32 as an optional step of ExprState in the > ExecBuildHash32FromAttrs? I wrote enough of this patch to test the performance. It does not help. See attached 0002. David From 31cffc52c302e63da6c6f0c20cf8cc7c8162c191 Mon Sep 17 00:00:00 2001 From: David Rowley Date: Wed, 7 Aug 2024 16:56:48 +1200 Subject: [PATCH v5 1/2] Use ExprStates for hashing in GROUP BY and SubPlans This speeds up obtaining hash values for GROUP BY and for hashed SubPlan by using the ExprState support for hashing. This allows JIT compilation for hash value. This hash shown to improve Hash Aggregate performance in some cases by about 3%. In passing, fix a hypothetical bug in ExecBuildHash32Expr() so that the initial value is stored directly in the ExprState's result field if there are no expressions to hash. None of the current users of this function uses an initial value, so the bug is only hypothetical. Reviewed-by: Andrei Lepikhov Discussion: https://postgr.es/m/caaphdvpyso3kc9urymevwqthtbrxgfd9djiajkhmpusqex9...@mail.gmail.com --- src/backend/executor/execExpr.c | 155 src/backend/executor/execGrouping.c | 82 ++- src/backend/executor/nodeSubplan.c | 17 ++- src/include/executor/executor.h | 10 +- src/include/nodes/execnodes.h | 25 +++-- 5 files changed, 223 insertions(+), 66 deletions(-) diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c index 8f7a534005..c7bb13e270 100644 --- a/src/backend/executor/execExpr.c +++ b/src/backend/executor/execExpr.c @@ -3971,6 +3971,161 @@ ExecBuildAggTransCall(ExprState *state, AggState *aggstate, } } +/* + * Build an ExprState that calls the given hash function(s) on the attnums + * given by 'keyColIdx' . When numCols > 1, the hash values returned by each + * hash function are combined to produce a single hash value. + * + * desc: tuple descriptor for the to-be-hashed columns + * ops: TupleTableSlotOps to use for the give TupleDesc + * hashfunctions: FmgrInfos for each hash function to call, one per numCols. + * These are used directly in the returned ExprState so must remain allocated. + * collations: collation to use when calling the hash function. + * numCols: array length of hashfunctions, collations and keyColIdx. + * parent: PlanState node that the resulting ExprState will be evaluated at + * init_value: Normally 0, but can be set to other values to seed the hash + * with. Non-zero is marginally slower, so best to only use if it's provably + * worthwhile. + */ +ExprState * +ExecBuildHash32FromAttrs(TupleDesc desc, const TupleTableSlotOps *ops, +FmgrInfo *hashfunctions, Oid *collations, +int numCols, AttrNumber *keyColIdx, +PlanState *parent, uint32 init_value) +{ + ExprState *state = makeNode(ExprState); + ExprEvalStep scratch = {0}; + NullableDatum *iresult = NULL; + intptr_topcode; + AttrNumber last_attnum = 0; + + Assert(numCols >= 0); + + state->parent = parent; + + /* +* Make a place to store intermediate hash values between subsequent +* hashing of individual columns. We only need this if there is more than +* one column to hash or an initial value plus one column. +*/ + if ((int64) numCols + (init_value != 0) > 1) + iresult = palloc(sizeof(NullableDatum)); + + /* find the highest attnum so we deform the tuple to that point */ + for (int i = 0; i < numCols; i++) + last_attnum = Max(last_attnum, keyColIdx[i]); + + scratch.opcode = EEOP_INNER_FETCHSOME; + scratch.d.fetch.last_var = last_attnum; + scra
Re: pg_dump --no-comments confusion
On Mon, 4 Nov 2024 at 21:00, Bruce Momjian wrote: > > On Mon, Nov 4, 2024 at 07:49:45PM +0100, Daniel Gustafsson wrote: > > > On 4 Nov 2024, at 17:24, Erik Wienhold wrote: > > > But I also think that > > > "SQL" in front of the command name is unnecessary because the man page > > > uses the "FOOBAR command" form throughout > > > > Agreed. > > > > >--inserts > > >Dump data as INSERT commands [...] > > > > > > Also, it doesn't really matter whether COMMENT is standard SQL. > > > > AFAIK some flavor of COMMENT is present in MySQL, PostgreSQL and Oracle > > which > > already makes it more "standardized" than many parts of the SQL standard =) Oh, I didn't know that, TIL. > Proposed patch attached. LGTM. -Matthias
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Tue, Nov 5, 2024 at 10:30 AM Tom Lane wrote: > we're not working in a green field here, and all these > decisions have history. > I hear you and understand. Ready to do legwork here. 1. VERBOSE first appeared in 1997 in 6.3 in 3a02ccfa, with different meaning: > This command [EXPLAIN] outputs details about the supplied query. The default > output is the computed query cost. \f2verbose\f1 displays the full query > plan and cost. 2. Support for parenthesis was added in d4382c4a (2009, 8.5), with "test" option COSTS, and this opened gates to extending with many options. 3. BUFFERS was added in d4382c4 (also 2009, 8.5), discussion https://www.postgresql.org/message-id/flat/4AC12A17.5040305%40timbira.com, I didn't see that inclusion it to VERBOSE was discussed. In my opinion, this option is invaluable: most of the performance optimization is done by reducing IO so seeing these numbers helps make decisions much faster. I always use them. When you optimize and, for example, want to verify an index idea, it's not good to do it on production – it's better to work with clones. There, we can have weaker hardware, different buffer state, etc. So timing numbers might be really off. Timing can be different even on the same server, e.g. after restart, when buffer pool is not warmed up. But BUFFERS never lie – they are not affected by saturated CPU if it happens, lock acquisition waits, etc. Not looking at them is missing an essential part of analysis, I strongly believe. It looks like in 2009, when the BUFFERS option was created, it was not enough understanding that it is so useful, so it was not discussed to include them by default or at least – as we discuss here – to involve in VERBOSE. I want to emphasize: BUFFERS is essential in my work and more and more people are convinced that during the optimization process, when you're inside it, in most cases it's beneficial to focus on BUFFERS. Notice that explain.depesz.com, explain.dalibo.com, pgMustard and many tools recognize it and ask users to include BUFFERS to analysis. And see the next item: 4. Making BUFFERS default behavior for EXPLAIN ANALYZE was raised several times, for example https://www.postgresql.org/message-id/flat/CANNMO++=LrJ4upoeydZhbmpd_ZgZjrTLueKSrivn6xmb=yf...@mail.gmail.com (2021) – and my understanding that it was received great support and it discussed in detail why it's useful, but then several attempts to implement it were not accomplished because of tech difficulties (as I remember, problem with broken tests and how to fix that). 5. EXPLAIN ALL proposed in https://www.postgresql.org/message-id/flat/080fe841-e38d-42a9-ad6d-48cabed16...@endpoint.com (2016) – I think it's actually a good idea originally, but didn't survive questions of mutually exclusive options and non-binary options, and then discussion stopped after pivoting in direction of GUC. 6. FInally, the fresh SERIALIZE option was discussed in https://www.postgresql.org/message-id/flat/ca0adb0e-fa4e-c37e-1cd7-91170b18cae1%40gmx.de (2023-2024, 17), and unfortunately again. I might be missing some discussions – please help me find them; I also expect that there are many people who support me thinking that BUFFERS are very useful and should be default or at least inside VERBOSE. Meanwhile: - to be able to have all data in hand during analysis, we need to recommend users to collect plans using EXPLAIN (ANALYZE, BUFFERS, VERBOSE, SETTINGS), which looks really long - independently, I know see pgMustard ended up having a similar recommendation: https://www.pgmustard.com/getting-a-query-plan: > For better advice, we recommend using at least: explain (analyze, format json, buffers, verbose, settings) My proposal remains: EXPLAIN ANALYZE VERBOSE -- let's consider this, please.
Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
I'm trying to write release notes for commits 53af9491a et al, and it seems to me that we need to explain how to get out of the mess that would be left behind by the old DETACH code. There's no hint about that in the commit message :-( Clearly, if you have now-inconsistent data, there's little help for that but to manually fix the inconsistencies. What I am worried about is how to get to a state where you have correct catalog entries for the constraint. Will ALTER TABLE DROP CONSTRAINT on the now stand-alone table work to clean out the old catalog entries for the constraint? I'm worried that it will either fail, or go through but remove triggers on the referenced table that we still need for the original partitioned table. If that doesn't work I think we had better create a recipe for manually removing the detritus. Once the old entries are gone it should be possible to do ALTER TABLE ADD CONSTRAINT (with an updated server), and that would validate your data. It's the DROP CONSTRAINT part that worries me. regards, tom lane
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Tue, Nov 5, 2024 at 2:54 PM Nikolay Samokhvalov wrote: > 6. FInally, the fresh SERIALIZE option was discussed in > https://www.postgresql.org/message-id/flat/ca0adb0e-fa4e-c37e-1cd7-91170b18cae1%40gmx.de > (2023-2024, 17), and unfortunately again. > (didn't finish the phrase here and hit Send) ...again, I don't see that it was discussed to include the SERIALIZE behavior to VERBOSE. I don't use SERIALIZE myself, but during our podcasts, Michael (CCing him) was wondering why it was so. Summary: I haven't found explicit discussions of including new options to VERBOSE, when that new options were created. I used Google, the .org search, and postgres.ai semantic search over archives involving pgvector/HNSW – I might be missing something, or it was really not discussed when new options were added.
Re: [PoC] Federated Authn/z with OAUTHBEARER
On Sun, Nov 3, 2024 at 9:00 PM jian he wrote: > The above Assert looks very wrong to me. I can switch to Assert(false) if that's preferred, but it makes part of the libc assert() report useless. (I wish we had more fluent ways to say "this shouldn't happen, but if it does, we still need to get out safely.") > we can also use PG_INT32_MAX, instead of INT_MAX > (generally i think PG_INT32_MAX looks more intuitive to me) That's a fixed-width max; we want the maximum for the `int` type here. >expires_in > REQUIRED. The lifetime in seconds of the "device_code" and > "user_code". >interval > OPTIONAL. The minimum amount of time in seconds that the client > SHOULD wait between polling requests to the token endpoint. If no > value is provided, clients MUST use 5 as the default. > " > these two fields seem to differ from struct device_authz. Yeah, Daniel and I had talked about being stricter about REQUIRED fields that are not currently used. There's a comment making note of this in parse_device_authz(). The v1 code will need to make expires_in REQUIRED, so that future developers can develop features that depend on it without worrying about breaking currently-working-but-noncompliant deployments. (And if there are any noncompliant deployments out there now, we need to know about them so we can have that explicit discussion.) Thanks, --Jacob
Re: define pg_structiszero(addr, s, r)
On Wed, 6 Nov 2024 at 04:03, Bertrand Drouvot wrote: > Another option could be to use SIMD instructions to check multiple bytes > is zero in a single operation. Maybe just an idea to keep in mind and > experiment > if we feel the need later on. Could do. I just wrote it that way to give the compiler flexibility to do SIMD implicitly. That seemed easier than messing around with SIMD intrinsics. I guess the compiler won't use SIMD with the single size_t-at-a-time version as it can't be certain it's ok to access the memory beyond the first zero word. Because I wrote the "if" condition using bitwise-OR, there's no boolean short-circuiting, so the compiler sees it must be safe to access all the memory for the loop iteration. If I use -march=native or -march=znver2 on my Zen2 machine, gcc does use SIMD operators. Clang uses some 128-bit registers without specifying -march: drowley@amd3990x:~$ gcc -O2 allzeros.c -march=native -o allzeros && for i in {1..3}; do ./allzeros; done char: done in 1940539 nanoseconds size_t: done in 261731 nanoseconds (7.41425 times faster than char) size_t * 4: done in 130415 nanoseconds (14.8797 times faster than char) size_t * 8: done in 70031 nanoseconds (27.7097 times faster than char) char: done in 3030132 nanoseconds size_t: done in 477044 nanoseconds (6.35189 times faster than char) size_t * 4: done in 123551 nanoseconds (24.5254 times faster than char) size_t * 8: done in 68549 nanoseconds (44.2039 times faster than char) char: done in 3214037 nanoseconds size_t: done in 256901 nanoseconds (12.5108 times faster than char) size_t * 4: done in 126017 nanoseconds (25.5048 times faster than char) size_t * 8: done in 73167 nanoseconds (43.9274 times faster than char) David
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Tue, Nov 5, 2024 at 3:55 PM Nikolay Samokhvalov wrote: > > 4. Making BUFFERS default behavior for EXPLAIN ANALYZE was raised several > times, for example > https://www.postgresql.org/message-id/flat/CANNMO++=LrJ4upoeydZhbmpd_ZgZjrTLueKSrivn6xmb=yf...@mail.gmail.com > (2021) – and my understanding that it was received great support and it > discussed in detail why it's useful, but then several attempts to implement > it were not accomplished because of tech difficulties (as I remember, > problem with broken tests and how to fix that). > The main premise here is that explain should include buffers by default, and to do so we are willing to inconvenience testers who do not want buffer data in their test plans to have to modify their tests to explicitly exclude buffers. We'll have to eat our own dog food here and go and add "buffers off" throughout our code base to make this happen. I personally feel that we should accept a patch that does so. The benefits to the many outweigh the one-time inconveniencing of the few. Especially if limited to explain analyze. > 5. EXPLAIN ALL proposed in > https://www.postgresql.org/message-id/flat/080fe841-e38d-42a9-ad6d-48cabed16...@endpoint.com > (2016) – I think it's actually a good idea originally, but didn't survive > questions of mutually exclusive options and non-binary options, and then > discussion stopped after pivoting in direction of GUC. > If the desire is to make the current keyword VERBOSE behave like the proposed ALL keyword then one must first get a version of ALL accepted, then argue for repurposing VERBOSE instead of adding the new keyword. But at this point I really do not see extending verbose to mean more than "add more comments and context labels". Verbose has never meant to include everything and getting buy-in to change that seems highly unlikely. In short, neither change is deemed unwanted, and indeed has desire. It's a matter of learning from the previous attempt to increase the odds of getting something committed. I wouldn't advise expending effort or political capital on the parentheses topic at this point. David J.
Re: per backend I/O statistics
On Tue, Nov 05, 2024 at 05:37:15PM +, Bertrand Drouvot wrote: > I'm starting working on option 2, I think it will be easier to discuss with > a patch proposal to look at. > > If in the meantime, one strongly disagree with option 2 (means implement a > brand > new PGSTAT_KIND_BACKEND and keep PGSTAT_KIND_IO), please let me know. Sorry for the late reply, catching up a bit. As you are quoting in [1], you do not expect the backend-io stats and the more global pg_stat_io to achieve the same level of consistency as the backend stats would be gone at restart, and wiped out when a backend shuts down. So, splitting them with a different stats kind feels more natural because it would be possible to control how each stat kind behaves depending on the code shutdown and reset paths within their own callbacks rather than making the callbacks of PGSTAT_KIND_IO more complex than they already are. And pg_stat_io is a fixed-numbered stats kind because of the way it aggregates its stats with a number states defined at compile-time. Is the structure you have in mind different than PgStat_BktypeIO? Perhaps a split is better anyway with that in mind. The amount of memory required to store the snapshots of backend-IO does not worry me much, TBH, but you are worried about a high turnover of connections that could cause a lot of bloat in the backend-IO snapshots because of the persistency that these stats would have, right? Actually, we already have cases with other stats kinds where it is possible for a backend to hold references to some stats on an object that has been dropped concurrently. At least, that's possible when a backend shuts down. If possible, supporting snapshots would be more consistent with the other stats. Just to be clear, I am not in favor of making PgStat_HashKey larger than it already is. With 8 bytes allocated for the object ID, the chances of conflicts in the dshash for a single variable-numbered stats kind play in our favor already even with a couple of million entries. [1]: https://www.postgresql.org/message-id/zymrjibupnpoc...@ip-10-97-1-34.eu-west-3.compute.internal -- Michael signature.asc Description: PGP signature
Re: [PoC] Federated Authn/z with OAUTHBEARER
On Tue, Nov 5, 2024 at 3:33 PM Jacob Champion wrote: > Done in v36, attached. Forgot to draw attention to this part: > +# XXX libcurl must link after libgssapi_krb5 on FreeBSD to avoid > segfaults > +# during gss_acquire_cred(). This is possibly related to Curl's Heimdal > +# dependency on that platform? Best I can tell, libpq for FreeBSD has a dependency diamond for GSS symbols: libpq links against MIT krb5, libcurl links against Heimdal, libpq links against libcurl. Link order becomes critical to avoid nasty segfaults, but I have not dug deeply into the root cause. --Jacob
Re: index_delete_sort: Unnecessary variable "low" is used in heapam.c
> On 5 Nov 2024, at 17:40, Fujii Masao wrote: > > On 2024/09/24 21:31, Daniel Gustafsson wrote: >>> On 24 Sep 2024, at 10:32, btnakamurakoukil >>> wrote: >>> I noticed unnecessary variable "low" in index_delete_sort() >>> (/postgres/src/backend/access/heap/heapam.c), patch attached. What do you >>> think? >> That variable does indeed seem to not be used, and hasn't been used since it >> was committed in d168b666823. The question is if it's a left-over from >> development which can be removed, or if it should be set and we're missing an >> optimization. Having not read the referenced paper I can't tell so adding >> Peter Geoghegan who wrote this for clarification. > > It's been about a month without updates. How about removing the unnecessary > variable as suggested? We can always add the "missing" logic later if needed. Thanks for reviving this, I had admittedly forgotten about this thread. I don't see any reason to not go ahead with the proposed diff. -- Daniel Gustafsson
Re: Time to add a Git .mailmap?
On Tue, Nov 05, 2024 at 10:33:13AM +0100, Alvaro Herrera wrote: > While I would agree with this line of thinking if the situation were as > you describe, it should be obvious that it isn't; nobody here uses or > has ever used a work email as committer address[1][2]. Nevertheless, > since this argument is about _your_ personal identity not mine, I'm not > going to stand against you on it. > > Therefore I +1 Daniel's original proposal with thanks, and BTW I'm not > sorry for changing my name to use the hard-won ' accent on it :-) And now I'm going to ask how you figured out about the ë in my name, because it's right ;) I've dropped it from the commit logs because of keyboard laziness, mostly. -- Michael signature.asc Description: PGP signature
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Tue, Nov 5, 2024 at 1:19 PM Nikolay Samokhvalov wrote: > >> > 2) VERBOSE doesn't include BUFFERS, and doesn't include SETTINGS; it >> might >> > be also confusing sometimes. Let's include them so VERBOSE would be >> really >> > verbose? >> >> This is not likely to fly for compatibility reasons. >> > > Can you elaborate? > I am not sure about the compatibility reasons (other than backtesting, or scripts?). But, personally, as a relatively new person to PG, I was surprised that VERBOSE did not include the buffers. Could we somehow turn this on? (GUC: VERBOSE_INCLUDES_BUFFERS = yes/no)?
Re: define pg_structiszero(addr, s, r)
On Wed, Nov 06, 2024 at 12:16:33PM +1300, David Rowley wrote: > On Wed, 6 Nov 2024 at 04:03, Bertrand Drouvot > wrote: >> Another option could be to use SIMD instructions to check multiple bytes >> is zero in a single operation. Maybe just an idea to keep in mind and >> experiment >> if we feel the need later on. > > Could do. I just wrote it that way to give the compiler flexibility to > do SIMD implicitly. That seemed easier than messing around with SIMD > intrinsics. I guess the compiler won't use SIMD with the single > size_t-at-a-time version as it can't be certain it's ok to access the > memory beyond the first zero word. Because I wrote the "if" condition > using bitwise-OR, there's no boolean short-circuiting, so the compiler > sees it must be safe to access all the memory for the loop iteration. How complex would that be compared to the latest patch proposed if done this way? If you can force SIMD without having to know about these specific gcc switches (aka -march is not set by default in the tree except for some armv8 path), then the performance happens magically. If that makes the code more readable, that's even better. -- Michael signature.asc Description: PGP signature
Re: Converting contrib SQL functions to new style
On Tue, Nov 05, 2024 at 10:14:02AM +0100, Ronan Dunklau wrote: > Sorry you're right I missed one for xml2. But most importantly I forgot to > update the meson build files for every one of them, using only the Makefile... If you want to catch that easily in the future, you can also set up the CI within your own repo. This has saved me many times and the cfbot sometimes takes a few days to report back. This usually reports within 15 minutes. I was wondering what was going on here, and this patch comes down to switching all these definitions from that: CREATE FUNCTION lo_oid(lo) RETURNS pg_catalog.oid AS 'SELECT $1::pg_catalog.oid' LANGUAGE SQL STRICT IMMUTABLE PARALLEL SAFE; To that: +CREATE OR REPLACE FUNCTION lo_oid(lo) RETURNS pg_catalog.oid +LANGUAGE SQL STRICT IMMUTABLE PARALLEL SAFE +RETURN (SELECT $1::pg_catalog.oid); This makes the executions more robust run-time search_path checks. Is that something that should be considered for a backpatch, actually? -- Michael signature.asc Description: PGP signature
Re: define pg_structiszero(addr, s, r)
On Wed, Nov 06, 2024 at 01:38:50PM +1300, David Rowley wrote: > We could just write it that way and leave it up to the compiler to > decide whether to use SIMD or not. Going by [1], gcc with -O2 uses > SIMD instructions from 14.1 and clang with -O2 does it from version > 8.0.0. gcc 14.1 was release in May 2024, so still quite new. It'll be > quite a bit older once PG18 is out. Using the bitwise-OR method, more > and more people will benefit as gcc14.1 and beyond becomes more > mainstream. > > Clang 8.0.0 is from March 2019, so quite old already. Okay, WFM to keep things the way they are in the patch. -- Michael signature.asc Description: PGP signature
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Wed, 6 Nov 2024 at 12:33, David G. Johnston wrote: > The main premise here is that explain should include buffers by default, and > to do so we are willing to inconvenience testers who do not want buffer data > in their test plans to have to modify their tests to explicitly exclude > buffers. We'll have to eat our own dog food here and go and add "buffers > off" throughout our code base to make this happen. I personally feel that we > should accept a patch that does so. The benefits to the many outweigh the > one-time inconveniencing of the few. Especially if limited to explain > analyze. I'm not against analyze = on turning buffers on by default. However, I think it would be quite painful to fix the tests if it were on without analyze. I tried it to see just how extensive the changes would need to be. It's not too bad. partition_prune.sql is the worst hit. 23 files changed, 171 insertions(+), 166 deletions(-) David diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out index f2bcd6aa98..bf322198a2 100644 --- a/contrib/postgres_fdw/expected/postgres_fdw.out +++ b/contrib/postgres_fdw/expected/postgres_fdw.out @@ -11553,7 +11553,7 @@ SELECT * FROM local_tbl, async_pt WHERE local_tbl.a = async_pt.a AND local_tbl.c Filter: (async_pt_3.a = local_tbl.a) (15 rows) -EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) +EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT * FROM local_tbl, async_pt WHERE local_tbl.a = async_pt.a AND local_tbl.c = 'bar'; QUERY PLAN --- @@ -11799,7 +11799,7 @@ SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt W Remote SQL: SELECT a, b, c FROM public.base_tbl2 WHERE ((a < 3000)) (20 rows) -EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) +EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt WHERE a < 3000) FROM async_pt WHERE a < 3000) t2 ON t1.a = t2.a; QUERY PLAN - @@ -11843,7 +11843,7 @@ SELECT * FROM async_pt t1 WHERE t1.b === 505 LIMIT 1; Filter: (t1_3.b === 505) (14 rows) -EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) +EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT * FROM async_pt t1 WHERE t1.b === 505 LIMIT 1; QUERY PLAN - @@ -12003,7 +12003,7 @@ DELETE FROM async_pt WHERE b = 0 RETURNING *; DELETE FROM async_p1; DELETE FROM async_p2; DELETE FROM async_p3; -EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) +EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT * FROM async_pt; QUERY PLAN - diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql b/contrib/postgres_fdw/sql/postgres_fdw.sql index 372fe6dad1..3900522ccb 100644 --- a/contrib/postgres_fdw/sql/postgres_fdw.sql +++ b/contrib/postgres_fdw/sql/postgres_fdw.sql @@ -3904,7 +3904,7 @@ ALTER FOREIGN TABLE async_p2 OPTIONS (use_remote_estimate 'true'); EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM local_tbl, async_pt WHERE local_tbl.a = async_pt.a AND local_tbl.c = 'bar'; -EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) +EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT * FROM local_tbl, async_pt WHERE local_tbl.a = async_pt.a AND local_tbl.c = 'bar'; SELECT * FROM local_tbl, async_pt WHERE local_tbl.a = async_pt.a AND local_tbl.c = 'bar'; @@ -3979,13 +3979,13 @@ ANALYZE local_tbl; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt WHERE a < 3000) FROM async_pt WHERE a < 3000) t2 ON t1.a = t2.a; -EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) +EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt WHERE a < 3000) FROM async_pt WHERE a < 3000) t2 ON t1.a = t2.a; SELECT * FROM local_tbl t1 LEFT JOIN (SELECT *, (SELECT count(*) FROM async_pt WHERE a < 3000) FROM async_pt WHERE a < 3000) t2 ON t1.a = t2.a; EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM async_pt t1 WHERE t1.b === 505 LIMIT 1; -EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF) +EXPLAIN (ANALYZE, COSTS OFF, SUMMARY OFF, TIMING OFF, BUFFERS OFF) SELECT * FROM async_pt t1 WHERE t1.b === 505 LIMIT 1; SELECT * FROM async_pt t1 WHERE t1.b =
Re: Pgoutput not capturing the generated columns
Hi Vignesh, Here are my review comments for patch v49-0001. == src/backend/catalog/pg_publication.c 1. check_fetch_column_list +bool +check_fetch_column_list(Publication *pub, Oid relid, MemoryContext mcxt, + Bitmapset **cols) +{ + HeapTuple cftuple = NULL; + Datum cfdatum = 0; + bool found = false; + 1a. The 'cftuple' is unconditionally assigned; the default assignment seems unnecessary. ~ 1b. The 'cfdatum' can be declared in a lower scope (in the if-block). The 'cfdatum' is unconditionally assigned; the default assignment seems unnecessary. == Kind Regards, Peter Smith. Fujitsu Australia
Re: define pg_structiszero(addr, s, r)
On Wed, 6 Nov 2024 at 13:52, Michael Paquier wrote: > > On Wed, Nov 06, 2024 at 01:38:50PM +1300, David Rowley wrote: > > We could just write it that way and leave it up to the compiler to > > decide whether to use SIMD or not. Going by [1], gcc with -O2 uses > > SIMD instructions from 14.1 and clang with -O2 does it from version > > 8.0.0. gcc 14.1 was release in May 2024, so still quite new. It'll be > > quite a bit older once PG18 is out. Using the bitwise-OR method, more > > and more people will benefit as gcc14.1 and beyond becomes more > > mainstream. > > > > Clang 8.0.0 is from March 2019, so quite old already. > > Okay, WFM to keep things the way they are in the patch. I'm not sure if I'm clear on what works for you. The latest patch I saw did 1 size_t per iteration. Are you saying we should do just size_t per loop? or we should form the code in a way that allows the compiler to use SIMD instructions? David
Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different.
looks good to me. I didn't find any issue. group_by_has_partkey can even cope with: EXPLAIN (COSTS OFF, settings) SELECT c collate "C" collate case_insensitive collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" collate case_insensitive collate "C" ORDER BY 1; so i guess in group_by_has_partkey if (IsA(groupexpr, RelabelType)) groupexpr = ((RelabelType *) groupexpr)->arg; should be enough. not need while loop. while (IsA(groupexpr, RelabelType)) groupexpr = (Expr *) (castNode(RelabelType, groupexpr))->arg;
Re: Proposals for EXPLAIN: rename ANALYZE to EXECUTE and extend VERBOSE
On Wed, 6 Nov 2024 at 13:14, Kirk Wolak wrote: > But, personally, as a relatively new person to PG, I was surprised that > VERBOSE did not include the buffers. > Could we somehow turn this on? (GUC: VERBOSE_INCLUDES_BUFFERS = yes/no)? Please read https://postgr.es/m/CA+TgmoYH_p-y=45saj58cu6jsmh6ojgqqzia2aeppvz0j+u...@mail.gmail.com David
Re: Rename Function: pg_postmaster_start_time
On Tue, Nov 05, 2024 at 12:05:11PM -0700, David G. Johnston wrote: > The ease or difficulty of making the change in the server has no meaningful > bearing on whether breaking this public API is warranted or not. This function has this name since 600da67fbe5e back from 2008. Changing that 16 years later will break things. -- Michael signature.asc Description: PGP signature
Re: define pg_structiszero(addr, s, r)
On Wed, 6 Nov 2024 at 13:20, Michael Paquier wrote: > > On Wed, Nov 06, 2024 at 12:16:33PM +1300, David Rowley wrote: > > On Wed, 6 Nov 2024 at 04:03, Bertrand Drouvot > > wrote: > >> Another option could be to use SIMD instructions to check multiple bytes > >> is zero in a single operation. Maybe just an idea to keep in mind and > >> experiment > >> if we feel the need later on. > > > > Could do. I just wrote it that way to give the compiler flexibility to > > do SIMD implicitly. That seemed easier than messing around with SIMD > > intrinsics. I guess the compiler won't use SIMD with the single > > size_t-at-a-time version as it can't be certain it's ok to access the > > memory beyond the first zero word. Because I wrote the "if" condition > > using bitwise-OR, there's no boolean short-circuiting, so the compiler > > sees it must be safe to access all the memory for the loop iteration. > > How complex would that be compared to the latest patch proposed if > done this way? If you can force SIMD without having to know about > these specific gcc switches (aka -march is not set by default in the > tree except for some armv8 path), then the performance happens > magically. If that makes the code more readable, that's even better. We could just write it that way and leave it up to the compiler to decide whether to use SIMD or not. Going by [1], gcc with -O2 uses SIMD instructions from 14.1 and clang with -O2 does it from version 8.0.0. gcc 14.1 was release in May 2024, so still quite new. It'll be quite a bit older once PG18 is out. Using the bitwise-OR method, more and more people will benefit as gcc14.1 and beyond becomes more mainstream. Clang 8.0.0 is from March 2019, so quite old already. David [1] https://godbolt.org/z/MTqao8scW
Re: Converting contrib SQL functions to new style
Michael Paquier writes: > I was wondering what was going on here, and this patch comes down to > switching all these definitions from that: > CREATE FUNCTION lo_oid(lo) RETURNS pg_catalog.oid AS > 'SELECT $1::pg_catalog.oid' LANGUAGE SQL STRICT IMMUTABLE PARALLEL SAFE; > To that: > +CREATE OR REPLACE FUNCTION lo_oid(lo) RETURNS pg_catalog.oid > +LANGUAGE SQL STRICT IMMUTABLE PARALLEL SAFE > +RETURN (SELECT $1::pg_catalog.oid); Right. > This makes the executions more robust run-time search_path checks. Is > that something that should be considered for a backpatch, actually? No, I don't think so. For one thing, it would not help existing installations unless they issue "ALTER EXTENSION UPDATE", which people are not likely to do in a minor update. But also, we don't know of live attacks against these functions with their current definitions, so I don't think this is urgent. regards, tom lane
Re: doc fail about ALTER TABLE ATTACH re. NO INHERIT
Hi Alvaro, On Tue, Nov 5, 2024 at 9:01 PM Alvaro Herrera wrote: > > Hello, > > While doing final review for not-null constraints, I noticed that the > ALTER TABLE ATTACH PARTITION have this phrase: > > If any of the CHECK constraints of the table being attached are marked NO > INHERIT, the command will fail; such constraints must be recreated > without the > NO INHERIT clause. > > However, this is not true and apparently has never been true. I tried > this in both master and pg10: > > create table parted (a int) partition by list (a); > create table part1 (a int , check (a > 0) no inherit); > alter table parted attach partition part1 for values in (1); > > In both versions (and I imagine all intermediate ones) that sequence > works fine and results in this table: > >Table "public.part1" > Column │ Type │ Collation │ Nullable │ Default │ Storage │ Stats target │ > Description > ┼─┼───┼──┼─┼─┼──┼─ > a │ integer │ │ │ │ plain │ │ > Partition of: parted FOR VALUES IN (1) > Partition constraint: ((a IS NOT NULL) AND (a = 1)) > Check constraints: > "part1_a_check" CHECK (a > 0) NO INHERIT > > On the other hand, if we were to throw an error in the ALTER TABLE as > the docs say, it would serve no purpose: the partition cannot have any > more descendants, so the fact that the CHECK constraint is NO INHERIT > makes no difference. So I think the code is fine and we should just fix > the docs. > > > Note that if you interpret it the other way around, i.e., that the > "table being attached" is the parent table, then the first CREATE > already fails: > > create table parted2 (a int check (a > 0) no inherit) partition by list (a); > ERROR: cannot add NO INHERIT constraint to partitioned table "parted2" > > This says that we don't need to worry about this condition in the parent > table either. > > All in all, I think this text serves no purpose and should be removed > (from all live branches), as in the attached patch. I think that text might be talking about this case: create table parent (a int, constraint check_a check (a > 0)) partition by list (a); create table part1 (a int, constraint check_a check (a > 0) no inherit); alter table parent attach partition part1 for values in (1); ERROR: constraint "check_a" conflicts with non-inherited constraint on child table "part1" which is due to this code in MergeConstraintsIntoExisting(): /* If the child constraint is "no inherit" then cannot merge */ if (child_con->connoinherit) ereport(ERROR, (errcode(ERRCODE_INVALID_OBJECT_DEFINITION), errmsg("constraint \"%s\" conflicts with non-inherited constraint on child table \"%s\"", NameStr(child_con->conname), RelationGetRelationName(child_rel; that came in with the following commit: commit 3b11247aadf857bbcbfc765191273973d9ca9dd7 Author: Alvaro Herrera Date: Mon Jan 16 19:19:42 2012 -0300 Disallow merging ONLY constraints in children tables Perhaps the ATTACH PARTITION text should be changed to make clear which case it's talking about, say, like: If any of the CHECK constraints of the table being attached are marked NO INHERIT, the command will fail if a constraint with the same name exists in the parent table; such constraints must be recreated without the NO INHERIT clause. -- Thanks, Amit Langote
Re: general purpose array_sort
On Tue, Nov 5, 2024 at 8:30 PM Junwang Zhao wrote: > > > Thanks for the bounds preserve solution, I just looked at 0002, > > + if (astate->arraystate != NULL) > + { > + memcpy(astate->arraystate->dims, dims, ndim * sizeof(int)); > + memcpy(astate->arraystate->lbs, lbs, ndim * sizeof(int)); > + Assert(ndim == astate->arraystate->ndims); > + } > > It seems to me we only need to set astate->arraystate->lbs[0] = lbs[0] ? > yes. > + memcpy(astate->arraystate->dims, dims, ndim * sizeof(int)); thinking about it, this is wrong. we should just do Assert for(int i = 0; i < ndim; i++) { Assert(astate->arraystate->dims[i] == dims[i]); } or just remove memcpy(astate->arraystate->dims, dims, ndim * sizeof(int));
Logical replication timeout
Hello, For some unknown reason (probably a very big transaction at the source), we experienced a logical decoding breakdown, due to a timeout from the subscriber side (either wal_receiver_timeout or connexion drop by network equipment due to inactivity). The problem is, that due to that failure, the wal_receiver process stops. When the wal_sender is ready to send some data, it finds the connexion broken and exits. A new wal_sender process is created that restarts from the beginning (restart LSN). This is an endless loop. Checking the network connexion between wal_sender and wal_receiver, we found that no traffic occurs for hours. We first increased wal_receiver_timeout up to 12h and still got a disconnection on the receiver party: 2024-10-17 16:31:58.645 GMT [1356203:2] user=,db=,app=,client= ERROR: terminating logical replication worker due to timeout 2024-10-17 16:31:58.648 GMT [849296:212] user=,db=,app=,client= LOG: background worker "logical replication worker" (PID 1356203) exited with exit code 1 Then put this parameter to 0, but got then disconnected by the network (note the slight difference in message): 2024-10-21 11:45:42.867 GMT [1697787:2] user=,db=,app=,client= ERROR: could not receive data from WAL stream: could not receive data from server: Connection timed out 2024-10-21 11:45:42.869 GMT [849296:40860] user=,db=,app=,client= LOG: background worker "logical replication worker" (PID 1697787) exited with exit code 1 The message is generated in libpqrcv_receive function (replication/libpqwalreceiver/libpqwalreceiver.c) which calls pqsecure_raw_read (interfaces/libpq/fe-secure.c) The last message "Connection timed out" is the errno translation from the recv system function: ETIMEDOUT Connection timed out (POSIX.1-2001) When those timeout occurred, the sender was still busy deleting files from data/pg_replslot/bdcpb21_sene, accumulating more than 6 millions small ".spill" files. It seems this very long pause is at cleanup stage were PG is blindly trying to delete those files. strace on wal sender show tons of calls like: unlink("pg_replslot/bdcpb21_sene/xid-2 721 821 917-lsn-439C-0.spill") = -1 ENOENT (Aucun fichier ou dossier de ce type) unlink("pg_replslot/bdcpb21_sene/xid-2721821917-lsn-439C-100.spill") = -1 ENOENT (Aucun fichier ou dossier de ce type) unlink("pg_replslot/bdcpb21_sene/xid-2721821917-lsn-439C-200.spill") = -1 ENOENT (Aucun fichier ou dossier de ce type) unlink("pg_replslot/bdcpb21_sene/xid-2721821917-lsn-439C-300.spill") = -1 ENOENT (Aucun fichier ou dossier de ce type) unlink("pg_replslot/bdcpb21_sene/xid-2721821917-lsn-439C-400.spill") = -1 ENOENT (Aucun fichier ou dossier de ce type) unlink("pg_replslot/bdcpb21_sene/xid-2721821917-lsn-439C-500.spill") = -1 ENOENT (Aucun fichier ou dossier de ce type) This occurs in ReorderBufferRestoreCleanup (backend/replication/logical/reorderbuffer.c). The call stack presumes this may probably occur in DecodeCommit or DecodeAbort (backend/replication/logical/decode.c): unlink("pg_replslot/bdcpb21_sene/xid-2730444214-lsn-43A6-8800.spill") = -1 ENOENT (Aucun fichier ou dossier de ce type) > /usr/lib64/libc-2.17.so(unlink+0x7) [0xf12e7] > /usr/pgsql-15/bin/postgres(ReorderBufferRestoreCleanup.isra.17+0x5d) > [0x769e3d] > /usr/pgsql-15/bin/postgres(ReorderBufferCleanupTXN+0x166) [0x76aec6] <=== > replication/logical/reorderbuff.c:1480 (mais cette fonction (static) n'est > utiliée qu'au sein de ce module ...) > /usr/pgsql-15/bin/postgres(xact_decode+0x1e7) [0x75f217] <=== > replication/logical/decode.c:175 > /usr/pgsql-15/bin/postgres(LogicalDecodingProcessRecord+0x73) [0x75eee3] <=== > replication/logical/decode.c:90, appelle la fonction rmgr.rm_decode(ctx, > &buf) = 1 des 6 méthodes du resource manager > /usr/pgsql-15/bin/postgres(XLogSendLogical+0x4e) [0x78294e] > /usr/pgsql-15/bin/postgres(WalSndLoop+0x151) [0x785121] > /usr/pgsql-15/bin/postgres(exec_replication_command+0xcba) [0x785f4a] > /usr/pgsql-15/bin/postgres(PostgresMain+0xfa8) [0x7d0588] > /usr/pgsql-15/bin/postgres(ServerLoop+0xa8a) [0x493b97] > /usr/pgsql-15/bin/postgres(PostmasterMain+0xe6c) [0x74d66c] > /usr/pgsql-15/bin/postgres(main+0x1c5) [0x494a05] > /usr/lib64/libc-2.17.so(__libc_start_main+0xf4) [0x22554] > /usr/pgsql-15/bin/postgres(_start+0x28) [0x494fb8] We did not find any other option than deleting the subscription to stop that loop and start a new one (thus loosing transactions). The publisher is PostgreSQL 15.6 The subscriber is PostgreSQL 14.5 Thanks
Re: Pgoutput not capturing the generated columns
On Wed, Nov 6, 2024 at 11:35 AM Peter Smith wrote: > > I am observing some unexpected errors with the following scenario. > You are getting an expected ERROR. It is because of the design of logical decoding which relies on historic snapshots. > == > Tables: > > Publisher table: > test_pub=# create table t1 (a int, b int GENERATED ALWAYS AS (a * 2) STORED); > CREATE TABLE > test_pub=# insert into t1 values (1); > INSERT 0 1 > > ~ > > And Subscriber table: > test_sub=# create table t1(a int, b int); > CREATE TABLE > > == > TEST PART 1. > > I create 2 publications, having different parameter values. > > test_pub=# create publication pub1 for table t1 with > (publish_generated_columns=true); > CREATE PUBLICATION > test_pub=# create publication pub2 for table t1 with > (publish_generated_columns=false); > CREATE PUBLICATION > > ~ > > And I try creating a subscription simultaneously subscribing to both > of these publications. This fails with an expected error. > > test_sub=# create subscription sub1 connection 'dbname=test_pub' > publication pub1, pub2; > ERROR: cannot use different column lists for table "public.t1" in > different publications > > == > TEST PART 2. > > Now on publisher set parameter for pub2 to be true; > > test_pub=# alter publication pub2 set (publish_generated_columns); > ALTER PUBLICATION > test_pub=# \dRp+ > Publication pub1 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via > root | Genera > ted columns > --++-+-+-+---+--+--- > > postgres | f | t | t | t | t | f| > t > Tables: > "public.t1" > > Publication pub2 > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via > root | Genera > ted columns > --++-+-+-+---+--+--- > > postgres | f | t | t | t | t | f| > t > Tables: > "public.t1" > > ~ > > Now the create subscriber works OK. > > test_sub=# create subscription sub1 connection 'dbname=test_pub' > publication pub1,pub2; > NOTICE: created replication slot "sub1" on publisher > CREATE SUBSCRIPTION > > == > TEST PART 3. > > Now on Publisher let's alter that parameter back to false again... > > test_pub=# alter publication pub2 set (publish_generated_columns=false); > ALTER PUBLICATION > > And insert some data. > > test_pub=# insert into t1 values (2); > INSERT 0 1 > > ~ > > Now the subscriber starts failing again... > > ERROR: cannot use different values of publish_generated_columns for > table "public.t1" in different publications > etc... > > == > TEST PART 4. > > Finally, on the Publisher alter that parameter back to true again! > > test_pub=# alter publication pub2 set (publish_generated_columns); > ALTER PUBLICATION ... > > > ~~ > > Unfortunately, even though the publication parameters are the same > again, the subscription seems to continue forever failing > > ERROR: cannot use different values of publish_generated_columns for > table "public.t1" in different publications > The reason is that the failing 'insert' uses a historic snapshot, which has a catalog state where 'publish_generated_columns' is still false. So, you are seeing that error repeatedly. This behavior exists from the very beginning of logical replication and another issue due to the same reason was reported recently [1] which is actually a setup issue. We should improve this situation some day but it is not the responsibility of this patch. [1] - https://www.postgresql.org/message-id/18683-a98f79c0673be358%40postgresql.org -- With Regards, Amit Kapila.
Re: Pgoutput not capturing the generated columns
On Wed, Nov 6, 2024 at 7:34 AM Peter Smith wrote: > > Hi Vignesh, > > Here are my review comments for patch v49-0001. > I have a question on the display of this new parameter. postgres=# \dRp+ Publication pub_gen Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root | Generated columns --++-+-+-+---+--+--- KapilaAm | f | t | t | t | t | f| t Tables: "public.test_gen" The current theory for the display of the "Generated Columns" option could be that let's add new parameters at the end which sounds reasonable. The other way to look at it is how it would be easier for users to interpret. I think the value of the "Via root" option should be either after "All tables" or at the end as that is higher level table information than operations or column-level information. As currently, it is at the end, so "Generated Columns" should be added before. Thoughts? -- With Regards, Amit Kapila.
Building Postgres 17.0 with meson
I'm trying to build Postgres 17.0 and have about learning meson as I go. The build setup command I have so far is: meson setup build --prefix=%prefix% --buildtype=release -Dssl=auto -Dzlib=auto -Dicu=auto but I believe that cmd expects ssl, zlib, and icu to be installed in default locations. How do I specify to meson that I want it to use the versions of those 3 software packages that I have built e.g. the openssl I want it to use is located in D:\Postgres-Builds\OpenSSL\OpenSSL-Install\OpenSSL-3.1.6-wx6 and similar locations for icu and zlib? Thanks, Mark
Re: define pg_structiszero(addr, s, r)
On Wed, Nov 06, 2024 at 03:53:48PM +1300, David Rowley wrote: > I'm not sure if I'm clear on what works for you. The latest patch I > saw did 1 size_t per iteration. Are you saying we should do just > size_t per loop? or we should form the code in a way that allows the > compiler to use SIMD instructions? Oh, sorry. I thought that you wanted to keep the size_t checks as done in [1] anyway. But your suggestion is different, and instructions like xmmword would be enough to show up as you group more the checks 8 at a time: bool pg_memory_is_all_zeros_size_t_times_8(const void *ptr, size_t len) { const char *p = (const char *) ptr; const char *end = &p[len]; const char *aligned_end = (const char *) ((uintptr_t) end & (~(sizeof(size_t) - 1))); while (((uintptr_t) p & (sizeof(size_t) - 1)) != 0) { if (p == end) return true; if (*p++ != 0) return false; } for (; p < aligned_end - (sizeof(size_t) * 7); p += sizeof(size_t) * 8) { if (((size_t *) p)[0] != 0 | ((size_t *) p)[1] != 0 | ((size_t *) p)[2] != 0 | ((size_t *) p)[3] != 0 | ((size_t *) p)[4] != 0 | ((size_t *) p)[5] != 0 | ((size_t *) p)[6] != 0 | ((size_t *) p)[7] != 0) return false; } while (p < end) { if (*p++ != 0) return false; } return true; } That's smart for large areas to cover. The patch should document why we are doing it this way. This should have a few more parenthesis in the second loop, or -Wparentheses would complain. Should the last loop check only 1 byte at a time or should this stuff include one more step before the last one you wrote to do a couple of checks with size_t? That may matter for areas small enough (len < sizeof(size_t) * 8) causing the second step to not be taken, but large enough (len > sizeof(size_t)) to apply a couple of size_t checks per loop. [1]: https://www.postgresql.org/message-id/zym0uvnm+fswq...@ip-10-97-1-34.eu-west-3.compute.internal -- Michael signature.asc Description: PGP signature
Re: Pgoutput not capturing the generated columns
On Wed, Nov 6, 2024 at 3:26 PM Amit Kapila wrote: > > On Wed, Nov 6, 2024 at 7:34 AM Peter Smith wrote: > > > > Hi Vignesh, > > > > Here are my review comments for patch v49-0001. > > > > I have a question on the display of this new parameter. > > postgres=# \dRp+ > Publication pub_gen > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via > root | Generated columns > --++-+-+-+---+--+--- > KapilaAm | f | t | t | t | t | f| > t > Tables: > "public.test_gen" > > The current theory for the display of the "Generated Columns" option > could be that let's add new parameters at the end which sounds > reasonable. The other way to look at it is how it would be easier for > users to interpret. I think the value of the "Via root" option should > be either after "All tables" or at the end as that is higher level > table information than operations or column-level information. As > currently, it is at the end, so "Generated Columns" should be added > before. > > Thoughts? > FWIW, I've always felt the CREATE PUBLICATION parameters publish publish_via_root publish_generated_columns Should be documented (e.g. on CREATE PUBLICATION page) in alphabetical order: publish publish_generated_columns publish_via_root ~ Following on from that. IMO it will make sense for the describe (\dRp+) columns for those parameters to be in the same order as the parameters in the documentation. So the end result would be the same order as what you are wanting, even though the reason might be different. == Kind Regards, Peter Smith. Fujitsu Australia.
Re: Eager aggregation, take 3
On Thu, Aug 29, 2024 at 10:26 AM Richard Guo wrote: > > > > 2. I think there might be techniques we could use to limit planning > > effort at an earlier stage when the approach doesn't appear promising. > > For example, if the proposed grouping column is already unique, the > > exercise is pointless (I think). Ideally we'd like to detect that > > without even creating the grouped_rel. But the proposed grouping > > column might also be *mostly* unique. For example, consider a table > > with a million rows and a column 500,000 distinct values. I suspect it > > will be difficult for partial aggregation to work out to a win in a > > case like this, because I think that the cost of performing the > > partial aggregation will not reduce the cost either of the final > > aggregation or of the intervening join steps by enough to compensate. > > It would be best to find a way to avoid generating a lot of rels and > > paths in cases where there's really not much hope of a win. > > > > One could, perhaps, imagine going further with this by postponing > > eager aggregation planning until after regular paths have been built, > > so that we have good cardinality estimates. Suppose the query joins a > > single fact table to a series of dimension tables. The final plan thus > > uses the fact table as the driving table and joins to the dimension > > tables one by one. Do we really need to consider partial aggregation > > at every level? Perhaps just where there's been a significant row > > count reduction since the last time we tried it, but at the next level > > the row count will increase again? > > > > Maybe there are other heuristics we could use in addition or instead. > > Yeah, one of my concerns with this work is that it can use > significantly more CPU time and memory during planning once enabled. > It would be great if we have some efficient heuristics to limit the > effort. I'll work on that next and see what happens. > in v13, latest version. we can /* ... and initialize these targets */ if (!init_grouping_targets(root, rel, target, agg_input, &group_clauses, &group_exprs)) return NULL; if (rel->reloptkind == RELOPT_BASEREL && group_exprs != NIL) { foreach_node(Var, var, group_exprs) { if(var->varno == rel->relid && has_unique_index(rel, var->varattno)) return NULL; } } since in init_grouping_targets we already Asserted that group_exprs is a list of Var. also in create_rel_agg_info, estimate_num_groups result->group_exprs = group_exprs; result->grouped_rows = estimate_num_groups(root, result->group_exprs, rel->rows, NULL, NULL); /* * The grouped paths for the given relation are considered useful iff * the row reduction ratio is greater than EAGER_AGGREGATE_RATIO. */ agg_info->agg_useful = (agg_info->grouped_rows <= rel->rows * (1 - EAGER_AGGREGATE_RATIO)); If the associated Var in group_exprs is too many, then result->grouped_rows will be less accurate, therefore agg_info->agg_useful will be less accurate. should we limit the number of Var associated with Var group_exprs. for example: SET enable_eager_aggregate TO on; drop table if exists eager_agg_t1,eager_agg_t2, eager_agg_t3; CREATE TABLE eager_agg_t1 (a int, b int, c double precision); CREATE TABLE eager_agg_t2 (a int, b int, c double precision); INSERT INTO eager_agg_t1 SELECT i % 100, i, i FROM generate_series(1, 5)i; INSERT INTO eager_agg_t2 SELECT i % 10, i, i FROM generate_series(1, 5)i; INSERT INTO eager_agg_t2 SELECT i % 10, i, i FROM generate_series(-4, -2)i; explain(costs off, verbose, settings) SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON abs(t1.b) = abs(t2.b % 10 + t2.a) group by 1; explain(costs off, verbose, settings) SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON abs(t1.b) = abs(t2.b % 10 + t2.a) group by 1; QUERY PLAN -- Finalize HashAggregate Output: t1.a, avg(t2.c) Group Key: t1.a -> Merge Join Output: t1.a, (PARTIAL avg(t2.c)) Merge Cond: ((abs(((t2.b % 10) + t2.a))) = (abs(t1.b))) -> Sort Output: t2.b, t2.a, (PARTIAL avg(t2.c)), (abs(((t2.b % 10) + t2.a))) Sort Key: (abs(((t2.b % 10) + t2.a))) -> Partial HashAggregate Output: t2.b, t2.a, PARTIAL avg(t2.c), abs(((t2.b % 10) + t2.a)) Group Key: t2.b, t2.a -> Seq Scan on public.eager_agg_t2 t2 Output: t2.a, t2.b, t2.c -> Sort Output: t1.a, t1.b, (abs(t1.b)) Sort Key: (abs(t1.b))
Re: Pgoutput not capturing the generated columns
On Wed, Nov 6, 2024 at 10:26 AM Peter Smith wrote: > > On Wed, Nov 6, 2024 at 3:26 PM Amit Kapila wrote: > > > > On Wed, Nov 6, 2024 at 7:34 AM Peter Smith wrote: > > > > > > Hi Vignesh, > > > > > > Here are my review comments for patch v49-0001. > > > > > > > I have a question on the display of this new parameter. > > > > postgres=# \dRp+ > > Publication pub_gen > > Owner | All tables | Inserts | Updates | Deletes | Truncates | Via > > root | Generated columns > > --++-+-+-+---+--+--- > > KapilaAm | f | t | t | t | t | f > > | t > > Tables: > > "public.test_gen" > > > > The current theory for the display of the "Generated Columns" option > > could be that let's add new parameters at the end which sounds > > reasonable. The other way to look at it is how it would be easier for > > users to interpret. I think the value of the "Via root" option should > > be either after "All tables" or at the end as that is higher level > > table information than operations or column-level information. As > > currently, it is at the end, so "Generated Columns" should be added > > before. > > > > Thoughts? > > > > FWIW, I've always felt the CREATE PUBLICATION parameters > publish > publish_via_root > publish_generated_columns > > Should be documented (e.g. on CREATE PUBLICATION page) in alphabetical order: > publish > publish_generated_columns > publish_via_root > > ~ > > Following on from that. IMO it will make sense for the describe > (\dRp+) columns for those parameters to be in the same order as the > parameters in the documentation. So the end result would be the same > order as what you are wanting, even though the reason might be > different. > Sounds reasonable to me. I have made some minor comments and function name changes in the attached. Please include in the next version. -- With Regards, Amit Kapila. diff --git a/src/backend/catalog/pg_publication.c b/src/backend/catalog/pg_publication.c index 92a5fa65e0..6dcb399ee3 100644 --- a/src/backend/catalog/pg_publication.c +++ b/src/backend/catalog/pg_publication.c @@ -261,15 +261,14 @@ is_schema_publication(Oid pubid) * publication, false otherwise. * * If a column list is found, the corresponding bitmap is returned through the - * cols parameter (if provided) and is constructed within the specified memory - * context (mcxt). + * cols parameter, if provided. The bitmap is constructed within the given + * memory context (mcxt). */ bool -check_fetch_column_list(Publication *pub, Oid relid, MemoryContext mcxt, +check_and_fetch_column_list(Publication *pub, Oid relid, MemoryContext mcxt, Bitmapset **cols) { - HeapTuple cftuple = NULL; - Datum cfdatum = 0; + HeapTuple cftuple; boolfound = false; if (pub->alltables) @@ -280,6 +279,7 @@ check_fetch_column_list(Publication *pub, Oid relid, MemoryContext mcxt, ObjectIdGetDatum(pub->oid)); if (HeapTupleIsValid(cftuple)) { + Datum cfdatum; boolisnull; /* Lookup the column list attribute. */ @@ -289,7 +289,7 @@ check_fetch_column_list(Publication *pub, Oid relid, MemoryContext mcxt, /* Was a column list found? */ if (!isnull) { - /* Build the column list bitmap in the mcxt context. */ + /* Build the column list bitmap in the given memory context. */ if (cols) *cols = pub_collist_to_bitmapset(*cols, cfdatum, mcxt); diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c index 1a7b9dee10..32e1134b54 100644 --- a/src/backend/replication/pgoutput/pgoutput.c +++ b/src/backend/replication/pgoutput/pgoutput.c @@ -1051,11 +1051,11 @@ check_and_init_gencol(PGOutputData *data, List *publications, foreach_ptr(Publication, pub, publications) { /* -* The column list takes precedence over 'publish_generated_columns' -* parameter. Those will be checked later, see -* pgoutput_column_list_init. +* The column list takes precedence over the +* 'publish_generated_columns' parameter. Those will be checked later, +* seepgoutput_column_list_init. */ - if (check_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL)) + if (check_and_fetch_column_list(pub, entry->publish_as_relid, NULL, NULL)) continue; if (first) @@ -1106,9 +1106,9 @@ pgoutput_column_list_init(PGOutputData *data, List *publicati
Re: Building Postgres 17.0 with meson
On Wed, 06 Nov 2024 09:59:37 +0530 Mark Hill wrote --- I’m trying to build Postgres 17.0 and have about learning meson as I go. The build setup command I have so far is: meson setup build --prefix=%prefix% --buildtype=release -Dssl=auto -Dzlib=auto -Dicu=auto but I believe that cmd expects ssl, zlib, and icu to be installed in default locations. How do I specify to meson that I want it to use the versions of those 3 software packages that I have built e.g. the openssl I want it to use is located in D:\Postgres-Builds\OpenSSL\OpenSSL-Install\OpenSSL-3.1.6-wx6 and similar locations for icu and zlib? Thanks, Mark Specify the lib's paths in "-Dextra_lib_dirs" during meson setup e.g. meson setup build --prefix=%prefix% --buildtype=release -Dextra_lib_dirs=D:\Postgres-Builds\OpenSSL\OpenSSL-Install\OpenSSL-3.1.6-wx6 https://www.postgresql.org/docs/current/install-meson.html#CONFIGURE-EXTRA-LIB-DIRS-MESON Regards, Srinath Reddy Sadipiralla Member Technical Staff ZOHO
Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible
On Fri, Nov 01, 2024 at 02:47:38PM -0700, Jacob Champion wrote: > On Sun, Sep 1, 2024 at 5:10 PM Michael Paquier wrote: >> Could it be more transparent to use a "startup" or >> "starting"" state instead that gets also used by pgstat_bestart() in >> the case of the patch where !pre_auth? > > Done. (I've used "starting".) 0003 looks much cleaner this way. > Added a new "Auth" class (to cover both authn and authz during > startup), plus documentation. +PAM_ACCT_MGMT "Waiting for the local PAM service to validate the user account." +PAM_AUTHENTICATE "Waiting for the local PAM service to authenticate the user." Is "local" required for both? Perhaps just use "the PAM service". +SSPI_LOOKUP_ACCOUNT_SID"Waiting for Windows to find the user's account SID." We don't document SID in doc/. So perhaps this should add be "SID (system identifier)". +SSPI_MAKE_UPN "Waiting for Windows to translate a Kerberos UPN." UPN is mentioned once in doc/ already. Perhaps this one is OK left alone.. Except for these tweaks 0004 looks OK. > The more I look at this, the more uneasy I feel about the goal. Best I > can tell, the pre-auth step can't ignore irrelevant fields, because > they may contain junk from the previous owner of the shared memory. So > if we want to optimize, we can only change the second step to skip > fields that were already filled in by the pre-auth step. > > That has its own problems: not every backend type uses the pre-auth > step in the current patch. Which means a bunch of backends that don't > benefit from the two-step initialization nevertheless have to either > do two PGSTAT_BEGIN_WRITE_ACTIVITY() dances in a row, or else we > duplicate a bunch of the logic to make sure they maintain the same > efficient code path as before. > > Finally, if we're okay with all of that, future maintainers need to be > careful about which fields get copied in the first (preauth) step, the > second step, or both. GSS, for example, can be set up during transport > negotiation (first step) or authentication (second step), so we have > to duplicate the logic there. SSL is currently first-step-only, I > think -- but are we sure we want to hardcode the assumption that cert > auth can't change any of those parameters after the transport has been > established? (I've been brainstorming ways we might use TLS 1.3's > post-handshake CertificateRequest, for example.) The future field maintenance and what one would need to think more about in the future is a good point. I still feel slightly uneasy about the way 0003 is shaped with its new pgstat_bestart_pre_auth(), but I think that I'm just going to put my hands down on 0003 and see if I can finish with something I'm a bit more comfortable with. Let's see.. > So before I commit to this path, I just want to double-check that all > of the above sounds good and non-controversial. :) The goal of the thread is sound. I'm OK with 0002 to add the wait parameter to BackgroundPsql and be able to take some actions until a manual wait_connect(). I'll go do this one. Also perhaps 0001 while on it but I am a bit puzzled by the removal of the three ok() calls in 037_invalid_database.pl. -- Michael signature.asc Description: PGP signature
Re: Converting contrib SQL functions to new style
On Tue, Nov 05, 2024 at 08:05:16PM -0500, Tom Lane wrote: > No, I don't think so. For one thing, it would not help existing > installations unless they issue "ALTER EXTENSION UPDATE", which > people are not likely to do in a minor update. But also, we don't > know of live attacks against these functions with their current > definitions, so I don't think this is urgent. OK, thanks. -- Michael signature.asc Description: PGP signature