Re: Large expressions in indexes can't be stored (non-TOASTable)

2024-10-14 Thread Nathan Bossart
On Tue, Oct 08, 2024 at 01:50:52PM -0500, Nathan Bossart wrote:
> 0002 could probably use some more commentary, but otherwise I think it is
> in decent shape.  You (Michael) seem to be implying that I should
> back-patch the actual fixes and only apply the new assertions to v18.  Am I
> understanding you correctly?

Here is a reorganized patch set.  0001 would be back-patched, but the
others would only be applied to v18.

-- 
nathan
>From 6792eeb656e8fc707d85dc07595e2183853d2ce5 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 14 Oct 2024 14:43:26 -0500
Subject: [PATCH v3 1/3] Ensure we have a snapshot when updating various system
 catalogs.

Reported-by: Alexander Lakhin
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: FIXME
---
 src/backend/commands/tablecmds.c| 13 +++
 src/backend/replication/logical/tablesync.c | 21 ++
 src/backend/replication/logical/worker.c| 24 +
 3 files changed, 58 insertions(+)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1ccc80087c..c6f82e93b9 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -19283,9 +19283,22 @@ ATExecDetachPartition(List **wqueue, AlteredTableInfo 
*tab, Relation rel,
tab->rel = rel;
}
 
+   /*
+* Detaching the partition might involve TOAST table access, so ensure 
we
+* have a valid snapshot.  We only expect to get here without a snapshot
+* in the concurrent path.
+*/
+   if (concurrent)
+   PushActiveSnapshot(GetTransactionSnapshot());
+   else
+   Assert(HaveRegisteredOrActiveSnapshot());
+
/* Do the final part of detaching */
DetachPartitionFinalize(rel, partRel, concurrent, defaultPartOid);
 
+   if (concurrent)
+   PopActiveSnapshot();
+
ObjectAddressSet(address, RelationRelationId, 
RelationGetRelid(partRel));
 
/* keep our lock until commit */
diff --git a/src/backend/replication/logical/tablesync.c 
b/src/backend/replication/logical/tablesync.c
index e03e761392..d3fbb18438 100644
--- a/src/backend/replication/logical/tablesync.c
+++ b/src/backend/replication/logical/tablesync.c
@@ -377,6 +377,12 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
replorigin_session_origin_lsn = InvalidXLogRecPtr;
replorigin_session_origin_timestamp = 0;
 
+   /*
+* Updating pg_replication_origin might involve TOAST table 
access, so
+* ensure we have a valid snapshot.
+*/
+   PushActiveSnapshot(GetTransactionSnapshot());
+
/*
 * Drop the tablesync's origin tracking if exists.
 *
@@ -387,6 +393,8 @@ process_syncing_tables_for_sync(XLogRecPtr current_lsn)
 */
replorigin_drop_by_name(originname, true, false);
 
+   PopActiveSnapshot();
+
finish_sync_worker();
}
else
@@ -483,6 +491,12 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)
started_tx = true;
}
 
+   /*
+* Updating pg_replication_origin might involve 
TOAST table
+* access, so ensure we have a valid snapshot.
+*/
+   PushActiveSnapshot(GetTransactionSnapshot());
+
/*
 * Remove the tablesync origin tracking if 
exists.
 *
@@ -500,6 +514,8 @@ process_syncing_tables_for_apply(XLogRecPtr current_lsn)

   sizeof(originname));
replorigin_drop_by_name(originname, true, 
false);
 
+   PopActiveSnapshot();
+
/*
 * Update the state to READY only after the 
origin cleanup.
 */
@@ -1455,8 +1471,13 @@ LogicalRepSyncTableStart(XLogRecPtr *origin_startpos)
 * Then advance to the LSN got from walrcv_create_slot. This is 
WAL
 * logged for the purpose of recovery. Locks are to prevent the
 * replication origin from vanishing while advancing.
+*
+* Updating pg_replication_origin might involve TOAST table 
access, so
+* ensure we have a valid snapshot.
 */
+   PushActiveSnapshot(GetTransactionSnapshot());
ori

Re: Changing the default random_page_cost value

2024-10-14 Thread Bruce Momjian
On Mon, Sep 30, 2024 at 10:05:29AM -0400, Greg Sabino Mullane wrote:
> On Fri, Sep 27, 2024 at 12:03 PM Dagfinn Ilmari Mannsåker 
> wrote:
> 
> It might also be worth mentioning cloudy block storage (e.g. AWS' EBS),
> which is typically backed by SSDs, but has extra network latency.
> 
> 
> That seems a little too in the weeds for me, but wording suggestions are
> welcome. To get things moving forward, I made a doc patch which changes a few
> things, namely:
> 
> * Mentions the distinction between ssd and hdd right up front.
> * Moves the tablespace talk to the very end, as tablespace use is getting 
> rarer
> (again, thanks in part to ssds)
> * Mentions the capability to set per-database and per-role since we mention
> per-tablespace.
> * Removes a lot of the talk of caches and justifications for the 4.0 setting.
> While those are interesting, I've been tuning this parameter for many years 
> and
> never really cared about the "90% cache rate". The proof is in the pudding: 
> rpc
> is the canonical "try it and see" parameter. Tweak. Test. Repeat.

I am not a fan of this patch.  I don't see why _removing_ the magnetic
part helps because you then have no logic for any 1.2 was chosen.  I
would put the magnetic in a separate paragraph perhaps, and recommend
4.0 for it.  Also, per-tablespace makes sense because of physical media
differences, but what purpose would per-database and per-role serve? 
Also, per-tablespace is not a connection-activated item like the other
two.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"




Re: Statistics Import and Export

2024-10-14 Thread Corey Huinker
>
> However, this function seems to accept -1 for the relpages parameter.
> Below is an example of execution:
> ---
> postgres=> CREATE TABLE data1(c1 INT PRIMARY KEY, c2 VARCHAR(10));
> CREATE TABLE
> postgres=> SELECT pg_set_relation_stats('data1', relpages=>-1);
>  pg_set_relation_stats
> ---
>  t
> (1 row)
> postgres=> SELECT relname, relpages FROM pg_class WHERE relname='data1';
>  relname | relpages
> -+--
>  data1   |   -1
> (1 row)
> ---
>
> The attached patch modifies the pg_set_relation_stats function to work as
> described in the manual.
>
> Regards,
> Noriyoshi Shinoda
>

Accepting -1 is correct. I thought I had fixed that in a recent patch.
Perhaps signals got crossed somewhere along the way.


Re: Logging parallel worker draught

2024-10-14 Thread Alena Rybakina

Sorry, I was in a hurry and didn't check my patch properly.

On 14.10.2024 23:20, Alena Rybakina wrote:

On 28.08.2024 15:58, Benoit Lobréau wrote:

Hi,

Here is a new version of the patch. Sorry for the long delay, I was 
hit by a motivation drought and was quite busy otherwise.


The guc is now called `log_parallel_workers` and has three possible 
values:


* "none": disables logging
* "all": logs parallel worker info for all parallel queries or utilities
* "failure": logs only when the number of parallel workers planned 
couldn't be reached.


For this, I added several members to the EState struct.

Each gather node / gather merge node updates the values and the 
offending queries are displayed during standard_ExecutorEnd.


For CREATE INDEX / REINDEX on btree and brin, I check the parallel 
context struct (pcxt) during _bt_end_parallel() or 
_brin_end_parallel() and display a log message when needed.


For vacuum, I do the same in parallel_vacuum_end().

I added some information to the error message for parallel queries as 
an experiment. I find it useful, but it can be removed, if you re not 
convinced.


2024-08-27 15:59:11.089 CEST [54585] LOG:  1 parallel nodes planned 
(1 obtained all their workers, 0 obtained none), 2 workers planned (2 
workers launched)

2024-08-27 15:59:11.089 CEST [54585] STATEMENT:  EXPLAIN (ANALYZE)
    SELECT i, avg(j) FROM test_pql GROUP BY i;

2024-08-27 15:59:14.006 CEST [54585] LOG:  2 parallel nodes planned 
(0 obtained all their workers, 1 obtained none), 4 workers planned (1 
workers launched)

2024-08-27 15:59:14.006 CEST [54585] STATEMENT:  EXPLAIN (ANALYZE)
    SELECT i, avg(j) FROM test_pql GROUP BY i
    UNION
    SELECT i, avg(j) FROM test_pql GROUP BY i;

For CREATE INDEX / REDINDEX / VACUUMS:

2024-08-27 15:58:59.769 CEST [54521] LOG:  1 workers planned (0 
workers launched)

2024-08-27 15:58:59.769 CEST [54521] STATEMENT:  REINDEX TABLE test_pql;

Do you think this is better?

I am not sure if a struct is needed to store the es_nworkers* and if 
the modification I did to parallel.h is ok.


Thanks to: Jehan-Guillaume de Rorthais, Guillaume Lelarge and Franck 
Boudehen for the help and motivation boost. 


Hi!

I think it's a good idea to log this. I suggest some changes that 
might improve your patch.



2. I think you can transfer nworkers_to_launch and nworkers_launched 
vacuum parameters in the ParallelContext struct.


Sorry, I didn’t notice right away that the second point is impossible to 
fulfill here.



Attached is the correct version of the patch.

--
Regards,
Alena Rybakina
Postgres Professional
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5c04a27eb66..1e2fadffaf1 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7704,8 +7704,8 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
   

-Controls whether a log message is produced to display information on 
the number of
-workers spawned when a parallel query or utility is executed. The 
default value is
+Controls whether a log message about the number of workers produced 
during an
+execution of the parallel query or utility statement. The default 
value is
 none which disables logging. all 
displays
 information for all parallel queries, whereas 
failures displays
 information only when the number of workers launched is lower than the 
number of
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index fdfbbc7dc42..402be20e27c 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2544,10 +2544,9 @@ _brin_end_parallel(BrinLeader *brinleader, 
BrinBuildState *state)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(brinleader->pcxt);
 
-   if ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
-brinleader->pcxt->nworkers_to_launch > 0) ||
-   (log_parallel_workers == LOG_PARALLEL_WORKERS_FAILURE &&
-brinleader->pcxt->nworkers_to_launch > 
brinleader->pcxt->nworkers_launched))
+   if (LoggingParallelWorkers(log_parallel_workers,
+  
brinleader->pcxt->nworkers_to_launch,
+  
brinleader->pcxt->nworkers_launched))
elog(LOG, "%i workers planned (%i workers launched)",
brinleader->pcxt->nworkers_to_launch,
brinleader->pcxt->nworkers_launched);
diff --git a/src/backend/access/nbtree/nbtsort.c 
b/src/backend/access/nbtree/nbtsort.c
index e31bd8e223e..1ca027d2727 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -1615,10 +1615,9 @@ _bt_end_parallel(BTLeader *btleader)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(btleader->pcxt);
 
-   if ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &

Re: Doc: typo in config.sgml

2024-10-14 Thread Bruce Momjian
On Mon, Oct 14, 2024 at 03:05:35PM -0400, Bruce Momjian wrote:
> I did some more research and we able to clarify our behavior in
> release.sgml:

I have specified some more details in my patched version:

We can only use Latin1 characters, not all UTF8 characters,
because some rendering engines do not support non-Latin1 UTF8
characters.  Specifically, the HTML rendering engine can display
all UTF8 characters, but the PDF rendering engine can only display
Latin1 characters.  In PDF files, non-Latin1 UTF8 characters are
displayed as "###".

In the SGML files we encode non-ASCII Latin1 characters as HTML
entities, e.g., Álvaro.  Oddly, it is possible to safely
represent Latin1 characters in SGML files as UTF8 for HTML and
PDF output, but we we currently disallow this via the Makefile
"check-non-ascii" rule.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"




Re: Make foreach_ptr macro work in C++ extensions

2024-10-14 Thread Bruce Momjian
On Mon, Oct 14, 2024 at 03:29:34PM -0500, Nathan Bossart wrote:
> > I think it would be good to backpatch this to PG17 as well,
> > as to not introduce some differences in the macro across versions.
> 
> I'd consider this a "low-risk" fix, which our versioning page [0] says is
> acceptable for a minor release, so I'll plan on back-patching unless
> someone objects.

Agreed.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"




Re: Improve node type forward reference

2024-10-14 Thread Nathan Bossart
On Mon, Oct 14, 2024 at 09:47:59AM +0200, Peter Eisentraut wrote:
> But we can do this better by using an incomplete struct, like
> 
> struct Query *viewQuery ...;
> 
> That way, everything has the correct type and fewer casts are required. This
> technique is already used elsewhere in node type definitions.

I noticed that the examples in parsenodes.h are for structs defined within
the same file.  If the struct is defined in a separate file, I guess you
might need to include another header file wherever it is used, but that
doesn't seem too bad.

> The second patch just removes some more unnecessary casts around
> copyObject() that I found while working on this.

LGTM

-- 
nathan




Re: Consider the number of columns in the sort cost model

2024-10-14 Thread Alena Rybakina

Hi!

On 14.10.2024 22:59, Kirill Reshke wrote:

The patch in the attachment is a simplistic attempt to differentiate
these sortings by the number of columns. It is not ideal because
comparisons depend on the duplicates in each column, but it may be the
subject of further work.

Even such a trivial model allows us to resolve the case like the below:
CREATE INDEX ON test (x1,x2,x3,x8);
EXPLAIN (ANALYZE) SELECT * FROM test WHERE x1<9E4 ORDER BY x1,x2,x3,x8;

The current master will choose SeqScan for four columns as well as for a
single column. With the model, the optimiser will understand that
sorting four columns costs more than sorting a single column and switch
to IndexScan.

--
regards, Andrei Lepikhov


Also, this patch needs to be rebased, again.

I also noticed that the code needs to be updated, so I rebased the code 
to master during the review and attached a patch.


I have a question about estimating the cost of an Append node with your 
sort cost model.


I see that if pathkeys is not a subset of subpath pathkeys, then we 
calculate the cost taking pathkeys into account. However, I didn't 
notice the estimation like that for the opposite situation.
I see that the cost of Append is the sum of the subpath costs, but we 
don't take pathkeys into account for subpaths, right?


--
Regards,
Alena Rybakina
Postgres Professional
From 9e9c5ff8aa977ec339c20433a427d98e15f8ed9d Mon Sep 17 00:00:00 2001
From: Alena Rybakina 
Date: Tue, 15 Oct 2024 01:05:38 +0300
Subject: [PATCH] Introduce the number of columns in the cost-sort model.

The sort node calls the comparison operator for each pair of attributes for
each couple of tuples. However, the current cost model uses
a '2.0*cpu_operator_cost' multiplier, which performs some sort of averaging.
This technique can lead to incorrect estimations when sorting on three, four,
or more columns, quite common in practice.
Moreover, further elaboration of the optimiser forms more strict requirements
for the balance of sortings, as caused by IncrementalSort, MergeAppend, and
MergeJoin.
In this patch, the multiplier is a linear function of a number of columns.
Member 1.0 needs to smooth the fact that dependence on the number of columns is
weaker than linear.
It is an extreme formula. The number of comparisons depends on the distinct
values in each column. As a TODO, we can natively elaborate this model by the
next step, involving distinct statistics to make estimations more precise.
---
 .../postgres_fdw/expected/postgres_fdw.out| 15 +++--
 contrib/postgres_fdw/postgres_fdw.c   |  2 +-
 src/backend/optimizer/path/costsize.c | 34 ++
 src/backend/optimizer/plan/createplan.c   |  2 +-
 src/backend/optimizer/plan/planner.c  |  2 +-
 src/backend/optimizer/prep/prepunion.c|  2 +-
 src/backend/optimizer/util/pathnode.c |  8 +--
 src/include/optimizer/cost.h  |  2 +-
 src/test/regress/expected/aggregates.out  | 13 ++--
 src/test/regress/expected/join.out| 20 +++---
 src/test/regress/expected/partition_join.out  |  8 ++-
 src/test/regress/expected/union.out   | 66 +--
 12 files changed, 95 insertions(+), 79 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index f2bcd6aa98c..5da33ab69ec 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9995,13 +9995,16 @@ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 
ON (t1.a = t2.b) INNER J
 -- left outer join + nullable clause
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 
10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
-   
  QUERY PLAN

 
-
- Foreign Scan
+   
 QUERY PLAN 

+---
+ Sort
Output: t1.a, fprt2.b, fprt2.c
-   Relations: (public.ftprt1_p1 t1) LEFT JOIN (public.ftprt2_p1 fprt2)
-   Remote SQL: SELECT r5.a, r6.b, r6.c FROM (public.fprt1_p1 r5 LEFT JOIN 
public.fprt2_p1 r6 ON (((r5.a = r6.b)) AND ((r5.b = r6.a)) AND ((r6.a < 10 
WHERE ((r5.a < 10)) ORDER BY r5.a ASC NULLS LAST, r6.b ASC NU

Re: Improving the notation for ecpg.addons rules

2024-10-14 Thread Tom Lane
I wrote:
> Anyway, my thought was to merge this into the other patch series,
> but I didn't do that yet.

I'd forgotten about merging these two patches into the bigger
ecpg patch series.  Since there didn't seem to be any objection
to either, I went ahead and pushed them.

regards, tom lane




Re: Logging parallel worker draught

2024-10-14 Thread Alena Rybakina

On 28.08.2024 15:58, Benoit Lobréau wrote:

Hi,

Here is a new version of the patch. Sorry for the long delay, I was 
hit by a motivation drought and was quite busy otherwise.


The guc is now called `log_parallel_workers` and has three possible 
values:


* "none": disables logging
* "all": logs parallel worker info for all parallel queries or utilities
* "failure": logs only when the number of parallel workers planned 
couldn't be reached.


For this, I added several members to the EState struct.

Each gather node / gather merge node updates the values and the 
offending queries are displayed during standard_ExecutorEnd.


For CREATE INDEX / REINDEX on btree and brin, I check the parallel 
context struct (pcxt) during _bt_end_parallel() or 
_brin_end_parallel() and display a log message when needed.


For vacuum, I do the same in parallel_vacuum_end().

I added some information to the error message for parallel queries as 
an experiment. I find it useful, but it can be removed, if you re not 
convinced.


2024-08-27 15:59:11.089 CEST [54585] LOG:  1 parallel nodes planned (1 
obtained all their workers, 0 obtained none), 2 workers planned (2 
workers launched)

2024-08-27 15:59:11.089 CEST [54585] STATEMENT:  EXPLAIN (ANALYZE)
    SELECT i, avg(j) FROM test_pql GROUP BY i;

2024-08-27 15:59:14.006 CEST [54585] LOG:  2 parallel nodes planned (0 
obtained all their workers, 1 obtained none), 4 workers planned (1 
workers launched)

2024-08-27 15:59:14.006 CEST [54585] STATEMENT:  EXPLAIN (ANALYZE)
    SELECT i, avg(j) FROM test_pql GROUP BY i
    UNION
    SELECT i, avg(j) FROM test_pql GROUP BY i;

For CREATE INDEX / REDINDEX / VACUUMS:

2024-08-27 15:58:59.769 CEST [54521] LOG:  1 workers planned (0 
workers launched)

2024-08-27 15:58:59.769 CEST [54521] STATEMENT:  REINDEX TABLE test_pql;

Do you think this is better?

I am not sure if a struct is needed to store the es_nworkers* and if 
the modification I did to parallel.h is ok.


Thanks to: Jehan-Guillaume de Rorthais, Guillaume Lelarge and Franck 
Boudehen for the help and motivation boost. 


Hi!

I think it's a good idea to log this. I suggest some changes that might 
improve your patch.


1. I think you should rewrite the first statement of the documentation 
about parameter as follows:


Controls a log message about the number of workers produced during an 
execution of the parallel query or utility statement.



2. I think you can transfer nworkers_to_launch and nworkers_launched 
vacuum parameters in the ParallelContext struct.


3. I think you should write the logging check condition in an 
independent function and provide necessary parameters for that. To be 
honest if the parameters weren't stored in a different struct for 
parallel queries, I would have put it in a macro.


I attached the diff file including my proposals.


--
Regards,
Alena Rybakina
Postgres Professional
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 5c04a27eb66..1e2fadffaf1 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -7704,8 +7704,8 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
   

-Controls whether a log message is produced to display information on 
the number of
-workers spawned when a parallel query or utility is executed. The 
default value is
+Controls whether a log message about the number of workers produced 
during an
+execution of the parallel query or utility statement. The default 
value is
 none which disables logging. all 
displays
 information for all parallel queries, whereas 
failures displays
 information only when the number of workers launched is lower than the 
number of
diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index fdfbbc7dc42..3ed10c7f8a0 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -2544,10 +2544,9 @@ _brin_end_parallel(BrinLeader *brinleader, 
BrinBuildState *state)
/* Shutdown worker processes */
WaitForParallelWorkersToFinish(brinleader->pcxt);
 
-   if ((log_parallel_workers == LOG_PARALLEL_WORKERS_ALL &&
-brinleader->pcxt->nworkers_to_launch > 0) ||
-   (log_parallel_workers == LOG_PARALLEL_WORKERS_FAILURE &&
-brinleader->pcxt->nworkers_to_launch > 
brinleader->pcxt->nworkers_launched))
+   if (LoggingParallelWorkers(log_parallel_workers,
+  
brinleader->pvs->pcxt->parallel_workers_to_launch,
+  
brinleader->pvs->pcxt->parallel_workers_launched))
elog(LOG, "%i workers planned (%i workers launched)",
brinleader->pcxt->nworkers_to_launch,
brinleader->pcxt->nworkers_launched);
diff --git a/src/backend/access/nbtree/nbtsort.c 
b/src/backend/access/nbtree/nbtsort.c
index e31bd8e223e..85fede89

Make foreach_ptr macro work in C++ extensions

2024-10-14 Thread Jelte Fennema-Nio
I've been writing a C++ extension recently and it turns out that the
foreach_ptr macro added in PG17 by one of my patches causes a
compilation failure when used in C++. This is due to C++ its stricter
rules around implicit casts from void pointers.

Attached is a tiny patch that fixes that by adding an explicit cast to
the macro. I think it would be good to backpatch this to PG17 as well,
as to not introduce some differences in the macro across versions.

(CCed Nathan because he committed the original patch)
From 3c3e94fdb935d71c95cdbee4cfaf971c1a9b93a3 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Mon, 14 Oct 2024 21:54:48 +0200
Subject: [PATCH v1] Make foreach_ptr macro work in C++ extensions

C++ has stronger requirements on casting from void pointers than C. It
turns out that because of this it's impossible to use the foreach_ptr
macro in a C++ extension. This additional explicit cast makes it work in
both C and C++ extensions.
---
 src/include/nodes/pg_list.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/include/nodes/pg_list.h b/src/include/nodes/pg_list.h
index 52df93759f..d131350e10 100644
--- a/src/include/nodes/pg_list.h
+++ b/src/include/nodes/pg_list.h
@@ -485,7 +485,7 @@ for_each_cell_setup(const List *lst, const ListCell *initcell)
 		for (ForEachState var##__state = {(lst), 0}; \
 			 (var##__state.l != NIL && \
 			  var##__state.i < var##__state.l->length && \
-			 (var = func(&var##__state.l->elements[var##__state.i]), true)); \
+			 (var = (type pointer) func(&var##__state.l->elements[var##__state.i]), true)); \
 			 var##__state.i++)
 
 /*
-- 
2.43.0



Re: Changing the default random_page_cost value

2024-10-14 Thread Tom Lane
I wrote:
> I recall that when we settled on 4.0 as a good number for
> spinning-rust drives, it came out of some experimentation that
> I'd done that involved multiple-day-long tests.  I don't recall any
> more details than that sadly, but perhaps trawling the mailing list
> archives would yield useful info.  It looks like the 4.0 value came
> in with b1577a7c7 of 2000-02-15, so late 1999/early 2000 would be the
> time frame to look in.

I tried asking https://www.postgresql.org/search/ about
random_page_cost, and got nothing except search engine timeouts :-(.
However, some digging in my own local archives yielded

https://www.postgresql.org/message-id/flat/25387.948414692%40sss.pgh.pa.us

https://www.postgresql.org/message-id/flat/14601.949786166%40sss.pgh.pa.us

That confirms my recollection of multiple-day test runs, but doesn't
offer much more useful detail than that :-(.  What I think I did
though was to create some large tables (much bigger than the RAM on
the machine I had) and actually measure the runtime of seq scans
versus full-table index scans, repeating plenty 'o times to try to
average out the noise.  There was some talk in those threads of
reducing that to a publishable script, but it was never followed up
on.

regards, tom lane




simplify regular expression locale global variables

2024-10-14 Thread Peter Eisentraut

We currently have

static PG_Locale_Strategy pg_regex_strategy;
static pg_locale_t pg_regex_locale;
static Oid  pg_regex_collation;

but after the recent improvements to pg_locale_t handling, we don't need 
all three anymore.  All the information we have is contained in 
pg_locale_t, so we just need to keep that one.  This allows us to 
structure the locale-using regular expression code more similar to other 
locale-using code, mainly by provider, avoiding another layer that is 
specific only to the regular expression code.  The first patch 
implements that.


The second patch removes a call to pg_set_regex_collation() that I think 
is unnecessary.


The third patch adds a pg_unset_regex_collation() call that undoes what 
pg_set_regex_collation() does.  I mainly used this to verify the second 
patch, but maybe it's also useful on its own, not sure.


(I don't have any plans to get rid of the remaining global variable. 
That would certainly be nice from an intellectual point of view, but 
fiddling this into the regular expression code looks quite messy.  In 
any case, it's probably easier with one variable instead of three, if 
someone wants to try.)From 1799abec05ae3d49a7a57333acd1d377e26d0fe9 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 15 Oct 2024 08:01:41 +0200
Subject: [PATCH 1/3] Remove pg_regex_collation and pg_regex_strategy

We don't need three global variables to describe the locale strategy
for regular expressions.  We only need to keep pg_regex_locale.  This
works now because pg_locale_t now contains all the required
information (such as a ctype_is_c field).  This allows us to structure
the locale-using regular expression code more similar to other
locale-using code, mainly by provider, avoiding another layer that is
specific only to the regular expression code.
---
 src/backend/regex/regc_pg_locale.c | 430 +
 1 file changed, 185 insertions(+), 245 deletions(-)

diff --git a/src/backend/regex/regc_pg_locale.c 
b/src/backend/regex/regc_pg_locale.c
index b75784b6ce5..4691e796385 100644
--- a/src/backend/regex/regc_pg_locale.c
+++ b/src/backend/regex/regc_pg_locale.c
@@ -63,18 +63,7 @@
  * NB: the coding here assumes pg_wchar is an unsigned type.
  */
 
-typedef enum
-{
-   PG_REGEX_STRATEGY_C,/* C locale (encoding independent) */
-   PG_REGEX_STRATEGY_BUILTIN,  /* built-in Unicode semantics */
-   PG_REGEX_STRATEGY_LIBC_WIDE,/* Use locale_t  functions */
-   PG_REGEX_STRATEGY_LIBC_1BYTE,   /* Use locale_t  functions */
-   PG_REGEX_STRATEGY_ICU,  /* Use ICU uchar.h functions */
-} PG_Locale_Strategy;
-
-static PG_Locale_Strategy pg_regex_strategy;
 static pg_locale_t pg_regex_locale;
-static Oid pg_regex_collation;
 
 /*
  * Hard-wired character properties for C locale
@@ -232,7 +221,6 @@ void
 pg_set_regex_collation(Oid collation)
 {
pg_locale_t locale = 0;
-   PG_Locale_Strategy strategy;
 
if (!OidIsValid(collation))
{
@@ -253,8 +241,9 @@ pg_set_regex_collation(Oid collation)
 * catalog access is available, so we can't call
 * pg_newlocale_from_collation().
 */
-   strategy = PG_REGEX_STRATEGY_C;
-   collation = C_COLLATION_OID;
+   static struct pg_locale_struct dummy_locale = {.ctype_is_c = 
true};
+
+   locale = &dummy_locale;
}
else
{
@@ -264,121 +253,80 @@ pg_set_regex_collation(Oid collation)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg("nondeterministic collations 
are not supported for regular expressions")));
-
-   if (locale->ctype_is_c)
-   {
-   /*
-* C/POSIX collations use this path regardless of 
database
-* encoding
-*/
-   strategy = PG_REGEX_STRATEGY_C;
-   locale = 0;
-   collation = C_COLLATION_OID;
-   }
-   else if (locale->provider == COLLPROVIDER_BUILTIN)
-   {
-   Assert(GetDatabaseEncoding() == PG_UTF8);
-   strategy = PG_REGEX_STRATEGY_BUILTIN;
-   }
-#ifdef USE_ICU
-   else if (locale->provider == COLLPROVIDER_ICU)
-   {
-   strategy = PG_REGEX_STRATEGY_ICU;
-   }
-#endif
-   else
-   {
-   Assert(locale->provider == COLLPROVIDER_LIBC);
-   if (GetDatabaseEncoding() == PG_UTF8)
-   strategy = PG_REGEX_STRATEGY_LIBC_WIDE;
-   else
-   strategy = PG_REGEX_STRATEGY_LIBC_1BYTE;
-   }
}
 
-   pg_regex_strategy = strategy;

Re: Consider the number of columns in the sort cost model

2024-10-14 Thread Andrei Lepikhov

On 10/15/24 12:15, David Rowley wrote:

As for your patch, I'm suspicious that the general idea you're
proposing is an actual improvement.
I didn't intend to treat it as an 'improvement' but as an intermediate 
patch. The main purpose here is to debate the way & introduce 
considering of number of columns. Conservative development approach has 
been preferred before.


it seems you're just charging cpu_operator_cost * . It seems like it won't be very hard to fool that into
doing the wrong thing when the first column to sort by is distinct or
almost distinct. There's going to be far fewer or no tiebreaker
comparisons for that case.
As I've written above, it is for starters. It allow to analyse how the 
balance between different types of orderings can be changed in extreme 
cases. I can join this patch with the following, implementing 
differentiation by distincts, but the effect on regression tests will be 
smoothed out.
The primary idea is 1) to elaborate GROUP-BY optimisation and 2) give 
the optimiser a tool to choose a more optimal sort, comparing 
MergeAppend/Sort/IncrementalSort costs.
The whole idea is implemented in the branch [1] and described in the 
post [2]. Of course, we differentiate sortings by distinct of the first 
column (only one trustworthy statistic). It is not so difficult (but I 
doubt the real necessity) to use extended statistics and reflect in the 
cost values of different combinations of columns.




As mentioned by Kirill, I also don't understand the cost_sort
signature change. Why would you do that over just doing
list_length(pathkeys) within cost_sort()? Your throwing away a
parameter that might  be the most useful one of the bunch for allowing
better sort cost estimates.
Ok, may be it is too much for an intermediate patch. We can change the 
interface later, if necessary.

Perhaps that could be done within the EquivalenceClass.

Thanks for the idea!

[1] https://github.com/postgrespro/postgres/tree/sort-columnsnum
[2] https://danolivo.substack.com/p/elaboration-of-the-postgresql-sort

--
regards, Andrei Lepikhov





Re: replace strtok()

2024-10-14 Thread Peter Eisentraut

On 10.10.24 14:59, Ranier Vilela wrote:

Please look at the SCRAM secret, which breaks parse_scram_secret(),
perhaps because strsep() doesn't return NULL where strtok() did:

CREATE ROLE r PASSWORD
'SCRAM-

SHA-256$4096:hpFyHTUsSWcR7O9P$LgZFIt6Oqdo27ZFKbZ2nV+=vtnYM995pDh9ca6WSi120qVV5NeluNfUPkwm7Vqat25RjSPLkGeoZBQs6wVv+um4=';

Core was generated by `postgres: law regression [local] CREATE
ROLE  '.
Program terminated with signal SIGSEGV, Segmentation fault.

#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
(gdb) bt
#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1  0x563625e9e5b0 in parse_scram_secret (...) at auth-scram.c:655

Thanks for the report.

It seems to me that it could be due to incorrect use of the strsep function.
See:
https://man7.org/linux/man-pages/man3/strsep.3.html 

"

In case no delimiter was found, the token
is taken to be the entire string/*stringp/, and/*stringp/ is made
NULL.

"
So, it is necessary to check the *stringp* against NULL too.


Thanks for the analysis.  I think moreover we *only* need to check the 
"stringp" for NULL, not the return value of strsep(), which would never 
be NULL in our case.  So I propose the attached patch as a variant of yours.
From df4d43acb2956509e8166f2df4a3a2be1493ff6f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 15 Oct 2024 08:39:30 +0200
Subject: [PATCH v2] Fix strsep() use for SCRAM secrets parsing

The previous code (from commit 5d2e1cc117b) did not detect end of
string correctly, so it would fail to error out if fewer than the
expected number of fields were present, which could then later lead to
a crash when NULL string pointers are accessed.

Reported-by: Alexander Lakhin 
Reported-by: Ranier Vilela 
Discussion: 
https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c39442...@eisentraut.org
---
 src/backend/libpq/auth-scram.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index 03c3c27..56df870e9ef 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -608,13 +608,17 @@ parse_scram_secret(const char *secret, int *iterations,
 * SCRAM-SHA-256$:$:
 */
v = pstrdup(secret);
-   if ((scheme_str = strsep(&v, "$")) == NULL)
+   scheme_str = strsep(&v, "$");
+   if (v == NULL)
goto invalid_secret;
-   if ((iterations_str = strsep(&v, ":")) == NULL)
+   iterations_str = strsep(&v, ":");
+   if (v == NULL)
goto invalid_secret;
-   if ((salt_str = strsep(&v, "$")) == NULL)
+   salt_str = strsep(&v, "$");
+   if (v == NULL)
goto invalid_secret;
-   if ((storedkey_str = strsep(&v, ":")) == NULL)
+   storedkey_str = strsep(&v, ":");
+   if (v == NULL)
goto invalid_secret;
serverkey_str = v;
 

base-commit: 7cdfeee320e72162b62dee638e713c2b8680
-- 
2.47.0



Re: Consider the number of columns in the sort cost model

2024-10-14 Thread Kirill Reshke
Hi!

On Thu, 22 Aug 2024 at 23:46, Andrei Lepikhov  wrote:
>
> Hi,
>
> I would like to propose a slight elaboration of the sort cost model.
> In practice, we frequently see the choice of suboptimal sortings, which
> slump performance by 10-50%.
>
> The key reason here is the optimiser's blindness to the fact that
> sorting calls a comparison operator for each pair of attributes in the
> tuple. Just for example:
>
> SET work_mem = '128MB';
> CREATE TABLE test (
>x1 int,x2 int,x3 int,x4 int,x5 int,x6 int,x7 int,x8 int,payload text);
> INSERT INTO test (x1,x2,x3,x4,x5,x6,x7,x8,payload) (
>SELECT 1,2,3,4,5,6,7,(1E5-x), 'abc'||x
>  FROM generate_series(1,1E5) AS x);
> VACUUM ANALYZE test;
>
> Let's execute the following three queries:
>
> EXPLAIN (ANALYZE)
> SELECT * FROM test WHERE x1<9E4 ORDER BY x8;
> EXPLAIN (ANALYZE)
> SELECT * FROM test WHERE x1<9E4 ORDER BY x1,x2,x3,x8;
> EXPLAIN (ANALYZE)
> SELECT * FROM test WHERE x1<9E4 ORDER BY x1,x2,x3,x4,x5,x6,x7,x8;
>
> My laptop executes these three queries in 100ms, 300ms, and 560ms,
> respectively. At the same time, the costs of these query plans are
> similar. This corner case shows that sorting matters and in the corner
> case, dependence on the number of comparisons looks close to linear.


Indeed. My numbers are  44.167, 88.759,136.864 ms:


> The patch in the attachment is a simplistic attempt to differentiate
> these sortings by the number of columns. It is not ideal because
> comparisons depend on the duplicates in each column, but it may be the
> subject of further work.
>
> Even such a trivial model allows us to resolve the case like the below:
> CREATE INDEX ON test (x1,x2,x3,x8);
> EXPLAIN (ANALYZE) SELECT * FROM test WHERE x1<9E4 ORDER BY x1,x2,x3,x8;
>
> The current master will choose SeqScan for four columns as well as for a
> single column. With the model, the optimiser will understand that
> sorting four columns costs more than sorting a single column and switch
> to IndexScan.
>
> --
> regards, Andrei Lepikhov


I configm, before applying your patch Index Scan is not chosen,
whereas after it does.

Two questions:
1) Why do we change the `cost_sort` definition? We can calculate the
length of `patchkeys` inside cost_sort if we want, just like we do
inside cost_gather_merge. No need to make extension developers suffer
with API change, isn't it?

2) I'm not very sure about the (1.0 + cmpMultiplier) coefficient. I
understand that we need to multiply by something that is not 0 (and
that's why we add 1), and this is strictly better than multiplying by
2, but why exactly is this formula like this, not (1.0 + cmpMultiplier
^ 2 / 10) for example? Maybe we need some more sophisticated approach?

Also, this patch needs to be rebased, again.

-- 
Best regards,
Kirill Reshke




Re: New "raw" COPY format

2024-10-14 Thread Joel Jacobson
On Mon, Oct 14, 2024, at 10:51, Joel Jacobson wrote:
> On Mon, Oct 14, 2024, at 10:07, Joel Jacobson wrote:
>> Attached is a first draft implementation of the new proposed COPY "raw" 
>> format.
>>
>> The first two patches are just the bug fix in HEAD, reported separately:
>> https://commitfest.postgresql.org/50/5297/

Rebase only.

/Joel

v6-0001-Fix-thinko-in-tests-for-COPY-options-force_not_null-.patch
Description: Binary data


v6-0002-Fix-validation-of-FORCE_NOT_NULL-FORCE_NULL-for-all-.patch
Description: Binary data


v6-0003-Replace-binary-flags-binary-and-csv_mode-with-format.patch
Description: Binary data


v6-0004-Reorganize-ProcessCopyOptions-for-clarity-and-consis.patch
Description: Binary data


v6-0005-Add-raw-COPY-format-support-for-unstructured-text-da.patch
Description: Binary data


Re: a litter question about mdunlinkfiletag function

2024-10-14 Thread Bruce Momjian
On Mon, Sep 30, 2024 at 10:43:17AM +0800, px shi wrote:
> Hi, hackers
> 
> When calculating the path, forknum is hardcoded as MAIN_FORKNUM:
> 
> /* Compute the path. */
> p = relpathperm(ftag->rnode, MAIN_FORKNUM);
> 
> 
> But since the ftag structure already contains forknum:
> 
> typedef struct FileTag
> {
> int16 handler; /* SyncRequestHandler value, saving space */
> int16 forknum; /* ForkNumber, saving space */
> RelFileNode rnode;
> uint32 segno;
> } FileTag;
> 
> 
> Wouldn’t it be more flexible to use the value from the ftag structure 
> directly?

You will find other places where relpathperm() is called without having
a FileTag structure available, e.g. ReorderBufferProcessTXN().

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"




Re: a litter question about mdunlinkfiletag function

2024-10-14 Thread px shi
>
> You will find other places where relpathperm() is called without having
> a FileTag structure available, e.g. ReorderBufferProcessTXN().
>

I apologize for the confusion. What I meant to say is that in the
mdunlinkfiletag() function, the forknum is currently hardcoded as
MAIN_FORKNUM when calling relpathperm(). While mdunlinkfiletag currently
only handles MAIN_FORKNUM, wouldn’t it be more flexible to retrieve the
forknum from the ftag structure instead?


Bruce Momjian  于2024年10月15日周二 04:59写道:

> On Mon, Sep 30, 2024 at 10:43:17AM +0800, px shi wrote:
> > Hi, hackers
> >
> > When calculating the path, forknum is hardcoded as MAIN_FORKNUM:
> >
> > /* Compute the path. */
> > p = relpathperm(ftag->rnode, MAIN_FORKNUM);
> >
> >
> > But since the ftag structure already contains forknum:
> >
> > typedef struct FileTag
> > {
> > int16 handler; /* SyncRequestHandler value, saving space */
> > int16 forknum; /* ForkNumber, saving space */
> > RelFileNode rnode;
> > uint32 segno;
> > } FileTag;
> >
> >
> > Wouldn’t it be more flexible to use the value from the ftag structure
> directly?
>
> You will find other places where relpathperm() is called without having
> a FileTag structure available, e.g. ReorderBufferProcessTXN().
>
> --
>   Bruce Momjian  https://momjian.us
>   EDB  https://enterprisedb.com
>
>   When a patient asks the doctor, "Am I going to die?", he means
>   "Am I going to die soon?"
>


Re: NOT ENFORCED constraint feature

2024-10-14 Thread Amul Sul
On Wed, Oct 9, 2024 at 2:44 PM Joel Jacobson  wrote:
>
> On Tue, Oct 8, 2024, at 11:06, Amul Sul wrote:
> > The attached patch proposes adding the ability to define CHECK and
> > FOREIGN KEY constraints as NOT ENFORCED.
>
> Thanks for working on this!
>
> > Adding NOT ENFORCED to CHECK constraints is simple, see 0001 patch,
>
> I've looked at the 0001 patch and think it looks simple and straight forward.
>

Thanks for looking into it.

> > but implementing it for FOREIGN KEY constraints requires more code due
> > to triggers, see 0002 - 0005 patches.
>
> I can't say that much yet about the code changes in 0002 - 0005 yet,
> but I've tested the patches and successfully experimented with the feature.
>
> Also think the documentation is good and sound. Only found a minor typo:
> -Adding a enforced CHECK or NOT NULL
> +Adding an enforced CHECK or NOT 
> NULL
>

Ok, will fix it in the next version.

> > There are various approaches for
> > implementing NOT ENFORCED foreign keys, what I thought of:
> >
> > 1. When defining a NOT ENFORCED foreign key, skip the creation of
> > triggers used for referential integrity check, while defining an
> > ENFORCED foreign key, remain the same as the current behaviour. If an
> > existing foreign key is changed to NOT ENFORCED, the triggers are
> > dropped, and when switching it back to ENFORCED, the triggers are
> > recreated.
> >
> > 2. Another approach could be to create the NOT ENFORCED constraint
> > with the triggers as usual, but disable those triggers by updating the
> > pg_trigger catalog so that they are never executed for the check. And
> > enable them when the constraint is changed back to ENFORCED.
> >
> > 3. Similarly, a final approach would involve updating the logic where
> > trigger execution is decided and skipping the execution if the
> > constraint is not enforced, rather than modifying the pg_trigger
> > catalog.
> >
> > In the attached patch, the first approach has been implemented. This
> > requires more code changes but prevents unused triggers from being
> > left in the database and avoids the need for changes all over the
> > place to skip trigger execution, which could be missed in future code
> > additions.
>
> I also like the first approach, since I think it's nice the pg_trigger
> entires are inserted / deleted upon enforced / not enforced.
>
> > The ALTER CONSTRAINT operation in the patch added code to handle
> > dropping and recreating triggers. An alternative approach would be to
> > simplify the process by dropping and recreating the FK constraint,
> > which would automatically handle skipping or creating triggers for NOT
> > ENFORCED or ENFORCED FK constraints. However, I wasn't sure if this
> > was the right approach, as I couldn't find any existing ALTER
> > operations that follow this pattern.
>
> I think the current approach of dropping and recreating the triggers is best,
> since if we would instead be dropping and recreating the FK constraint,
> that would cause problems if some other future SQL feature would need to
> introduce dependencies on the FK constraints via pg_depend.
>

Yes, that was my initial thought as well, and recreating the
dependencies would be both painful and prone to bugs.

Regards,
Amul




Re: NOT ENFORCED constraint feature

2024-10-14 Thread Amul Sul
On Wed, Oct 9, 2024 at 6:45 PM Andrew Dunstan  wrote:
>
>
> On 2024-10-09 We 5:14 AM, Joel Jacobson wrote:
>
> On Tue, Oct 8, 2024, at 11:06, Amul Sul wrote:
>
> The attached patch proposes adding the ability to define CHECK and
> FOREIGN KEY constraints as NOT ENFORCED.
>
> Thanks for working on this!
>
> Adding NOT ENFORCED to CHECK constraints is simple, see 0001 patch,
>
> I've looked at the 0001 patch and think it looks simple and straight forward.
>
> but implementing it for FOREIGN KEY constraints requires more code due
> to triggers, see 0002 - 0005 patches.
>
> I can't say that much yet about the code changes in 0002 - 0005 yet,
> but I've tested the patches and successfully experimented with the feature.
>
> Also think the documentation is good and sound. Only found a minor typo:
> -Adding a enforced CHECK or NOT NULL
> +Adding an enforced CHECK or NOT 
> NULL
>
> There are various approaches for
> implementing NOT ENFORCED foreign keys, what I thought of:
>
> 1. When defining a NOT ENFORCED foreign key, skip the creation of
> triggers used for referential integrity check, while defining an
> ENFORCED foreign key, remain the same as the current behaviour. If an
> existing foreign key is changed to NOT ENFORCED, the triggers are
> dropped, and when switching it back to ENFORCED, the triggers are
> recreated.
>
> 2. Another approach could be to create the NOT ENFORCED constraint
> with the triggers as usual, but disable those triggers by updating the
> pg_trigger catalog so that they are never executed for the check. And
> enable them when the constraint is changed back to ENFORCED.
>
> 3. Similarly, a final approach would involve updating the logic where
> trigger execution is decided and skipping the execution if the
> constraint is not enforced, rather than modifying the pg_trigger
> catalog.
>
> In the attached patch, the first approach has been implemented. This
> requires more code changes but prevents unused triggers from being
> left in the database and avoids the need for changes all over the
> place to skip trigger execution, which could be missed in future code
> additions.
>
> I also like the first approach, since I think it's nice the pg_trigger
> entires are inserted / deleted upon enforced / not enforced.
>
>
>
> I also prefer this, as it gets us out from any possible dance with enabling / 
> disabling triggers.
>

Agreed. Thanks for the inputs.

Regards,
Amul.




Re: Consider the number of columns in the sort cost model

2024-10-14 Thread Andrei Lepikhov

On 10/15/24 02:59, Kirill Reshke wrote:

On Thu, 22 Aug 2024 at 23:46, Andrei Lepikhov  wrote:
Two questions:

Thank you for the interest to this feature!

1) Why do we change the `cost_sort` definition? We can calculate the
length of `pathkeys` inside cost_sort if we want, just like we do
inside cost_gather_merge.
I am suspicious of that but open to hearing other opinions. The 
coefficient incorporates knowledge about how many comparisons will be 
made with this sorting operator. The caller can have some additional 
knowledge about that. For example, IncrementalSort knows about groups - 
each group has the same prefix, which means each tuple comparison will 
inevitably cause a comparison for each column from the prefix. Also, it 
may be knowledge about the number of distinct values for each column, 
etc. I'm not sure we should elaborate cost_sort a lot here.


2) I'm not very sure about the (1.0 + cmpMultiplier) coefficient. I
understand that we need to multiply by something that is not 0 (and
that's why we add 1), and this is strictly better than multiplying by
2, but why exactly is this formula like this, not (1.0 + cmpMultiplier
^ 2 / 10) for example? Maybe we need some more sophisticated approach?
Hmmm, I think you misunderstood the general idea behind this 
coefficient. I explained it earlier this year in more detail [1], 
showing the direction of future improvement.
Value '1' in this formula reflects some second-order significance 
physics that we don't consider in that formula. Of course, it may be 
different, but after tests, I decided to keep it simple.
The cmpMultiplier value reflects the number of comparisons the executor 
will make when comparing a pair of tuples. It is a fraction of the 
maximum number of potential comparisons for each operation of tuple 
comparisons (likewise, Hash Agg already employs in costing). Right now, 
we don't use distinct statistics on sorting columns and use a 
conservative approach, which may be improved in the future. But I don't 
think it is a real problem here because we use the same in all places 
and, with this feature, change the balance only with hashing methods (in 
favour of their usage). I think it is not harmful because this 
characteristic is a second-order grade factor and shouldn't 
significantly reduce the performance of actual queries.


Also, this patch needs to be rebased, again.

Thanks, done.

[1] https://danolivo.substack.com/p/elaboration-of-the-postgresql-sort

--
regards, Andrei Lepikhov
From 3ba39d245e079235f6352e5cb90224cc9ecee9b6 Mon Sep 17 00:00:00 2001
From: "Andrei V. Lepikhov" 
Date: Tue, 15 Oct 2024 11:33:01 +0700
Subject: [PATCH v3] Introduce the number of columns in the cost-sort model.

The sort node calls the comparison operator for each pair of attributes for
each couple of tuples. However, the current cost model uses
a '2.0*cpu_operator_cost' multiplier, which performs some sort of averaging.
This technique can lead to incorrect estimations when sorting on three, four,
or more columns, quite common in practice.
Moreover, further elaboration of the optimiser forms more strict requirements
for the balance of sortings, as caused by IncrementalSort, MergeAppend, and
MergeJoin.
In this patch, the multiplier is a linear function of a number of columns.
Member 1.0 needs to smooth the fact that dependence on the number of columns is
weaker than linear.
It is an extreme formula. The number of comparisons depends on the distinct
values in each column. As a TODO, we can natively elaborate this model by the
next step, involving distinct statistics to make estimations more precise.
---
 .../postgres_fdw/expected/postgres_fdw.out| 15 +++--
 contrib/postgres_fdw/postgres_fdw.c   |  2 +-
 src/backend/optimizer/path/costsize.c | 34 ++
 src/backend/optimizer/plan/createplan.c   |  2 +-
 src/backend/optimizer/plan/planner.c  |  2 +-
 src/backend/optimizer/prep/prepunion.c|  2 +-
 src/backend/optimizer/util/pathnode.c |  8 +--
 src/include/optimizer/cost.h  |  2 +-
 src/test/regress/expected/aggregates.out  | 13 ++--
 src/test/regress/expected/join.out| 20 +++---
 src/test/regress/expected/partition_join.out  |  8 ++-
 src/test/regress/expected/union.out   | 66 +--
 12 files changed, 95 insertions(+), 79 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index f2bcd6aa98..5da33ab69e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9995,13 +9995,16 @@ SELECT t1.a,t2.b,t3.c FROM fprt1 t1 INNER JOIN fprt2 t2 ON (t1.a = t2.b) INNER J
 -- left outer join + nullable clause
 EXPLAIN (VERBOSE, COSTS OFF)
 SELECT t1.a,t2.b,t2.c FROM fprt1 t1 LEFT JOIN (SELECT * FROM fprt2 WHERE a < 10) t2 ON (t1.a = t2.b and t1.b = t2.a) WHERE t1.a < 10 ORDER BY 1,2,3;
-  

Re: BitmapHeapScan streaming read user and prelim refactoring

2024-10-14 Thread Melanie Plageman
On Tue, Oct 8, 2024 at 4:56 PM Thomas Munro  wrote:
>
> Just by the way, you can't use anonymous structs or unions in C99
> (that was added to C11), but most compilers accept them silently
> unless you use eg -std=c99.  Some buildfarm animal or other would
> bleat about that (ask me how I know), but out CI doesn't pick it up
> :-(  Maybe we should fix that.  That's why you see a lot of code
> around that accesses unions through a member with a name like "u",
> because the author was shaking her fist at the god of standards and
> steadfastly refusing to think of a name.  It's a shame in cases where
> you want to retrofit a union into existing code without having to
> adjust the code that accesses one of the members...

Wow, that is terrible news. I fact-checked this with ast-grep, and you
are right -- we don't use anonymous unions or structs.

Perhaps this isn't the time or place but 1) I was under the impression
we sometimes did things not allowed in C99 if they were really useful
2) remind me what we are waiting for to die so that we can use
features from C11?

- Melanie




Re: Changing the default random_page_cost value

2024-10-14 Thread David Rowley
On Tue, 15 Oct 2024 at 10:15, Bruce Momjian  wrote:
> I am not a fan of this patch.  I don't see why _removing_ the magnetic
> part helps because you then have no logic for any 1.2 was chosen.  I
> would put the magnetic in a separate paragraph perhaps, and recommend
> 4.0 for it.  Also, per-tablespace makes sense because of physical media
> differences, but what purpose would per-database and per-role serve?
> Also, per-tablespace is not a connection-activated item like the other
> two.

Yeah, I think any effort to change the default value for this setting
would require some analysis to prove that the newly proposed default
is a more suitable setting than the current default. I mean, why 1.2
and not 1.1 or 1.3? Where's the evidence that 1.2 is the best value
for this?

I don't think just providing evidence that random read times are
closer to sequential read times on SSDs are closer than they are with
HDDs is going to be enough. What we want to know is if the planner
costs become more closely related to the execution time or not. From
my experience, random_page_cost really only has a loose grasp on
reality, so you might find that it's hard to prove this with any
degree of confidence (just have a look at how inconsiderate
index_pages_fetched() is to other queries running on the database, for
example).

I suggest first identifying all the locations that use
random_page_cost then coming up with some test cases that run queries
that exercise those locations in a way that does things like vary the
actual selectivity of some value to have the planner switch plans then
try varying the random_page_cost to show that the switchover point is
more correct with the new value than it is with the old value. It
would be nice to have this as a script so that other people could
easily run it on their hardware to ensure that random_page_cost we
choose as the new default is representative of the average hardware.
You'll likely need to do this with varying index sizes. I imagine to
properly test this so that we'd have any useful degree of confidence
that the new value is better than the old one would likely require a
benchmark that runs for several hours. At the upper end, you'd likely
want the data sizes to exceed the size of RAM. Another dimension that
the tests should likely explore is varying data locality.

David




Re: Changing the default random_page_cost value

2024-10-14 Thread Tom Lane
David Rowley  writes:
> Yeah, I think any effort to change the default value for this setting
> would require some analysis to prove that the newly proposed default
> is a more suitable setting than the current default. I mean, why 1.2
> and not 1.1 or 1.3? Where's the evidence that 1.2 is the best value
> for this?

Yeah, that's been my main concern about this proposal too.

I recall that when we settled on 4.0 as a good number for
spinning-rust drives, it came out of some experimentation that
I'd done that involved multiple-day-long tests.  I don't recall any
more details than that sadly, but perhaps trawling the mailing list
archives would yield useful info.  It looks like the 4.0 value came
in with b1577a7c7 of 2000-02-15, so late 1999/early 2000 would be the
time frame to look in.

regards, tom lane




Re: Make foreach_ptr macro work in C++ extensions

2024-10-14 Thread Nathan Bossart
On Mon, Oct 14, 2024 at 10:05:42PM +0200, Jelte Fennema-Nio wrote:
> I've been writing a C++ extension recently and it turns out that the
> foreach_ptr macro added in PG17 by one of my patches causes a
> compilation failure when used in C++. This is due to C++ its stricter
> rules around implicit casts from void pointers.

Whoops.

> Attached is a tiny patch that fixes that by adding an explicit cast to
> the macro.

Looks reasonable to me.

> I think it would be good to backpatch this to PG17 as well,
> as to not introduce some differences in the macro across versions.

I'd consider this a "low-risk" fix, which our versioning page [0] says is
acceptable for a minor release, so I'll plan on back-patching unless
someone objects.

[0] https://www.postgresql.org/support/versioning/

-- 
nathan




Re: Add contrib/pg_logicalsnapinspect

2024-10-14 Thread Masahiko Sawada
On Sun, Oct 13, 2024 at 11:23 PM Bertrand Drouvot
 wrote:
>
> Hi,
>
> On Mon, Oct 14, 2024 at 09:57:22AM +1100, Peter Smith wrote:
> > Here are some minor review comments for v15-0002.
> >
> > ==
> > contrib/pg_logicalinspect/pg_logicalinspect.c
> >
> > 1.
> > +pg_get_logical_snapshot_meta(PG_FUNCTION_ARGS)
> > +{
> > +#define PG_GET_LOGICAL_SNAPSHOT_META_COLS 3
> > + SnapBuildOnDisk ondisk;
> > + HeapTuple tuple;
> > + Datum values[PG_GET_LOGICAL_SNAPSHOT_META_COLS] = {0};
> > + bool nulls[PG_GET_LOGICAL_SNAPSHOT_META_COLS] = {0};
> > + TupleDesc tupdesc;
> > + char path[MAXPGPATH];
> > + int i = 0;
> > + text*filename_t = PG_GETARG_TEXT_PP(0);
> > +
> > + sprintf(path, "%s/%s",
> > + PG_LOGICAL_SNAPSHOTS_DIR,
> > + text_to_cstring(filename_t));
> > +
> > + /* Build a tuple descriptor for our result type */
> > + if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
> > + elog(ERROR, "return type must be a row type");
> > +
> > + /* Validate and restore the snapshot to 'ondisk' */
> > + SnapBuildRestoreSnapshot(&ondisk, path, CurrentMemoryContext, false);
> >
> > The sprintf should be deferred. Could you do it after the ERROR check?
>
> I think that makes sense, done in v16 attached.
>
> > ==
> > src/backend/replication/logical/snapbuild.c
> >
> > 3.
> >  /*
> > - * Restore a snapshot into 'builder' if previously one has been stored at 
> > the
> > - * location indicated by 'lsn'. Returns true if successful, false 
> > otherwise.
> > + * Restore the logical snapshot file contents to 'ondisk'.
> > + *
> > + * If 'missing_ok' is true, will not throw an error if the file is not 
> > found.
> > + * 'context' is the memory context where the catalog modifying/committed 
> > xid
> > + * will live.
> >   */
> > -static bool
> > -SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
> > +bool
> > +SnapBuildRestoreSnapshot(SnapBuildOnDisk *ondisk, const char *path,
> > + MemoryContext context, bool missing_ok)
> >
> > nit - I think it's better to describe parameters in the same order
> > that they are declared.
>
> Done in v16.
>
> > Also, include a 'path' description, so it is
> > not the only one omitted.
>
> I don't think that's worth it as self explanatory IMHO.

Thank you for updating the patches!

I fixed a compiler warning by -Wtypedef-redefinition related to the
declaration of SnapBuild struct, then pushed both patches.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Doc: typo in config.sgml

2024-10-14 Thread Yugo NAGATA
Hi Bruce,

On Mon, 14 Oct 2024 16:31:11 -0400
Bruce Momjian  wrote:

> On Mon, Oct 14, 2024 at 03:05:35PM -0400, Bruce Momjian wrote:
> > I did some more research and we able to clarify our behavior in
> > release.sgml:
> 
> I have specified some more details in my patched version:
> 
> We can only use Latin1 characters, not all UTF8 characters,
> because some rendering engines do not support non-Latin1 UTF8
> characters.  Specifically, the HTML rendering engine can display
> all UTF8 characters, but the PDF rendering engine can only display
> Latin1 characters.  In PDF files, non-Latin1 UTF8 characters are
> displayed as "###".
> 
> In the SGML files we encode non-ASCII Latin1 characters as HTML
> entities, e.g., Álvaro.  Oddly, it is possible to safely
> represent Latin1 characters in SGML files as UTF8 for HTML and
> PDF output, but we we currently disallow this via the Makefile
> "check-non-ascii" rule.
> 

I agree with encoding non-Latin1 characters and disallowing non-ASCII
characters totally.

I found your patch includes fixes in *.svg files, so how about checking
also them by check-non-ascii? Also, I think it is better to use perl instead
of grep because non-GNU grep doesn't support hex escape sequences. I've attached
a updated patch for Makefile. The changes in release.sgml above is not applied
yet, though.


Regards,
Yugo Nagata

-- 
Yugo NAGATA 
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 65ed32cd0a..3d992ebd84 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -143,7 +143,7 @@ postgres.txt: postgres.html
 ## Print
 ##
 
-postgres.pdf:
+postgres.pdf pdf:
 	$(error Invalid target;  use postgres-A4.pdf or postgres-US.pdf as targets)
 
 XSLTPROC_FO_FLAGS += --stringparam img.src.path '$(srcdir)/'
@@ -194,7 +194,7 @@ MAKEINFO = makeinfo
 ##
 
 # Quick syntax check without style processing
-check: postgres.sgml $(ALLSGML) check-tabs check-nbsp
+check: postgres.sgml $(ALLSGML) check-tabs check-non-ascii
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $<
 
 
@@ -257,15 +257,16 @@ endif # sqlmansectnum != 7
 
 # tabs are harmless, but it is best to avoid them in SGML files
 check-tabs:
-	@( ! grep '	' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || \
+	@( ! grep '	' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl $(srcdr)/images/*.svg) ) || \
 	(echo "Tabs appear in SGML/XML files" 1>&2;  exit 1)
 
-# Non-breaking spaces are harmless, but it is best to avoid them in SGML files.
-# Use perl command because non-GNU grep or sed could not have hex escape sequence.
-check-nbsp:
-	@ ( $(PERL) -ne '/\xC2\xA0/ and print("$$ARGV:$$_"),$$n++; END {exit($$n>0)}' \
-	  $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || \
-	(echo "Non-breaking spaces appear in SGML/XML files" 1>&2;  exit 1)
+# Disallow non-ASCII characters because some rendering engines do not
+# support non-Latin1 UTF8 characters. Use perl command because non-GNU grep
+# or sed could not have hex escape sequence.
+check-non-ascii:
+	@ ( $(PERL) -ne '/[^\x00-\x7f]/ and print("$$ARGV:$$_"),$$n++; END {exit($$n>0)}' \
+	  $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl $(srcdir)/images/*.svg) ) || \
+	(echo "Non-ASCII characters appear in SGML/XML files;  use HTML entities for Latin1 characters" 1>&2;  exit 1)
 
 ##
 ## Clean
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 1ef5322b91..f5e115e8d6 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -1225,7 +1225,7 @@ CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-tr
 
 -- ignore differences in accents and case
 CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
-SELECT 'Å' = 'A' COLLATE ignore_accent_case; -- true
+SELECT 'Å' = 'A' COLLATE ignore_accent_case; -- true
 SELECT 'z' = 'Z' COLLATE ignore_accent_case; -- true
 
 -- upper case letters sort before lower case.
@@ -1282,7 +1282,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true
  'ab' = U&'a\2063b'
  'x-y' = 'x_y'
  'g' = 'G'
- 'n' = 'ñ'
+ 'n' = 'ñ'
  'y' = 'z'
 

@@ -1346,7 +1346,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true
 
 
  At every level, even with full normalization off, basic normalization is
- performed. For example, 'á' may be composed of the
+ performed. For example, 'á' may be composed of the
  code points U&'\0061\0301' or the single code
  point U&'\00E1', and those sequences will be
  considered equal even at the identic level. To treat
@@ -1430,8 +1430,8 @@ SELECT 'x-y' = 'x_y' COLLATE level4; -- false
  false
  
   Backwards comparison for the level 2 differences. For example,
-  locale und-u-kb sorts 'àe'
-  before 'aé'.
+  local

Re: Statistics Import and Export

2024-10-14 Thread Corey Huinker
On Mon, Oct 14, 2024 at 3:40 PM Corey Huinker 
wrote:

> However, this function seems to accept -1 for the relpages parameter.
>> Below is an example of execution:
>> ---
>> postgres=> CREATE TABLE data1(c1 INT PRIMARY KEY, c2 VARCHAR(10));
>> CREATE TABLE
>> postgres=> SELECT pg_set_relation_stats('data1', relpages=>-1);
>>  pg_set_relation_stats
>> ---
>>  t
>> (1 row)
>> postgres=> SELECT relname, relpages FROM pg_class WHERE relname='data1';
>>  relname | relpages
>> -+--
>>  data1   |   -1
>> (1 row)
>> ---
>>
>> The attached patch modifies the pg_set_relation_stats function to work as
>> described in the manual.
>>
>> Regards,
>> Noriyoshi Shinoda
>>
>
> Accepting -1 is correct. I thought I had fixed that in a recent patch.
> Perhaps signals got crossed somewhere along the way.
>

Just to be sure, I went back to v29, fixed a typo and some whitespace
issues in stats_import.out, confirmed that it passed regression tests, then
changed the relpages lower bound from -1 back to 0, and sure enough, the
regression test for pg_upgrade failed again.

It seems that partitioned tables have a relpages of -1, so regression tests
involving tables alpha_neg and alpha_pos (and 35 others, all seemingly
partitioned) fail. So it was the docs that were wrong, not the code.

e839c8ecc9352b7754e74f19ace013c0c0d18613 doesn't include the stuff that
modified pg_dump/pg_upgrade, so it wouldn't have turned up this problem.


Fix for consume_xids advancing XIDs incorrectly

2024-10-14 Thread Yushi Ogiwara

Hi,

I found that the consume_xids function incorrectly advances XIDs as 
shown:


postgres=# select txid_current();
 txid_current
--
746
(1 row)

postgres=# select consume_xids('100');
 consume_xids
--
847
(1 row)

In the example, the consume_xids function consumes 100 XIDs when XID = 
746, so the desired outcome from consume_xids should be 746 + 100 = 846, 
which differs from the actual outcome, 847.


Behavior inside a transaction block:

postgres=# select txid_current();
 txid_current
--
 1410
(1 row)

postgres=# begin;
BEGIN
postgres=*# select consume_xids('100');
 consume_xids
--
 1511
(1 row)
postgres=*# select consume_xids('100');
 consume_xids
--
 1521
(1 row)

Here, the first call inside the transaction block consumes 100+1 XIDs 
(incorrect), while the second call consumes exactly 100 XIDs (as 
expected)


Summary:

The function performs incorrectly when:
- Outside of a transaction block
- The first call inside a transaction block
But works correctly when:
- After the second call inside a transaction block

The issue arises because consume_xids does not correctly count the 
consumed XIDs when it calls the GetTopTransactionId function, which 
allocates a new XID when none has been assigned.


With the attached patch, the consume_xids works as expected, both inside 
and outside of transaction blocks as shown below:


postgres=# select txid_current();
 txid_current
--
 1546
(1 row)

postgres=# select consume_xids('100');
 consume_xids
--
 1646
(1 row)

postgres=# begin;
BEGIN
postgres=*# select consume_xids('100');
 consume_xids
--
 1746
(1 row)

postgres=*# select consume_xids('100');
 consume_xids
--
 1846
(1 row)

Regards,
YushiFrom 59bc3f300b8447a407c8f90ebb542d1c0ef0e932 Mon Sep 17 00:00:00 2001
From: Yushi Ogiwara 
Date: Fri, 11 Oct 2024 14:45:26 +0900
Subject: [PATCH] fix xid

---
 src/test/modules/xid_wraparound/xid_wraparound.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/test/modules/xid_wraparound/xid_wraparound.c b/src/test/modules/xid_wraparound/xid_wraparound.c
index dce81c0c6d..c5f3b717cf 100644
--- a/src/test/modules/xid_wraparound/xid_wraparound.c
+++ b/src/test/modules/xid_wraparound/xid_wraparound.c
@@ -89,7 +89,11 @@ consume_xids_common(FullTransactionId untilxid, uint64 nxids)
 	 * the cache overflows, but beyond that, we don't keep track of the
 	 * consumed XIDs.
 	 */
-	(void) GetTopTransactionId();
+	if(!FullTransactionIdIsValid(GetTopFullTransactionIdIfAny()))
+	{
+		(void) GetTopTransactionId();
+		consumed++;
+	}
 
 	for (;;)
 	{
-- 
2.40.1



Re: Consider the number of columns in the sort cost model

2024-10-14 Thread David Rowley
On Tue, 15 Oct 2024 at 17:48, Andrei Lepikhov  wrote:
> I am suspicious of that but open to hearing other opinions. The
> coefficient incorporates knowledge about how many comparisons will be
> made with this sorting operator. The caller can have some additional
> knowledge about that. For example, IncrementalSort knows about groups -
> each group has the same prefix, which means each tuple comparison will
> inevitably cause a comparison for each column from the prefix. Also, it
> may be knowledge about the number of distinct values for each column,
> etc. I'm not sure we should elaborate cost_sort a lot here.

I agree that cost_sort() likely needs a bit of work to try to bring it
closer to the current reality.  There have been plenty of performance
improvements to sort over the years and I don't recall the costing
code ever being touched, likely that's because there are just so many
things that are not correctly or not costed at all.

As for your patch, I'm suspicious that the general idea you're
proposing is an actual improvement.  I've only glanced at the patch,
but at least from this fragment here:

@@ -2150,9 +2156,12 @@ cost_sort(Path *path, PlannerInfo *root,
 {
  Cost startup_cost;
  Cost run_cost;
+ int cmpMultiplier = npathkeys;

and:

 static void
 cost_tuplesort(Cost *startup_cost, Cost *run_cost,
-double tuples, int width,
+double tuples, int width, int cmpMultiplier,
 Cost comparison_cost, int sort_mem,
 double limit_tuples)
 {
@@ -1913,7 +1917,7 @@ cost_tuplesort(Cost *startup_cost, Cost *run_cost,
  tuples = 2.0;

  /* Include the default cost-per-comparison */
- comparison_cost += 2.0 * cpu_operator_cost;
+ comparison_cost += (1.0 + cmpMultiplier) * cpu_operator_cost;

it seems you're just charging cpu_operator_cost * . It seems like it won't be very hard to fool that into
doing the wrong thing when the first column to sort by is distinct or
almost distinct. There's going to be far fewer or no tiebreaker
comparisons for that case.

As mentioned by Kirill, I also don't understand the cost_sort
signature change. Why would you do that over just doing
list_length(pathkeys) within cost_sort()? Your throwing away a
parameter that might  be the most useful one of the bunch for allowing
better sort cost estimates.

I think maybe what is worth working on is seeing if you can better
estimate the number of tiebreak comparisons by checking the number of
distinct values for each sort key.  That might require what we
discussed on another thread about having estimate_num_groups() better
use the n_distinct estimate from the EquivalenceMember with the least
distinct values.  It'll be another question if all that can be done
and this all still perform well enough for cost_sort(). You may have
to write it first before we can figure that out.  It may be possible
to buy back some of the overheads with some additional caching...
Perhaps that could be done within the EquivalenceClass.

David




Re: Jargon and acronyms on this mailing list

2024-10-14 Thread Bruce Momjian
On Thu, Sep  5, 2024 at 02:14:48PM +0200, Álvaro Herrera wrote:
> On 2024-Aug-30, Greg Sabino Mullane wrote:
> 
> > I normally wouldn't mention my blog entries here, but this one was about
> > the hackers mailing list, so wanted to let people know about it in case you
> > don't follow Planet Postgres. I scanned the last year's worth of posts and
> > gathered the most used acronyms and jargon. The most commonly used acronym
> > was IMO (in my opinion), followed by FWIW (for what it's worth), and IIUC
> > (if I understand correctly). The complete list can be found in the post
> > below, I'll refrain from copying everything here.
> > 
> > https://www.crunchydata.com/blog/understanding-the-postgres-hackers-mailing-list
> 
> Good post, thanks for taking the time.
> 
> This seems a great resource to link in the
> https://wiki.postgresql.org/wiki/So,_you_want_to_be_a_developer%3F
> page or maybe in
> https://wiki.postgresql.org/wiki/Developer_FAQ
> or both ... and also this one
> http://rhaas.blogspot.com/2024/08/posting-your-patch-on-pgsql-hackers.html

Good idea, URL added to both.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"




Re: Use function smgrclose() to replace the loop

2024-10-14 Thread Ilia Evdokimov



On 14.08.2024 09:32, Steven Niu wrote:

Hi, Kirill, Junwang,

I made this patch to address the refactor issue in our previous email 
discussion.
https://www.postgresql.org/message-id/flat/CABBtG=cdtcbdcbk7mcsy6bjr3s5xutog0vsffuw8olduyyc...@mail.gmail.com 



That is, the for loop in function smgrdestroy() and smgrdounlinkall 
can be replaced with smgrclose().


for (forknum = 0; forknum <= MAX_FORKNUM; forknum++)
    smgrsw[reln->smgr_which].smgr_close(reln, forknum);
-->
smgrclose(rels[i]);

Please let me know if you have any questions.

Best Regards,
Steven from Highgo.com


Hello,

Are you sure we can refactor loop by 'smgrclose()'? I see it is not 
quite the same.


Regards,
Ilia Evdokimov,
Tantor Labs LLC.





Re: New "raw" COPY format

2024-10-14 Thread Joel Jacobson
On Sun, Oct 13, 2024, at 14:39, Joel Jacobson wrote:
> On Sun, Oct 13, 2024, at 11:52, Tatsuo Ishii wrote:
>> After copy imported the "unstructured text file" in "row" COPY format,
>> what the column type is?  text? or bytea? If it's text, how do you
>> handle encoding conversion if the "unstructured text file" is encoded
>> in server side unsafe encoding such as SJIS?
>>
>>>  All characters are taken literally.
>>>  There is no special handling for quotes, backslashes, or escape sequences.
>>
>> If SJIS text is imported "literally" (i.e. no encoding conversion), it
>> should be rejected.
>
> I think encoding conversion is still necessary,
> and should work the same as for the COPY formats "text" and "csv".

Attached is a first draft implementation of the new proposed COPY "raw" format.

The first two patches are just the bug fix in HEAD, reported separately:
https://commitfest.postgresql.org/50/5297/

* v4-0001-Fix-thinko-in-tests-for-COPY-options-force_not_null-.patch
The first patch fixes a thinko in tests for COPY options force_not_null and 
force_null.

* v4-0002-Fix-validation-of-FORCE_NOT_NULL-FORCE_NULL-for-all-.patch
The second patch fixes validation of FORCE_NOT_NULL/FORCE_NULL for all-columns 
case.

* v4-0003-Replace-binary-flags-binary-and-csv_mode-with-format.patch
The third patch introduces a new enum CopyFormat, with options for the three 
current formats.

* v4-0004-Reorganize-ProcessCopyOptions-for-clarity-and-consis.patch
The fourth patch reorganize ProcessCopyOptions for clarity and consistent 
option handling.

* v4-0005-Add-raw-COPY-format-support-for-unstructured-text-da.patch
Finally, the firth patch introduces the new "raw" COPY format.

Docs and tests updated.

The raw format currently goes through the same multiple stages,
as the text and CSV formats. I'm not sure what the best approach would be,
if we would want to create a special fast parsing path for this.

/Joel

v4-0001-Fix-thinko-in-tests-for-COPY-options-force_not_null-.patch
Description: Binary data


v4-0002-Fix-validation-of-FORCE_NOT_NULL-FORCE_NULL-for-all-.patch
Description: Binary data


v4-0003-Replace-binary-flags-binary-and-csv_mode-with-format.patch
Description: Binary data


v4-0004-Reorganize-ProcessCopyOptions-for-clarity-and-consis.patch
Description: Binary data


v4-0005-Add-raw-COPY-format-support-for-unstructured-text-da.patch
Description: Binary data


Re: Inval reliability, especially for inplace updates

2024-10-14 Thread Nitin Motiani
On Sun, Oct 13, 2024 at 6:15 AM Noah Misch  wrote:
>
> On Sat, Oct 12, 2024 at 06:05:06PM +0530, Nitin Motiani wrote:
> > 1. In heap_inplace_update_and_unlock, currently both buffer and tuple
> > are unlocked outside the critical section. Why do we have to move the
> > buffer unlock within the critical section here? My guess is that it
> > needs to be unlocked for the inplace invals to be processed. But what
> > is the reasoning behind that?
>
> AtInplace_Inval() acquires SInvalWriteLock.  There are two reasons to want to
> release the buffer lock before acquiring SInvalWriteLock:
>
> 1. Otherwise, we'd need to maintain the invariant that no other part of the
>system tries to lock the buffer while holding SInvalWriteLock.  (That would
>cause an undetected deadlock.)
>
> 2. Concurrency is better if we release a no-longer-needed LWLock before doing
>something time-consuming, like acquiring another LWLock potentially is.
>
> Inplace invals do need to happen in the critical section, because we've
> already written the change to shared buffers, making it the new authoritative
> value.  If we fail to invalidate, other backends may continue operating with
> stale caches.
>

Thanks for the clarification.

> > 2. Is there any benefit in CacheInvalidateHeapTupleCommon taking the
> > preapre_callback argument? Wouldn't it be simpler to just pass an
> > InvalidationInfo* to the function?
>
> CacheInvalidateHeapTupleCommon() has three conditions that cause it to return
> without invoking the callback.  Every heap_update() calls
> CacheInvalidateHeapTuple().  In typical performance-critical systems, non-DDL
> changes dwarf DDL.  Hence, the overwhelming majority of heap_update() calls
> involve !IsCatalogRelation().  I wouldn't want to allocate InvalidationInfo in
> DDL-free transactions.  To pass in InvalidationInfo*, I suppose I'd move those
> three conditions to a function and make the callers look like:
>
> CacheInvalidateHeapTuple(Relation relation,
>  HeapTuple tuple,
>  HeapTuple newtuple)
> {
> if (NeedsInvalidateHeapTuple(relation))
> CacheInvalidateHeapTupleCommon(relation, tuple, newtuple,
>
> PrepareInvalidationState());
> }
>
> I don't have a strong preference between that and the callback way.
>

Thanks. I would have probably done it using the
NeedsInvalidateHeapTuple. But I don't have a strong enough preference
to change it from the callback way. So the current approach seems
good.

> > Also is inval-requires-xid-v0.patch planned to be fixed up to inplace160?
>
> I figure I'll pursue that on a different thread, after inplace160 and
> inplace180.  If there's cause to pursue it earlier, let me know.
>
Sure. Can be done in a different thread.

Thanks,
Nitin Motiani
Google




Re: New "raw" COPY format

2024-10-14 Thread Joel Jacobson
On Mon, Oct 14, 2024, at 10:07, Joel Jacobson wrote:
> Attached is a first draft implementation of the new proposed COPY "raw" 
> format.
>
> The first two patches are just the bug fix in HEAD, reported separately:
> https://commitfest.postgresql.org/50/5297/

I forgot about adding support for the old syntax format.
Fixed in new version. Only the fifth patch is updated.

Before, only the new syntax worked:
COPY  (FORMAT raw);

Now this also works:
COPY ... RAW;

/Joel

v5-0001-Fix-thinko-in-tests-for-COPY-options-force_not_null-.patch
Description: Binary data


v5-0002-Fix-validation-of-FORCE_NOT_NULL-FORCE_NULL-for-all-.patch
Description: Binary data


v5-0003-Replace-binary-flags-binary-and-csv_mode-with-format.patch
Description: Binary data


v5-0004-Reorganize-ProcessCopyOptions-for-clarity-and-consis.patch
Description: Binary data


v5-0005-Add-raw-COPY-format-support-for-unstructured-text-da.patch
Description: Binary data


Re: Add contrib/pg_logicalsnapinspect

2024-10-14 Thread Peter Smith
FYI - Although I did not re-apply/test the latest patchset v16*, by
visual inspection of the minor v15/v16 diffs it looks good to me.

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Improve node type forward reference

2024-10-14 Thread Peter Eisentraut

This is a small code simplification.

In primnodes.h, we have

typedef struct IntoClause
{
...
/* materialized view's SELECT query */
Node   *viewQuery ...;

with the comment

/* (Although it's actually Query*, we declare
 * it as Node* to avoid a forward reference.)

But we can do this better by using an incomplete struct, like

struct Query *viewQuery ...;

That way, everything has the correct type and fewer casts are required. 
This technique is already used elsewhere in node type definitions.


The second patch just removes some more unnecessary casts around 
copyObject() that I found while working on this.From df3f09a1d12c91595ad0702a17a2a7d884f5193f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 14 Oct 2024 08:52:34 +0200
Subject: [PATCH 1/2] Improve node type forward reference

Instead of using Node *, we can use an incomplete struct.  That way,
everything has the correct type and fewer casts are required.  This
technique is already used elsewhere in node type definitions.
---
 src/backend/commands/createas.c | 2 +-
 src/backend/parser/analyze.c| 2 +-
 src/include/nodes/primnodes.h   | 5 ++---
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0b629b1f79c..68ec122dbf9 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -133,7 +133,7 @@ create_ctas_internal(List *attrList, IntoClause *into)
if (is_matview)
{
/* StoreViewQuery scribbles on tree, so make a copy */
-   Query  *query = (Query *) copyObject(into->viewQuery);
+   Query  *query = copyObject(into->viewQuery);
 
StoreViewQuery(intoRelationAddr.objectId, query, false);
CommandCounterIncrement();
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index e901203424d..5e5e0b1c5cf 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -3076,7 +3076,7 @@ transformCreateTableAsStmt(ParseState *pstate, 
CreateTableAsStmt *stmt)
 * in the IntoClause because that's where intorel_startup() can
 * conveniently get it from.
 */
-   stmt->into->viewQuery = (Node *) copyObject(query);
+   stmt->into->viewQuery = copyObject(query);
}
 
/* represent the command as a utility Query */
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index ea47652adb8..f2677b2da1d 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -152,8 +152,7 @@ typedef struct TableFunc
  * For CREATE MATERIALIZED VIEW, viewQuery is the parsed-but-not-rewritten
  * SELECT Query for the view; otherwise it's NULL.  This is irrelevant in
  * the query jumbling as CreateTableAsStmt already includes a reference to
- * its own Query, so ignore it.  (Although it's actually Query*, we declare
- * it as Node* to avoid a forward reference.)
+ * its own Query, so ignore it.
  */
 typedef struct IntoClause
 {
@@ -166,7 +165,7 @@ typedef struct IntoClause
OnCommitAction onCommit;/* what do we do at COMMIT? */
char   *tableSpaceName; /* table space to use, or NULL */
/* materialized view's SELECT query */
-   Node   *viewQuery pg_node_attr(query_jumble_ignore);
+   struct Query *viewQuery pg_node_attr(query_jumble_ignore);
boolskipData;   /* true for WITH NO DATA */
 } IntoClause;
 
-- 
2.47.0

From 2cc8cde745140a17c35c17fe3ff10f0a4d5cd12a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 14 Oct 2024 09:00:15 +0200
Subject: [PATCH 2/2] Fix unnecessary casts of copyObject() result

The result is already of the correct type, so these casts don't do
anything.
---
 src/backend/commands/trigger.c | 2 +-
 src/backend/nodes/nodeFuncs.c  | 4 ++--
 src/backend/rewrite/rewriteManip.c | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 3671e82535e..09356e46d16 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -1169,7 +1169,7 @@ CreateTriggerFiringOn(CreateTrigStmt *stmt, const char 
*queryString,
 * Initialize our fabricated parse node by copying the 
original
 * one, then resetting fields that we pass separately.
 */
-   childStmt = (CreateTrigStmt *) copyObject(stmt);
+   childStmt = copyObject(stmt);
childStmt->funcname = NIL;
childStmt->whenClause = NULL;
 
diff --git a/src/backend/nodes/nodeFuncs.c b/src/backend/nodes/nodeFuncs.c
index 0d00e029f32..f76072228c9 100644
--- a/src/backend/nodes/nodeFuncs.c
+++ b/src/backend/nodes/nodeFuncs.c
@@ -2996,7 +2996,7 @@ expression_tr

Check for tuplestorestate nullness before dereferencing

2024-10-14 Thread Alexander Kuznetsov

Hello everyone,

I'd like to propose adding a check for the nullness of tuplestorestate before 
dereferencing it
in src/backend/executor/nodeModifier.c. The patch is attached.

I am proposing this fix based on the assumption that tuplestorestate could be 
NULL
since there is a check for it when calculating eof_tuplestore at line 85.
However, since this code hasn't been changed since 2006 and hasn't caused any 
issues,
it is possible that the check for (tuplestorestate == NULL) is redundant when 
calculating eof_tuplestore.

--
Best regards,
Alexander Kuznetsov
From b5217fd138f35fb5bf70ad8741ebed5330457891 Mon Sep 17 00:00:00 2001
From: Alexander Kuznetsov 
Date: Thu, 10 Oct 2024 17:38:10 +0300
Subject: [PATCH] Check for tuplestorestate nullness before dereferencing

tuplestorestate can be NULL when calculating eof_tuplestore,
where tuplestorestate is dereferenced by tuplestore_gettuple().
Add check for nullness before dereferencing.

Found by ALT Linux Team with Svace.
---
 src/backend/executor/nodeMaterial.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 22e1787fbd..5bc8561f3a 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -95,7 +95,7 @@ ExecMaterial(PlanState *pstate)
 			 * to return the one before that, if possible. So do an extra
 			 * fetch.
 			 */
-			if (!tuplestore_advance(tuplestorestate, forward))
+			if (tuplestorestate == NULL || !tuplestore_advance(tuplestorestate, forward))
 return NULL;	/* the tuplestore must be empty */
 		}
 		eof_tuplestore = false;
-- 
2.42.2



Re: Enable data checksums by default

2024-10-14 Thread Peter Eisentraut

On 03.10.24 23:13, Nathan Bossart wrote:

On Tue, Oct 01, 2024 at 11:15:02AM -0400, Peter Eisentraut wrote:

I have committed 0001 (the new option) and 0004 (the docs tweak).  I think
there is consensus for the rest, too, but I'll leave it for a few more days
to think about.  I guess the test failure has to be addressed.


Here is a rebased patch with the test fix (for cfbot).  I have made no
other changes.


I have committed the test changes (patch 0002).  (I renamed the option 
to no_data_checksums to keep the wording consistent with the initdb option.)


Right now, with checksums off by default, this doesn't do much, but you 
can test this like


PG_TEST_INITDB_EXTRA_OPTS=--data-checksums meson test ...

and everything will pass.  To make that work, I had to adjust the order 
of how the initdb options are assembled in Cluster.pm a bit.


I will work on the patch that flips the default next.





Re: Add system column support to the USING clause

2024-10-14 Thread Denis Garsh
If we are not going to implement this functionality, maybe it’s worth 
adding an explicit description to the 
[documentation](https://www.postgresql.org/docs/current/queries-table-expressions.html)? 
A note like: "JOIN USING doesn’t support system columns. If you need to 
join tables by a system column, use the JOIN ON clause."


--
Best regards,
Denis Garsh,
C Developer,
d.ga...@arenadata.io





Re: Check for tuplestorestate nullness before dereferencing

2024-10-14 Thread Alena Rybakina

Hi!

On 14.10.2024 16:41, Ilia Evdokimov wrote:


On 14.10.2024 12:25, Alexander Kuznetsov wrote:

Hello everyone,

I'd like to propose adding a check for the nullness of 
tuplestorestate before dereferencing it

in src/backend/executor/nodeModifier.c. The patch is attached.

I am proposing this fix based on the assumption that tuplestorestate 
could be NULL
since there is a check for it when calculating eof_tuplestore at line 
85.
However, since this code hasn't been changed since 2006 and hasn't 
caused any issues,
it is possible that the check for (tuplestorestate == NULL) is 
redundant when calculating eof_tuplestore.




Hi Alexander,

The 'tuplestorestate' variable may be initialized at line 64 if it is 
NULL. You should consider initializing this variable earlier.



To be honest, I'm not sure this change is necessary. Looking at the 
code, I see that in ExecMaterial it is possible to handle a 
tuplestorestate of NULL, and this error can be accessed if the flags are 
not zero, but I think these cases have been worked out.


As I see it, node->eflags can be zero if it passes the output of a 
subquery, during the initialization of the Material node execution, and 
when the subquery is rescanned.


After the subplan scan is complete, we see that the eof_underlying 
variable becomes true, and this part of the code will no longer be 
accessible. tuplestorestate also becomes Null.


I also noticed that tuplestorestate=NULL is an indicator that the scan 
is complete, so if this section of code is called, something is wrong 
earlier in the code.


--
Regards,
Alena Rybakina
Postgres Professional





Re: allowing extensions to control planner behavior

2024-10-14 Thread Jakub Wartak
Hi Robert,

On Thu, Oct 10, 2024 at 6:52 PM Robert Haas  wrote:

> On Wed, Sep 18, 2024 at 11:48 AM Robert Haas 
> wrote:
> > Still, I think it's a pretty useful starting point. It is mostly
> > enough to give you control over join planning, and if combined with
> > similar work for scan planning, I think it would be enough for
> > pg_hint_plan. If we also got control over appendrel and agg planning,
> > then you could do a bunch more cool things.
>
> Here's a new set of patches where I added a similar mechanism for scan
> type control. See the commit message for some notes on limitations of
> this approach. In the absence of better ideas, I'd like to proceed
> with something along the lines of 0001 and 0002.
>
> I upgraded the hint_via_alias contrib module (which is not intended
> for commit, it's just a demo) so that it can hint either scan type or
> join type. I think this is sufficient to demonstrate that it's quite
> easy to use hooks to leverage this new infrastructure.


Thank You! I've played a little and IMHO this is a step in a good direction
after playing a tiny bit with 'hint_via_alias'.


> In fact, the
> biggest thing I'm unhappy about right now is the difficulty of
> providing the hooks with any sort of useful information. I don't think
> it should be the job of this patch set to solve that problem, but I do
> think we should do something about it. The problem exists on two
> levels:
>
> 1. If you want to specify in-query hints using comments, how does your
> extension get access to the comments? [..]Still, it's not clear what other
> approach you could
> adopt.
>

No, I don't think the ability to influence optimizers should be tied to SQL
comments as a "vehicle" to transfer some information, so -1 for going into
any discussion about it and wasting Your time on this. Rationale: as you
note such "interface" is quirky, and troublesome even for users. IMHO they
are mostly just used as a way to run experiments, but noone with sense of
touch with reality would ever embed query with hints in the comment section
inside the production application if had other choices, and that's for
multiple reasons (e.g. it's better to have control about it in the DB as
performance is function of time [and pgversion], ORM might be even unable
to do it in the 1st place, people do not have access to the source code).
You could just assume we have "SET
extension.influence_optimizer='SeqScan(t)'" and be done with it as far as
the production-troubleshooting goes. It's better because it's easier to use
(one does not need to even launch an editor to modify the query) during
those experiments. E.g. I would find it much faster to iterate in psql with
a loop of: SET + `\i query.sql` rather than often having dozens of KBs to
re-edit. And there's even a logon trigger today, so it could be (ab)used to
SET that GUC with just some specific conditions (e.g. only for specific
application_name and that could be even forced by e.g. pgjdbc driver --
jdbcurl?ApplicationName=enable_this_workaround_just_here).

Well the issue is however how do you bind such influence to just one
specific query without changing the app in the long run. My take is that we
should utilize compute_query_id (hash) and then extension should allow
doing something along of the lines of mapping (queryId <->
influence_optimizer), or even it could provide `ALTER SQL  SET
influence_optimizer='SeqScan(t)'`. Users could take that hash from the %Q
in the log_line_prefix.

2. If you want a certain base relation or join relation to be treated
> in a certain way, how do you identify it? You might think that this is
> easy because, even when a query contains multiple references to a
> relation with the same name, or identical aliases in different parts
> of the query, EXPLAIN renames them so they have disjoint names. What
> would be nice is if you could feed those names back into your
> extension and use them as a way of specifying what behavior you want
> where. But that doesn't work, because what actually happens is that
> the plan can contain duplicated aliases, and then when EXPLAIN
> deparses it using ruleutils.c, that's when we rename things so that
> they're distinct. This means that, at the time we're planning, we
> don't yet know what name EXPLAIN will end up giving to any relation
> yet, which means we can't use the names that EXPLAIN produced for an
> earlier plan for the same query to associate behaviors with relations.
> I wonder if we could think about reversing the order of operations
> here and making it so that we do the distinct-ification during parse
> analysis or maybe early in planning, so that the name you see EXPLAIN
> print out is the real name of that thing and not just a value invented
> for display purposes.
>

This if that's possible?, or simply some counter and numbering the plan
operation? or Andrei's response/idea of using hashing??

So again, I am definitely not saying that these patches get us all the
> way to where we should be -- not in te

Re: Missing deconstruct_array_builtin usage

2024-10-14 Thread Peter Eisentraut

On 14.10.24 08:12, Bertrand Drouvot wrote:

It seems that src/backend/utils/adt/float.c file still has functions
that can use [de]construct_array_builtin(), for example
float8_combine(). I think it's an oversight of d746021de1.

Thanks for looking at it.

Good catch, please find attached v2 taking care of those. I looked at the
remaining [de]construct_array() usages and that looks ok to me.


This looks good to me.  I can't think of a reason why it was not 
included in the original patch.






Re: Adding OLD/NEW support to RETURNING

2024-10-14 Thread Dean Rasheed
On Mon, 14 Oct 2024 at 16:34, jian he  wrote:
>
> typedef struct ReturningOption
> {
> NodeTagtype;
> ReturningOptionKind option; /* specified option */
> char   *value;/* option's value */
> ParseLoclocation;/* token location, or -1 if unknown */
> } ReturningOption;
>
> @@ -4304,6 +4332,16 @@ raw_expression_tree_walker_impl(Node *no
>   return true;
>   }
>   break;
> + case T_ReturningClause:
> + {
> + ReturningClause *returning = (ReturningClause *) node;
> +
> + if (WALK(returning->options))
> + return true;
> + if (WALK(returning->exprs))
> + return true;
> + }
> + break;
>
> + if (WALK(returning->options))
> + return true;
> T_ReturningOption is primitive, so we only need to
> "if (WALK(returning->exprs))"?

No, it still needs to walk the options so that it will call the
callback for each option. The fact that T_ReturningOption is primitive
doesn't change that, it just means that there is no more structure
*below* a ReturningOption that needs to be traversed. The
ReturningOption itself still needs to be traversed. For example,
imagine you wanted to use raw_expression_tree_walker() to print out
the whole structure of a raw parse tree. You'd want your printing
callback to be called for every node, including the ReturningOption
nodes.

Regards,
Dean




Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

2024-10-14 Thread Pavel Stehule
po 14. 10. 2024 v 5:38 odesílatel jian he 
napsal:

> On Mon, Oct 14, 2024 at 1:13 AM Tom Lane  wrote:
> >
> > Pavel Stehule  writes:
> > > so 12. 10. 2024 v 9:33 odesílatel jian he  >
> > > napsal:
> > >> + /*
> > >> + * If we have a location (which, as said above, we really always
> should)
> > >> + * then report a line number to aid in localizing problems in big
> scripts.
> > >> + */
> > >> + if (location >= 0)
> > >> so this part will always be true?
> >
> > > yes, after  CleanQuerytext the location should not be -1 ever
> >
> > Right, but we might not have entered either of those previous
> > if-blocks.
>
>
> in src/backend/parser/gram.y
> your makeRawStmt changes (v4) seem to guarantee that
> RawStmt->stmt_location >= 0.
> other places {DefineView,DoCopy,PrepareQuery} use makeNode(RawStmt),
> In these cases, I am not so sure RawStmt->stmt_location >=0  is always
> true.
>
> in execute_sql_string
>
> raw_parsetree_list = pg_parse_query(sql);
>dest = CreateDestReceiver(DestNone);
> foreach(lc1, raw_parsetree_list)
> {
> RawStmt*parsetree = lfirst_node(RawStmt, lc1);
> MemoryContext per_parsetree_context,
> oldcontext;
> List   *stmt_list;
> ListCell   *lc2;
> callback_arg.stmt_location = parsetree->stmt_location;
> callback_arg.stmt_len = parsetree->stmt_len;
> per_parsetree_context =
> AllocSetContextCreate(CurrentMemoryContext,
>   "execute_sql_string per-statement
> context",
>   ALLOCSET_DEFAULT_SIZES);
> oldcontext = MemoryContextSwitchTo(per_parsetree_context);
> CommandCounterIncrement();
> stmt_list = pg_analyze_and_rewrite_fixedparams(parsetree,
>sql,
>NULL,
>0,
>NULL);
>
> Based on the above code, we do
> `callback_arg.stmt_location = parsetree->stmt_location;`
> pg_parse_query(sql) doesn't use script_error_callback.
>
> So if we are in script_error_callback
> `intlocation = callback_arg->stmt_location;`
> location >= 0 will be always true?
>
>
>
> > The question here is whether the raw parser (gram.y)
> > ever throws an error that doesn't include a cursor position.  IMO it
> > shouldn't, but a quick look through gram.y finds a few ereports that
> > lack parser_errposition.  We could go fix those, and probably should,
> > but imagining that none will ever be introduced again seems like
> > folly.
> >
>
> I don't know how to add the error position inside the function
> insertSelectOptions.
> maybe we can add
> `parser_errposition(exprLocation(limitClause->limitCount;`
> but limitCount position is a nearby position.
> I am also not sure about func mergeTableFuncParameters.
>
>
> for other places in gram.y, I've added error positions for ereport
> that lack it  , please check the attached.
>

I think this can be a separate commitfest issue.

Regards

Pavel


Re: Index AM API cleanup

2024-10-14 Thread Peter Eisentraut

On 24.09.24 11:09, Mark Dilger wrote:




On Sep 24, 2024, at 10:50 AM, Peter Eisentraut  wrote:

Next, I have reviewed patches

v17-0010-Track-sort-direction-in-SortGroupClause.patch
v17-0011-Track-scan-reversals-in-MergeJoin.patch

Both of these seem ok and sensible to me.

They take the concept of the "reverse" flag that already exists in the affected 
code and just apply it more consistently throughout the various code layers, instead of 
relying on strategy numbers as intermediate storage.  This is both helpful for your 
ultimate goal in this patch series, and it also makes the affected code areas simpler and 
more consistent and robust.



Thanks for the review!

Yes, I found the existing use of a btree strategy number rather than a boolean 
"reverse" flag made using the code from other index AMs needlessly harder.  I 
am glad you see it the same way.


committed





Re: Better error reporting from extension scripts (Was: Extend ALTER OPERATOR)

2024-10-14 Thread Tom Lane
jian he  writes:
> On Mon, Oct 14, 2024 at 1:13 AM Tom Lane  wrote:
>> Right, but we might not have entered either of those previous
>> if-blocks.

> in src/backend/parser/gram.y
> your makeRawStmt changes (v4) seem to guarantee that
> RawStmt->stmt_location >= 0.

Yes, I would expect that any RawStmt we see here will have valid
stmt_location.  What you seem to be missing is that an error could
be thrown from

> raw_parsetree_list = pg_parse_query(sql);

before execute_sql_string reaches its loop over RawStmts.  In that
case we'll reach script_error_callback with callback_arg.stmt_location
still being -1.

> pg_parse_query(sql) doesn't use script_error_callback.

Eh?  We've put that on the error context callback stack.
It is not pg_parse_query's decision whether it will be called.

regards, tom lane




Re: Remove obsolete comment in reorderbuffer.h

2024-10-14 Thread Masahiko Sawada
On Sun, Oct 13, 2024 at 8:57 PM Amit Kapila  wrote:
>
> On Sat, Oct 12, 2024 at 5:42 AM Masahiko Sawada  wrote:
> >
> > I realized an obsolete comment in the definition of ReorderBufferTXN.
> >
> > Commit 9fab40ad32e changed ReorderBuffer to use Slab Context for
> > allocating ReorderBufferTXN entries instead of using a caching
> > mechanism, so the txn->node is no longer used as an element of the
> > list of preallocated ReorderBufferTXNs.
> >
>
> I agree, and your patch looks good to me.

Thank you for reviewing the patch. Pushed.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Missing deconstruct_array_builtin usage

2024-10-14 Thread Masahiko Sawada
On Mon, Oct 14, 2024 at 3:05 AM Peter Eisentraut  wrote:
>
> On 14.10.24 08:12, Bertrand Drouvot wrote:
> >> It seems that src/backend/utils/adt/float.c file still has functions
> >> that can use [de]construct_array_builtin(), for example
> >> float8_combine(). I think it's an oversight of d746021de1.
> > Thanks for looking at it.
> >
> > Good catch, please find attached v2 taking care of those. I looked at the
> > remaining [de]construct_array() usages and that looks ok to me.
>
> This looks good to me.  I can't think of a reason why it was not
> included in the original patch.

Thank you for reviewing the patch. Pushed (v2 patch).

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: pg_upgrade check for invalid databases

2024-10-14 Thread Bruce Momjian
On Sun, Oct 13, 2024 at 08:28:57AM -0400, Thomas Krennwallner wrote:
> In v2 I've made changes to the patch incorporating the suggestions here:
> 
> * Default behaviour is to just fail with a report of all invalid databases
> 
> * A new option --skip-invalid-databases will then skip the checks, and
> would not transfer any invalid database to the new cluster. A warning
> with a report file will then follow after a successful run.
> 
> Dropping invalid databases in the old cluster will make invalid
> databases unrecoverable, so I opted for a skip over invalid databases
> approach that would leave invalid databases in the old cluster.
> 
> Apart from a missing --skip-invalid-databases test, does this attempt look OK?

I don't think there is enough demand for the feature of skipping invalid
databases because we have gotten few reports of such problems, and also
because your case is related to an external tool causing this problem.

What might be acceptable would be to add an option that would make
pg_upgrade more tolerant of problems in many areas, that is a lot more
research and discussion.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"




Re: ECPG cleanup and fix for clang compile-time problem

2024-10-14 Thread Tom Lane
John Naylor  writes:
>> [v5]

Thanks for reviewing!  I pushed 0001-0005 and 0009, adopting
your suggestions except for

> + /* List a token here if pgc.l assigns to base_yylval.str for it */
> Does pgc.l need to have a similar comment?

That's not a bad suggestion, but I couldn't see any very useful place
to put such a comment.

> I haven't looked closely at 0006 through 0009. One possible concern is
> that the regression tests might not cover very well, but if you can
> get valgrind silent for memory leaks for what they do cover, that's
> certainly a good step.

Attached are rebased and renumbered 0006-0008, mostly to keep the
cfbot happy.  We could actually stop here, if we were feeling lazy,
but now that I've done the work I'm inclined to push forward with
the rest.

The rest is just memory leak removal, and I suspect that nobody really
cares that much about small leakage in the preprocessor: you'd have to
be running some darn big files through it to notice.  FTR, here are
the total leaks reported by valgrind for running the ecpg regression
tests, using code like

$ grep lost: *log | tr -d ',' | awk '{sum += $5} 
END {print sum}'

Before these patches:   25743
after 0003: 59049363
after 0005: 141556  (this is master now)
after 0006(0001):   132633
after 0007(0002):   9087
after 0008(0003):   0

So clearly, 0003 by itself wasn't good enough, but arguably no
real users will notice the extra inefficiency as of HEAD.
Still, I'd kind of like to get 0007 (now 0002) in there, and
I believe 0006 (0001) is a necessary prerequisite to that.

regards, tom lane

From 43f3f423db776695172f74d35322f99b23706b86 Mon Sep 17 00:00:00 2001
From: Tom Lane 
Date: Mon, 14 Oct 2024 14:00:40 -0400
Subject: [PATCH v6 1/3] ecpg: fix some memory leakage of data-type-related
 structures.

ECPGfree_type() and related functions were quite incomplete
about removing subsidiary data structures.  Possibly this is
because ecpg wasn't careful to make sure said data structures
always had their own storage.  Previous patches in this series
cleaned up a lot of that, and I had to add a couple more
mm_strdup's here.

Also, ecpg.trailer tended to overwrite struct_member_list[struct_level]
without bothering to free up its previous contents, thus potentially
leaking a lot of struct-member-related storage.  Add
ECPGfree_struct_member() calls at appropriate points.  (Note: the
lifetime of those lists is not obvious.  They are still live after
initial construction, in order to handle cases like

struct foo { ... } foovar1, foovar2;

We can't delete the list immediately after parsing the right brace,
because it needs to be copied for each of the variables.  Instead,
it's kept around until the next struct declaration.)

Discussion: https://postgr.es/m/2011420.1713493...@sss.pgh.pa.us
---
 src/interfaces/ecpg/preproc/ecpg.header  |  3 ++-
 src/interfaces/ecpg/preproc/ecpg.trailer | 11 +--
 src/interfaces/ecpg/preproc/type.c   | 15 +--
 src/interfaces/ecpg/preproc/type.h   |  5 +++--
 src/interfaces/ecpg/preproc/variable.c   |  7 ++-
 5 files changed, 29 insertions(+), 12 deletions(-)

diff --git a/src/interfaces/ecpg/preproc/ecpg.header b/src/interfaces/ecpg/preproc/ecpg.header
index d3df8eabbb..d54eca918d 100644
--- a/src/interfaces/ecpg/preproc/ecpg.header
+++ b/src/interfaces/ecpg/preproc/ecpg.header
@@ -506,11 +506,12 @@ add_typedef(const char *name, const char *dimension, const char *length,
 		this->name = mm_strdup(name);
 		this->brace_level = braces_open;
 		this->type = (struct this_type *) mm_alloc(sizeof(struct this_type));
+		this->type->type_storage = NULL;
 		this->type->type_enum = type_enum;
 		this->type->type_str = mm_strdup(name);
 		this->type->type_dimension = mm_strdup(dimension); /* dimension of array */
 		this->type->type_index = mm_strdup(length);	/* length of string */
-		this->type->type_sizeof = ECPGstruct_sizeof;
+		this->type->type_sizeof = ECPGstruct_sizeof ? mm_strdup(ECPGstruct_sizeof) : NULL;
 		this->struct_member_list = (type_enum == ECPGt_struct || type_enum == ECPGt_union) ?
 			ECPGstruct_member_dup(struct_member_list[struct_level]) : NULL;
 
diff --git a/src/interfaces/ecpg/preproc/ecpg.trailer b/src/interfaces/ecpg/preproc/ecpg.trailer
index 0a77559e83..41029701fc 100644
--- a/src/interfaces/ecpg/preproc/ecpg.trailer
+++ b/src/interfaces/ecpg/preproc/ecpg.trailer
@@ -751,6 +751,7 @@ var_type: simple_type
 			else
 $$.type_sizeof = cat_str(3, "sizeof(", this->name, ")");
 
+			ECPGfree_struct_member(struct_member_list[struct_level]);
 			struct_member_list[struct_level] = ECPGstruct_member_dup(this->struct_member_list);
 		}
 	}
@@ -878,6 +879,7 @@ var_type: simple_type
 			else
 $$.type_sizeof = cat_str(3, "sizeof(", this->name, ")");
 
+			ECPGfree_struct_member(struct_member_list[struct_level]);
 			struct_member_list[struct_level] = ECPGstruct_member_dup(this->struct_member_list);
 		}
 	}
@@ -900,6 +9

Re: Doc: typo in config.sgml

2024-10-14 Thread Bruce Momjian
On Fri, Oct 11, 2024 at 12:36:53PM +0900, Yugo NAGATA wrote:
> On Fri, 11 Oct 2024 12:16:50 +0900 (JST)
> Tatsuo Ishii  wrote:
> 
> > > We can check non-ASCII letters SGML/XML files by preparing "allowlist"
> > > that contains lines which are allowed to have non-ascii characters,
> > > although this list will need to be maintained when lines in it are 
> > > modified.
> > > I've attached a patch to add a simple Perl script to do this.
> > 
> > I doubt it really works. For example, nbsp can be used formatting
> > (that's the purpose of the character in the first place). Whenever a
> > developer decides to or not to use nbsp, "allowlist" needs to be
> > maintained. It's too annoying.
> 
> I suppose non-ascii characters including nbsp are basically disallowed,
> so the allowlist will not increase unless there is some special reason.
> 
> However, it is true that there might be a cost for maintaining the list
> more or less, so if people don't think it is worth adding this check, 
> I will withdraw this proposal.l.

I did some more research and we able to clarify our behavior in
release.sgml:

We can only use Latin1 characters, not all UTF8 characters,
because rendering engines must support the referenced characters,
and they currently only support Latin1.  In the SGML files we
encode non-ASCII Latin1 characters as HTML entities, e.g.,
Álvaro Herrera.  Oddly, it is possible to add Latin1
characters as UTF8, but we we currently prevent this via the
Makefile "check-non-ascii" check.

We used to use UTF8 characters in SGML files, but only UTF8 characters
that had Latin1 equivalents, and I think the toolchain would convert
UTF8 to Latin1 for us.

What I ended up doing was to change the UTF8 encoded characters to HTML
entities, and then modify the Makefile to check for any non-ASCII
characters.  This will catch   and any other UTF8 characters.

I also added a dummy 'pdf' target that is the same as the postgres.pdf
dummy target;  we already had an "html" target, so I thought a "pdf" one
made sense.

Patch attached.  I plan to apply this in a few days to master.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  When a patient asks the doctor, "Am I going to die?", he means 
  "Am I going to die soon?"
diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 65ed32cd0ab..87d21783e52 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -143,7 +143,7 @@ postgres.txt: postgres.html
 ## Print
 ##
 
-postgres.pdf:
+postgres.pdf pdf:
 	$(error Invalid target;  use postgres-A4.pdf or postgres-US.pdf as targets)
 
 XSLTPROC_FO_FLAGS += --stringparam img.src.path '$(srcdir)/'
@@ -194,7 +194,7 @@ MAKEINFO = makeinfo
 ##
 
 # Quick syntax check without style processing
-check: postgres.sgml $(ALLSGML) check-tabs check-nbsp
+check: postgres.sgml $(ALLSGML) check-tabs check-non-ascii
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $<
 
 
@@ -262,10 +262,9 @@ check-tabs:
 
 # Non-breaking spaces are harmless, but it is best to avoid them in SGML files.
 # Use perl command because non-GNU grep or sed could not have hex escape sequence.
-check-nbsp:
-	@ ( $(PERL) -ne '/\xC2\xA0/ and print("$$ARGV:$$_"),$$n++; END {exit($$n>0)}' \
-	  $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || \
-	(echo "Non-breaking spaces appear in SGML/XML files" 1>&2;  exit 1)
+check-non-ascii:
+	@( ! grep -P '[^\x00-\x7f]' $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml $(srcdir)/*.xsl) ) || \
+	(echo "Non-ASCII characters appear in SGML/XML files;  use HTML entities for Latin1 characters" 1>&2;  exit 1)
 
 ##
 ## Clean
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 1ef5322b912..f5e115e8d6e 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -1225,7 +1225,7 @@ CREATE COLLATION ignore_accents (provider = icu, locale = 'und-u-ks-level1-kc-tr
 
 -- ignore differences in accents and case
 CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
-SELECT 'Å' = 'A' COLLATE ignore_accent_case; -- true
+SELECT 'Å' = 'A' COLLATE ignore_accent_case; -- true
 SELECT 'z' = 'Z' COLLATE ignore_accent_case; -- true
 
 -- upper case letters sort before lower case.
@@ -1282,7 +1282,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true
  'ab' = U&'a\2063b'
  'x-y' = 'x_y'
  'g' = 'G'
- 'n' = 'ñ'
+ 'n' = 'ñ'
  'y' = 'z'
 

@@ -1346,7 +1346,7 @@ SELECT 'w;x*y-z' = 'wxyz' COLLATE num_ignore_punct; -- true
 
 
  At every level, even with full normalization off, basic normalization is
- performed. For example, 'á' may be composed of the
+ performed. For example, 'á' may be composed of the
  code points U&'\0061\0301' or the single code
  point U&'\00E1', and those sequences will be
  considered equal even at the identic lev

Re: Add support to TLS 1.3 cipher suites and curves lists

2024-10-14 Thread Peter Eisentraut

On 26.09.24 11:01, Daniel Gustafsson wrote:

Attached is a v7 which address a test failure in the CI.  It turns out that the
test_misc module gather GUC names using the :alpha: character class which only
allows alphabetic whereas GUC names can have digits in them.  The 0001 patch
fixes this by instead using the :alnum: character class which allows all
alphanumeric characters.  This is not directly related to this patch, it just
happened to be exposed by it.


If we are raising the minimum version to OpenSSL 1.1.1, couldn't we then 
remove the version check introduced by commit cdbc0c0 ("Only perform 
pg_strong_random init when required")?


FWIW, these patches generally look okay to me.  I haven't done much 
in-depth checking, but overall everything looks sensible.  I think Jacob 
already provided more in-depth reviews, but let me know if you need 
anything else on this.






Re: Check for tuplestorestate nullness before dereferencing

2024-10-14 Thread Ilia Evdokimov



On 14.10.2024 12:25, Alexander Kuznetsov wrote:

Hello everyone,

I'd like to propose adding a check for the nullness of tuplestorestate 
before dereferencing it

in src/backend/executor/nodeModifier.c. The patch is attached.

I am proposing this fix based on the assumption that tuplestorestate 
could be NULL

since there is a check for it when calculating eof_tuplestore at line 85.
However, since this code hasn't been changed since 2006 and hasn't 
caused any issues,
it is possible that the check for (tuplestorestate == NULL) is 
redundant when calculating eof_tuplestore.




Hi Alexander,

The 'tuplestorestate' variable may be initialized at line 64 if it is 
NULL. You should consider initializing this variable earlier.


Regards,
Ilia Evdokimov,
Tantor Labs LLC.





Re: Adding OLD/NEW support to RETURNING

2024-10-14 Thread jian he
On Mon, Oct 14, 2024 at 7:03 PM Dean Rasheed  wrote:
>
> > > typedef struct ReturningOption
> > > {
> > > NodeTagtype;
> > > boolisNew;
> > > char   *name;
> > > intlocation;
> > > } ReturningOption;
>
> Thinking about that struct some more, I think "isNew" is better done
> as an enum, since this is meant to be a generic option. So even though
> it might never have more than 2 possible values, I think it's neater
> done that way.
>

typedef struct ReturningOption
{
NodeTagtype;
ReturningOptionKind option; /* specified option */
char   *value;/* option's value */
ParseLoclocation;/* token location, or -1 if unknown */
} ReturningOption;



@@ -4304,6 +4332,16 @@ raw_expression_tree_walker_impl(Node *no
  return true;
  }
  break;
+ case T_ReturningClause:
+ {
+ ReturningClause *returning = (ReturningClause *) node;
+
+ if (WALK(returning->options))
+ return true;
+ if (WALK(returning->exprs))
+ return true;
+ }
+ break;


+ if (WALK(returning->options))
+ return true;
T_ReturningOption is primitive, so we only need to
"if (WALK(returning->exprs))"?