date:20240122

Re: Implement missing join selectivity estimation for range types

2024-01-22 Thread jian he

I cannot figure out why it aborts.

as Tom mentioned in upthread about the test cases.
similar to src/test/regress/sql/stats_ext.sql check_estimated_rows function.
we can test it by something:

create or replace function check_estimated_rows(text) returns table (ok bool)
language plpgsql as
$$
declare
ln text;
tmp text[];
first_row bool := true;
begin
for ln in
execute format('explain analyze %s', $1)
loop
if first_row then
first_row := false;
tmp := regexp_match(ln, 'rows=(\d*) .* rows=(\d*)');
return query select 0.2 < tmp[1]::float8 / tmp[2]::float8
and tmp[1]::float8 / tmp[2]::float8 < 5;
end if;
end loop;
end;
$$;

select * from check_estimated_rows($$select * from test_range_join_1,
test_range_join_2 where ir1 && ir2$$);
select * from check_estimated_rows($$select * from test_range_join_1,
test_range_join_2 where ir1 << ir2$$);
select * from check_estimated_rows($$select * from test_range_join_1,
test_range_join_2 where ir1 >> ir2$$);

Do you need 3 tables to do the test? because we need to actually run
the query then compare the estimated row
and actually returned rows.
If you really execute the query with 3 table joins, it will take a lot of time.
So two tables join with where quql should be fine?

/* Fast-forwards i and j to start of iteration */
+ for (i = 0; range_cmp_bound_values(typcache, &hist1[i], &hist2[0]) < 0; i++);
+ for (j = 0; range_cmp_bound_values(typcache, &hist2[j], &hist1[0]) < 0; j++);
+
+ /* Do the estimation on overlapping regions */
+ while (i < nhist1 && j < nhist2)
+ {
+ double cur_sel1,
+ cur_sel2;
+ RangeBound cur_sync;
+
+ if (range_cmp_bound_values(typcache, &hist1[i], &hist2[j]) < 0)
+ cur_sync = hist1[i++];
+ else if (range_cmp_bound_values(typcache, &hist1[i], &hist2[j]) > 0)
+ cur_sync = hist2[j++];
+ else
+ {
+ /* If equal, skip one */
+ cur_sync = hist1[i];
+

this part range_cmp_bound_values "if else if" part computed twice, you
can just do
`
int cmp;
cmp = range_cmp_bound_values(typcache, &hist1[i], &hist2[j]);
if cmp <0  then
else if cmp > 0 then
else then
`

also. I think you can put the following into  main while loop.
+ for (i = 0; range_cmp_bound_values(typcache, &hist1[i], &hist2[0]) < 0; i++);
+ for (j = 0; range_cmp_bound_values(typcache, &hist2[j], &hist1[0]) < 0; j++);

split range and multirange into 2 patches might be a good idea.
seems: same function (calc_hist_join_selectivity) with same function
signature in src/backend/utils/adt/multirangetypes_selfuncs.c
and src/backend/utils/adt/rangetypes_selfuncs.c,
previously mail complaints not resolved.

Re: [PoC] Improve dead tuple storage for lazy vacuum

2024-01-22 Thread John Naylor

On Mon, Jan 22, 2024 at 2:24 PM Masahiko Sawada  wrote:
>
> For the next version patch, I'll work on this idea and try to clean up
> locking stuff both in tidstore and radix tree. Or if you're already
> working on some of them, please let me know. I'll review it.

Okay go ahead, sounds good. I plan to look at the tests since they
haven't been looked at in a while.

Bug report and fix about building historic snapshot

2024-01-22 Thread cca5507

Hello, I find a bug in building historic snapshot and the steps to reproduce 
are as follows:


Prepare:


(pub)create table t1 (id int primary key);



(pub)insert into t1 values (1);



(pub)create publication pub for table t1;



(sub)create table t1 (id int primary key);

Reproduce:



(pub)begin; insert into t1 values (2); (txn1 in session1)



(sub)create subscription sub connection 'hostaddr=127.0.0.1 port=5432 user=xxx 
dbname=postgres' publication pub; (pub will switch to BUILDING_SNAPSHOT state 
soon)



(pub)begin; insert into t1 values (3); (txn2 in session2)



(pub)create table t2 (id int primary key); (session3)



(pub)commit; (commit txn1, and pub will switch to FULL_SNAPSHOT state soon)



(pub)begin; insert into t2 values (1); (txn3 in session3)



(pub)commit; (commit txn2, and pub will switch to CONSISTENT state soon)



(pub)commit; (commit txn3, and replay txn3 will failed because its snapshot 
cannot see t2)

Reasons:
We currently don't track the transaction that begin after BUILDING_SNAPSHOT
and commit before FULL_SNAPSHOT when building historic snapshot in logical
decoding. This can cause a transaction which begin after FULL_SNAPSHOT to take
an incorrect historic snapshot because transactions committed in 
BUILDING_SNAPSHOT
state will not be processed by SnapBuildCommitTxn().


To fix it, we can track the transaction that begin after BUILDING_SNAPSHOT and
commit before FULL_SNAPSHOT forcely by using SnapBuildCommitTxn(). 


--
Regards,
ChangAo Chen

0001-Track-transactions-committed-in-BUILDING_SNAPSHOT.patch
Description: Binary data

Re: heavily contended lwlocks with long wait queues scale badly

2024-01-22 Thread Michael Paquier

On Fri, Jan 19, 2024 at 01:49:59PM +0100, Jakub Wartak wrote:
> Hi Michael, just to reassure you that it is a good thing. We have a
> customer who reported much better performance on 16.x than on 13~15 in
> very heavy duty LWLock/lockmanager scenarios (ofc, before that was
> committed/released), so I gave it a try here today to see how much can
> be attributed to that single commit.

Ahh.  Thanks for the feedback.
--
Michael


signature.asc
Description: PGP signature

RE: Synchronizing slots from primary to standby

2024-01-22 Thread Zhijie Hou (Fujitsu)

On Monday, January 22, 2024 11:36 AM shveta malik  
wrote:

Hi,

> On Fri, Jan 19, 2024 at 4:18 PM shveta malik  wrote:
> >
> > PFA v64.
> 
> V64 fails to apply to HEAD due to a recent commit. Rebased it. PFA v64_2. It 
> has
> no new changes.

I noticed few things while analyzing the patch.

1.
sleep_ms = Min(sleep_ms * 2, MAX_WORKER_NAPTIME_MS);

The initial value for sleep_ms is 0(default value for static variable) which
will not be advanced in this expression. We should initialize sleep_ms to a 
positive
number.

2.
/ Wait a bit, we don't expect to have to wait long /
rc = WaitLatch(MyLatch,
   WL_LATCH_SET | WL_TIMEOUT | 
WL_EXIT_ON_PM_DEATH,
   10L, WAIT_EVENT_BGWORKER_SHUTDOWN);

The slotsync worker is not a bgworker anymore after 0003 patch, so a new event
is needed I think.

3.
slot->effective_catalog_xmin = xmin_horizon;

The assignment is also needed in local_slot_update() to make 
ReplicationSlotsComputeRequiredXmin work.

Best Regards,
Hou zj

Re: Test slots invalidations in 035_standby_logical_decoding.pl only if dead rows are removed

2024-01-22 Thread Bertrand Drouvot

Hi,

On Mon, Jan 22, 2024 at 03:54:44PM +0900, Michael Paquier wrote:
> On Fri, Jan 19, 2024 at 09:03:01AM +, Bertrand Drouvot wrote:
> > On Fri, Jan 19, 2024 at 09:00:01AM +0300, Alexander Lakhin wrote:
> +# Launch $sql and wait for a new snapshot that has a newer horizon before
> +# doing the vacuum with $vac_option on $to_vac.
> +sub wait_until_vacuum_can_remove
> 
> This had better document what the arguments of this routine are,
> because that's really unclear.  $to_vac is the relation that will be
> vacuumed, for one.

Agree, done that way in v8 attached.

> Also, wouldn't it be better to document in the test why
> txid_current_snapshot() is chosen?  Contrary to txid_current(), it
> does not generate a Transaction/COMMIT to make the test more
> predictible, something you have mentioned upthread, and implied in the
> test.

Good point, added more comments in v8 (but not mentioning txid_current() as 
after giving more thought about it then I think it was not the right approach in
any case).

> 
> -  INSERT INTO flush_wal DEFAULT VALUES; -- see create table flush_wal
> 
> This removes two INSERTs on flush_wal and refactors the code to do the
> INSERT in wait_until_vacuum_can_remove(), using a SQL comment to
> document a reference about the reason why an INSERT is used.  Couldn't
> you just use a normal comment here?

Agree, done in v8.

> >> I've re-tested the v6 patch today and confirmed that it makes the test
> >> more stable. I ran (with bgwriter_delay = 1 in temp.config) 20 tests in
> >> parallel and got failures ('inactiveslot slot invalidation is logged with
> >> vacuum on pg_authid') on iterations 2, 6, 6 with no patch applied.
> >> (With unlimited CPU, when average test duration is around 70 seconds.)
> >> 
> >> But with v6 applied, 60 iterations succeeded.
> > 
> > Nice! Thanks for the testing!
> 
> I need to review what you have more thoroughly, but is it OK to assume
> that both of you are happy with the latest version of the patch in
> terms of stability gained?  That's still not the full picture, still a
> good step towards that.

Yeah, I can clearly see how this patch helps from a theoritical perspective but
rely on Alexander's testing to see how it actually stabilizes the test.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From 1380408965d8c7cb1a2e63043af7d1b8033a7e89 Mon Sep 17 00:00:00 2001
From: bdrouvot 
Date: Tue, 9 Jan 2024 05:08:35 +
Subject: [PATCH v8] Fix 035_standby_logical_decoding.pl race condition

We want to ensure that vacuum was able to remove dead rows (aka no other
transactions holding back global xmin) before testing for slots invalidation on
the standby.

While at it, also fixing some typos/bad test description in it.
---
 .../t/035_standby_logical_decoding.pl | 78 +++
 1 file changed, 45 insertions(+), 33 deletions(-)
 100.0% src/test/recovery/t/

diff --git a/src/test/recovery/t/035_standby_logical_decoding.pl b/src/test/recovery/t/035_standby_logical_decoding.pl
index 8bc39a5f03..6664405772 100644
--- a/src/test/recovery/t/035_standby_logical_decoding.pl
+++ b/src/test/recovery/t/035_standby_logical_decoding.pl
@@ -238,6 +238,35 @@ sub check_for_invalidation
 	) or die "Timed out waiting confl_active_logicalslot to be updated";
 }
 
+# Launch $sql and wait for a new snapshot that has a newer horizon before
+# doing the vacuum. Arguments are: $vac_option (the option to be passed to the
+# vacuum command), $sql (the sql to launch before triggering the vacuum) and
+# $to_vac (the relation to vacuum).
+# Note that to get the horizon we're using pg_current_snapshot() (it does not
+# generate a Transaction/COMMIT wal record, so we don't increase the risk to see
+# a xl_running_xacts that would advance active replication slot's catalog_xmin).
+# Indeed advancing the active replication slot's catalog_xmin would break some
+# tests that expect the active slot to conflict with the catalog xmin horizon.
+sub wait_until_vacuum_can_remove
+{
+	my ($vac_option, $sql, $to_vac) = @_;
+
+	# Get the current xid horizon
+	my $xid_horizon = $node_primary->safe_psql('testdb', qq[select pg_snapshot_xmin(pg_current_snapshot());]);
+
+	# Launch our sql
+	$node_primary->safe_psql('testdb', qq[$sql]);
+
+	# Wait until we get a newer horizon
+	$node_primary->poll_query_until('testdb',
+		"SELECT (select pg_snapshot_xmin(pg_current_snapshot())::text::int - $xid_horizon) > 0")
+	  or die "new snapshot does not have a newer horizon";
+
+	# Launch the vacuum command and insert into flush_wal (see create table
+	# flush_wal for the reason why).
+	$node_primary->safe_psql('testdb', qq[VACUUM $vac_option verbose $to_vac;
+		  INSERT INTO flush_wal DEFAULT VALUES;]);
+}
 
 # Initialize primary node
 
@@ -248,6 +277,7 @@ $node_primary->append_conf(
 wal_level = 'logical'
 max_replication_slots = 4
 max_wal_senders = 4
+autovac

RE: speed up a logical replica setup

2024-01-22 Thread Hayato Kuroda (Fujitsu)

Dear Peter,

> 
> > Yet other options could be
> > pg_buildsubscriber, pg_makesubscriber as 'build' or 'make' in the name
> > sounds like we are doing some work to create the subscriber which I
> > think is the case here.
> 
> I see your point here.  pg_createsubscriber is not like createuser in
> that it just runs an SQL command.  It does something different than
> CREATE SUBSCRIBER.  So a different verb would make that clearer.  Maybe
> something from here: https://www.thesaurus.com/browse/convert

I read the link and found a good verb "switch". So, how about using 
"pg_switchsubscriber"?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

Re: Add \syncpipeline command to pgbench

2024-01-22 Thread Anthonin Bonnefoy

That looks like a bug with how opened pipelines are not caught at the
end of the script processing. startpipeline seems to have similar
related issues.

$ cat only_startpipeline.sql
\startpipeline
SELECT 1;

With only 1 transaction, pgbench will consider this a success despite
not sending anything since the pipeline was not flushed:
pgbench -t1 -Mextended -f only_startpipeline.sql
[...]
number of transactions per client: 1
number of transactions actually processed: 1/1

With 2 transactions, the error will happen when \startpipeline is
called a second time:
pgbench -t2 -Mextended -f only_startpipeline.sql
[...]
pgbench: error: client 0 aborted in command 0 (startpipeline) of
script 0; already in pipeline mode
number of transactions per client: 2
number of transactions actually processed: 1/2

I've split the changes into two patches.
0001 introduces a new error when the end of a pgbench script is
reached while there's still an ongoing pipeline.
0002 adds the \syncpipeline command (original patch with an additional
test case).

Regards,
Anthonin

On Mon, Jan 22, 2024 at 7:16 AM Michael Paquier  wrote:
>
> On Fri, Jan 19, 2024 at 08:55:31AM +0100, Anthonin Bonnefoy wrote:
> > I've updated the doc to group the commands. It does look better and
> > more consistent with similar command groups like \if.
>
> I was playing with a few meta command scenarios while looking at this
> patch, and this sequence generates an error that should never happen:
> $ cat /tmp/test.sql
> \startpipeline
> \syncpipeline
> $ pgbench -n -f /tmp/test.sql -M extended
> [...]
> pgbench: error: unexpected transaction status 1
> pgbench: error: client 0 aborted while receiving the transaction status
>
> It looks to me that we need to be much smarter than that for the error
> handling we'd need when a sync request is optionally sent when a
> transaction stops at the end of pgbench.  Could you look at it?
> --
> Michael


v3-0001-Abort-pgbench-if-script-end-is-reached-with-an-op.patch
Description: Binary data


v3-0002-pgbench-Add-syncpipeline-to-pgbench.patch
Description: Binary data

Re: speed up a logical replica setup

2024-01-22 Thread Amit Kapila

On Mon, Jan 22, 2024 at 2:38 PM Hayato Kuroda (Fujitsu)
 wrote:
> >
> > > Yet other options could be
> > > pg_buildsubscriber, pg_makesubscriber as 'build' or 'make' in the name
> > > sounds like we are doing some work to create the subscriber which I
> > > think is the case here.
> >
> > I see your point here.  pg_createsubscriber is not like createuser in
> > that it just runs an SQL command.  It does something different than
> > CREATE SUBSCRIBER.

Right.

  So a different verb would make that clearer.  Maybe
> > something from here: https://www.thesaurus.com/browse/convert
>
> I read the link and found a good verb "switch". So, how about using 
> "pg_switchsubscriber"?
>

I also initially thought on these lines and came up with a name like
pg_convertsubscriber but didn't feel strongly about it as that would
have sounded meaningful if we use a name like
pg_convertstandbytosubscriber. Now, that has become too long. Having
said that, I am not opposed to it having a name on those lines. BTW,
another option that occurred to me today is pg_preparesubscriber. We
internally create slots and then wait for wal, etc. which makes me
sound like adding 'prepare' in the name can also explain the purpose.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Amit Kapila

On Mon, Jan 22, 2024 at 1:11 PM Bertrand Drouvot
 wrote:
>
> Hi,
>

Thanks for sharing the feedback.

>
> > Then we also discussed whether extending libpqwalreceiver's connect
> > API is a good idea and whether we need to further extend it in the
> > future. As far as I can see, slotsync worker's primary requirement is
> > to execute SQL queries which the current API is sufficient, and don't
> > see something that needs any drastic change in this API. Note that
> > tablesync worker that executes SQL also uses these APIs, so we may
> > need something in the future for either of those. Then finally we need
> > a slotsync worker to also connect to a database to use SQL and fetch
> > results.
> >
>
> On my side the nits concerns about using the libpqrcv_connect / 
> walrcv_connect are:
>
> - cosmetic: the "rcv" do not really align with the sync slot worker
>

But note that the same API is even used for apply worker as well. One
can think that this is a connection used to receive WAL or slot_info.

minor comments on the patch:
===
1.
+ /* First time slot update, the function must return true */
+ Assert(local_slot_update(remote_slot));

Isn't moving this code to Assert in update_and_persist_slot() wrong?
It will make this function call no-op in non-assert builds?

2.
+ ereport(LOG,
+ errmsg("newly locally created slot \"%s\" is sync-ready now",

I think even without 'locally' in the above LOG message, it is clear.

3.
+/*
+ * walrcv_get_dbinfo_for_failover_slots_fn
+ *
+ * Run LIST_DBID_FOR_FAILOVER_SLOTS on primary server to get the
+ * list of unique DBIDs for failover logical slots
+ */
+typedef List *(*walrcv_get_dbinfo_for_failover_slots_fn)
(WalReceiverConn *conn);

This looks like a leftover from the previous version of the patch.

-- 
With Regards,
Amit Kapila.

Re: initdb's -c option behaves wrong way?

2024-01-22 Thread Daniel Gustafsson

> On 19 Jan 2024, at 17:33, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>> I'll give some more time for opinions, then I'll go ahead with one of the
>> patches with a backpatch to v16.
> 
> OK, I take back my previous complaint that just using strncasecmp
> would be enough --- I was misremembering how the code worked, and
> you're right that it would use the spelling from the command line
> rather than that from the file.
> 
> However, the v3 patch is flat broken.  You can't assume you have
> found a match until you've verified that whitespace and '='
> appear next --- otherwise, you'll be fooled by a guc_name that
> is a prefix of one that appears in the file.  I think the simplest
> change that does it correctly is as attached.

The attached v4 looks good to me, I don't think it moves the needle wrt
readability (ie, no need to move the search).  Feel free to go ahead with that
version, or let me know if you want me to deal with it.

--
Daniel Gustafsson

Re: Prefetch the next tuple's memory during seqscans

2024-01-22 Thread David Rowley

On Sat, 20 Jan 2024 at 16:35, vignesh C  wrote:
> I'm seeing that there has been no activity in this thread for more
> than 6 months, I'm planning to close this in the current commitfest
> unless someone is planning to take it forward.

Thanks for the reminder about this.   Since the
heapgettup/heapgettup_pagemode refactor I was unable to see the same
performance gains as I was before.

Also, since reading "The Art of Writing Efficient Programs" I'm led to
believe that modern processor hardware prefetchers can detect and
prefetch on both forward and backward access patterns. I also saw some
discussion on twitter about this [1].

I'm not sure yet how this translates to non-uniform access patterns,
e.g. tuples are varying cachelines apart and we do something like only
deform attributes in the first cacheline.  Will the prefetcher still
see the pattern in this case? If it's non-uniform, then how does it
know which cacheline to fetch? If the tuple spans multiple cacheline
and we deform the whole tuple, will accessing the next cacheline in a
forward direction make the hardware prefetcher forget about the more
general backward access that's going on?

These are questions I'll need to learn the answers to before I can
understand what's the best thing to do in this area.  The only way to
tell is to design a benchmark and see how far we can go before the
hardware prefetcher no longer detects the pattern.

I've withdrawn the patch. I can resubmit once I've done some more
experimentation if that experimentation yields positive results.

David

[1] https://twitter.com/ID_AA_Carmack/status/1470832912149135360

Re: Make documentation builds reproducible

2024-01-22 Thread Peter Eisentraut


On 20.01.24 23:59, Tom Lane wrote:

Peter Eisentraut  writes:

On 20.01.24 17:03, Tom Lane wrote:

* I gather that the point here is to change some generated anchor
tags.  Would any of these tags be things people would be likely
to have bookmarked?



No, because the problem is that the anchor names are randomly generated
in each build.


D'oh.  No objection then.


Thanks, committed.

Re: XLog size reductions: smaller XLRec block header for PG17

2024-01-22 Thread Aleksander Alekseev

Hi,

> I'm seeing that there has been no activity in this thread for nearly 4
> months, I'm planning to close this in the current commitfest unless
> someone is planning to take it forward.

I don't think that closing CF entries purely due to inactivity is a
good practice (neither something we did before) as long as there is
code, it applies, etc. There are a lot of patches and few people
working on them. Inactivity in a given thread doesn't necessarily
indicate lack of interest, more likely lack of resources.

-- 
Best regards,
Aleksander Alekseev

Re: FEATURE REQUEST: Role vCPU limit/priority

2024-01-22 Thread Aleksander Alekseev

Hi Yoni,

> It would be useful to have the ability to define for a role default vCPU 
> affinity limits/thread priority settings so that more active sessions could 
> coexist similar to MySQL resource groups.

To me this sounds like a valuable feature.

Would you be interested in working on it? Typically it is a good idea
to start with an RFC document and discuss it with the community.

-- 
Best regards,
Aleksander Alekseev

Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-22 Thread Aleksander Alekseev

Hi,

> Hi, this patch was marked in CF as "Needs Review" [1], but there has
> been no activity on this thread for 5+ months.
>
> Do you wish to keep this open, or can you post something to elicit
> more interest in reviews for the latest patch set? Otherwise, if
> nothing happens then the CF entry will be closed ("Returned with
> feedback") at the end of this CF.

I don't think that closing CF entries only due to inactivity is a good
practice, nor something we typically do. When someone will have spare
time this person will (hopefully) review the code.

-- 
Best regards,
Aleksander Alekseev

Re: BUG: Former primary node might stuck when started as a standby

2024-01-22 Thread Aleksander Alekseev

Hi,

> But node1 knows that it's a standby now and it's expected to get all the
> WAL records from the primary, doesn't it?

Yes, but node1 doesn't know if it always was a standby or not. What if
node1 was always a standby, node2 was a primary, then node2 died and
node3 is a new primary. If node1 sees inconsistency  in the WAL
records, it should report it and stop doing anything, since it doesn't
has all the information needed to resolve the inconsistencies in all
the possible cases. Only DBA has this information.

> > It's been a while since I seriously played with replication, but if
> > memory serves, a proper way to switch node1 to a replica mode would be
> > to use pg_rewind on it first.
>
> Perhaps that's true generally, but as we can see, without the extra
> records replayed, this scenario works just fine. Moreover, existing tests
> rely on it, e.g., 009_twophase.pl or 012_subtransactions.pl (in fact, my
> research of the issue was initiated per a test failure).

I suggest focusing on particular flaky tests then and how to fix them.

-- 
Best regards,
Aleksander Alekseev

Re: Removing const-false IS NULL quals and redundant IS NOT NULL quals

2024-01-22 Thread David Rowley

On Thu, 28 Dec 2023 at 00:38, Andy Fan  wrote:
> I also want to add notnullattnums for the UniqueKey stuff as well, by
> comparing your implementation with mine,  I found you didn't consider
> the NOT NULL generated by filter. After apply your patch:
>
> create table a(a int);
> explain (costs off) select * from a where a > 3 and a is null;
>  QUERY PLAN
> -
>  Seq Scan on a
>Filter: ((a IS NULL) AND (a > 3))
> (2 rows)

> [1]
> https://www.postgresql.org/message-id/attachment/151254/v1-0001-uniquekey-on-base-relation-and-used-it-for-mark-d.patch

I believe these are two different things and we should not mix the two up.

Looking at your patch, I see you have:

+ /* The not null attrs from catalogs or baserestrictinfo. */
+ Bitmapset  *notnullattrs;

Whereas, I have:

/* zero-based set containing attnums of NOT NULL columns */
Bitmapset  *notnullattnums;

I'm a bit worried that your definition of notnullattrs could lead to
confusion about which optimisations will be possible.

Let's say for example I want to write some code that optimises the
expression evaluation code to transform EEOP_FUNCEXPR_STRICT into
EEOP_FUNCEXPR when all function arguments are Vars that have NOT NULL
constraints and are not nullable by any outer join.  With my
definition, it should be safe to do this, but with your definition, we
can't trust we'll not see any NULLs as if the strict function is
evaluated before the strict base qual that filters the NULLs then the
strict function could be called with NULL.

Perhaps we'd want another Bitmapset that has members for strict OpExrs
that filter NULLs and we could document that it's only safe to assume
there are no NULLs beyond the scan level but I'd say that's
another patch and I don't want to feed you design ideas here and
derail this patch.

David

Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-22 Thread Amit Kapila

On Wed, Jul 26, 2023 at 7:22 PM Aleksander Alekseev
 wrote:
>
> > Here is the corrected patch. I added it to the nearest CF [1].
>
> I played a bit more with the patch. There was an idea to make
> ReorderBufferTupleBufData an opaque structure known only within
> reorderbyffer.c but it turned out that replication/logical/decode.c
> accesses it directly so I abandoned that idea for now.
>
> > Alternatively we could convert ReorderBufferTupleBufData macro to an
> > inlined function. At least it will add some type safety.
>
> Here is v3 that implements it too as a separate patch.
>

But why didn't you pursue your idea of getting rid of the wrapper
structure ReorderBufferTupleBuf which after this patch will have just
one member? I think there could be hassles in backpatching bug-fixes
in some cases but in the longer run it would make the code look clean.

-- 
With Regards,
Amit Kapila.

Re: Removing const-false IS NULL quals and redundant IS NOT NULL quals

2024-01-22 Thread David Rowley

On Mon, 22 Jan 2024 at 17:32, Peter Smith  wrote:
> Hi, This patch has a CF status of "Needs Review" [1], but it seems
> there were CFbot test failures last time it was run [2].

I've attached v11 which updates the expected results in some newly
added regression tests.

No other changes.

David
From 095744f5583ab5446c1cdb75bfd3b40c7ab493d8 Mon Sep 17 00:00:00 2001
From: David Rowley 
Date: Thu, 7 Dec 2023 22:52:34 +1300
Subject: [PATCH v11] Reduce NullTest quals to constant TRUE or FALSE

---
 .../postgres_fdw/expected/postgres_fdw.out|  16 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql |   4 +-
 src/backend/optimizer/plan/initsplan.c| 202 +--
 src/backend/optimizer/util/joininfo.c |  28 ++
 src/backend/optimizer/util/plancat.c  |  19 ++
 src/backend/optimizer/util/relnode.c  |   3 +
 src/include/nodes/pathnodes.h |   7 +-
 src/include/optimizer/planmain.h  |   4 +
 src/test/regress/expected/equivclass.out  |  18 +-
 src/test/regress/expected/join.out|  67 +++--
 src/test/regress/expected/predicate.out   | 244 ++
 src/test/regress/parallel_schedule|   2 +-
 src/test/regress/sql/predicate.sql| 122 +
 13 files changed, 663 insertions(+), 73 deletions(-)
 create mode 100644 src/test/regress/expected/predicate.out
 create mode 100644 src/test/regress/sql/predicate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index d83f6ae8cb..b5a38aeb21 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -656,20 +656,20 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE 
t1.c1 = 100 AND t1.c2 =
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE 
(("C 1" = 100)) AND ((c2 = 0))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- 
NullTest
-   QUERY PLAN  
  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c3 IS NULL;-- 
NullTest
+  QUERY PLAN   
   
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE 
(("C 1" IS NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE 
((c3 IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- 
NullTest
- QUERY PLAN
  
--
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c3 IS NOT NULL;-- 
NullTest
+QUERY PLAN 
   
+--
  Foreign Scan on public.ft1 t1
Output: c1, c2, c3, c4, c5, c6, c7, c8
-   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE 
(("C 1" IS NOT NULL))
+   Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE 
((c3 IS NOT NULL))
 (3 rows)
 
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; 
-- FuncExpr
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql 
b/contrib/postgres_fdw/sql/postgres_fdw.sql
index 90c8fa4b70..f410c3db4e 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -332,8 +332,8 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft_empty ORDER 
BY c1;
 -- ===
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 1; -- 
Var, OpExpr(b), Const
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE t1.c1 = 100 AND t1.c2 
= 0; -- BoolExpr
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- 
NullTest
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- 
NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c3 IS NULL;-- 
NullTest
+EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c3 IS NOT NULL;-- 
NullTest
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE round(abs(c1), 0) = 1; 
-- FuncExpr
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 = -c1;  -- 
OpExpr(l)
 EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE (c1 IS NOT NULL) IS 
DISTINCT FROM (c1 IS NOT NULL); -- DistinctExpr
diff --git a

Re: introduce dynamic shared memory registry

2024-01-22 Thread Bharath Rupireddy

On Mon, Jan 22, 2024 at 3:43 AM Nathan Bossart  wrote:
>
> Oops.  I've attached an attempt at fixing this.  I took the opportunity to
> clean up the surrounding code a bit.

The code looks cleaner and readable with the patch. All the call sites
are taking care of dsm_attach returning NULL value. So, the attached
patch looks good to me.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Synchronizing slots from primary to standby

2024-01-22 Thread shveta malik

On Mon, Jan 22, 2024 at 12:28 PM Amit Kapila  wrote:
>
> On Fri, Jan 19, 2024 at 3:55 PM shveta malik  wrote:
> >
> > On Fri, Jan 19, 2024 at 10:35 AM Masahiko Sawada  
> > wrote:
> > >
> > >
> > > Thank you for updating the patch. I have some comments:
> > >
> > > ---
> > > +latestWalEnd = GetWalRcvLatestWalEnd();
> > > +if (remote_slot->confirmed_lsn > latestWalEnd)
> > > +{
> > > +elog(ERROR, "exiting from slot synchronization as the
> > > received slot sync"
> > > + " LSN %X/%X for slot \"%s\" is ahead of the
> > > standby position %X/%X",
> > > + LSN_FORMAT_ARGS(remote_slot->confirmed_lsn),
> > > + remote_slot->name,
> > > + LSN_FORMAT_ARGS(latestWalEnd));
> > > +}
> > >
> > > IIUC GetWalRcvLatestWalEnd () returns walrcv->latestWalEnd, which is
> > > typically the primary server's flush position and doesn't mean the LSN
> > > where the walreceiver received/flushed up to.
> >
> > yes. I think it makes more sense to use something which actually tells
> > flushed-position. I gave it a try by replacing GetWalRcvLatestWalEnd()
> > with GetWalRcvFlushRecPtr() but I see a problem here. Lets say I have
> > enabled the slot-sync feature in a running standby, in that case we
> > are all good (flushedUpto is the same as actual flush-position
> > indicated by LogstreamResult.Flush). But if I restart standby, then I
> > observed that the startup process sets flushedUpto to some value 'x'
> > (see [1]) while when the wal-receiver starts, it sets
> > 'LogstreamResult.Flush' to another value (see [2]) which is always
> > greater than 'x'. And we do not update flushedUpto with the
> > 'LogstreamResult.Flush' value in walreceiver until we actually do an
> > operation on primary. Performing a data change on primary sends WALs
> > to standby which then hits XLogWalRcvFlush() and updates flushedUpto
> > same as LogstreamResult.Flush. Until then we have a situation where
> > slots received on standby are ahead of flushedUpto and thus slotsync
> > worker keeps one erroring out. I am yet to find out why flushedUpto is
> > set to a lower value than 'LogstreamResult.Flush' at the start of
> > standby.  Or maybe am I using the wrong function
> > GetWalRcvFlushRecPtr() and should be using something else instead?
> >
>
> Can we think of using GetStandbyFlushRecPtr()? We probably need to
> expose this function, if this works for the required purpose.

I think we can. For the records, the problem while using flushedUpto
(or GetWalRcvFlushRecPtr()) directly is that it is not set to the
latest flushed position immediately after startup.  It points to some
prior location (perhaps segment or page start) after startup until
some data  is flushed next which then updates it to the latest flushed
position, thus we can not use it directly.  GetStandbyFlushRecPtr()
OTOH takes care of it i.e. it returns correct flushed-location at any
point of time. I have changed v65 to use this one.

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-01-22 Thread shveta malik

On Fri, Jan 19, 2024 at 11:48 AM Peter Smith  wrote:
>
> Here are some review comments for patch v63-0003.

Thanks Peter. I have addressed all in v65.

>
> 4b.
> It was a bit different when there were ERRORs but now they are LOGs
> somehow it seems wrong for this function to say what the *caller* will
> do. Maybe you can rewrite all the errmsg so the don't say "skipping"
> but they just say "bad configuration for slot synchronization"
>
> If valid is false then you can LOG "skipping" at the caller...

I have made this change but now in the log file we see 3 logs like
below, does it seem apt? Was the earlier one better where we get the
info in 2 lines?

[34416] LOG:  bad configuration for slot synchronization
[34416] HINT:  hot_standby_feedback must be enabled.
[34416] LOG:  skipping slot synchronization

thanks
Shveta

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Masahiko Sawada

On Sat, Jan 20, 2024 at 7:44 PM Amit Kapila  wrote:
>
> On Sat, Jan 20, 2024 at 10:52 AM Dilip Kumar  wrote:
> >
> > On Fri, Jan 19, 2024 at 5:24 PM Amit Kapila  wrote:
> > >
> > > On Wed, Jan 17, 2024 at 4:00 PM shveta malik  
> > > wrote:
> > > >
> > >
> > > I had some off-list discussions with Sawada-San, Hou-San, and Shveta
> > > on the topic of extending replication commands instead of using the
> > > current model where we fetch the required slot information via SQL
> > > using a database connection. I would like to summarize the discussion
> > > and would like to know the thoughts of others on this topic.
> > >
> > > In the current patch, we launch the slotsync worker on physical
> > > standby which connects to the specified database (currently we let
> > > users specify the required dbname in primary_conninfo) on the primary.
> > > It then fetches the required information for failover marked slots
> > > from the primary and also does some primitive checks on the upstream
> > > node via SQL (the additional checks are like whether the upstream node
> > > has a specified physical slot or whether the upstream node is a
> > > primary node or a standby node). To fetch the required information it
> > > uses a libpqwalreciever API which is mostly apt for this purpose as it
> > > supports SQL execution but for this patch, we don't need a replication
> > > connection, so we extend the libpqwalreciever connect API.
> >
> > What sort of extension we have done to 'libpqwalreciever'? Is it
> > something like by default this supports replication connections so we
> > have done an extension to the API so that we can provide an option
> > whether to create a replication connection or a normal connection?
> >
>
> Yeah and in the future there could be more as well. The other function
> added walrcv_get_dbname_from_conninfo doesn't appear to be a problem
> either for now.
>
> > > Now, the concerns related to this could be that users would probably
> > > need to change existing mechanisms/tools to update priamry_conninfo

I'm concerned about this. In fact, a primary_conninfo value generated
by pg_basebackup does not work with enable_syncslot.

> > > and one of the alternatives proposed is to have an additional GUC like
> > > slot_sync_dbname. Users won't be able to drop the database this worker
> > > is connected to aka whatever is specified in slot_sync_dbname but as
> > > the user herself sets up the configuration it shouldn't be a big deal.
> >
> > Yeah for this purpose users may use template1 or so which they
> > generally don't plan to drop.
> >
>
> Using template1 has other problems like users won't be able to create
> a new database. See [2] (point number 2.2)
>
> >
> >  So in case the user wants to drop that
> > database user needs to turn off the slot syncing option and then it
> > can be done?
> >
>
> Right.

If the user wants to continue using slot syncing, they need to switch
the database to connect. Which requires modifying primary_conninfo and
reloading the configuration file. Which further leads to restarting
the physical replication. If they use synchronous replication, it
means the application temporarily stops during that.

>
> > > Then we also discussed whether extending libpqwalreceiver's connect
> > > API is a good idea and whether we need to further extend it in the
> > > future. As far as I can see, slotsync worker's primary requirement is
> > > to execute SQL queries which the current API is sufficient, and don't
> > > see something that needs any drastic change in this API. Note that
> > > tablesync worker that executes SQL also uses these APIs, so we may
> > > need something in the future for either of those. Then finally we need
> > > a slotsync worker to also connect to a database to use SQL and fetch
> > > results.
> >
> > While looking into the patch v64-0002 I could not exactly point out
> > what sort of extensions are there in libpqwalreceiver.c, I just saw
> > one extra API for fetching the dbname from connection info?
> >
>
> Right, the worry was that we may need it in the future.

Yes. IIUC the slotsync worker uses libpqwalreceiver to establish a
non-replication connection and to execute SQL query. But neither of
them are relevant with replication. I'm a bit concerned that when we
need to extend the slotsync feature in the future we will end up
extending libpqwalreceiver, even if the new feature is not also
relevant with replication.

>
> > > Now, let us consider if we extend the replication commands like
> > > READ_REPLICATION_SLOT and or introduce a new set of replication
> > > commands to fetch the required information then we don't need a DB
> > > connection with primary or a connection in slotsync worker. As per my
> > > current understanding, it is quite doable but I think we will slowly
> > > go in the direction of making replication commands something like SQL
> > > because today we need to extend it to fetch all slots info that have
> > > failover marked as true, the existenc

Re: partitioning and identity column

2024-01-22 Thread Peter Eisentraut


On 17.01.24 06:36, Ashutosh Bapat wrote:

On Wed, Jan 17, 2024 at 12:30 AM Peter Eisentraut  wrote:


On 09.01.24 15:10, Ashutosh Bapat wrote:

Here's complete patch-set.


Looks good!  Committed.



Thanks a lot Peter.


I found another piece of code that might need updating, or at least the 
comment.


In MergeAttributes(), in the part that merges the specified column 
definitions into the inherited ones, it says


/*
 * Identity is never inherited.  The new column can have an
 * identity definition, so we always just take that one.
 */
def->identity = newdef->identity;

This is still correct for regular inheritance, but not for partitioning. 
 I think for partitioning, this is not reachable because you can't 
specify identity information when you create a partition(?).  So maybe 
something like


if (newdef->identity)
{
Assert(!is_partioning);
/*
 * Identity is never inherited.  The new column can have an
 * identity definition, so we always just take that one.
 */
def->identity = newdef->identity;
}

Thoughts?

Re: make dist using git archive

2024-01-22 Thread Junwang Zhao

Hi,

On Mon, Jan 22, 2024 at 3:32 PM Peter Eisentraut  wrote:
>
> One of the goals is to make the creation of the distribution tarball
> more directly traceable to the git repository.  That is why we removed
> the "make distprep" step.
>
> Here I want to take another step in that direction, by changing "make
> dist" to directly use "git archive", rather than the custom shell script
> it currently runs.
>
> The simple summary is that it would run
>
> git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
> postgresql-17devel.tar.gz
>
> (with appropriate version numbers, of course), and that's the tarball we
> would ship.
>
> There are analogous commands for other compression formats.
>
> The actual command gets subtly more complicated if you need to run this
> in a separate build directory.  In my attached patch, the make version
> doesn't support vpath at the moment, just so that it's easier to
> understand for now.  The meson version is a bit hairy.
>
> I have studied and tested this quite a bit, and I have found that the
> archives produced this way are deterministic and reproducible, meaning
> for a given commit the result file should always be bit-for-bit identical.
>
> The exception is that if you use a git version older than 2.38.0, gzip
> records the platform in the archive, so you'd get a different output on
> Windows vs. macOS vs. "UNIX" (everything else).  In git 2.38.0, this was
> changed so that everything is recorded as "UNIX" now.  This is just
> something to keep in mind.  This issue is specific to the gzip format,
> it does not affect other compression formats.
>
> Meson has its own distribution building command (meson dist), but opted
> against using that.  The main problem is that the way they have
> implemented it, it is not deterministic in the above sense.  (Another
> point is of course that we probably want a "make" version for the time
> being.)
>
> But the target name "dist" in meson is reserved for that reason, so I
> needed to call the custom target "pgdist".
>
> I did take one idea from meson: It runs a check before git archive that
> the checkout is clean.  That way, you avoid mistakes because of
> uncommitted changes.  This works well in my "make" implementation.  In
> the meson implementation, I had to find a workaround, because a
> custom_target cannot have a dependency on a run_target.  As also
> mentioned above, the whole meson implementation is a bit ugly.
>
> Anyway,  with the attached patch you can do
>
>  make dist
>
> or
>
>  meson compile -C build pgdist

I played this with meson build on macOS, the packages are generated
in source root but not build root, I'm sure if this is by design but I think
polluting *working directory* is not good.

Another thing I'd like to point out is, should we also introduce *git commit*
or maybe *git tag* to package name, something like:

git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-17devel-`git rev-parse --short HEAD`.tar.gz
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-`git describe --tags`.tar.gz

>
> and it produces the same set of tarballs as before, except it's done
> differently.
>
> The actual build scripts need some fine-tuning, but the general idea is
> correct, I think.

I think this is a good idea, thanks for working on this.


-- 
Regards
Junwang Zhao

Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-22 Thread Bharath Rupireddy

On Mon, Jan 22, 2024 at 4:17 PM Aleksander Alekseev
 wrote:
>
> Hi,
>
> > Hi, this patch was marked in CF as "Needs Review" [1], but there has
> > been no activity on this thread for 5+ months.
> >
> > Do you wish to keep this open, or can you post something to elicit
> > more interest in reviews for the latest patch set? Otherwise, if
> > nothing happens then the CF entry will be closed ("Returned with
> > feedback") at the end of this CF.
>
> I don't think that closing CF entries only due to inactivity is a good
> practice, nor something we typically do. When someone will have spare
> time this person will (hopefully) review the code.

Agree. IMHO, a patch not picked up by anyone doesn't mean it's the
author's problem to mark it "Returned with feedback". And, the timing
to mark things this way is not quite right as we are getting close to
the PG17 release. The patch may be fixing a problem (like the one
that's closed due to inactivity
https://commitfest.postgresql.org/46/3503/) or may be a feature or an
improvement that no reviewer/committer has had a chance to look at.

I think a new label such as "Returned due to lack of interest" or
"Returned due to disinterest" (or some better wording) helps
reviewers/committers to pick things up. This new label can be
attributed to the class of patches for which initially there's some
positive feedback on the idea, the patch is being kept latest, but
finds no one to get it reviewed for say X number of months.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: partitioning and identity column

2024-01-22 Thread Ashutosh Bapat

On Mon, Jan 22, 2024 at 5:32 PM Peter Eisentraut  wrote:
>
> I found another piece of code that might need updating, or at least the
> comment.
>
> In MergeAttributes(), in the part that merges the specified column
> definitions into the inherited ones, it says
>
>  /*
>   * Identity is never inherited.  The new column can have an
>   * identity definition, so we always just take that one.
>   */
>  def->identity = newdef->identity;
>
> This is still correct for regular inheritance, but not for partitioning.
>   I think for partitioning, this is not reachable because you can't
> specify identity information when you create a partition(?).  So maybe
> something like

You may specify the information when creating a partition, but it will
cause an error. We have tests in identity.sql for the same (look for
pitest1_pfail).

>
>  if (newdef->identity)
>  {
>  Assert(!is_partioning);
>  /*
>   * Identity is never inherited.  The new column can have an
>   * identity definition, so we always just take that one.
>   */
>  def->identity = newdef->identity;
>  }
>
> Thoughts?

That code block already has Assert(!is_partition) at line 3085. I
thought that Assert is enough.

There's another thing I found. The file isn't using
check_stack_depth() in the function which traverse inheritance
hierarchies. This isn't just a problem of the identity related
function but most of the functions in that file. Do you think it's
worth fixing it?

--
Best Wishes,
Ashutosh Bapat

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Amit Kapila

On Mon, Jan 22, 2024 at 5:28 PM Masahiko Sawada  wrote:
>
> On Sat, Jan 20, 2024 at 7:44 PM Amit Kapila  wrote:
> >
> > On Sat, Jan 20, 2024 at 10:52 AM Dilip Kumar  wrote:
> > >
> > > On Fri, Jan 19, 2024 at 5:24 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Wed, Jan 17, 2024 at 4:00 PM shveta malik  
> > > > wrote:
> > > > >
> > > >
> > > > I had some off-list discussions with Sawada-San, Hou-San, and Shveta
> > > > on the topic of extending replication commands instead of using the
> > > > current model where we fetch the required slot information via SQL
> > > > using a database connection. I would like to summarize the discussion
> > > > and would like to know the thoughts of others on this topic.
> > > >
> > > > In the current patch, we launch the slotsync worker on physical
> > > > standby which connects to the specified database (currently we let
> > > > users specify the required dbname in primary_conninfo) on the primary.
> > > > It then fetches the required information for failover marked slots
> > > > from the primary and also does some primitive checks on the upstream
> > > > node via SQL (the additional checks are like whether the upstream node
> > > > has a specified physical slot or whether the upstream node is a
> > > > primary node or a standby node). To fetch the required information it
> > > > uses a libpqwalreciever API which is mostly apt for this purpose as it
> > > > supports SQL execution but for this patch, we don't need a replication
> > > > connection, so we extend the libpqwalreciever connect API.
> > >
> > > What sort of extension we have done to 'libpqwalreciever'? Is it
> > > something like by default this supports replication connections so we
> > > have done an extension to the API so that we can provide an option
> > > whether to create a replication connection or a normal connection?
> > >
> >
> > Yeah and in the future there could be more as well. The other function
> > added walrcv_get_dbname_from_conninfo doesn't appear to be a problem
> > either for now.
> >
> > > > Now, the concerns related to this could be that users would probably
> > > > need to change existing mechanisms/tools to update priamry_conninfo
>
> I'm concerned about this. In fact, a primary_conninfo value generated
> by pg_basebackup does not work with enable_syncslot.
>

Right, but if we want can't we extend pg_basebackup to do that? It is
just that I am not sure that it is a good idea to extend pg_basebackup
in the first version.

> > > > and one of the alternatives proposed is to have an additional GUC like
> > > > slot_sync_dbname. Users won't be able to drop the database this worker
> > > > is connected to aka whatever is specified in slot_sync_dbname but as
> > > > the user herself sets up the configuration it shouldn't be a big deal.
> > >
> > > Yeah for this purpose users may use template1 or so which they
> > > generally don't plan to drop.
> > >
> >
> > Using template1 has other problems like users won't be able to create
> > a new database. See [2] (point number 2.2)
> >
> > >
> > >  So in case the user wants to drop that
> > > database user needs to turn off the slot syncing option and then it
> > > can be done?
> > >
> >
> > Right.
>
> If the user wants to continue using slot syncing, they need to switch
> the database to connect. Which requires modifying primary_conninfo and
> reloading the configuration file. Which further leads to restarting
> the physical replication. If they use synchronous replication, it
> means the application temporarily stops during that.
>

Yes, that would be an inconvenience but the point is we don't expect
this to change often.

> >
> > > > Then we also discussed whether extending libpqwalreceiver's connect
> > > > API is a good idea and whether we need to further extend it in the
> > > > future. As far as I can see, slotsync worker's primary requirement is
> > > > to execute SQL queries which the current API is sufficient, and don't
> > > > see something that needs any drastic change in this API. Note that
> > > > tablesync worker that executes SQL also uses these APIs, so we may
> > > > need something in the future for either of those. Then finally we need
> > > > a slotsync worker to also connect to a database to use SQL and fetch
> > > > results.
> > >
> > > While looking into the patch v64-0002 I could not exactly point out
> > > what sort of extensions are there in libpqwalreceiver.c, I just saw
> > > one extra API for fetching the dbname from connection info?
> > >
> >
> > Right, the worry was that we may need it in the future.
>
> Yes. IIUC the slotsync worker uses libpqwalreceiver to establish a
> non-replication connection and to execute SQL query. But neither of
> them are relevant with replication.
>

But we are already using libpqwalreceiver to execute SQL queries via
tablesync worker.

 I'm a bit concerned that when we
> need to extend the slotsync feature in the future we will end up
> extending libpqwalreceiver, even if the new feature is n

Does redundant extension exist During faster COPY in PG16

2024-01-22 Thread 何柯文(渊云)

Hi,
I'm learning faster COPY of PG16. I have some questions about extension lock 
improvement.
From ./src/backend/storage/buffer/bufmgr.c:1901 (ExtendBufferedRelShared)
```
 /*
 * Lock relation against concurrent extensions, unless requested not to.
 *
 * We use the same extension lock for all forks. That's unnecessarily
 * restrictive, but currently extensions for forks don't happen often
 * enough to make it worth locking more granularly.
 *
 * Note that another backend might have extended the relation by the time
 * we get the lock.
 */
 if (!(flags & EB_SKIP_EXTENSION_LOCK))
 {
 LockRelationForExtension(bmr.rel, ExclusiveLock);
 if (bmr.rel)
 bmr.smgr = RelationGetSmgr(bmr.rel);
 }
 ...
 smgrzeroextend(bmr.smgr, fork, first_block, extend_by, false);
```
During concurrent extension, when we obtain the extension lock, we use 
smgrzeroextend() to extend relation files instead of searching fsm through 
GetPageWithFreeSpace(). Is this approach reasonable?
During concurrent extensions, one backend bulk extend successfully, meaning 
that other backends waiting on extension lock have free pages to use.
If all concurrent extend backends extend the relation file after getting the 
extension lock, the extension lock will be held (extention backends * 
smgrzeroextend() executing time).

Any feedback is welcome.

--
Best Regards
Kewen He

from 阿里邮箱 macOS

Re: Improving EXPLAIN's display of SubPlan nodes

2024-01-22 Thread Aleksander Alekseev

Hi,

> EXPLAIN has always been really poor at displaying SubPlan nodes
> in expressions: you don't get much more than "(SubPlan N)".
> This is mostly because every time I thought about it, I despaired
> of trying to represent all the information in a SubPlan accurately.
> However, a recent discussion[1] made me realize that we could do
> a lot better just by displaying the SubLinkType and the testexpr
> (if relevant).  So here's a proposed patch.  You can see what
> it does by examining the regression test changes.
>
> There's plenty of room to bikeshed about exactly how to display
> this stuff, and I'm open to suggestions.
>
> BTW, I was somewhat depressed to discover that we have exactly
> zero regression coverage of the ROWCOMPARE_SUBLINK code paths;
> not only was EXPLAIN output not covered, but the planner and
> executor too.  So I added some simple tests for that.  Otherwise
> I think existing coverage is enough for this.

I reviewed the code and tested the patch on MacOS. It looks good to me.

Although something like:

```
+   Filter: (ANY (base_tbl.a = $1) FROM SubPlan 1 (returns $1))
+   SubPlan 1 (returns $1)
```

... arguably doesn't give much more information to the user comparing
to what we have now:

```
-   Filter: (SubPlan 1)
-   SubPlan 1
```

... I believe this is the right step toward more detailed EXPLAINs,
and perhaps could be useful for debugging and/or educational purposes.
Also the patch improves code coverage.

-- 
Best regards,
Aleksander Alekseev

Re: tablecmds.c/MergeAttributes() cleanup

2024-01-22 Thread Peter Eisentraut


On 06.12.23 09:23, Peter Eisentraut wrote:
The (now) second patch is also still of interest to me, but I have since 
noticed that I think [0] should be fixed first, but to make that fix 
simpler, I would like the first patch from here.


[0]: 
https://www.postgresql.org/message-id/flat/24656cec-d6ef-4d15-8b5b-e8dfc9c833a7%40eisentraut.org


The remaining patch in this series needed a rebase and adjustment.

The above precondition still applies.
From c4c4462fa58c73fc383c4935c8d2753b7a0b4c9d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 22 Jan 2024 13:23:43 +0100
Subject: [PATCH v5] MergeAttributes: convert pg_attribute back to ColumnDef
 before comparing

MergeAttributes() has a loop to merge multiple inheritance parents
into a column column definition.  The merged-so-far definition is
stored in a ColumnDef node.  If we have to merge columns from multiple
inheritance parents (if the name matches), then we have to check
whether various column properties (type, collation, etc.) match.  The
current code extracts the pg_attribute value of the
currently-considered inheritance parent and compares it against the
merged-so-far ColumnDef value.  If the currently considered column
doesn't match any previously inherited column, we make a new ColumnDef
node from the pg_attribute information and add it to the column list.

This patch rearranges this so that we create the ColumnDef node first
in either case.  Then the code that checks the column properties for
compatibility compares ColumnDef against ColumnDef (instead of
ColumnDef against pg_attribute).  This makes the code more symmetric
and easier to follow.  Also, later in MergeAttributes(), there is a
similar but separate loop that merges the new local column definition
with the combination of the inheritance parents established in the
first loop.  That comparison is already ColumnDef-vs-ColumnDef.  With
this change, both of these can use similar-looking logic.  (A future
project might be to extract these two sets of code into a common
routine that encodes all the knowledge of whether two column
definitions are compatible.  But this isn't currently straightforward
because we want to give different error messages in the two cases.)
Furthermore, by avoiding the use of Form_pg_attribute here, we make it
easier to make changes in the pg_attribute layout without having to
worry about the local needs of tablecmds.c.

Because MergeAttributes() is hugely long, it's sometimes hard to know
where in the function you are currently looking.  To help with that, I
also renamed some variables to make it clearer where you are currently
looking.  The first look is "prev" vs. "new", the second loop is "inh"
vs. "new".

Discussion: 
https://www.postgresql.org/message-id/flat/52a125e4-ff9a-95f5-9f61-b87cf447e...@eisentraut.org
---
 src/backend/commands/tablecmds.c | 198 ---
 1 file changed, 102 insertions(+), 96 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 2a56a4357c9..04e674bbdae 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -2704,7 +2704,8 @@ MergeAttributes(List *columns, const List *supers, char 
relpersistence,

parent_attno - 1);
char   *attributeName = NameStr(attribute->attname);
int exist_attno;
-   ColumnDef  *def;
+   ColumnDef  *newdef;
+   ColumnDef  *savedef;
 
/*
 * Ignore dropped columns in the parent.
@@ -2713,14 +2714,38 @@ MergeAttributes(List *columns, const List *supers, char 
relpersistence,
continue;   /* leave 
newattmap->attnums entry as zero */
 
/*
-* Does it conflict with some previously inherited 
column?
+* Create new column definition
+*/
+   newdef = makeColumnDef(attributeName, 
attribute->atttypid,
+  
attribute->atttypmod, attribute->attcollation);
+   newdef->storage = attribute->attstorage;
+   newdef->generated = attribute->attgenerated;
+   if (CompressionMethodIsValid(attribute->attcompression))
+   newdef->compression =
+   
pstrdup(GetCompressionMethodName(attribute->attcompression));
+
+   /*
+* Regular inheritance children are independent enough 
not to
+* inherit identity columns.  But partitions are 
integral part
+* of a partitioned table and inherit identity column.
+

Re: Add \syncpipeline command to pgbench

2024-01-22 Thread Alvaro Herrera

On 2024-Jan-22, Anthonin Bonnefoy wrote:

> That looks like a bug with how opened pipelines are not caught at the
> end of the script processing. startpipeline seems to have similar
> related issues.

Ah, yeah.  Your fix looks necessary on a quick look.  I'll review and
see about backpatching this.

> 0002 adds the \syncpipeline command (original patch with an additional
> test case).

I can look into this one later, unless Michael wants to.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"No me acuerdo, pero no es cierto.  No es cierto, y si fuera cierto,
 no me acuerdo." (Augusto Pinochet a una corte de justicia)

Re: [17] CREATE SUBSCRIPTION ... SERVER

2024-01-22 Thread Ashutosh Bapat

Hi Jeff,

On Tue, Jan 16, 2024 at 7:25 AM Jeff Davis  wrote:
>
> On Fri, 2024-01-12 at 17:17 -0800, Jeff Davis wrote:
> > I think 0004 needs a bit more work, so I'm leaving it off for now,
> > but
> > I'll bring it back in the next patch set.
>
> Here's the next patch set. 0001 - 0003 are mostly the same with some
> improved error messages and some code fixes. I am looking to start
> committing 0001 - 0003 soon, as they have received some feedback
> already and 0004 isn't required for the earlier patches to be useful.
>

I am reviewing the patches. Here are some random comments.

0002 adds a prefix "regress_" to almost every object that is created
in foreign_data.sql. The commit message doesn't say why it's doing so.
But more importantly, the new tests added are lost in all the other
changes. It will be good to have prefix adding changes into its own
patch explaining the reason. The new tests may stay in 0002.
Interestingly the foreign server created in the new tests doesn't have
"regress_" prefix. Why?

Dummy FDW makes me nervous. The way it's written, it may grow into a
full-fledged postgres_fdw and in the process might acquire the same
concerns that postgres_fdw has today. But I will study the patches and
discussion around it more carefully.

I enhanced the postgres_fdw TAP test to use foreign table. Please see
the attached patch. It works as expected. Of course a follow-on work
will require linking the local table and its replica on the publisher
table so that push down will work on replicated tables. But the
concept at least works with your changes. Thanks for that.

I am not sure we need a full-fledged TAP test for testing
subscription. I wouldn't object to it, but TAP tests are heavy. It
should be possible to write the same test as a SQL test by creating
two databases and switching between them. Do you think it's worth
trying that way?

> 0004 could use more discussion. The purpose is to split the privileges
> of pg_create_subscription into two: pg_create_subscription, and
> pg_create_connection. By separating the privileges, it's possible to
> allow someone to create/manage subscriptions to a predefined set of
> foreign servers (on which they have USAGE privileges) without allowing
> them to write an arbitrary connection string.

Haven't studied this patch yet. Will continue reviewing the patches.

--
Best Wishes,
Ashutosh Bapat
diff --git a/contrib/postgres_fdw/t/010_subscription.pl 
b/contrib/postgres_fdw/t/010_subscription.pl
index d1d80d0679..3ae2b6da4a 100644
--- a/contrib/postgres_fdw/t/010_subscription.pl
+++ b/contrib/postgres_fdw/t/010_subscription.pl
@@ -20,7 +20,7 @@ $node_subscriber->start;
 
 # Create some preexisting content on publisher
 $node_publisher->safe_psql('postgres',
-   "CREATE TABLE tab_ins AS SELECT generate_series(1,1002) AS a");
+   "CREATE TABLE tab_ins AS SELECT a, a + 1 as b FROM 
generate_series(1,1002) AS a");
 
 # Replicate the changes without columns
 $node_publisher->safe_psql('postgres', "CREATE TABLE tab_no_col()");
@@ -29,7 +29,7 @@ $node_publisher->safe_psql('postgres',
 
 # Setup structure on subscriber
 $node_subscriber->safe_psql('postgres', "CREATE EXTENSION postgres_fdw");
-$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_ins (a int)");
+$node_subscriber->safe_psql('postgres', "CREATE TABLE tab_ins (a int, b int)");
 
 # Setup logical replication
 my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
@@ -45,6 +45,9 @@ $node_subscriber->safe_psql('postgres',
"CREATE USER MAPPING FOR PUBLIC SERVER tap_server"
 );
 
+$node_subscriber->safe_psql('postgres',
+   "CREATE FOREIGN TABLE f_tab_ins (a int, b int) SERVER tap_server 
OPTIONS(table_name 'tab_ins')"
+);
 $node_subscriber->safe_psql('postgres',
"CREATE SUBSCRIPTION tap_sub SERVER tap_server PUBLICATION tap_pub WITH 
(password_required=false)"
 );
@@ -53,16 +56,16 @@ $node_subscriber->safe_psql('postgres',
 $node_subscriber->wait_for_subscription_sync($node_publisher, 'tap_sub');
 
 my $result =
-  $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM tab_ins");
+  $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM (SELECT f.b = 
l.b as match FROM tab_ins l, f_tab_ins f WHERE l.a = f.a) WHERE match");
 is($result, qq(1002), 'check initial data was copied to subscriber');
 
 $node_publisher->safe_psql('postgres',
-   "INSERT INTO tab_ins SELECT generate_series(1,50)");
+   "INSERT INTO tab_ins SELECT a, a + 1 FROM generate_series(1003,1050) 
a");
 
 $node_publisher->wait_for_catchup('tap_sub');
 
 $result =
-  $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM tab_ins");
-is($result, qq(1052), 'check initial data was copied to subscriber');
+  $node_subscriber->safe_psql('postgres', "SELECT count(*) FROM (SELECT f.b = 
l.b as match FROM tab_ins l, f_tab_ins f WHERE l.a = f.a) WHERE match");
+is($result, qq(1050), 'check initial data was copied to subscriber');
 
 done_testing();

Re: Move walreceiver state assignment (to WALRCV_STREAMING) in WalReceiverMain()

2024-01-22 Thread Bertrand Drouvot

Hi,

On Mon, Jan 22, 2024 at 04:14:46PM +1100, Peter Smith wrote:
> 2024-01 Commitfest.
> 
> Hi, This patch has a CF status of "Needs Review" [1], but it seems
> there were CFbot test failures last time it was run [2]. Please have a
> look and post an updated version if necessary.

Thanks for the warning! I don't think the current code is causing any issues so
given the feedback I've had so far I think I'll withdraw the patch.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: [DOC] Add detail regarding resource consumption wrt max_connections

2024-01-22 Thread reid . thompson

On Fri, 2024-01-19 at 17:37 -0500, reid.thomp...@crunchydata.com wrote:
> On Sat, 2024-01-13 at 10:31 -0700, Roberto Mello wrote:
> > 
> > I can add a suggestion for the user to consider increasing
> > shared_buffers in accordance with higher max_connections, but it
> > would be better if there was a "rule of thumb" guideline to go
> > along. I'm open to suggestions.
> > 
> > I can revise with a similar warning in max_prepared_xacts as well.
> > 
> > Sincerely,
> > 
> > Roberto
> 
> Can a "close enough" rule of thumb be calculated from:
> postgresql.conf -> log_min_messages = debug3
> 
> start postgresql with varying max_connections to get
> CreateSharedMemoryAndSemaphores() sizes to generate a rough equation
> 

or maybe it would be sufficient to advise to set log_min_messages =
debug3 on a test DB and start/stop it with varying values of
max_connections and look at the differing values in 
DEBUG: invoking IpcMemoryCreate(size=...) log messages for themselves.

Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-22 Thread Aleksander Alekseev

Hi,

> > I played a bit more with the patch. There was an idea to make
> > ReorderBufferTupleBufData an opaque structure known only within
> > reorderbyffer.c but it turned out that replication/logical/decode.c
> > accesses it directly so I abandoned that idea for now.
> >
> > > Alternatively we could convert ReorderBufferTupleBufData macro to an
> > > inlined function. At least it will add some type safety.
> >
> > Here is v3 that implements it too as a separate patch.
> >
>
> But why didn't you pursue your idea of getting rid of the wrapper
> structure ReorderBufferTupleBuf which after this patch will have just
> one member? I think there could be hassles in backpatching bug-fixes
> in some cases but in the longer run it would make the code look clean.

Indeed. In fact turned out that I suggested the same above but
apparently forgot:

> On top of that IMO it doesn't make much sense to keep a one-field
> wrapper structure. Perhaps we should get rid of it entirely and just
> use HeapTupleData instead.

After actually trying the refactoring I agree that the code becomes
cleaner and it's going to be beneficial in the long run. Here is the
patch.

--
Best regards,
Aleksander Alekseev


v4-0001-Remove-ReorderBufferTupleBuf-structure.patch
Description: Binary data

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-01-22 Thread Melih Mutlu

Hi Bharath,

Thanks for working on this. It seems like a nice improvement to have.

Here are some comments on 0001 patch.

1-  xlog.c
+ /*
+ * Fast paths for the following reasons: 1) WAL buffers aren't in use when
+ * server is in recovery. 2) WAL is inserted into WAL buffers on current
+ * server's insertion TLI. 3) Invalid starting WAL location.
+ */

Shouldn't the comment be something like "2) WAL is *not* inserted into WAL
buffers on current server's insertion TLI" since the condition to break is tli
!= GetWALInsertionTimeLine()

2-
+ /*
+ * WAL being read doesn't yet exist i.e. past the current insert position.
+ */
+ if ((startptr + count) > reservedUpto)
+ return ntotal;

This question may not even make sense but I wonder whether we can read from
startptr only to reservedUpto in case of startptr+count exceeds
reservedUpto?

3-
+ /* Wait for any in-progress WAL insertions to WAL buffers to finish. */
+ if ((startptr + count) > LogwrtResult.Write &&
+ (startptr + count) <= reservedUpto)
+ WaitXLogInsertionsToFinish(startptr + count);

Do we need to check if (startptr + count) <= reservedUpto as we already
verified this condition a few lines above?

4-
+ Assert(nread > 0);
+ memcpy(dst, data, nread);
+
+ /*
+ * Make sure we don't read xlblocks down below before the page
+ * contents up above.
+ */
+ pg_read_barrier();
+
+ /* Recheck if the read page still exists in WAL buffers. */
+ endptr = pg_atomic_read_u64(&XLogCtl->xlblocks[idx]);
+
+ /* Return if the page got initalized while we were reading it. */
+ if (expectedEndPtr != endptr)
+ break;
+
+ /*
+ * Typically, we must not read a WAL buffer page that just got
+ * initialized. Because we waited enough for the in-progress WAL
+ * insertions to finish above. However, there can exist a slight
+ * window after the above wait finishes in which the read buffer page
+ * can get replaced especially under high WAL generation rates. After
+ * all, we are reading from WAL buffers without any locks here. So,
+ * let's not count such a page in.
+ */
+ phdr = (XLogPageHeader) page;
+ if (!(phdr->xlp_magic == XLOG_PAGE_MAGIC &&
+   phdr->xlp_pageaddr == (ptr - (ptr % XLOG_BLCKSZ)) &&
+   phdr->xlp_tli == tli))
+ break;

I see that you recheck if the page still exists and so at the end. What
would you think about memcpy'ing only after being sure that we will need
and use the recently read data? If we break the loop during the recheck, we
simply discard the data read in the latest attempt. I guess that this may
not be a big deal but the data would be unnecessarily copied into the
destination in such a case.

5- xlogreader.c
+ nread = XLogReadFromBuffers(startptr, tli, count, buf);
+
+ if (nread > 0)
+ {
+ /*
+ * Check if its a full read, short read or no read from WAL buffers.
+ * For short read or no read, continue to read the remaining bytes
+ * from WAL file.
+ *
+ * XXX: It might be worth to expose WAL buffer read stats.
+ */
+ if (nread == count) /* full read */
+ return true;
+ else if (nread < count) /* short read */
+ {
+ buf += nread;
+ startptr += nread;
+ count -= nread;
+ }

Typo in the comment. Should be like "Check if *it's* a full read, short
read or no read from WAL buffers."

Also I don't think XLogReadFromBuffers() returns anything less than 0 and
more than count. Is verifying nread > 0 necessary? I think if nread does
not equal to count, we can simply assume that it's a short read. (or no
read at all in case nread is 0 which we don't need to handle specifically)

6-
+ /*
+ * We determined how much of the page can be validly read as 'count', read
+ * that much only, not the entire page. Since WALRead() can read the page
+ * from WAL buffers, in which case, the page is not guaranteed to be
+ * zero-padded up to the page boundary because of the concurrent
+ * insertions.
+ */

I'm not sure about pasting this into the most places we call WalRead().
Wouldn't it be better if we mention this somewhere around WALRead() only
once?


Best,
-- 
Melih Mutlu
Microsoft

Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-22 Thread Aleksander Alekseev

Hi,

> > But why didn't you pursue your idea of getting rid of the wrapper
> > structure ReorderBufferTupleBuf which after this patch will have just
> > one member? I think there could be hassles in backpatching bug-fixes
> > in some cases but in the longer run it would make the code look clean.
>
> Indeed. In fact turned out that I suggested the same above but
> apparently forgot:
>
> > On top of that IMO it doesn't make much sense to keep a one-field
> > wrapper structure. Perhaps we should get rid of it entirely and just
> > use HeapTupleData instead.
>
> After actually trying the refactoring I agree that the code becomes
> cleaner and it's going to be beneficial in the long run. Here is the
> patch.

I did a mistake in v4:

```
-alloc_len = tuple_len + SizeofHeapTupleHeader;
+alloc_len = tuple_len + HEAPTUPLESIZE;
```

Here is the corrected patch.

-- 
Best regards,
Aleksander Alekseev


v5-0001-Remove-ReorderBufferTupleBuf-structure.patch
Description: Binary data

Re: UUID v7

2024-01-22 Thread Aleksander Alekseev

Hi,

> But now (after big timeseries project with multiple time zones and DST 
> problems) I think differently.
> Even though timestamp and timestamptz are practically the same, timestamptz 
> should be used to store the time in UTC.
> Choosing timestamp is more likely to lead to problems and misunderstandings 
> than timestamptz.

As somebody who contributed TZ support to TimescaleDB I'm more or less
aware about the pros and cons of Timestamp and TimestampTz :)
Engineering is all about compromises. I can imagine a project where it
makes sense to use only TimestampTz for the entire database, and the
opposite - when it's better to use only UTC and Timestamp. In this
particular case I was merely concerned that the particular choice
could be confusing for the users but I think I changed my mind by now,
see below.

>> Here's v12. Changes:
>> 1. Documentation improvements
>> 2. Code comments
>> 3. Better commit message and reviews list
>
>
> Thank you, Andrey! I have just checked v12 – cleanly applied to HEAD, and 
> functions work well. I especially like that fact that we keep 
> uuid_extract_time(..) here – this is a great thing to have for time-based 
> partitioning, and in many cases we will be able to decide not to have a 
> creation column timestamp (e.g., "created_at") at all, saving 8 bytes.
>
> The docs and comments look great too.
>
> Overall, the patch looks mature enough. It would be great to have it in pg17. 
> Yes, the RFC is not fully finalized yet, but it's very close. And many 
> libraries are already including implementation of UUIDv7 – here are some 
> examples:
>
> - https://www.npmjs.com/package/uuidv7
> - https://crates.io/crates/uuidv7
> - https://github.com/google/uuid/pull/139

Thanks!

After playing with v12 I'm inclined to agree that it's RfC.

I only have a couple of silly nitpicks:

- It could make sense to decompose the C implementation of uuidv7() in
two functions, for readability.
- It could make sense to get rid of curly braces in SQL tests when
calling uuid_extract_ver() and uuid_extract_ver(), for consistency.

I'm not going to insist on these changes though and prefer leaving it
to the author and the committer to decide.

Also I take back what I said above about using Timestamp instead of
TimestampTz. I forgot that Timestamps are implicitly casted to
TimestampTz's, so users preferring Timestamps can do this:

```
=# select uuidv7('2024-01-22 12:34:56' :: timestamp);
uuidv7
--
 018d3085-de00-77c1-9e7b-7b04ddb9ebb9
```

Cfbot also seems to be happy with the patch so I'm changing the CF
entry status to RfC.

-- 
Best regards,
Aleksander Alekseev

psql: Allow editing query results with \gedit

2024-01-22 Thread Christoph Berg

Assuming a SELECT statement reading from a single table, it is quite an
effort to transform that statement to an UPDATE statement on that table,
perhaps to fix a typo that the user has spotted in the query result.

First, the general syntax is not the same with the order of syntax
elements changed. Then the row in question needs to be pinned down by
the primary key, requiring cut-and-paste of the PK columns. Furthermore,
the value to be updated needs to be put into the command, with proper
quoting. If the original value spans multiple line, copy-and-pasting it
for editing is especially tedious.

Suppose the following query where we spot a typo in the 2nd message:

=# select id, language, message from messages where language = 'en';
id | language | message
 1 | en   | Good morning
 2 | en   | Hello warld

The query needs to be transformed into this update:

=# update messages set message = 'Hello world' where id = 2;

This patch automates the tedious parts by opening the query result in a
editor in JSON format, where the user can edit the data. On closing the
editor, the JSON data is read back, and the differences are sent as
UPDATE commands. New rows are INSERTed, and deleted rows are DELETEd.

=# select id, language, message from messages where language = 'en' \gedit

An editor opens:
[
{ "id": 1, "language": "en", "message": "Good morning" },
{ "id": 2, "language": "en", "message": "Hello warld" }
]

Let's fix the typo and save the file:
[
{ "id": 1, "language": "en", "message": "Good morning" },
{ "id": 2, "language": "en", "message": "Hello world" }
]
UPDATE messages SET message = 'Hello world' WHERE id = '2';
UPDATE 1

In this example, typing "WHERE id = 2" would not be too hard, but the
primary key might be a composite key, with complex non-numeric values.
This is supported as well.

If expanded mode (\x) is enabled, \gedit will use the expanded JSON
format, best suitable for long values.


This patch requires the "psql JSON output format" patch.

Christoph
>From be3fef72af0adac7d3a6962f8cc78e167785a702 Mon Sep 17 00:00:00 2001
From: Christoph Berg 
Date: Mon, 22 Jan 2024 14:40:33 +0100
Subject: [PATCH] psql: Allow editing query results with \gedit

Assuming a SELECT statement reading from a single table, it is quite an
effort to transform that statement to an UPDATE statement on that table,
perhaps to fix a typo that the user has spotted in the query result.

First, the general syntax is not the same with the order of syntax
elements changed. Then the row in question needs to be pinned down by
the primary key, requiring cut-and-paste of the PK columns. Furthermore,
the value to be updated needs to be put into the command, with proper
quoting. If the original value spans multiple line, copy-and-pasting it
for editing is especially tedious.

Suppose the following query where we spot a typo in the 2nd message:

=# select id, language, message from messages where language = 'en';
id | language | message
 1 | en   | Good morning
 2 | en   | Hello warld

The query needs to be transformed into this update:

=# update messages set message = 'Hello world' where id = 2;

This patch automates the tedious parts by opening the query result in a
editor in JSON format, where the user can edit the data. On closing the
editor, the JSON data is read back, and the differences are sent as
UPDATE commands. New rows are INSERTed, and deleted rows are DELETEd.

=# select id, language, message from messages where language = 'en' \gedit

An editor opens:
[
{ "id": 1, "language": "en", "message": "Good morning" },
{ "id": 2, "language": "en", "message": "Hello warld" }
]

Let's fix the typo and save the file:
[
{ "id": 1, "language": "en", "message": "Good morning" },
{ "id": 2, "language": "en", "message": "Hello world" }
]
UPDATE messages SET message = 'Hello world' WHERE id = '2';
UPDATE 1

In this example, typing "WHERE id = 2" would not be too hard, but the
primary key might be a composite key, with complex non-numeric values.
This is supported as well.

If expanded mode (\x) is enabled, \gedit will use the expanded JSON
format, best suitable for long values:

=# \x
=# select id, language, message from messages where language = 'en' \gedit
[{
  "id": 1,
  "language": "en",
  "message": "Good morning"
},{
  "id": 2,
  "language": "en",
  "message": "Hello world"
}]

\gedit requires the PK (or any UNIQUE key) to be part of the select
statement:

=# select language, message from messages \gedit
\gedit: no key of table "messages" is contained in the returned query columns.
Select more columns or manually specify a key using \gedit (key=...)

If the user knows that their operation is sound even without a PK/UNIQUE
key, they can force a (set of) column(s) to be used:

=# select language, message from messages \gedit (key=message)
[
{ "language": "en", "message": "Good morning" },
{ "language": "en_US", "message": "Hello world" }
]
UPDATE messages SET language = 'en_US' WHERE message = 'Hello world';
UPDATE 1
-

Re: Synchronizing slots from primary to standby

2024-01-22 Thread Masahiko Sawada

On Mon, Jan 22, 2024 at 9:26 PM Amit Kapila  wrote:
>
> On Mon, Jan 22, 2024 at 5:28 PM Masahiko Sawada  wrote:
> >
> > On Sat, Jan 20, 2024 at 7:44 PM Amit Kapila  wrote:
> > >
> > > On Sat, Jan 20, 2024 at 10:52 AM Dilip Kumar  
> > > wrote:
> > > >
> > > > On Fri, Jan 19, 2024 at 5:24 PM Amit Kapila  
> > > > wrote:
> > > > >
> > > > > On Wed, Jan 17, 2024 at 4:00 PM shveta malik  
> > > > > wrote:
> > > > > >
> > > > >
> > > > > I had some off-list discussions with Sawada-San, Hou-San, and Shveta
> > > > > on the topic of extending replication commands instead of using the
> > > > > current model where we fetch the required slot information via SQL
> > > > > using a database connection. I would like to summarize the discussion
> > > > > and would like to know the thoughts of others on this topic.
> > > > >
> > > > > In the current patch, we launch the slotsync worker on physical
> > > > > standby which connects to the specified database (currently we let
> > > > > users specify the required dbname in primary_conninfo) on the primary.
> > > > > It then fetches the required information for failover marked slots
> > > > > from the primary and also does some primitive checks on the upstream
> > > > > node via SQL (the additional checks are like whether the upstream node
> > > > > has a specified physical slot or whether the upstream node is a
> > > > > primary node or a standby node). To fetch the required information it
> > > > > uses a libpqwalreciever API which is mostly apt for this purpose as it
> > > > > supports SQL execution but for this patch, we don't need a replication
> > > > > connection, so we extend the libpqwalreciever connect API.
> > > >
> > > > What sort of extension we have done to 'libpqwalreciever'? Is it
> > > > something like by default this supports replication connections so we
> > > > have done an extension to the API so that we can provide an option
> > > > whether to create a replication connection or a normal connection?
> > > >
> > >
> > > Yeah and in the future there could be more as well. The other function
> > > added walrcv_get_dbname_from_conninfo doesn't appear to be a problem
> > > either for now.
> > >
> > > > > Now, the concerns related to this could be that users would probably
> > > > > need to change existing mechanisms/tools to update priamry_conninfo
> >
> > I'm concerned about this. In fact, a primary_conninfo value generated
> > by pg_basebackup does not work with enable_syncslot.
> >
>
> Right, but if we want can't we extend pg_basebackup to do that? It is
> just that I am not sure that it is a good idea to extend pg_basebackup
> in the first version.

Okay.

>
> > > > > and one of the alternatives proposed is to have an additional GUC like
> > > > > slot_sync_dbname. Users won't be able to drop the database this worker
> > > > > is connected to aka whatever is specified in slot_sync_dbname but as
> > > > > the user herself sets up the configuration it shouldn't be a big deal.
> > > >
> > > > Yeah for this purpose users may use template1 or so which they
> > > > generally don't plan to drop.
> > > >
> > >
> > > Using template1 has other problems like users won't be able to create
> > > a new database. See [2] (point number 2.2)
> > >
> > > >
> > > >  So in case the user wants to drop that
> > > > database user needs to turn off the slot syncing option and then it
> > > > can be done?
> > > >
> > >
> > > Right.
> >
> > If the user wants to continue using slot syncing, they need to switch
> > the database to connect. Which requires modifying primary_conninfo and
> > reloading the configuration file. Which further leads to restarting
> > the physical replication. If they use synchronous replication, it
> > means the application temporarily stops during that.
> >
>
> Yes, that would be an inconvenience but the point is we don't expect
> this to change often.
>
> > >
> > > > > Then we also discussed whether extending libpqwalreceiver's connect
> > > > > API is a good idea and whether we need to further extend it in the
> > > > > future. As far as I can see, slotsync worker's primary requirement is
> > > > > to execute SQL queries which the current API is sufficient, and don't
> > > > > see something that needs any drastic change in this API. Note that
> > > > > tablesync worker that executes SQL also uses these APIs, so we may
> > > > > need something in the future for either of those. Then finally we need
> > > > > a slotsync worker to also connect to a database to use SQL and fetch
> > > > > results.
> > > >
> > > > While looking into the patch v64-0002 I could not exactly point out
> > > > what sort of extensions are there in libpqwalreceiver.c, I just saw
> > > > one extra API for fetching the dbname from connection info?
> > > >
> > >
> > > Right, the worry was that we may need it in the future.
> >
> > Yes. IIUC the slotsync worker uses libpqwalreceiver to establish a
> > non-replication connection and to execute SQL query. But neither of
> > them are releva

Re: Add code indentation check to cirrus-ci (was Re: Add BF member koel-like indentation checks to SanityCheck CI)

2024-01-22 Thread Andrew Dunstan




On 2024-01-21 Su 22:19, Peter Smith wrote:

2024-01 Commitfest.

Hi, This patch has a CF status of "Needs Review" [1], but it seems
like there was some CFbot test failure last time it was run [2].
Please have a look and post an updated version if necessary.



I don't think there's a consensus that we want this. It should probably 
be returned with feedback.



cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: psql JSON output format

2024-01-22 Thread Christoph Berg

Re: Laurenz Albe
> > But I do think it has positive
> > value. If we produce output that could be ingested back into PG later
> > with the right tool, that leaves the door open for someone to build
> > the tool later even if we don't have it today. If we produce output
> > that loses information, no tool built later can make up for the loss.

> I am all for losing as little information as possible, but I think
> that this discussion is going off on a tangent.  After all, we are not
> talking about a data export tool here, we are talking about psql.

I've just posted the other patch where I need the JSON format:
https://www.postgresql.org/message-id/flat/Za6EfXeewwLRS_fs%40msg.df7cb.de

There, I need to be able to read back the query output into psql,
while at the same time being human-readable so the user can sanely
edit the data in an editor. The default "aligned" format is only
human-readable, while CSV is mostly only machine-readable. JSON is the
best option between the two, I think.

What I did now in v3 of this patch is to print boolean and numeric
values (ints, floats, numeric) without quotes, while adding the quotes
back to json. This solves the NULL vs 'null'::json problem.

> I don't see anybody complain that float8 values lose precision in
> the default aligned format, or that empty strings and NULL values
> look the same in aligned format.  Why do the goalposts move for the
> JSON output format?  I don't think psql output should be considered
> a form of backup.

Fwiw, not quoting numbers in JSON won't have any of these problems if
the JSON reader just passes the strings read through. (Which PG's JSON
parser does.)

> Can we get consensus that SQL NULL columns should be omitted from the
> output, and the rest left as it currently is?

I think that would be an interesting option for a JSON export format.
The psql JSON format is more for human inspection, where omitting the
columns might create confusion. (We could make it a pset parameter of
the format, but I think the default should be to show NULL columns.)

Christoph
>From 2b575846fee609237256fe8eaf38fd45005fe8da Mon Sep 17 00:00:00 2001
From: Christoph Berg 
Date: Fri, 8 Sep 2023 15:59:29 +0200
Subject: [PATCH] Add JSON output format to psql

Query results are formatted as an array of JSON objects. In non-expanded
mode, one object per line is printed, and in expanded mode, one
key-value pair. Use `\pset format json` to enable, or run `psql --json`
or `psql -J`. NULL values are printed as `null`, independently from the
psql null setting. Numeric values are printed in plain, booleans as
true/false, and other values are printed as quoted strings.
---
 doc/src/sgml/ref/psql-ref.sgml |  26 -
 src/bin/psql/command.c |   6 +-
 src/bin/psql/help.c|   1 +
 src/bin/psql/startup.c |   6 +-
 src/bin/psql/tab-complete.c|   2 +-
 src/fe_utils/print.c   | 182 -
 src/include/fe_utils/print.h   |   6 +-
 src/test/regress/expected/psql.out |  79 +
 src/test/regress/sql/psql.sql  |  28 +
 9 files changed, 325 insertions(+), 11 deletions(-)

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index cc7d797159..f1f7eda082 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -274,6 +274,17 @@ EOF
   
 
 
+
+  -J
+  --json
+  
+  
+  Switches to JSON output mode.  This is
+  equivalent to \pset format json.
+  
+  
+
+
 
   -l
   --list
@@ -2956,6 +2967,7 @@ lo_import 152801
   asciidoc,
   csv,
   html,
+  json,
   latex,
   latex-longtable, troff-ms,
   unaligned, or wrapped.
@@ -2972,7 +2984,7 @@ lo_import 152801
   in by other programs, for example, tab-separated or comma-separated
   format.  However, the field separator character is not treated
   specially if it appears in a column's value;
-  so CSV format may be better suited for such
+  the CSV or JSON formats may be better suited for such
   purposes.
   
 
@@ -2997,6 +3009,18 @@ lo_import 152801
   \pset csv_fieldsep.
   
 
+  json format
+  
+   JSON format
+   in psql
+  
+  writes output in JSON format, described in
+  https://www.ietf.org/rfc/rfc4627.txt";>RFC 4627.
+  The result is an array of JSON objects. In non-expanded mode, one
+  object per line is printed, while in expanded mode, each key-value
+  pair is a separate line.
+  
+
   wrapped format is like aligned but wraps
   wide data values across lines to make the output fit in the target
   column width.  The target width is determined as described under
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5c906e4806..824900819b 100644
--- a/src/bin/psql/comm

Re: Support TZ format code in to_timestamp()

2024-01-22 Thread Aleksander Alekseev

Hi,

> > Hi, this patch was marked in CF as "Needs Review" [1], but there has
> > been no activity on this thread for 7+ months.
> > If nothing more is planned for this thread then it will be closed
> > ("Returned with feedback") at the end of this CF.
>
> I still think it would be a good idea, but I can't deny the lack
> of other interest in it.  Unless someone steps up to review,
> let's close it.

I agree that it would be a good idea, and again I would like to
condemn the approach "since no one reviews it we are going to reject
it". A friendly reminder like "hey, this patch was waiting long
enough, maybe someone could take a look" would be more appropriate
IMO. I remember during previous commitfests some CF managers created a
list of patches that could use more attention. That was useful.

I will review the patch, but probably only tomorrow.

-- 
Best regards,
Aleksander Alekseev

Re: the s_lock_stuck on perform_spin_delay

2024-01-22 Thread Robert Haas

On Mon, Jan 22, 2024 at 2:22 AM Andy Fan  wrote:
> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
> quickdie, however this is bad not only because it doesn't work on
> Windows, but also it has too poor performance even it impacts on
> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
> variable quickDieInProgress to handle this.

OK, I like the split between 0001 and 0002. I still think 0001 has
cosmetic problems, but if some committer wants to take it forward,
they can decide what to do about that; you and I going back and forth
doesn't seem like the right approach to sorting that out. Whether or
not 0002 is adopted might affect what we do about the cosmetics in
0001, too.

0003 seems ... unfortunate. It seems like an admission that 0001 is
wrong. Surely it *isn't* right to ignore the spinlock restrictions in
quickdie() in general. For example, we could self-deadlock if we try
to acquire a spinlock we already hold. If the problem here is merely
the call in errstart() then maybe we need to rethink that particular
call. If it goes any deeper than that, maybe we've got actual bugs we
need to fix.

+* It's likely to check the BlockSig to know if it is doing a quickdie
+* with sigismember, but it is too expensive in test, so introduce
+* quickDieInProgress to avoid that.

This isn't very good English -- I realize that can sometimes be hard
-- but also -- I don't think it likely that a future hacker would
wonder why this isn't done that way. A static variable is normal for
PostgreSQL; checking the signal mask would be a completely novel
approach. So I think this comment is missing the mark topically. If
this patch is right at all, the comment here should focus on why
disabling these checks in quickdie() is necessary and appropriate, not
why it's coded to match everything else in the system.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: psql: Allow editing query results with \gedit

2024-01-22 Thread Pavel Stehule

Hi

po 22. 1. 2024 v 16:06 odesílatel Christoph Berg  napsal:

> Assuming a SELECT statement reading from a single table, it is quite an
> effort to transform that statement to an UPDATE statement on that table,
> perhaps to fix a typo that the user has spotted in the query result.
>
> First, the general syntax is not the same with the order of syntax
> elements changed. Then the row in question needs to be pinned down by
> the primary key, requiring cut-and-paste of the PK columns. Furthermore,
> the value to be updated needs to be put into the command, with proper
> quoting. If the original value spans multiple line, copy-and-pasting it
> for editing is especially tedious.
>
> Suppose the following query where we spot a typo in the 2nd message:
>
> =# select id, language, message from messages where language = 'en';
> id | language | message
>  1 | en   | Good morning
>  2 | en   | Hello warld
>
> The query needs to be transformed into this update:
>
> =# update messages set message = 'Hello world' where id = 2;
>
> This patch automates the tedious parts by opening the query result in a
> editor in JSON format, where the user can edit the data. On closing the
> editor, the JSON data is read back, and the differences are sent as
> UPDATE commands. New rows are INSERTed, and deleted rows are DELETEd.
>
> =# select id, language, message from messages where language = 'en' \gedit
>
> An editor opens:
> [
> { "id": 1, "language": "en", "message": "Good morning" },
> { "id": 2, "language": "en", "message": "Hello warld" }
> ]
>
> Let's fix the typo and save the file:
> [
> { "id": 1, "language": "en", "message": "Good morning" },
> { "id": 2, "language": "en", "message": "Hello world" }
> ]
> UPDATE messages SET message = 'Hello world' WHERE id = '2';
> UPDATE 1
>
> In this example, typing "WHERE id = 2" would not be too hard, but the
> primary key might be a composite key, with complex non-numeric values.
> This is supported as well.
>
> If expanded mode (\x) is enabled, \gedit will use the expanded JSON
> format, best suitable for long values.
>
>
> This patch requires the "psql JSON output format" patch.
>

Introduction of \gedit is interesting idea, but in this form it looks too
magic

a) why the data are in JSON format, that is not native for psql (minimally
now)

b) the implicit transformation to UPDATEs and the next evaluation can be
pretty dangerous.

The concept of proposed feature is interesting, but the name \gedit is too
generic, maybe too less descriptive for this purpose

Maybe \geditupdates can be better - but still it can be dangerous and slow
(without limits)

In the end I am not sure if I like it or dislike it. Looks dangerous. I can
imagine possible damage when some people will see vi first time and will
try to finish vi, but in this command, it will be transformed to executed
UPDATEs. More generating UPDATEs without knowledge of table structure
(knowledge of PK) can be issue (and possibly dangerous too), and you cannot
to recognize PK from result of SELECT (Everywhere PK is not "id" and it is
not one column).

Regards

Pavel


> Christoph
>

Re: remaining sql/json patches

2024-01-22 Thread jian he

On Mon, Jan 22, 2024 at 10:28 PM Amit Langote  wrote:
>
> > based on v35.
> > Now I only applied from 0001 to 0007.
> > For {DEFAULT expression  ON EMPTY}  | {DEFAULT expression ON ERROR}
> > restrict DEFAULT expression be either Const node or FuncExpr node.
> > so these 3 SQL/JSON functions can be used in the btree expression index.
>
> I'm not really excited about adding these restrictions into the
> transformJsonFuncExpr() path.  Index or any other code that wants to
> put restrictions already have those in place, no need to add them
> here.  Moreover, by adding these restrictions, we might end up
> preventing users from doing useful things with this like specify
> column references.  If there are semantic issues with allowing that,
> we should discuss them.
>

after applying v36.
The following index creation and query operation works. I am not 100%
sure about these cases.
just want confirmation, sorry for bothering you

drop table t;
create table t(a jsonb, b  int);
insert into t select '{"hello":11}',1;
insert into t select '{"hello":12}',2;
CREATE INDEX t_idx2 ON t (JSON_query(a, '$.hello1' RETURNING int
default b + random() on error));
CREATE INDEX t_idx3 ON t (JSON_query(a, '$.hello1' RETURNING int
default random()::int on error));
SELECT JSON_query(a, '$.hello1'  RETURNING int default ret_setint() on
error) from t;
SELECT JSON_query(a, '$.hello1'  RETURNING int default sum(b) over()
on error) from t;
SELECT JSON_query(a, '$.hello1'  RETURNING int default sum(b) on
error) from t group by a;

but the following cases will fail related to index and default expression.
create table zz(a int, b int);
CREATE INDEX zz_idx1 ON zz ( (b + random()::int));
create table (a int, b int default ret_setint());
create table (a int, b int default sum(b) over());

Re: Support TZ format code in to_timestamp()

2024-01-22 Thread Daniel Gustafsson

> On 22 Jan 2024, at 03:10, Tom Lane  wrote:

> I still think it would be a good idea, but I can't deny the lack
> of other interest in it.  Unless someone steps up to review,
> let's close it.

Since I had this on my (ever-growing) TODO I re-prioritized today and took a
look at it since I think it's something we should support.

Nothing really sticks out and I was unable to poke any holes so I don't have
too much more to offer than a LGTM.

+   while (len > 0)
+   {
+   const datetkn *tp = datebsearch(lowtoken, zoneabbrevtbl->abbrevs,
+   zoneabbrevtbl->numabbrevs);

My immediate reaction was that we should stop at prefix lengths of two since I
could only think of abbreviations of two or more.  Googling and reading found
that there are indeed one-letter timezones (Alpha, Bravo etc..).  Not sure if
it's worth mentioning that in the comment to help other readers who aren't neck
deep in timezones?

+ /* FALL THRU */

Tiny nitpick, it looks a bit curious that we spell it FALL THRU here and "fall
through" a few cases up in the same switch.  While we are quite inconsistent
across the tree, consistency within a file is preferrable (regardless of
which).

--
Daniel Gustafsson

Re: Add tuples_skipped to pg_stat_progress_copy

2024-01-22 Thread torikoshia


On 2024-01-17 14:47, Masahiko Sawada wrote:
On Wed, Jan 17, 2024 at 2:22 PM torikoshia  
wrote:


Hi,

132de9968840c introduced SAVE_ERROR_TO option to COPY and enabled to
skip malformed data, but there is no way to watch the number of 
skipped

rows during COPY.

Attached patch adds tuples_skipped to pg_stat_progress_copy, which
counts the number of skipped tuples because source data is malformed.
If SAVE_ERROR_TO is not specified, this column remains zero.

The advantage would be that users can quickly notice and stop COPYing
when there is a larger amount of skipped data than expected, for
example.

As described in commit log, it is expected to add more choices for
SAVE_ERROR_TO like 'log' and using such options may enable us to know
the number of skipped tuples during COPY, but exposed in
pg_stat_progress_copy would be easier to monitor.


What do you think?


+1

The patch is pretty simple. Here is a comment:

+   (if SAVE_ERROR_TO is specified, otherwise 
zero).

+  
+ 

To be precise, this counter only advances when a value other than
'ERROR' is specified to SAVE_ERROR_TO option.


Thanks for your comment and review!

Updated the patch according to your comment and option name change by 
b725b7eec.



BTW, based on this patch, I think we can add another option which 
specifies the maximum tolerable number of malformed rows.
I remember this was discussed in [1], and feel it would be useful when 
loading 'dirty' data but there is a limit to how dirty it can be.

Attached 0002 is WIP patch for this(I haven't added doc yet).

This may be better discussed in another thread, but any comments(e.g. 
necessity of this option, option name) are welcome.



[1] 
https://www.postgresql.org/message-id/752672.1699474336%40sss.pgh.pa.us


--
Regards,

--
Atsushi Torikoshi
NTT DATA Group CorporationFrom 571ada768bdb68a31f295cbcb28f4348f253989d Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Mon, 22 Jan 2024 23:57:24 +0900
Subject: [PATCH v2 1/2] Add tuples_skipped to pg_stat_progress_copy

132de9968840c enabled COPY to skip malformed data, but there is no way to watch
the number of skipped rows during COPY.

This patch adds tuples_skipped to pg_stat_progress_copy, which counts the
number of skipped tuple because source data is malformed.
This column only advances when a value other than stop is specified to ON_ERROR.

Needs catalog bump.
---
 doc/src/sgml/monitoring.sgml | 11 +++
 src/backend/catalog/system_views.sql |  3 ++-
 src/backend/commands/copyfrom.c  |  5 +
 src/include/commands/progress.h  |  1 +
 src/test/regress/expected/rules.out  |  3 ++-
 5 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 6e74138a69..cfc13b3580 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5780,6 +5780,17 @@ FROM pg_stat_get_backend_idset() AS backendid;
WHERE clause of the COPY command.
   
  
+
+ 
+  
+   tuples_skipped bigint
+  
+  
+   Number of tuples skipped because they contain malformed data.
+   This counter only advances when a value other than
+   stop is specified to ON_ERROR.
+  
+ 
 

   
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index e43e36f5ac..6288270e2b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1318,7 +1318,8 @@ CREATE VIEW pg_stat_progress_copy AS
 S.param1 AS bytes_processed,
 S.param2 AS bytes_total,
 S.param3 AS tuples_processed,
-S.param4 AS tuples_excluded
+S.param4 AS tuples_excluded,
+S.param7 AS tuples_skipped
 FROM pg_stat_get_progress_info('COPY') AS S
 LEFT JOIN pg_database D ON S.datid = D.oid;
 
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 173a736ad5..8ab3777664 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -650,6 +650,7 @@ CopyFrom(CopyFromState cstate)
 	CopyMultiInsertInfo multiInsertInfo = {0};	/* pacify compiler */
 	int64		processed = 0;
 	int64		excluded = 0;
+	int64		skipped = 0;
 	bool		has_before_insert_row_trig;
 	bool		has_instead_insert_row_trig;
 	bool		leafpart_use_multi_insert = false;
@@ -1012,6 +1013,10 @@ CopyFrom(CopyFromState cstate)
  */
 cstate->escontext->error_occurred = false;
 
+			/* Report that this tuple was skipped by the ON_ERROR clause */
+			pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
+			 ++skipped);
+
 			continue;
 		}
 
diff --git a/src/include/commands/progress.h b/src/include/commands/progress.h
index a458c8c50a..73afa77a9c 100644
--- a/src/include/commands/progress.h
+++ b/src/include/commands/progress.h
@@ -142,6 +142,7 @@
 #define PROGRESS_COPY_TUPLES_EXCLUDED 3
 #define PROGRESS_COPY_COMMAND 4
 #define PROGRESS_COPY_TYPE 5
+#define PROGRESS_COPY_TUPLES_SKIPPED 6
 
 /*

Re: Adding facility for injection points (or probe points?) for more advanced tests

2024-01-22 Thread Heikki Linnakangas


On 22/01/2024 06:38, Michael Paquier wrote:

0001~0004 have been now applied, and I'm marking the CF entry as
committed.


Woo-hoo!

I wrote the attached patch to enable injection points in the Cirrus CI 
config, to run the injection tests I wrote for a GIN bug today [1]. But 
that led to a crash in the asan-enabled build [2]. I didn't investigate 
it yet.


[1] 
https://www.postgresql.org/message-id/d8f0b068-0e6e-4b2c-8932-62507eb7e1c6%40iki.fi

[2] https://cirrus-ci.com/task/5242888636858368

--
Heikki Linnakangas
Neon (https://neon.tech)
From d641364bccec051761a9360453162929ef77ea3d Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 22 Jan 2024 17:44:20 +0200
Subject: [PATCH 1/1] Enable injection points in cirrus CI

---
 .cirrus.tasks.yml | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e137769850d..0eacbbf3721 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -172,7 +172,7 @@ task:
 su postgres <<-EOF
   meson setup \
 --buildtype=debug \
--Dcassert=true -Duuid=bsd -Dtcl_version=tcl86 -Ddtrace=auto \
+-Dcassert=true -Dinjection_points=true -Duuid=bsd -Dtcl_version=tcl86 -Ddtrace=auto \
 -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
 -Dextra_lib_dirs=/usr/local/lib -Dextra_include_dirs=/usr/local/include/ \
 build
@@ -327,7 +327,7 @@ task:
   configure_script: |
 su postgres <<-EOF
   ./configure \
---enable-cassert --enable-debug --enable-tap-tests \
+--enable-cassert --enable-injection-points --enable-debug --enable-tap-tests \
 --enable-nls \
 --with-segsize-blocks=6 \
 \
@@ -357,7 +357,7 @@ task:
 su postgres <<-EOF
   meson setup \
 --buildtype=debug \
--Dcassert=true \
+-Dcassert=true -Dinjection_points=true \
 ${LINUX_MESON_FEATURES} \
 -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
 build
@@ -370,7 +370,7 @@ task:
   export CC='ccache gcc -m32'
   meson setup \
 --buildtype=debug \
--Dcassert=true \
+-Dcassert=true -Dinjection_points=true \
 ${LINUX_MESON_FEATURES} \
 -Dllvm=disabled \
 --pkg-config-path /usr/lib/i386-linux-gnu/pkgconfig/ \
@@ -482,7 +482,7 @@ task:
   --buildtype=debug \
   -Dextra_include_dirs=/opt/local/include \
   -Dextra_lib_dirs=/opt/local/lib \
-  -Dcassert=true \
+  -Dcassert=true -Dinjection_points=true \
   -Duuid=e2fs -Ddtrace=auto \
   -DPG_TEST_EXTRA="$PG_TEST_EXTRA" \
   build
@@ -556,7 +556,7 @@ task:
   # Use /DEBUG:FASTLINK to avoid high memory usage during linking
   configure_script: |
 vcvarsall x64
-meson setup --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% -DPG_TEST_EXTRA="%PG_TEST_EXTRA%" build
+meson setup --backend ninja --buildtype debug -Dc_link_args=/DEBUG:FASTLINK -Dcassert=true -Dinjection_points=true -Db_pch=true -Dextra_lib_dirs=c:\openssl\1.1\lib -Dextra_include_dirs=c:\openssl\1.1\include -DTAR=%TAR% -DPG_TEST_EXTRA="%PG_TEST_EXTRA%" build
 
   build_script: |
 vcvarsall x64
@@ -616,7 +616,7 @@ task:
 
   # disable -Dnls as the number of files it creates cause a noticable slowdown
   configure_script: |
-%BASH% -c "meson setup -Ddebug=true -Doptimization=g -Dcassert=true -Db_pch=true -Dnls=disabled -DTAR=%TAR% build"
+%BASH% -c "meson setup -Ddebug=true -Doptimization=g -Dcassert=true -Dinjection_points=true -Db_pch=true -Dnls=disabled -DTAR=%TAR% build"
 
   build_script: |
 %BASH% -c "ninja -C build"
-- 
2.39.2

Re: pg_stat_statements and "IN" conditions

2024-01-22 Thread Dmitry Dolgov

> On Mon, Jan 22, 2024 at 05:33:26PM +1100, Peter Smith wrote:
> 2024-01 Commitfest.
>
> Hi, This patch has a CF status of "Needs Review" [1], but it seems
> there was a CFbot test failure last time it was run [2]. Please have a
> look and post an updated version if necessary.
>
> ==
> [1] https://commitfest.postgresql.org/46/2837/
> [2] https://cirrus-ci.com/task/6688578378399744

It's the same failing pipeline Vignesh C was talking above. I've fixed
the issue in the latest patch version, but looks like it wasn't picked
up yet (from what I understand, the latest build for this CF is 8 weeks
old).

Re: psql: Allow editing query results with \gedit

2024-01-22 Thread Tom Lane

Pavel Stehule  writes:
> po 22. 1. 2024 v 16:06 odesílatel Christoph Berg  napsal:
>> This patch automates the tedious parts by opening the query result in a
>> editor in JSON format, where the user can edit the data. On closing the
>> editor, the JSON data is read back, and the differences are sent as
>> UPDATE commands. New rows are INSERTed, and deleted rows are DELETEd.

> Introduction of \gedit is interesting idea, but in this form it looks too
> magic

Yeah, I don't like it either --- it feels like something that belongs
in an ETL tool not psql.  The sheer size of the patch shows how far
afield it is from anything that psql already does, necessitating
writing tons of stuff that was not there before.  The bits that try
to parse the query to get necessary information seem particularly
half-baked.

> In the end I am not sure if I like it or dislike it. Looks dangerous. I can
> imagine possible damage when some people will see vi first time and will
> try to finish vi, but in this command, it will be transformed to executed
> UPDATEs.

Yup -- you'd certainly want some way of confirming that you actually
want the changes applied.  Our existing edit commands load the edited
string back into the query buffer, where you can \r it if you don't
want to run it.  But I fear that the results of this operation would
be too long for that to be workable.

> More generating UPDATEs without knowledge of table structure
> (knowledge of PK) can be issue (and possibly dangerous too), and you cannot
> to recognize PK from result of SELECT (Everywhere PK is not "id" and it is
> not one column).

It does look like it's trying to find out the pkey from the system
catalogs ... however, it's also accepting unique constraints without
a lot of thought about the implications of NULLs.  Even if you have
a pkey, it's not exactly clear to me what should happen if the user
changes the contents of a pkey field.  That could be interpreted as
either an INSERT or UPDATE, I think.

Also, while I didn't read too closely, it's not clear to me how the
code could reliably distinguish INSERT vs UPDATE vs DELETE cases ---
surely we're not going to try to put a "diff" engine into this, and
even if we did, diff is well known for producing somewhat surprising
decisions about exactly which old lines match which new ones.  That's
part of the reason why I really don't like the idea that the deduced
change commands will be summarily applied without the user even
seeing them.

regards, tom lane

Re: remaining sql/json patches

2024-01-22 Thread Alvaro Herrera

On 2024-Jan-18, Alvaro Herrera wrote:

> > commands/explain.c (Hmm, I think this is a preexisting bug actually)
> > 
> > 3893  18 : case T_TableFuncScan:
> > 3894  18 : Assert(rte->rtekind == RTE_TABLEFUNC);
> > 3895  18 : if (rte->tablefunc)
> > 3896   0 : if (rte->tablefunc->functype == 
> > TFT_XMLTABLE)
> > 3897   0 : objectname = "xmltable";
> > 3898 : else/* Must be 
> > TFT_JSON_TABLE */
> > 3899   0 : objectname = "json_table";
> > 3900 : else
> > 3901  18 : objectname = NULL;
> > 3902  18 : objecttag = "Table Function Name";
> > 3903  18 : break;
> 
> Indeed 

I was completely wrong about this, and in order to gain coverage the
only thing we needed was to add an EXPLAIN that uses the JSON format.
I did that just now.  I think your addition here works just fine.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Re: Support TZ format code in to_timestamp()

2024-01-22 Thread Tom Lane

Daniel Gustafsson  writes:
> On 22 Jan 2024, at 03:10, Tom Lane  wrote:
> +   while (len > 0)
> +   {
> +   const datetkn *tp = datebsearch(lowtoken, zoneabbrevtbl->abbrevs,
> +   zoneabbrevtbl->numabbrevs);

> My immediate reaction was that we should stop at prefix lengths of two since I
> could only think of abbreviations of two or more.  Googling and reading found
> that there are indeed one-letter timezones (Alpha, Bravo etc..).  Not sure if
> it's worth mentioning that in the comment to help other readers who aren't 
> neck
> deep in timezones?

The one I usually think of is "Z" for UTC; I wasn't actually aware
that there were any other single-letter abbrevs.  But in any case
I don't see a reason for this code to be making such assumptions.

> + /* FALL THRU */

> Tiny nitpick, it looks a bit curious that we spell it FALL THRU here and "fall
> through" a few cases up in the same switch.  While we are quite inconsistent
> across the tree, consistency within a file is preferrable (regardless of
> which).

Fair.  I tend to shorten it, but I failed to notice that there was
nearby precedent for the other way.

regards, tom lane

Re: psql: Allow editing query results with \gedit

2024-01-22 Thread David G. Johnston

On Mon, Jan 22, 2024 at 8:06 AM Christoph Berg  wrote:

> Assuming a SELECT statement reading from a single table, it is quite an
> effort to transform that statement to an UPDATE statement on that table,
> perhaps to fix a typo that the user has spotted in the query result.
>
>
Building off the other comments, I'd suggest trying to get rid of the
intermediate JSOn format and also just focus on a single row at any given
time.

For an update the first argument to the metacommand could be the unique key
value present in the previous result.  The resultant UPDATE would just put
that into the where clause and every other column in the result would be a
SET clause column with the thing being set the current value, ready to be
edited.

DELETE would be similar but without the need for a SET clause.

INSERT can produce a template INSERT (cols) VALUES ... command with some
smarts regarding auto incrementing keys and placeholder values.

David J.

Re: pg_stat_statements and "IN" conditions

2024-01-22 Thread Tom Lane

Dmitry Dolgov <9erthali...@gmail.com> writes:
>> On Mon, Jan 22, 2024 at 05:33:26PM +1100, Peter Smith wrote:
>> Hi, This patch has a CF status of "Needs Review" [1], but it seems
>> there was a CFbot test failure last time it was run [2]. Please have a
>> look and post an updated version if necessary.
>> 
>> ==
>> [1] https://commitfest.postgresql.org/46/2837/
>> [2] https://cirrus-ci.com/task/6688578378399744

> It's the same failing pipeline Vignesh C was talking above. I've fixed
> the issue in the latest patch version, but looks like it wasn't picked
> up yet (from what I understand, the latest build for this CF is 8 weeks
> old).

Please notice that at the moment, it's not being tested at all because
of a patch-apply failure -- that's what the little triangular symbol
means.  The rest of the display concerns the test results from the
last successfully-applied patch version.  (Perhaps that isn't a
particularly great UI design.)

If you click on the triangle you find out

== Applying patches on top of PostgreSQL commit ID 
b0f0a9432d0b6f53634a96715f2666f6d4ea25a1 ===
=== applying patch 
./v17-0001-Prevent-jumbling-of-every-element-in-ArrayExpr.patch
patching file contrib/pg_stat_statements/Makefile
Hunk #1 FAILED at 19.
1 out of 1 hunk FAILED -- saving rejects to file 
contrib/pg_stat_statements/Makefile.rej
patching file contrib/pg_stat_statements/expected/merging.out
patching file contrib/pg_stat_statements/meson.build
...

regards, tom lane

Re: Add \syncpipeline command to pgbench

2024-01-22 Thread Alvaro Herrera

On 2024-Jan-22, Anthonin Bonnefoy wrote:

> 0001 introduces a new error when the end of a pgbench script is
> reached while there's still an ongoing pipeline.

Pushed, backpatched to 14.  I reworded the error message to be

  client %d aborted: end of script reached with pipeline open

I hope this is OK.  I debated a half a dozen alternatives ("with open
pipeline", "without closing pipeline", "with unclosed pipeline" (???),
"leaving pipeline open") and decided this was the least ugly.

Thanks,

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
 really, I see PHP as like a strange amalgamation of C, Perl, Shell
 inflex: you know that "amalgam" means "mixture with mercury",
   more or less, right?
 i.e., "deadly poison"

Re: the s_lock_stuck on perform_spin_delay

2024-01-22 Thread Andy Fan

Robert Haas  writes:

> On Mon, Jan 22, 2024 at 2:22 AM Andy Fan  wrote:
>> I used sigismember(&BlockSig, SIGQUIT) to detect if a process is doing a
>> quickdie, however this is bad not only because it doesn't work on
>> Windows, but also it has too poor performance even it impacts on
>> USE_ASSERT_CHECKING build only. In v8, I introduced a new global
>> variable quickDieInProgress to handle this.
>
> OK, I like the split between 0001 and 0002. I still think 0001 has
> cosmetic problems, but if some committer wants to take it forward,
> they can decide what to do about that; you and I going back and forth
> doesn't seem like the right approach to sorting that out. Whether or
> not 0002 is adopted might affect what we do about the cosmetics in
> 0001, too.

Replacing ASSERT_NO_SPIN_LOCK with VerifyNoSpinLocksHeld or adding more
comments to each call of VerifyNoSpinLocksHeld really makes me happy
since they make things more cosmetic and practical. So I'd be absolutely
willing to do more stuff like this. Thanks for such suggestions!

Then I can't understand the left cosmetic problems... since you are
saying it may related to 0002, I guess you are talking about the naming
of START_SPIN_LOCK and END_SPIN_LOCK?

>
> 0003 seems ... unfortunate. It seems like an admission that 0001 is
> wrong.

Yes, that's what I was thinking. I doubted if I should merge 0003 to
0001 directly during this discussion, and finally I made it separate for
easier dicussion.

> Surely it *isn't* right to ignore the spinlock restrictions in
> quickdie() in general. For example, we could self-deadlock if we try
> to acquire a spinlock we already hold. If the problem here is merely
> the call in errstart() then maybe we need to rethink that particular
> call. If it goes any deeper than that, maybe we've got actual bugs we
> need to fix.

I get your point! Acquiring an already held spinlock in quickdie is
unlikely to happen, but since our existing infrastructure can handle it,
then there is no reason to bypass it. Since the problem here is just
errstart, we can do a if(!quickDieInProgress) VerifyNoSpinLocksHeld();
in errstart only. Another place besides the errstart is the
CHECK_FOR_INTERRUPTS in errfinish. I think we can add the same check for
the VerifyNoSpinLocksHeld in CHECK_FOR_INTERRUPTS.

>
> +* It's likely to check the BlockSig to know if it is doing a quickdie
> +* with sigismember, but it is too expensive in test, so introduce
> +* quickDieInProgress to avoid that.
>
> This isn't very good English -- I realize that can sometimes be hard
> -- but also -- I don't think it likely that a future hacker would
> wonder why this isn't done that way. A static variable is normal for
> PostgreSQL; checking the signal mask would be a completely novel
> approach. So I think this comment is missing the mark topically.

I was wrong to think sigismember is a syscall, but now I see it is just
a function in glibc. Even I can't get the source code of it, I think it
should just be some bit-operation based on the definition of __sigset_t. 

Another badness of sigismember is it is not avaiable in windows. It is
still unclear to me why sigaddset in quickdie can 
compile in windows.  (I have a sigismember version and get a linker
error at windows). This should be a blocker for me to submit the next
version of patch.

> If
> this patch is right at all, the comment here should focus on why
> disabling these checks in quickdie() is necessary and appropriate, not
> why it's coded to match everything else in the system.

I agree, and I think the patch 0003 is not right at all:(

-- 
Best Regards
Andy Fan

Re: XLog size reductions: smaller XLRec block header for PG17

2024-01-22 Thread vignesh C

On Mon, 22 Jan 2024 at 16:08, Aleksander Alekseev
 wrote:
>
> Hi,
>
> > I'm seeing that there has been no activity in this thread for nearly 4
> > months, I'm planning to close this in the current commitfest unless
> > someone is planning to take it forward.
>
> I don't think that closing CF entries purely due to inactivity is a
> good practice (neither something we did before) as long as there is
> code, it applies, etc. There are a lot of patches and few people
> working on them. Inactivity in a given thread doesn't necessarily
> indicate lack of interest, more likely lack of resources.

There are a lot of patches like this and there is no clear way to find
out if someone wants to work on it or if they have lost interest in
it. That is the reason, I thought to send out a mail so that the
author/reviewer can reply and take it to the next state like ready for
committer state. If the author/reviewer is not planning to work in
this commitfest, but has plans to work in the next commitfest we can
move this to the next commitfest. I don't see a better way to identify
if the patch has interest or not.

Regards,
Vignesh

Re: the s_lock_stuck on perform_spin_delay

2024-01-22 Thread Robert Haas

On Mon, Jan 22, 2024 at 11:58 AM Andy Fan  wrote:
> I get your point! Acquiring an already held spinlock in quickdie is
> unlikely to happen, but since our existing infrastructure can handle it,
> then there is no reason to bypass it.

No, the existing infrastructure cannot handle that at all.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Improving EXPLAIN's display of SubPlan nodes

2024-01-22 Thread Tom Lane

Aleksander Alekseev  writes:
> Although something like:

> ```
> +   Filter: (ANY (base_tbl.a = $1) FROM SubPlan 1 (returns $1))
> +   SubPlan 1 (returns $1)
> ```

> ... arguably doesn't give much more information to the user comparing
> to what we have now:

> ```
> -   Filter: (SubPlan 1)
> -   SubPlan 1
> ```

Yeah, I would probably not have thought to do this on my own; but
we had an actual user request for it.  I think arguably the main
benefit is to confirm "yes, this is the sub-select you think it is".

The main thing that's still missing compared to what is in the plan
data structure is information about which Param is which.  I think
we have the subplan output Params relatively well covered through
the expedient of listing them in the generated plan_name, but it's
still not apparent to me how we could shoehorn subplan input
Params into this (or whether it's worth doing).

regards, tom lane

Re: pg_stat_statements and "IN" conditions

2024-01-22 Thread Dmitry Dolgov

> On Mon, Jan 22, 2024 at 11:35:22AM -0500, Tom Lane wrote:
> Dmitry Dolgov <9erthali...@gmail.com> writes:
> >> On Mon, Jan 22, 2024 at 05:33:26PM +1100, Peter Smith wrote:
> >> Hi, This patch has a CF status of "Needs Review" [1], but it seems
> >> there was a CFbot test failure last time it was run [2]. Please have a
> >> look and post an updated version if necessary.
> >>
> >> ==
> >> [1] https://commitfest.postgresql.org/46/2837/
> >> [2] https://cirrus-ci.com/task/6688578378399744
>
> > It's the same failing pipeline Vignesh C was talking above. I've fixed
> > the issue in the latest patch version, but looks like it wasn't picked
> > up yet (from what I understand, the latest build for this CF is 8 weeks
> > old).
>
> Please notice that at the moment, it's not being tested at all because
> of a patch-apply failure -- that's what the little triangular symbol
> means.  The rest of the display concerns the test results from the
> last successfully-applied patch version.  (Perhaps that isn't a
> particularly great UI design.)
>
> If you click on the triangle you find out
>
> == Applying patches on top of PostgreSQL commit ID 
> b0f0a9432d0b6f53634a96715f2666f6d4ea25a1 ===
> === applying patch 
> ./v17-0001-Prevent-jumbling-of-every-element-in-ArrayExpr.patch
> patching file contrib/pg_stat_statements/Makefile
> Hunk #1 FAILED at 19.
> 1 out of 1 hunk FAILED -- saving rejects to file 
> contrib/pg_stat_statements/Makefile.rej
> patching file contrib/pg_stat_statements/expected/merging.out
> patching file contrib/pg_stat_statements/meson.build

Oh, I see, thanks. Give me a moment, will fix this.

Re: Improving EXPLAIN's display of SubPlan nodes

2024-01-22 Thread Chantal Keller


Hi Aleksander and Tom

I do confirm that I requested to get this information, in order to 
recover the formula to filter on.


Thanks to both of you
Chantal




Le 22/01/2024 à 18:07, Tom Lane a écrit :

Aleksander Alekseev  writes:

Although something like:



```
+   Filter: (ANY (base_tbl.a = $1) FROM SubPlan 1 (returns $1))
+   SubPlan 1 (returns $1)
```



... arguably doesn't give much more information to the user comparing
to what we have now:



```
-   Filter: (SubPlan 1)
-   SubPlan 1
```


Yeah, I would probably not have thought to do this on my own; but
we had an actual user request for it.  I think arguably the main
benefit is to confirm "yes, this is the sub-select you think it is".

The main thing that's still missing compared to what is in the plan
data structure is information about which Param is which.  I think
we have the subplan output Params relatively well covered through
the expedient of listing them in the generated plan_name, but it's
still not apparent to me how we could shoehorn subplan input
Params into this (or whether it's worth doing).

regards, tom lane

Re: the s_lock_stuck on perform_spin_delay

2024-01-22 Thread Andy Fan

Robert Haas  writes:

> On Mon, Jan 22, 2024 at 11:58 AM Andy Fan  wrote:
>> I get your point! Acquiring an already held spinlock in quickdie is
>> unlikely to happen, but since our existing infrastructure can handle it,
>> then there is no reason to bypass it.
>
> No, the existing infrastructure cannot handle that at all.

Actually I mean we can handle it without 0003. am I still wrong?
Without the 0003, if we acquiring the spin lock which is held by
ourself already. VerifyNoSpinLocksHeld in SpinLockAcquire should catch
it. 

-- 
Best Regards
Andy Fan

Re: Permute underscore separated components of columns before fuzzy matching

2024-01-22 Thread Arne Roland

Thank you for bringing that to my attention. Is there a way to subscribe 
to cf-bot failures?


Apparently I confused myself with my naming. I attached a patch that 
fixes the bug (at least at my cassert test-world run).


Regards
Arne

On 2024-01-22 06:38, Peter Smith wrote:

2024-01 Commitfest.

Hi, This patch has a CF status of "Needs Review" [1], but it seems
like there were  CFbot test failures last time it was run [2]. Please
have a look and post an updated version if necessary.

==
[1] https://commitfest.postgresql.org/46/4282/
[2] https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/46/4282

Kind Regards,
Peter Smith.diff --git a/src/backend/parser/parse_relation.c b/src/backend/parser/parse_relation.c
index 34a0ec5901..c416be2ea0 100644
--- a/src/backend/parser/parse_relation.c
+++ b/src/backend/parser/parse_relation.c
@@ -579,32 +579,13 @@ GetCTEForRTE(ParseState *pstate, RangeTblEntry *rte, int rtelevelsup)
 	return NULL;/* keep compiler quiet */
 }
 
-/*
- * updateFuzzyAttrMatchState
- *	  Using Levenshtein distance, consider if column is best fuzzy match.
- */
 static void
-updateFuzzyAttrMatchState(int fuzzy_rte_penalty,
-		  FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte,
-		  const char *actual, const char *match, int attnum)
+updateFuzzyAttrMatchStateSingleString(int fuzzy_rte_penalty,
+			FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte,
+			const char *actual, const char *match, int attnum, int matchlen)
 {
-	int			columndistance;
-	int			matchlen;
-
-	/* Bail before computing the Levenshtein distance if there's no hope. */
-	if (fuzzy_rte_penalty > fuzzystate->distance)
-		return;
-
-	/*
-	 * Outright reject dropped columns, which can appear here with apparent
-	 * empty actual names, per remarks within scanRTEForColumn().
-	 */
-	if (actual[0] == '\0')
-		return;
-
 	/* Use Levenshtein to compute match distance. */
-	matchlen = strlen(match);
-	columndistance =
+	int columndistance =
 		varstr_levenshtein_less_equal(actual, strlen(actual), match, matchlen,
 	  1, 1, 1,
 	  fuzzystate->distance + 1
@@ -667,6 +648,142 @@ updateFuzzyAttrMatchState(int fuzzy_rte_penalty,
 	}
 }
 
+static void putUnderscores(char* string, int start, int underscore_amount, int len) {
+	for (int i = 0; start + i < len && i < underscore_amount; i++)
+		string[start + i] = '_';
+}
+
+/*
+ * updateFuzzyAttrMatchState
+ *	  Using Levenshtein distance, consider if column is best fuzzy match.
+ */
+static void
+updateFuzzyAttrMatchState(int fuzzy_rte_penalty,
+		FuzzyAttrMatchState *fuzzystate, RangeTblEntry *rte,
+		const char *actual, const char *match, int attnum)
+{
+	/* Memory segment to store the current permutation of the match string. */
+	char* tmp_match;
+	int matchlen		 = strlen(match);
+	/* We keep track how many permutations we have already processed, to avoid long runtimes. */
+	int underscore_permutations_count = 0;
+	/* The location the underscore we currently process within the match string. */
+	int underscore_current = 1;
+	/* Variables to track the amount of underscores delimiting sections */
+	int underscore_amount = 1;
+	int underscore_second_amount = 1;
+
+	/* Bail before computing the Levenshtein distance if there's no hope. */
+	if (fuzzy_rte_penalty > fuzzystate->distance)
+		return;
+
+	/*
+	 * Outright reject dropped columns, which can appear here with apparent
+	 * empty actual names, per remarks within scanRTEForColumn().
+	 */
+	if (actual[0] == '\0')
+		return;
+
+	updateFuzzyAttrMatchStateSingleString(fuzzy_rte_penalty, fuzzystate, rte, actual, match, attnum, matchlen);
+	/* We don't want to permute zero length strings, so check whether the string starts with an underscore. */
+	if (match[0] == '_') {
+		while (underscore_current < matchlen - 1 && match[underscore_current] == '_') {
+			underscore_current++;
+		}
+	}
+	/* Advance to the next underscore. We do this once here to avoid pallocing, if the string does't contain an underscore at all. */
+	while (underscore_current < matchlen - 1 && match[underscore_current] != '_') {
+		underscore_current++;
+	}
+	/*
+	 * Check for permuting up to three sections separated by underscores.
+	 *
+	 * We count the number of underscores here, because we want to know whether we should consider
+	 * permuting underscore separated sections.
+	 */
+	if (underscore_current < matchlen - 1) {
+		tmp_match = palloc(matchlen + 1);
+		tmp_match[matchlen] = '\0';
+		while (underscore_permutations_count < 300 && underscore_current < matchlen - 1) {
+			/*
+			 * If sections contain more than one underscore, we want to swap the sections separated by more than one instead.
+			 * There would be no point in swapping zero length strings around.
+			 * So we check how many consecutive underscores we have here.
+			 */
+			underscore_amount = 1;
+			while (underscore_current + underscore_amount < matchlen && match[underscore_current + underscore_amount] == '_') {
+underscore_amount++;
+

Re: psql: Allow editing query results with \gedit

2024-01-22 Thread Pavel Stehule

po 22. 1. 2024 v 17:34 odesílatel David G. Johnston <
david.g.johns...@gmail.com> napsal:

> On Mon, Jan 22, 2024 at 8:06 AM Christoph Berg  wrote:
>
>> Assuming a SELECT statement reading from a single table, it is quite an
>> effort to transform that statement to an UPDATE statement on that table,
>> perhaps to fix a typo that the user has spotted in the query result.
>>
>>
> Building off the other comments, I'd suggest trying to get rid of the
> intermediate JSOn format and also just focus on a single row at any given
> time.
>
> For an update the first argument to the metacommand could be the unique
> key value present in the previous result.  The resultant UPDATE would just
> put that into the where clause and every other column in the result would
> be a SET clause column with the thing being set the current value, ready to
> be edited.
>
> DELETE would be similar but without the need for a SET clause.
>
> INSERT can produce a template INSERT (cols) VALUES ... command with some
> smarts regarding auto incrementing keys and placeholder values.
>
> David J.
>

Can you imagine using it?  I like psql, I like term applications, my first
database was FoxPro, but for dataeditation I am almost sure so I don't want
to use psql.

I can imagine enhancing the current \gexec command because it is executed
directly without possibility to edit. I see valuable some special clause
"edit"

like

\gexec_edit or \gexec(edit) or \gexec edit

This is like current gexec but with possibility to edit the result in
editor and with possibility to finish without saving.

Then we can introduce SQL functions UpdateTemplate(cols text[], rowvalue),
InsertTemplate, ...

and then you can write

SELECT UpdateTemplace(ARRAY['a','b','c'],  foo) FROM foo WHERE id IN (1,2)
\gexec_with_edit

But still looks strange to me - like we try reintroduce of necessity sed or
awk to SQL and psql

I would have forms like FoxPro, I would have a grid like FoxPro, but not in
psql, and I would not develop it :-)

Commitfest 2024-01 third week update

2024-01-22 Thread vignesh C

Hi,

Here's a quick status report after the third week:
Status summary:
status|   w1  |  w2 |   w3
---+---++--
Needs review:   |238 | 213|  181
Waiting on Author: | 44 |   46 |   52
Ready for Committer:| 27 |   27 |   26
Committed:| 36 |   46 |   57
Moved to next CF  |   1 | 3 | 4
Withdrawn:|   2 | 4 |   12
Returned with Feedback:  |   3 |   12 |   18
Rejected:   |   1 | 1 | 2
Total:  |   352|  352 |  352

If you have submitted a patch and it's in "Waiting for author" state,
please aim to get it to "Needs review" state soon if you can, as
that's where people are most likely to be looking for things to
review.
I have pinged most threads that are in "Needs review" state and don't
apply, compile warning-free, or pass check-world.  I'll do some more
of that sort of thing.
I have sent a private mail through commitfest to patch owners who have
submitted one or more patches but have not picked any of the patches
for review.
I have sent out mails for which there has been no activity for a long
time, please respond to the mails if you are planning to continue to
work on it or if you are planning to work on it in the next
commitfest, please move it to the next commitfest. If there is no
response I'm planning to return these patches and it can be added
again when it will be worked upon actively.

Regards,
Vignesh

Re: XLog size reductions: smaller XLRec block header for PG17

2024-01-22 Thread Robert Haas

On Mon, Jan 22, 2024 at 5:38 AM Aleksander Alekseev
 wrote:
> I don't think that closing CF entries purely due to inactivity is a
> good practice (neither something we did before) as long as there is
> code, it applies, etc. There are a lot of patches and few people
> working on them. Inactivity in a given thread doesn't necessarily
> indicate lack of interest, more likely lack of resources.

If the CF entry doesn't serve the purpose of allowing someone to find
patches in need of review, it's worse than useless. Patches that
aren't getting reviewed should stay in the CF until they do, or until
we make a collective decision that we don't care about or want that
particular patch enough to bother. But patches that are not ready for
review need to get out until such time as they are. Otherwise they're
just non-actionable clutter. And unfortunately we have so much of that
in the CF app now that finding things that really need review is time
consuming and difficult. That REALLY needs to be cleaned up.

In the case of this particular patch, I think the problem is that
there's no consensus on the design. There's not a ton of debate on
this thread, but thread [1] linked in the original post contains a lot
of vigorous debate about what the right thing to do is here and I
don't believe we reached any meeting of the minds. In light of that
lack of agreement, I'm honestly a bit confused why Matthias even found
it worthwhile to submit this on a new thread. I think we all agree
with him that there's room for improvement here, but we don't agree on
what particular form that improvement should take, and as long as that
agreement is lacking, I find it hard to imagine anything getting
committed. The task right now is to get agreement on something, and
leaving the CF entry open longer isn't going make the people who have
already expressed opinions start agreeing with each other more than
they do already.

It looks like I never replied to
https://www.postgresql.org/message-id/20221019192130.ebjbycpw6bzjry4v%40awork3.anarazel.de
but, FWIW, I agree with Andres that applying the same technique to
multiple fields that are stored together (DB OID, TS OID, rel #, block
#) is unlikely in practice to produce many cases that regress. But the
question for this thread is really more about whether we're OK with
using ad-hoc bit swizzling to reduce the size of xlog records or
whether we want to insist on the use of a uniform varint encoding.
Heikki and Andres both seem to favor the latter. IIRC, I was initially
more optimistic about ad-hoc bit swizzling being a potentially
acceptable technique, but I'm not convinced enough about it to argue
against two very smart committers both of whom know more about
micro-optimizing performance than I do, and nobody else seems to
making this argument on this thread either, so I just don't really see
how this patch is ever going to go anywhere in its current form.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

2024-01-22 Thread Robert Haas

On Mon, Jan 22, 2024 at 12:13 PM Andy Fan  wrote:
> > On Mon, Jan 22, 2024 at 11:58 AM Andy Fan  wrote:
> >> I get your point! Acquiring an already held spinlock in quickdie is
> >> unlikely to happen, but since our existing infrastructure can handle it,
> >> then there is no reason to bypass it.
> >
> > No, the existing infrastructure cannot handle that at all.
>
> Actually I mean we can handle it without 0003. am I still wrong?
> Without the 0003, if we acquiring the spin lock which is held by
> ourself already. VerifyNoSpinLocksHeld in SpinLockAcquire should catch
> it.

But that's only going to run in assert-only builds. The whole point of
the patch set is to tell developers that there are bugs in the code
that need fixing, not to catch problems that actually occur in
production.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: the s_lock_stuck on perform_spin_delay

2024-01-22 Thread Andy Fan



Robert Haas  writes:

> On Mon, Jan 22, 2024 at 12:13 PM Andy Fan  wrote:
>> > On Mon, Jan 22, 2024 at 11:58 AM Andy Fan  wrote:
>> >> I get your point! Acquiring an already held spinlock in quickdie is
>> >> unlikely to happen, but since our existing infrastructure can handle it,
>> >> then there is no reason to bypass it.
>> >
>> > No, the existing infrastructure cannot handle that at all.
>>
>> Actually I mean we can handle it without 0003. am I still wrong?
>> Without the 0003, if we acquiring the spin lock which is held by
>> ourself already. VerifyNoSpinLocksHeld in SpinLockAcquire should catch
>> it.
>
> But that's only going to run in assert-only builds. The whole point of
> the patch set is to tell developers that there are bugs in the code
> that need fixing, not to catch problems that actually occur in
> production.

I see. As to this aspect, then yes.

-- 
Best Regards
Andy Fan

Re: Network failure may prevent promotion

2024-01-22 Thread Fujii Masao

On Thu, Jan 18, 2024 at 10:42 PM Heikki Linnakangas  wrote:
> Given that commit 728f86fec6 that introduced this issue was not strictly
> required, perhaps we should just revert it for v16.

+1 for the revert.

This issue should be fixed in the upcoming minor release
since it might cause unexpected delays in failover times.

Regards,

-- 
Fujii Masao

Re: Teach predtest about IS [NOT] proofs

2024-01-22 Thread Tom Lane

James Coleman  writes:
> 0001 does the initial pure refactor. 0003 makes a lot of modifications
> to what we can prove about implication and refutation. Finally, 0003
> isn't intended to be committed, but attempts to validate more
> holistically that none of the changes creates any invalid proofs
> beyond the mostly happy-path tests added in 0004.

> I ended up not tackling changing how test_predtest tests run for now.
> That's plausibly still useful, and I'd be happy to add that if you
> generally agree with the direction of the patch and with that
> abstraction being useful.

> I added some additional verifications to the test_predtest module to
> prevent additional obvious flaws.

I looked through 0001 and made some additional cosmetic changes,
mostly to get comments closer to the associated code; I also
ran pgindent on it (see v5-0001 attached).  That seems pretty
committable to me at this point.  I also like your 0002 additions to
test_predtest.c (although why the mixture of ERROR and WARNING?
ISTM they should all be WARNING, so we can press on with the test).

One other thought is that maybe separating out
predicate_implied_not_null_by_clause should be part of 0001?

I'm less excited about the rest of v4-0002.

@@ -740,6 +747,16 @@ predicate_refuted_by_recurse(Node *clause, Node *predicate,
  !weak))
 return true;

+/*
+ * Because weak refutation expands the allowed outcomes for B
+ * from "false" to "false or null", we can additionally prove
+ * weak refutation in the case that strong refutation is proven.
+ */
+if (weak && not_arg &&
+predicate_implied_by_recurse(predicate, not_arg,
+ true))
+return true;
+
 switch (pclass)
 {
 case CLASS_AND:

I don't buy this bit at all.  If the prior recursive call fails to
prove weak refutation in a case where strong refutation holds, how is
that not a bug lower down?  Moreover, in order to mask such a bug,
you're doubling the time taken by failed proofs, which is an
unfortunate thing --- we don't like spending a lot of time on
something that fails to improve the plan.

@@ -1138,32 +1155,114 @@ predicate_implied_by_simple_clause(Expr *predicate, 
Node *clause,
 Assert(list_length(op->args) == 2);
 rightop = lsecond(op->args);

-/*
- * We might never see a null Const here, but better check
- * anyway
- */
-if (rightop && IsA(rightop, Const) &&
-!((Const *) rightop)->constisnull)
+if (rightop && IsA(rightop, Const))
 {
+Const*constexpr = (Const *) rightop;
 Node   *leftop = linitial(op->args);

-if (DatumGetBool(((Const *) rightop)->constvalue))
-{
-/* X = true implies X */
-if (equal(predicate, leftop))
-return true;
-}
+if (constexpr->constisnull)
+return false;
+
+if (DatumGetBool(constexpr->constvalue))
+return equal(predicate, leftop);
 else
-{
-/* X = false implies NOT X */
-if (is_notclause(predicate) &&
-equal(get_notclausearg(predicate), leftop))
-return true;
-}
+return is_notclause(predicate) &&
+equal(get_notclausearg(predicate), leftop);
 }
 }
 }
 break;

I don't understand what this bit is doing ... and the fact that
the patch removes all the existing comments and adds none isn't
helping that.  What it seems to mostly be doing is adding early
"return false"s, which I'm not sure is a good thing, because
it seems possible that operator_predicate_proof could apply here.

+case IS_UNKNOWN:
+/*
+ * When the clause is in the form "foo IS UNKNOWN" then
+ * we can prove weak implication of a predicate that
+ * is strict for "foo" and negated. This doesn't work
+ * for strong implication since if "foo" is "null" then
+ * the predicate will evaluate to "null" rather than
+ * "true".
+ */

The phrasing of this comment seems randomly inconsistent with others
making similar arguments.

+

Re: psql: Allow editing query results with \gedit

2024-01-22 Thread Tom Lane

Pavel Stehule  writes:
> I would have forms like FoxPro, I would have a grid like FoxPro, but not in
> psql, and I would not develop it :-)

Yeah, that's something that was also bothering me, but I failed to
put my finger on it.  "Here's some JSON, edit it, and don't forget
to keep the quoting correct" does not strike me as a user-friendly
way to adjust data content.  A spreadsheet-like display where you
can change data within cells seems like a far better API, although
I don't want to write that either.

This kind of API would not readily support INSERT or DELETE cases, but
TBH I think that's better anyway --- you're adding too much ambiguity
in pursuit of a very secondary use-case.  The stated complaint was
"it's too hard to build UPDATE commands", which I can sympathize with.

(BTW, I wonder how much of this already exists in pgAdmin.)

regards, tom lane

Re: Permute underscore separated components of columns before fuzzy matching

2024-01-22 Thread Tom Lane

Arne Roland  writes:
> Thank you for bringing that to my attention. Is there a way to subscribe 
> to cf-bot failures?

I don't know of any push notification support in cfbot, but you
can bookmark the page with your own active patches, and check it
periodically:

http://commitfest.cputube.org/arne-roland.html

(For others, click on your own name in the main cfbot page's entry for
one of your patches to find out how it spelled your name for this
purpose.)

regards, tom lane

Re: make dist using git archive

2024-01-22 Thread Peter Eisentraut


On 22.01.24 13:10, Junwang Zhao wrote:

I played this with meson build on macOS, the packages are generated
in source root but not build root, I'm sure if this is by design but I think
polluting *working directory* is not good.


Yes, it's not good, but I couldn't find a way to make it work.

This is part of the complications with meson I referred to.  The 
@BUILD_ROOT@ placeholder in custom_target() is apparently always a 
relative path, but it doesn't know that git -C changes the current 
directory.



Another thing I'd like to point out is, should we also introduce *git commit*
or maybe *git tag* to package name, something like:

git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-17devel-`git rev-parse --short HEAD`.tar.gz
git archive --format tar.gz --prefix postgresql-17devel/ HEAD -o
postgresql-`git describe --tags`.tar.gz


I'm not sure why we would need it built-in.  It can be done by hand, of 
course.

Re: Built-in CTYPE provider

2024-01-22 Thread Peter Eisentraut


On 18.01.24 23:03, Jeff Davis wrote:

On Thu, 2024-01-18 at 13:53 +0100, Peter Eisentraut wrote:

I think that would be a terrible direction to take, because it would
regress the default sort order from "correct" to "useless".


I don't agree that the current default is "correct". There are a lot of
ways it can be wrong:

   * the environment variables at initdb time don't reflect what the
users of the database actually want
   * there are so many different users using so many different
applications connected to the database that no one "correct" sort order
exists
   * libc has some implementation quirks
   * the version of Unicode that libc is based on is not what you expect
   * the version of libc is not what you expect


These are arguments why the current defaults are not universally 
perfect, but I'd argue that they are still most often the right thing as 
the default.



   Aside from
the overall message this sends about how PostgreSQL cares about
locales
and Unicode and such.


Unicode is primarily about the semantics of characters and their
relationships. The patches I propose here do a great job of that.

Collation (relationships between *strings*) is a part of Unicode, but
not the whole thing or even the main thing.


I don't get this argument.  Of course, people care about sorting and 
sort order.  Whether you consider this part of Unicode or adjacent to 
it, people still want it.



Maybe you don't intend for this to be the default provider?


I am not proposing that this provider be the initdb-time default.


ok


   But then
who would really use it? I mean, sure, some people would, but how
would
you even explain, in practice, the particular niche of users or use
cases?


It's for users who want to respect Unicode support text from
international sources in their database; but are not experts on the
subject and don't know precisely what they want or understand the
consequences. If and when such users do notice a problem with the sort
order, they'd handle it at that time (perhaps with a COLLATE clause, or
sorting in the application).



Vision:



* ICU offers COLLATE UNICODE, locale tailoring, case-insensitive
matching, and customization with rules. It's the solution for
everything from "slightly more advanced" to "very advanced".


I am astonished by this.  In your world, do users not want their text 
data sorted?  Do they not care what the sort order is?  You consider UCA 
sort order an "advanced" feature?

Re: partitioning and identity column

2024-01-22 Thread Peter Eisentraut


On 22.01.24 13:23, Ashutosh Bapat wrote:

  if (newdef->identity)
  {
  Assert(!is_partioning);
  /*
   * Identity is never inherited.  The new column can have an
   * identity definition, so we always just take that one.
   */
  def->identity = newdef->identity;
  }

Thoughts?


That code block already has Assert(!is_partition) at line 3085. I
thought that Assert is enough.


Ok.  Maybe just rephrase that comment somehow then?


There's another thing I found. The file isn't using
check_stack_depth() in the function which traverse inheritance
hierarchies. This isn't just a problem of the identity related
function but most of the functions in that file. Do you think it's
worth fixing it?


I suppose the number of inheritance levels is usually not a problem for 
stack depth?

Re: Adding facility for injection points (or probe points?) for more advanced tests

2024-01-22 Thread Heikki Linnakangas


On 22/01/2024 18:08, Heikki Linnakangas wrote:

I wrote the attached patch to enable injection points in the Cirrus CI
config, to run the injection tests I wrote for a GIN bug today [1]. But
that led to a crash in the asan-enabled build [2]. I didn't investigate
it yet.


Pushed a fix for the crash.

What do you think of enabling this in the Cirrus CI config?

--
Heikki Linnakangas
Neon (https://neon.tech)

Re: [17] CREATE SUBSCRIPTION ... SERVER

2024-01-22 Thread Jeff Davis

On Mon, 2024-01-22 at 18:41 +0530, Ashutosh Bapat wrote:
> 0002 adds a prefix "regress_" to almost every object that is created
> in foreign_data.sql.

psql \dew outputs the owner, which in the case of a built-in FDW is the
bootstrap superuser, which is not a stable name. I used the prefix to
exclude the built-in FDW -- if you have a better suggestion, please let
me know. (Though reading below, we might not even want a built-in FDW.)

> Dummy FDW makes me nervous. The way it's written, it may grow into a
> full-fledged postgres_fdw and in the process might acquire the same
> concerns that postgres_fdw has today. But I will study the patches
> and
> discussion around it more carefully.

I introduced that based on this comment[1].

I also thought it fit with your previous suggestion to make it work
with postgres_fdw, but I suppose it's not required. We could just not
offer the built-in FDW, and expect users to either use postgres_fdw or
create their own dummy FDW.

> I enhanced the postgres_fdw TAP test to use foreign table. Please see
> the attached patch. It works as expected. Of course a follow-on work
> will require linking the local table and its replica on the publisher
> table so that push down will work on replicated tables. But the
> concept at least works with your changes. Thanks for that.

Thank you, I'll include it in the next patch set.

> I am not sure we need a full-fledged TAP test for testing
> subscription. I wouldn't object to it, but TAP tests are heavy. It
> should be possible to write the same test as a SQL test by creating
> two databases and switching between them. Do you think it's worth
> trying that way?

I'm not entirely sure what you mean here, but I am open to test
simplifications if you see an opportunity.

Regards,
Jeff Davis
> 

[1] 
https://www.postgresql.org/message-id/172273.1693403385%40sss.pgh.pa.us

Re: psql: Allow editing query results with \gedit

2024-01-22 Thread Christoph Berg

Re: Pavel Stehule
> Introduction of \gedit is interesting idea, but in this form it looks too
> magic
> 
> a) why the data are in JSON format, that is not native for psql (minimally
> now)

Because we need something machine-readable. CSV would be an
alternative, but that is hardly human-readable.

> b) the implicit transformation to UPDATEs and the next evaluation can be
> pretty dangerous.

You can always run it in a transaction:

begin;
select * from tbl \gedit
commit;

Alternatively, we could try to make it not send the commands right away
- either by putting them into the query buffer (that currently work
only for single commands without any ';') or by opening another
editor. Doable, but perhaps the transaction Just Works?

> The concept of proposed feature is interesting, but the name \gedit is too
> generic, maybe too less descriptive for this purpose
> 
> Maybe \geditupdates can be better - but still it can be dangerous and slow
> (without limits)

An alternative I was considering was \edittable, but that would put
more emphasis on "table" when it's actually more powerful than that,
it works on a query result. (Similar in scope to updatable views.)

> In the end I am not sure if I like it or dislike it. Looks dangerous. I can
> imagine possible damage when some people will see vi first time and will
> try to finish vi, but in this command, it will be transformed to executed
> UPDATEs.

If you close the editor with touching the file, nothing will be sent.
And if you mess up the file, it will complain. I think it's unlikely
that people who end up in an editor they can't operate would be able
to modify JSON in a way that is still valid.

Also, \e has the same problem. Please don't let "there are users who
don't know what they are doing" spoil the idea.

> More generating UPDATEs without knowledge of table structure
> (knowledge of PK) can be issue (and possibly dangerous too), and you cannot
> to recognize PK from result of SELECT (Everywhere PK is not "id" and it is
> not one column).

It *does* retrieve the proper PK from the table. All updates are based
on the PK.


Re: Tom Lane
> > Introduction of \gedit is interesting idea, but in this form it looks too
> > magic
> 
> Yeah, I don't like it either --- it feels like something that belongs
> in an ETL tool not psql.

I tried to put it elsewhere first, starting with pspg:
https://github.com/okbob/pspg/issues/200
The problem is that external programs like the pager neither have
access to the query string/table name, nor the database connection.

ETL would also not quite fit, this is meant for interactive use.

> The sheer size of the patch shows how far
> afield it is from anything that psql already does, necessitating
> writing tons of stuff that was not there before.

I've been working on this for several months, so it's already larger
than the MVP would be. It does have features like (key=col1,col2) that
could be omitted.

> The bits that try
> to parse the query to get necessary information seem particularly
> half-baked.

Yes, that's not pretty, and I'd be open for suggestions on how to
improve that. I was considering:

1) this (dumb query parsing)
2) EXPLAIN the query to get the table
3) use libpq's PQftable

The problem with 2+3 is that on views and partitioned tables, this
would yield the base table name, not the table name used in the query.
1 turned out to be the most practical, and worked for me so far.

If the parse tree would be available, using that would be much better.
Should we perhaps add something like "explain (parse) select...", or
add pg_parsetree(query) function?

> Yup -- you'd certainly want some way of confirming that you actually
> want the changes applied.  Our existing edit commands load the edited
> string back into the query buffer, where you can \r it if you don't
> want to run it.  But I fear that the results of this operation would
> be too long for that to be workable.

The results are as long as you like. The intended use case would be to
change just a few rows.

As said above, I was already thinking of making it user-confirmable,
just the current version doesn't have it.

> It does look like it's trying to find out the pkey from the system
> catalogs ... however, it's also accepting unique constraints without
> a lot of thought about the implications of NULLs.

Right, the UNIQUE part doesn't take care of NULLs yet. Will fix that.
(Probably by erroring out if any key column is NULL.)

> Even if you have
> a pkey, it's not exactly clear to me what should happen if the user
> changes the contents of a pkey field.  That could be interpreted as
> either an INSERT or UPDATE, I think.

A changed PK will be interpreted as DELETE + INSERT. (I shall make
that more clear in the documentation.)

> Also, while I didn't read too closely, it's not clear to me how the
> code could reliably distinguish INSERT vs UPDATE vs DELETE cases ---
> surely we're not going to try to put a "diff" engine into this, and
> even if we did, diff is

Multiple startup messages over the same connection

2024-01-22 Thread Vladimir Churyukin

Hello,

A question about protocol design - would it be possible to extend the
protocol, so it can handle multiple startup / authentication messages over
a single connection? Are there any serious obstacles? (possible issues with
re-initialization of backends, I guess?)
If that is possible, it could improve one important edge case - where you
have to talk to multiple databases on a single host currently, you need to
open a separate connection to each of them. In some cases (multitenancy for
example), you may have thousands of databases on a host, which leads to
inefficient connection utilization on clients (on the db side too). A lot
of other RDBMSes  don't have this limitation.

thank you,
-Vladimir Churyukin

Re: make dist using git archive

2024-01-22 Thread Tristan Partin


On Mon Jan 22, 2024 at 1:31 AM CST, Peter Eisentraut wrote:

From 4b128faca90238d0a0bb6949a8050c2501d1bd67 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Sat, 20 Jan 2024 21:54:36 +0100
Subject: [PATCH v0] make dist uses git archive

---
 GNUmakefile.in | 34 --
 meson.build| 38 ++
 2 files changed, 50 insertions(+), 22 deletions(-)

diff --git a/GNUmakefile.in b/GNUmakefile.in
index eba569e930e..3e04785ada2 100644
--- a/GNUmakefile.in
+++ b/GNUmakefile.in
@@ -87,29 +87,19 @@ update-unicode: | submake-generated-headers 
submake-libpgport
 distdir= postgresql-$(VERSION)
 dummy  = =install=
 
+GIT = git

+
 dist: $(distdir).tar.gz $(distdir).tar.bz2
-   rm -rf $(distdir)
-
-$(distdir).tar: distdir
-   $(TAR) chf $@ $(distdir)
-
-.INTERMEDIATE: $(distdir).tar
-
-distdir-location:
-   @echo $(distdir)
-
-distdir:
-   rm -rf $(distdir)* $(dummy)
-   for x in `cd $(top_srcdir) && find . \( -name CVS -prune \) -o \( -name 
.git -prune \) -o -print`; do \
- file=`expr X$$x : 'X\./\(.*\)'`; \
- if test -d "$(top_srcdir)/$$file" ; then \
-   mkdir "$(distdir)/$$file" && chmod 777 "$(distdir)/$$file"; \
- else \
-   ln "$(top_srcdir)/$$file" "$(distdir)/$$file" >/dev/null 2>&1 \
- || cp "$(top_srcdir)/$$file" "$(distdir)/$$file"; \
- fi || exit; \
-   done
-   $(MAKE) -C $(distdir) distclean
+
+.PHONY: check-dirty-index
+check-dirty-index:
+   $(GIT) diff-index --quiet HEAD
+
+$(distdir).tar.gz: check-dirty-index
+   $(GIT) archive --format tar.gz --prefix $(distdir)/ HEAD -o $@
+
+$(distdir).tar.bz2: check-dirty-index
+   $(GIT) -c tar.tar.bz2.command='$(BZIP2) -c' archive --format tar.bz2 
--prefix $(distdir)/ HEAD -o $@
 
 distcheck: dist

rm -rf $(dummy)
diff --git a/meson.build b/meson.build
index c317144b6bc..f0d870c5192 100644
--- a/meson.build
+++ b/meson.build
@@ -3347,6 +3347,44 @@ run_target('help',
 
 
 
+###

+# Distribution archive
+###
+
+git = find_program('git', required: false, native: true, disabler: true)
+bzip2 = find_program('bzip2', required: false, native: true, disabler: true)


This doesn't need to be a disabler. git is fine as-is. See later 
comment. Disablers only work like you are expecting when they are used 
like how git is used. Once you call a method like .path(), all bets are 
off.



+distdir = meson.project_name() + '-' + meson.project_version()
+
+check_dirty_index = run_target('check-dirty-index',
+   command: [git, 'diff-index', '--quiet', 'HEAD'])


Seems like you might want to add -C here too?


+
+tar_gz = custom_target('tar.gz',
+  build_always_stale: true,
+  command: [git, '-C', '@SOURCE_ROOT@', 'archive',
+'--format', 'tar.gz',
+'--prefix', distdir + '/',
+'-o', '@BUILD_ROOT@/@OUTPUT@',
+'HEAD', '.'],
+  install: false,
+  output: distdir + '.tar.gz',
+)
+
+tar_bz2 = custom_target('tar.bz2',
+  build_always_stale: true,
+  command: [git, '-C', '@SOURCE_ROOT@', '-c', 'tar.tar.bz2.command=' + 
bzip2.path() + ' -c', 'archive',
+'--format', 'tar.bz2',
+'--prefix', distdir + '/',


-'-o', '@BUILD_ROOT@/@OUTPUT@',
+'-o', join_paths(meson.build_root(), '@OUTPUT@'),

This will generate the tarballs in the build directory. Do the same for 
the previous target. Tested locally.



+'HEAD', '.'],
+  install: false,
+  output: distdir + '.tar.bz2',
+)


The bz2 target should be wrapped in an `if bzip2.found()`. It is 
possible for git to be found, but not bzip2. I might also define the bz2 
command out of line. Also, you may want to add 
these programs to meson_options.txt for overriding, even though the 
"meson-ic" way is to use a machine file.



+
+alias_target('pgdist', [check_dirty_index, tar_gz, tar_bz2])


Are you intending for the check_dirty_index target to prohibit the other 
two targets from running? Currently that is not the case. If it is what 
you intend, use a stamp file or something to indicate a relationship. 
Alternatively, inline the git diff-index into the other commands. These 
might also do better as external scripts. It would reduce duplication 
between the autotools and Meson builds.



+
+
+
 ###
 # The End, The End, My Friend
 ###


I am not really following why we can't use the builtin Meson dist 
command. The only difference from my testing is it doesn't use 
a --prefix argument.


--
Tristan Partin
Neon (https://neon.tech)

Re: Does redundant extension exist During faster COPY in PG16

2024-01-22 Thread Andres Freund

Hi,

On 2024-01-22 19:54:00 +0800, 何柯文(渊云) wrote:
> I'm learning faster COPY of PG16. I have some questions about extension lock 
> improvement.
> From ./src/backend/storage/buffer/bufmgr.c:1901 (ExtendBufferedRelShared)
> ```
>  /*
>  * Lock relation against concurrent extensions, unless requested not to.
>  *
>  * We use the same extension lock for all forks. That's unnecessarily
>  * restrictive, but currently extensions for forks don't happen often
>  * enough to make it worth locking more granularly.
>  *
>  * Note that another backend might have extended the relation by the time
>  * we get the lock.
>  */
>  if (!(flags & EB_SKIP_EXTENSION_LOCK))
>  {
>  LockRelationForExtension(bmr.rel, ExclusiveLock);
>  if (bmr.rel)
>  bmr.smgr = RelationGetSmgr(bmr.rel);
>  }
>  ...
>  smgrzeroextend(bmr.smgr, fork, first_block, extend_by, false);
> ```
> During concurrent extension, when we obtain the extension lock, we use
> smgrzeroextend() to extend relation files instead of searching fsm through
> GetPageWithFreeSpace(). Is this approach reasonable?

I think so, yes.

> During concurrent extensions, one backend bulk extend successfully, meaning
> that other backends waiting on extension lock have free pages to use.  If
> all concurrent extend backends extend the relation file after getting the
> extension lock, the extension lock will be held (extention backends *
> smgrzeroextend() executing time).

If there's this much contention on the extension lock, there's no harm in
extending more - the space will be used soon. The alternatives would be a) to
search the FSM with the extension lock held, making contention worse, b) to
release the extension lock again if we couldn't immediately acquire it, search
the fsm, and retry if we couldn't find any free space, which would
substantially increase contention.

The FSM is the source of substantial contention, disabling it actually results
in substantial throughput increases. Vastly increasing the number of lookups
in the FSM would make that considerably worse, without a meaningful gain in
density.

Greetings,

Andres Freund

Re: Improve WALRead() to suck data directly from WAL buffers when possible

2024-01-22 Thread Andres Freund

Hi,

On 2024-01-10 19:59:29 +0530, Bharath Rupireddy wrote:
> + /*
> +  * Typically, we must not read a WAL buffer page that just got
> +  * initialized. Because we waited enough for the in-progress WAL
> +  * insertions to finish above. However, there can exist a slight
> +  * window after the above wait finishes in which the read 
> buffer page
> +  * can get replaced especially under high WAL generation rates. 
> After
> +  * all, we are reading from WAL buffers without any locks here. 
> So,
> +  * let's not count such a page in.
> +  */
> + phdr = (XLogPageHeader) page;
> + if (!(phdr->xlp_magic == XLOG_PAGE_MAGIC &&
> +   phdr->xlp_pageaddr == (ptr - (ptr % XLOG_BLCKSZ)) &&
> +   phdr->xlp_tli == tli))
> + break;

I still think that anything that requires such checks shouldn't be
merged. It's completely bogus to check page contents for validity when we
should have metadata telling us which range of the buffers is valid and which
not.

Greetings,

Andres Freund

core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Andres Freund

Hi,

I noticed that I was getting core dumps while executing the tests, without the
tests failing. Backtraces are vriations of this:

#0  0x00ca29cd in pg_atomic_read_u32_impl (ptr=0x7fe13497a004) at 
../../../../../home/andres/src/postgresql/src/include/port/atomics/generic.h:48
#1  0x00ca2b08 in pg_atomic_read_u32 (ptr=0x7fe13497a004) at 
../../../../../home/andres/src/postgresql/src/include/port/atomics.h:239
#2  0x00ca3c3d in LWLockAttemptLock (lock=0x7fe13497a000, 
mode=LW_EXCLUSIVE)
at 
../../../../../home/andres/src/postgresql/src/backend/storage/lmgr/lwlock.c:825
#3  0x00ca440c in LWLockAcquire (lock=0x7fe13497a000, mode=LW_EXCLUSIVE)
at 
../../../../../home/andres/src/postgresql/src/backend/storage/lmgr/lwlock.c:1264
#4  0x7fe130204ab4 in apw_detach_shmem (code=0, arg=0) at 
../../../../../home/andres/src/postgresql/contrib/pg_prewarm/autoprewarm.c:788
#5  0x00c81c99 in shmem_exit (code=0) at 
../../../../../home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:276
#6  0x00c81a7c in proc_exit_prepare (code=0) at 
../../../../../home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:198
#7  0x00c819a8 in proc_exit (code=0) at 
../../../../../home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:111
#8  0x00bdd0b3 in BackgroundWorkerMain () at 
../../../../../home/andres/src/postgresql/src/backend/postmaster/bgworker.c:841
#9  0x00be861d in do_start_bgworker (rw=0x341ff20) at 
../../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:5756
#10 0x00be8a34 in maybe_start_bgworkers () at 
../../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:5980
#11 0x00be4f9f in process_pm_child_exit () at 
../../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:3039
#12 0x00be2de4 in ServerLoop () at 
../../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1765
#13 0x00be27bf in PostmasterMain (argc=4, argv=0x33dbba0) at 
../../../../../home/andres/src/postgresql/src/backend/postmaster/postmaster.c:1475
#14 0x00aca326 in main (argc=4, argv=0x33dbba0) at 
../../../../../home/andres/src/postgresql/src/backend/main/main.c:198

The most likely culprit seems to be:
https://postgr.es/m/E1rQvjC-002Chd-Cd%40gemulon.postgresql.org

The test encountering this is pg_prewarm/001_basic:
(gdb) p DataDir
$12 = 0x33ef8a0 
"/srv/dev/build/postgres/m-dev-assert/testrun/pg_prewarm/001_basic/data/t_001_basic_main_data/pgdata"


A secondary issue is that the tests suceed despite two segfaults. The problem
here likely is that we don't have sufficient error handling during shutdowns:

2024-01-22 12:31:34.133 PST [3054823] LOG:  background worker "logical 
replication launcher" (PID 3054836) exited with exit code 1
2024-01-22 12:31:34.443 PST [3054823] LOG:  background worker "autoprewarm 
leader" (PID 3054835) was terminated by signal 11: Segmentation fault
2024-01-22 12:31:34.443 PST [3054823] LOG:  terminating any other active server 
processes
2024-01-22 12:31:34.444 PST [3054823] LOG:  abnormal database system shutdown
2024-01-22 12:31:34.469 PST [3054823] LOG:  database system is shut down

2024-01-22 12:31:34.555 PST [3055133] LOG:  starting PostgreSQL 17devel on 
x86_64-linux, compiled by gcc-14.0.0, 64-bit
2024-01-22 12:31:34.555 PST [3055133] LOG:  listening on Unix socket 
"/tmp/p6XG0kQW9w/.s.PGSQL.60402"
2024-01-22 12:31:34.557 PST [3055148] LOG:  database system was interrupted; 
last known up at 2024-01-22 12:31:33 PST
2024-01-22 12:31:34.557 PST [3055148] LOG:  database system was not properly 
shut down; automatic recovery in progress
2024-01-22 12:31:34.558 PST [3055148] LOG:  redo starts at 0/1531090
2024-01-22 12:31:34.559 PST [3055148] LOG:  invalid record length at 0/1555F60: 
expected at least 24, got 0
2024-01-22 12:31:34.559 PST [3055148] LOG:  redo done at 0/1555F38 system 
usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2024-01-22 12:31:34.559 PST [3055146] LOG:  checkpoint starting: 
end-of-recovery immediate wait
2024-01-22 12:31:34.570 PST [3055146] LOG:  checkpoint complete: wrote 42 
buffers (0.3%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s, 
sync=0.001 s, total=0.011 s; sync files=0, longest=0.000 s, average=0.000 s; 
distance=147 kB, estimate=147 kB; lsn=0/1555F60, redo lsn=0/1555F60
2024-01-22 12:31:34.573 PST [3055133] LOG:  database system is ready to accept 
connections

ISTM that we shouldn't basically silently overlook shutdowns due to crashes in
the tests. How to not do so is unfortunately not immediately obvious to me...


Greetings,

Andres Freund

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Nathan Bossart

On Mon, Jan 22, 2024 at 12:41:17PM -0800, Andres Freund wrote:
> I noticed that I was getting core dumps while executing the tests, without the
> tests failing. Backtraces are vriations of this:

Looking, thanks for the heads-up.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Evaluate arguments of correlated SubPlans in the referencing ExprState

2024-01-22 Thread Andres Freund

Hi,

On 2024-01-22 10:30:22 +1100, Peter Smith wrote:
> 2024-01 Commitfest.
> 
> Hi, This patch has a CF status of "Ready for Committer", but it is
> currently failing some CFbot tests [1]. Please have a look and post an
> updated version..

I think this failure is independent of this patch - by coincidence I just
sent an email about the issue
https://www.postgresql.org/message-id/20240122204117.swton324xcoodnyi%40awork3.anarazel.de
a few minutes ago.

Greetings,

Andres Freund

Re: pg_stat_statements and "IN" conditions

2024-01-22 Thread Dmitry Dolgov

> On Mon, Jan 22, 2024 at 06:07:27PM +0100, Dmitry Dolgov wrote:
> > Please notice that at the moment, it's not being tested at all because
> > of a patch-apply failure -- that's what the little triangular symbol
> > means.  The rest of the display concerns the test results from the
> > last successfully-applied patch version.  (Perhaps that isn't a
> > particularly great UI design.)
> >
> > If you click on the triangle you find out
> >
> > == Applying patches on top of PostgreSQL commit ID 
> > b0f0a9432d0b6f53634a96715f2666f6d4ea25a1 ===
> > === applying patch 
> > ./v17-0001-Prevent-jumbling-of-every-element-in-ArrayExpr.patch
> > patching file contrib/pg_stat_statements/Makefile
> > Hunk #1 FAILED at 19.
> > 1 out of 1 hunk FAILED -- saving rejects to file 
> > contrib/pg_stat_statements/Makefile.rej
> > patching file contrib/pg_stat_statements/expected/merging.out
> > patching file contrib/pg_stat_statements/meson.build
>
> Oh, I see, thanks. Give me a moment, will fix this.

Here is it.
>From ac8a7c93fbb72c469ca7128280f52024adb860ab Mon Sep 17 00:00:00 2001
From: Dmitrii Dolgov <9erthali...@gmail.com>
Date: Mon, 22 Jan 2024 21:11:15 +0100
Subject: [PATCH v18 1/4] Prevent jumbling of every element in ArrayExpr

pg_stat_statements produces multiple entries for queries like

SELECT something FROM table WHERE col IN (1, 2, 3, ...)

depending on the number of parameters, because every element of
ArrayExpr is jumbled. In certain situations it's undesirable, especially
if the list becomes too large.

Make an array of Const expressions contribute only the first/last
elements to the jumble hash. Allow to enable this behavior via the new
pg_stat_statements parameter query_id_const_merge with the default value off.

Reviewed-by: Zhihong Yu, Sergey Dudoladov, Robert Haas, Tom Lane,
Michael Paquier, Sergei Kornilov, Alvaro Herrera, David Geier
Tested-by: Chengxi Sun
---
 contrib/pg_stat_statements/Makefile   |   2 +-
 .../pg_stat_statements/expected/merging.out   | 167 ++
 contrib/pg_stat_statements/meson.build|   1 +
 .../pg_stat_statements/pg_stat_statements.c   |  62 ++-
 contrib/pg_stat_statements/sql/merging.sql|  58 ++
 doc/src/sgml/pgstatstatements.sgml|  57 +-
 src/backend/nodes/gen_node_support.pl |  21 ++-
 src/backend/nodes/queryjumblefuncs.c  | 100 ++-
 src/backend/postmaster/postmaster.c   |   3 +
 src/backend/utils/misc/postgresql.conf.sample |   1 -
 src/include/nodes/primnodes.h |   2 +-
 src/include/nodes/queryjumble.h   |   9 +-
 12 files changed, 458 insertions(+), 25 deletions(-)
 create mode 100644 contrib/pg_stat_statements/expected/merging.out
 create mode 100644 contrib/pg_stat_statements/sql/merging.sql

diff --git a/contrib/pg_stat_statements/Makefile b/contrib/pg_stat_statements/Makefile
index 414a30856e4..03a62d685f3 100644
--- a/contrib/pg_stat_statements/Makefile
+++ b/contrib/pg_stat_statements/Makefile
@@ -19,7 +19,7 @@ LDFLAGS_SL += $(filter -lm, $(LIBS))
 
 REGRESS_OPTS = --temp-config $(top_srcdir)/contrib/pg_stat_statements/pg_stat_statements.conf
 REGRESS = select dml cursors utility level_tracking planning \
-	user_activity wal entry_timestamp cleanup oldextversions
+	user_activity wal entry_timestamp cleanup oldextversions merging
 # Disabled because these tests require "shared_preload_libraries=pg_stat_statements",
 # which typical installcheck users do not have (e.g. buildfarm clients).
 NO_INSTALLCHECK = 1
diff --git a/contrib/pg_stat_statements/expected/merging.out b/contrib/pg_stat_statements/expected/merging.out
new file mode 100644
index 000..1e58283afe8
--- /dev/null
+++ b/contrib/pg_stat_statements/expected/merging.out
@@ -0,0 +1,167 @@
+--
+-- Const merging functionality
+--
+CREATE EXTENSION pg_stat_statements;
+CREATE TABLE test_merge (id int, data int);
+-- IN queries
+-- No merging is performed, as a baseline result
+SELECT pg_stat_statements_reset() IS NOT NULL AS t;
+ t 
+---
+ t
+(1 row)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9);
+ id | data 
++--
+(0 rows)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
+ id | data 
++--
+(0 rows)
+
+SELECT * FROM test_merge WHERE id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11);
+ id | data 
++--
+(0 rows)
+
+SELECT query, calls FROM pg_stat_statements ORDER BY query COLLATE "C";
+query| calls 
+-+---
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9)   | 1
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10)  | 1
+ SELECT * FROM test_merge WHERE id IN ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11) | 1
+ SELECT pg_stat_statements_reset() IS NOT NULL AS t

Re: Refactoring backend fork+exec code

2024-01-22 Thread Andres Freund

Hi,

On 2024-01-10 14:35:52 +0200, Heikki Linnakangas wrote:
> Here's a patch that gets rid of AuxProcType. It's independent of the other
> patches in this thread; if this is committed, I'll rebase the rest of the
> patches over this and get rid of the new PMC_* enum.
> 
> Three patches, actually. The first one fixes an existing comment that I
> noticed to be incorrect while working on this. I'll push that soon, barring
> objections. The second one gets rid of AuxProcType, and the third one
> replaces IsBackgroundWorker, IsAutoVacuumLauncherProcess() and
> IsAutoVacuumWorkerProcess() with checks on MyBackendType. So MyBackendType
> is now the primary way to check what kind of a process the current process
> is.
> 
> 'am_walsender' would also be fairly straightforward to remove and replace
> with AmWalSenderProcess(). I didn't do that because we also have
> am_db_walsender and am_cascading_walsender which cannot be directly replaced
> with MyBackendType. Given that, it might be best to keep am_walsender for
> symmetry.


> @@ -561,13 +561,13 @@ static void ShmemBackendArrayAdd(Backend *bn);
>  static void ShmemBackendArrayRemove(Backend *bn);
>  #endif   /* EXEC_BACKEND 
> */
>  
> -#define StartupDataBase()StartChildProcess(StartupProcess)
> -#define StartArchiver()  
> StartChildProcess(ArchiverProcess)
> -#define StartBackgroundWriter() StartChildProcess(BgWriterProcess)
> -#define StartCheckpointer()  StartChildProcess(CheckpointerProcess)
> -#define StartWalWriter() StartChildProcess(WalWriterProcess)
> -#define StartWalReceiver()   StartChildProcess(WalReceiverProcess)
> -#define StartWalSummarizer() StartChildProcess(WalSummarizerProcess)
> +#define StartupDataBase()StartChildProcess(B_STARTUP)
> +#define StartArchiver()  StartChildProcess(B_ARCHIVER)
> +#define StartBackgroundWriter() StartChildProcess(B_BG_WRITER)
> +#define StartCheckpointer()  StartChildProcess(B_CHECKPOINTER)
> +#define StartWalWriter() StartChildProcess(B_WAL_WRITER)
> +#define StartWalReceiver()   StartChildProcess(B_WAL_RECEIVER)
> +#define StartWalSummarizer() StartChildProcess(B_WAL_SUMMARIZER)

Not for this commit, but we ought to rip out these macros - all they do is to
make it harder to understand the code.




> @@ -5344,31 +5344,31 @@ StartChildProcess(AuxProcType type)
>   errno = save_errno;
>   switch (type)
>   {
> - case StartupProcess:
> + case B_STARTUP:
>   ereport(LOG,
>   (errmsg("could not fork startup 
> process: %m")));
>   break;
> - case ArchiverProcess:
> + case B_ARCHIVER:
>   ereport(LOG,
>   (errmsg("could not fork 
> archiver process: %m")));
>   break;
> - case BgWriterProcess:
> + case B_BG_WRITER:
>   ereport(LOG,
>   (errmsg("could not fork 
> background writer process: %m")));
>   break;
> - case CheckpointerProcess:
> + case B_CHECKPOINTER:
>   ereport(LOG,
>   (errmsg("could not fork 
> checkpointer process: %m")));
>   break;
> - case WalWriterProcess:
> + case B_WAL_WRITER:
>   ereport(LOG,
>   (errmsg("could not fork WAL 
> writer process: %m")));
>   break;
> - case WalReceiverProcess:
> + case B_WAL_RECEIVER:
>   ereport(LOG,
>   (errmsg("could not fork WAL 
> receiver process: %m")));
>   break;
> - case WalSummarizerProcess:
> + case B_WAL_SUMMARIZER:
>   ereport(LOG,
>   (errmsg("could not fork WAL 
> summarizer process: %m")));
>   break;

Seems we should replace this with something slightly more generic one of these
days...


> diff --git a/src/backend/utils/activity/backend_status.c 
> b/src/backend/utils/activity/backend_status.c
> index 1a1050c8da1..92f24db4e18 100644
> --- a/src/backend/utils/activity/backend_status.c
> +++ b/src/backend/utils/activity/backend_status.c
> @@ -257,17 +257,16 @@ pgstat_beinit(void)
>   else
>   {
>   /* Must be an auxiliary process */
> - Assert(MyAuxP

Re: Optimizing nbtree ScalarArrayOp execution, allowing multi-column ordered scans, skip scan

2024-01-22 Thread Matthias van de Meent

On Fri, 19 Jan 2024 at 23:42, Peter Geoghegan  wrote:
>

Thank you for your replies so far.

> On Thu, Jan 18, 2024 at 11:39 AM Matthias van de Meent
>  wrote:
> > I would agree with you if this was about new APIs and features, but
> > here existing APIs are being repurposed without changing them. A
> > maintainer of index AMs would not look at the commit title 'Enhance
> > nbtree ScalarArrayOp execution' and think "oh, now I have to make sure
> > my existing amsearcharray+amcanorder handling actually supports
> > non-prefix arrays keys and return data in index order".
>
> This is getting ridiculous.
>
> It is quite likely that there are exactly zero affected out-of-core
> index AMs. I don't count pgroonga as a counterexample (I don't think
> that it actually fullfills the definition of a ). Basically,
> "amcanorder" index AMs more or less promise to be compatible with
> nbtree, down to having the same strategy numbers. So the idea that I'm
> going to upset amsearcharray+amcanorder index AM authors is a
> completely theoretical problem. The planner code evolved with nbtree,
> hand-in-glove.

And this is where I disagree with your (percieved) implicit argument
that this should be and always stay this way. I don't mind changes in
the planner for nbtree per se, but as I've mentioned before in other
places, I really don't like how we handle amcanorder as if it were
amisbtree. But it's not called that, so we shouldn't expect that to
remain the case; and definitely not keep building on those
expectations that it always is going to be the nbtree when amcanorder
is set (or amsearcharray is set, or ..., or any combination of those
that is currently used by btree). By keeping that expectation alive,
this becomes a self-fulfilling prophecy, and I really don't like such
restrictive self-fulfilling prophecies. It's nice that we have index
AM feature flags, but if we only effectively allow one implementation
for this set of flags by ignoring potential other users when changing
behavior, then what is the API good for (apart from abstraction
layering, which is nice)?

/rant

I'll see about the

> I'm more than happy to add a reminder to the commit message about
> adding a proforma listing to the compatibility section of the Postgres
> 17 release notes. Can we actually agree on which theoretical index AM
> types are affected first, though? Isn't it actually
> amsearcharray+amcanorder+amcanmulticol external index AMs only? Do I
> have that right?

I think that may be right, but I could very well have built a
btree-lite that doesn't have the historical baggage of having to deal
with pages from before v12 (version 4) and some improvements that
haven't made it to core yet.

> BTW, how should we phrase this compatibility note, so that it can be
> understood? It's rather awkward.

Something like the following, maybe?

"""
Compatibility: The planner will now generate paths with array scan
keys in any column for ordered indexes, rather than on only a prefix
of the index columns. The planner still expects fully ordered data
from those indexes.
Historically, the btree AM couldn't output correctly ordered data for
suffix array scan keys, which was "fixed" by prohibiting the planner
from generating array scan keys without an equality prefix scan key up
to that attribute. In this commit, the issue has been fixed, and the
planner restriction has thus been removed as the only in-core IndexAM
that reports amcanorder now supports array scan keys on any column
regardless of what prefix scan keys it has.
"""

> > > As I said to Heikki, thinking about this some more is on my todo list.
> > > I mean the way that this worked had substantial revisions on HEAD
> > > right before I posted v9. v9 was only to fix the bit rot that that
> > > caused.
> >
> > Then I'll be waiting for that TODO item to be resolved.
>
> My TODO item is to resolve the question of whether and to what extent
> these two optimizations should be combined. It's possible that I'll
> decide that it isn't worth it, for whatever reason.

That's fine, this decision (and any work related to it) is exactly
what I was referring to with that mention of the TODO.

> > > Even single digit
> > > thousand of array elements should be considered huge. Plus this code
> > > path is only hit with poorly written queries.
> >
> > Would you accept suggestions?
>
> You mean that you want to have a go at it yourself? Sure, go ahead.
>
> I cannot promise that I'll accept your suggested revisions, but if
> they aren't too invasive/complicated compared to what I have now, then
> I will just accept them. I maintain that this isn't really necessary,
> but it might be the path of least resistance at this point.

Attached 2 patches: v11.patch-a and v11.patch-b. Both are incremental
on top of your earlier set, and both don't allocate additional memory
in the merge operation in non-assertion builds.

patch-a is a trivial and clean implementation of mergesort, which
tends to reduce the total number of compare

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Nathan Bossart

On Mon, Jan 22, 2024 at 02:44:57PM -0600, Nathan Bossart wrote:
> On Mon, Jan 22, 2024 at 12:41:17PM -0800, Andres Freund wrote:
>> I noticed that I was getting core dumps while executing the tests, without 
>> the
>> tests failing. Backtraces are vriations of this:
> 
> Looking, thanks for the heads-up.

I think this is because the autoprewarm state was moved to a DSM segment,
and DSM segments are detached before the on_shmem_exit callbacks are called
during process exit.  Moving apw_detach_shmem to the before_shmem_exit list
seems to resolve the crashes.

diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 9ea6c2252a..88c3a04109 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -165,7 +165,7 @@ autoprewarm_main(Datum main_arg)
 first_time = false;
 
 /* Set on-detach hook so that our PID will be cleared on exit. */
-on_shmem_exit(apw_detach_shmem, 0);
+before_shmem_exit(apw_detach_shmem, 0);
 
 /*
  * Store our PID in the shared memory area --- unless there's already

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Andres Freund

Hi,

On 2024-01-22 15:19:36 -0600, Nathan Bossart wrote:
> On Mon, Jan 22, 2024 at 02:44:57PM -0600, Nathan Bossart wrote:
> > On Mon, Jan 22, 2024 at 12:41:17PM -0800, Andres Freund wrote:
> >> I noticed that I was getting core dumps while executing the tests, without 
> >> the
> >> tests failing. Backtraces are vriations of this:
> > 
> > Looking, thanks for the heads-up.
> 
> I think this is because the autoprewarm state was moved to a DSM segment,
> and DSM segments are detached before the on_shmem_exit callbacks are called
> during process exit.  Moving apw_detach_shmem to the before_shmem_exit list
> seems to resolve the crashes.

That seems plausible. Would still be nice to have at least this test ensure
that the shutdown code works. Perhaps just a check of the control file after
shutdown, ensuring that the state is "shutdowned" vs crashed?

Greetings,

Andres Freund

Re: Network failure may prevent promotion

2024-01-22 Thread Andres Freund

Hi,

On 2024-01-19 12:28:05 +0900, Michael Paquier wrote:
> On Thu, Jan 18, 2024 at 03:42:28PM +0200, Heikki Linnakangas wrote:
> > Given that commit 728f86fec6 that introduced this issue was not strictly
> > required, perhaps we should just revert it for v16.
> 
> Is there a point in keeping 728f86fec6 as well on HEAD?  That does not
> strike me as wise to keep that in the tree for now.  If it needs to be
> reworked, looking at this problem from scratch would be a safer
> approach.

IDK, I think we'll introduce this type of bug over and over if we don't fix it
properly.

Greetings,

Andres Freund

Re: Improving EXPLAIN's display of SubPlan nodes

2024-01-22 Thread Tom Lane

I wrote:
> The main thing that's still missing compared to what is in the plan
> data structure is information about which Param is which.  I think
> we have the subplan output Params relatively well covered through
> the expedient of listing them in the generated plan_name, but it's
> still not apparent to me how we could shoehorn subplan input
> Params into this (or whether it's worth doing).

Actually ... it looks like it probably isn't worth doing, because
it's already the case that we don't expose input Params as such.
EXPLAIN searches for the referent of an input Param and displays
that (cf find_param_referent()).  Just for experimental purposes,
I wrote a follow-on patch to add printout of the parParam and args
list (attached, as .txt so the cfbot doesn't think it's a patch).
This produces results like

explain (verbose, costs off)
select array(select sum(x+y) s
from generate_series(1,3) y group by y order by s)
  from generate_series(1,3) x;
QUERY PLAN 
---
 Function Scan on pg_catalog.generate_series x
   Output: ARRAY(SubPlan 1 PASSING $0 := x.x)
   ^ added by delta patch
   Function Call: generate_series(1, 3)
   SubPlan 1
 ->  Sort
   Output: (sum((x.x + y.y))), y.y
   Sort Key: (sum((x.x + y.y)))
   ->  HashAggregate
 Output: sum((x.x + y.y)), y.y
  ^^^ actual reference to $0
 Group Key: y.y
 ->  Function Scan on pg_catalog.generate_series y
   Output: y.y
   Function Call: generate_series(1, 3)
(13 rows)

As you can see, it's not necessary to explain what $0 is because
that name isn't shown anywhere else --- the references to "x.x" in
the subplan are actually uses of $0.

So now I'm thinking that we do have enough detail in the present
proposal, and we just need to think about whether there's some
nicer way to present it than the particular spelling I used here.

One idea I considered briefly is to pull the same trick with
regards to output parameters --- that is, instead of adding all
the "returns $n" annotations to subplans, drill down and print
the subplan's relevant targetlist expression instead of "$n".
On balance I think that might be more confusing not less so,
though.  SQL users are used to the idea that a sub-select can
"see" variables from the outer query, but not at all vice-versa.
I think it probably wouldn't be formally ambiguous, because
ruleutils already de-duplicates table aliases across the whole
tree, but it still seems likely to be confusing.  Also, people
are already pretty used to seeing $n to represent the outputs
of InitPlans, and I've not heard many complaints suggesting
that we should change that.

regards, tom lane

diff --git a/src/backend/utils/adt/ruleutils.c 
b/src/backend/utils/adt/ruleutils.c
index 9ee35640ba..b3da98a2f7 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -8911,9 +8911,30 @@ get_rule_expr(Node *node, deparse_context *context,
break;
}
if (subplan->useHashTable)
-   appendStringInfo(buf, "HASHED %s)", 
subplan->plan_name);
+   appendStringInfo(buf, "HASHED %s", 
subplan->plan_name);
else
-   appendStringInfo(buf, "%s)", 
subplan->plan_name);
+   appendStringInfo(buf, "%s", 
subplan->plan_name);
+   if (subplan->parParam)
+   {
+   ListCell   *lc3;
+   ListCell   *lc4;
+   bool first = true;
+
+   appendStringInfoString(buf, " PASSING 
");
+   forboth(lc3, subplan->parParam, lc4, 
subplan->args)
+   {
+   int paramid 
= lfirst_int(lc3);
+   Node   *arg = (Node *) 
lfirst(lc4);
+
+   if (first)
+   first = false;
+   else
+   
appendStringInfoString(buf, ", ");
+   appendStringInfo(buf, "$%d := 
", paramid);
+   get_rule_expr(arg, context, 
showimplicit);
+   }
+   }
+

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Nathan Bossart

On Mon, Jan 22, 2024 at 01:24:54PM -0800, Andres Freund wrote:
> On 2024-01-22 15:19:36 -0600, Nathan Bossart wrote:
>> I think this is because the autoprewarm state was moved to a DSM segment,
>> and DSM segments are detached before the on_shmem_exit callbacks are called
>> during process exit.  Moving apw_detach_shmem to the before_shmem_exit list
>> seems to resolve the crashes.
> 
> That seems plausible. Would still be nice to have at least this test ensure
> that the shutdown code works. Perhaps just a check of the control file after
> shutdown, ensuring that the state is "shutdowned" vs crashed?

I'll give that a try.  I'll also expand the comment above the
before_shmem_exit() call.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: core dumps in auto_prewarm, tests succeed

2024-01-22 Thread Nathan Bossart

On Mon, Jan 22, 2024 at 03:38:15PM -0600, Nathan Bossart wrote:
> On Mon, Jan 22, 2024 at 01:24:54PM -0800, Andres Freund wrote:
>> On 2024-01-22 15:19:36 -0600, Nathan Bossart wrote:
>>> I think this is because the autoprewarm state was moved to a DSM segment,
>>> and DSM segments are detached before the on_shmem_exit callbacks are called
>>> during process exit.  Moving apw_detach_shmem to the before_shmem_exit list
>>> seems to resolve the crashes.
>> 
>> That seems plausible. Would still be nice to have at least this test ensure
>> that the shutdown code works. Perhaps just a check of the control file after
>> shutdown, ensuring that the state is "shutdowned" vs crashed?
> 
> I'll give that a try.  I'll also expand the comment above the
> before_shmem_exit() call.

Here is a patch.

This might be a topic for another thread, but I do wonder whether we could
put a generic pg_controldata check in node->stop that would at least make
sure that the state is some sort of expected shut-down state.  My first
thought is that it could be a tad expensive, but... maybe it wouldn't be
too bad.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From b339a84802b76d42f4863e52e6d91ab28873e00a Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 22 Jan 2024 16:22:57 -0600
Subject: [PATCH v1 1/1] fix autoprewarm core dump

---
 contrib/pg_prewarm/autoprewarm.c  | 10 --
 contrib/pg_prewarm/t/001_basic.pl |  6 ++
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index 9ea6c2252a..06ee21d496 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -164,8 +164,14 @@ autoprewarm_main(Datum main_arg)
 	if (apw_init_shmem())
 		first_time = false;
 
-	/* Set on-detach hook so that our PID will be cleared on exit. */
-	on_shmem_exit(apw_detach_shmem, 0);
+	/*
+	 * Set on-detach hook so that our PID will be cleared on exit.
+	 *
+	 * NB: Autoprewarm's state is stored in a DSM segment, and DSM segments
+	 * are detached before calling the on_shmem_exit callbacks, so we must put
+	 * apw_detach_shmem in the before_shmem_exit callback list.
+	 */
+	before_shmem_exit(apw_detach_shmem, 0);
 
 	/*
 	 * Store our PID in the shared memory area --- unless there's already
diff --git a/contrib/pg_prewarm/t/001_basic.pl b/contrib/pg_prewarm/t/001_basic.pl
index bcd23a6914..825d3448ee 100644
--- a/contrib/pg_prewarm/t/001_basic.pl
+++ b/contrib/pg_prewarm/t/001_basic.pl
@@ -55,4 +55,10 @@ $node->wait_for_log(
 
 $node->stop;
 
+# control file should indicate normal shut down
+command_like(
+	[ 'pg_controldata', $node->data_dir() ],
+	qr/Database cluster state:\s*shut down/,
+	'cluster shut down normally');
+
 done_testing();
-- 
2.25.1

Re: psql: Allow editing query results with \gedit

2024-01-22 Thread Christoph Berg

Re: David G. Johnston
> Building off the other comments, I'd suggest trying to get rid of the
> intermediate JSOn format and also just focus on a single row at any given
> time.

We need *some* machine-readable format. It doesn't have to be JSON,
but JSON is actually pretty nice to read - and if values are too long,
or there are too many values, switch to extended mode:

select * from messages \gedit (expanded)

[{
  "id": "1",
  "language": "en",
  "message": "This is a very long test text with little actual meaning."
},{
  "id": "2",
  "language": "en",
  "message": "Another one, a bit shorter."
}]

I tweaked the indentation in the psql JSON output patch specifically
to make it readable.

Restricting to a single line might make sense if it helps editing, but
I don't think it does.

> For an update the first argument to the metacommand could be the unique key
> value present in the previous result.  The resultant UPDATE would just put
> that into the where clause and every other column in the result would be a
> SET clause column with the thing being set the current value, ready to be
> edited.

Hmm, then you would still have to cut-and-paste the PK value. If that
that's a multi-column non-numeric key, you are basically back to the
original problem.


Re: Tom Lane
> Yeah, that's something that was also bothering me, but I failed to
> put my finger on it.  "Here's some JSON, edit it, and don't forget
> to keep the quoting correct" does not strike me as a user-friendly
> way to adjust data content.  A spreadsheet-like display where you
> can change data within cells seems like a far better API, although
> I don't want to write that either.

Right. I wouldn't want a form editing system in there either. But
perhaps this middle ground of using a well-established format that is
easy to generate and to parse (it's using the JSON parser from
pgcommon) makes it fit into psql.

If parsing the editor result fails, the user is asked if they want to
re-edit with a parser error message, and if they go to the editor
again, the cursor is placed in the line where the error is. (Also,
what's wrong with having to strictly adhere to some syntax, we are
talking about SQL here.)

It's admittedly larger than the average \backslash command, but it
does fit into psql's interactive usage. \crosstabview is perhaps a
similar thing - it doesn't really fit into a simple "send query and
display result" client, but since it solves an actual problem, it
makes well sense to spend the extra code on it.

> This kind of API would not readily support INSERT or DELETE cases, but
> TBH I think that's better anyway --- you're adding too much ambiguity
> in pursuit of a very secondary use-case.  The stated complaint was
> "it's too hard to build UPDATE commands", which I can sympathize with.

I've been using the feature already for some time, and it's a real
relief. In my actual use case here, I use it on my ham radio logbook:

=# select start, call, qrg, name from log where cty = 'CE9' order by start;
 start  │  call  │ qrg │ name
┼┼─┼───
 2019-03-12 20:34:00+00 │ RI1ANL │7.076253 │ ∅
 2021-03-16 21:24:00+00 │ DP0GVN │2400.395 │ Felix
 2022-01-15 17:19:00+00 │ DP0GVN │ 2400.01 │ Felix
 2022-10-23 19:17:15+00 │ DP0GVN │ 2400.041597 │ ∅
 2023-10-01 14:05:00+00 │ 8J1RL  │   28.182575 │ ∅
 2024-01-22 21:15:15+00 │ DP1POL │   10.138821 │ ∅
(6 Zeilen)

The primary key is (start, call).

If I now want to note that the last contact with Antarctica there was
also with Felix, I'd have to transform that into

update log set name = 'Felix' where start = '2024-01-22 21:15:15+00' and call = 
'DP1POL';

\gedit is just so much easier.

UPDATE is the core feature. If we want to say INSERT and DELETE aren't
supported, but UPDATE support can go in, that'd be fine with me.

> (BTW, I wonder how much of this already exists in pgAdmin.)

pgadmin seems to support it. (Most other clients don't.)

Obviously, I would want to do the updating using the client I also use
for querying.

Christoph

Re: Things I don't like about \du's "Attributes" column

2024-01-22 Thread David G. Johnston

On Sun, Jan 21, 2024 at 2:35 PM Pavel Luzanov 
wrote:

> Another approach based on early suggestions.
>
> The Attributes column includes only the enabled logical attributes.
> Regardless of whether the attribute is enabled by default or not.
>
>

> The attribute names correspond to the keywords of the CREATE ROLE command.
> The attributes are listed in the same order as in the documentation.
> (I think that the LOGIN attribute should be moved to the first place,
> both in the documentation and in the command.)
>
> I'd just flip INHERIT and LOGIN


> The "Connection limit" and "Valid until" attributes are placed in separate 
> columns.
> The "Password?" column has been added.
>
> Sample output.
>
> Patch v3:
> =# \du
>  List of roles
>  Role name |Attributes
>  | Password? |  Valid until   | Connection limit
> ---+---+---++--
>  admin | INHERIT  
>  | no||   -1
>  alice | SUPERUSER LOGIN  
>  | yes   | infinity   |5
>  bob   | CREATEDB INHERIT LOGIN REPLICATION BYPASSRLS 
>  | no| 2022-01-01 00:00:00+03 |   -1
>  charlie   | CREATEROLE INHERIT LOGIN 
>  | yes   ||0
>  postgres  | SUPERUSER CREATEDB CREATEROLE INHERIT LOGIN REPLICATION 
> BYPASSRLS | no||   -1
> (5 rows)
>
>

> Small modification with newline separator for Attributes column:
>
> Patch v3 with newlines:
> =# \du
>   List of roles
>  Role name | Attributes  | Password? |  Valid until   | Connection 
> limit
> ---+-+---++--
>  postgres  | SUPERUSER  +| no||   
> -1
>| CREATEDB   +|   ||
>| CREATEROLE +|   ||
>| INHERIT+|   ||
>| LOGIN  +|   ||
>| REPLICATION+|   ||
>| BYPASSRLS   |   ||
> (5 rows)
>
> I'm strongly in favor of using mixed-case for the attributes.  The SQL
Command itself doesn't care about capitalization and it is much easier on
the eyes.  I'm also strongly in favor of newlines, as can be seen by the
default bootstrap superuser entry putting everything on one line eats up 65
characters.

List of roles
 Role name | Attributes  | Password? | Valid until | Connection limit |
Description
---+-+---+-+--+-
 davidj| Superuser  +| no| |   -1 |
   | CreateDB   +|   | |  |
   | CreateRole +|   | |  |
   | Inherit+|   | |  |
   | Login  +|   | |  |
   | Replication+|   | |  |
   | BypassRLS   |   | |  |
(1 row)

As noted by Peter this patch didn't update the two affected expected output
files. psql.out and, due to the system view change, rules.out.  That
particular change requires a documentation update to the pg_roles system
view page.  I'd suggest pulling out this system view change into its own
patch.

I will take another pass later when I get some more time.  I want to
re-review some of the older messages.  But the tweaks I show and breaking
out the view changes in to a separate patch both appeal to me right now.

David J.

Re: Built-in CTYPE provider

2024-01-22 Thread Jeff Davis

On Mon, 2024-01-22 at 19:49 +0100, Peter Eisentraut wrote:

> > 
> I don't get this argument.  Of course, people care about sorting and 
> sort order.  Whether you consider this part of Unicode or adjacent to
> it, people still want it.

You said that my proposal sends a message that we somehow don't care
about Unicode, and I strongly disagree. The built-in provider I'm
proposing does implement Unicode semantics.

Surely a database that offers UCS_BASIC (a SQL spec feature) isn't
sending a message that it doesn't care about Unicode, and neither is my
proposal.

> > 
> > * ICU offers COLLATE UNICODE, locale tailoring, case-insensitive
> > matching, and customization with rules. It's the solution for
> > everything from "slightly more advanced" to "very advanced".
> 
> I am astonished by this.  In your world, do users not want their text
> data sorted?  Do they not care what the sort order is? 

I obviously care about Unicode and collation. I've put a lot of effort
recently into contributions in this area, and I wouldn't have done that
if I thought users didn't care. You've made much greater contributions
and I thank you for that.

The logical conclusion of your line of argument would be that libc's
"C.UTF-8" locale and UCS_BASIC simply should not exist. But they do
exist, and for good reason.

One of those good reasons is that only *human* users care about the
human-friendliness of sort order. If Postgres is just feeding the
results to another system -- or an application layer that re-sorts the
data anyway -- then stability, performance, and interoperability matter
more than human-friendliness. (Though Unicode character semantics are
still useful even when the data is not going directly to a human.)

>  You consider UCA 
> sort order an "advanced" feature?

I said "slightly more advanced" compared with "basic". "Advanced" can
be taken in either a positive way ("more useful") or a negative way
("complex"). I'm sorry for the misunderstanding, but my point was this:

* The builtin provider is for people who are fine with code point order
and no tailoring, but want Unicode character semantics, collation
stability, and performance.

* ICU is the right solution for anyone who wants human-friendly
collation or tailoring, and is willing to put up with some collation
stability risk and lower collation performance.

Both have their place and the user is free to mix and match as needed,
thanks to the COLLATE clause for columns and queries.

Regards,
Jeff Davis

1 2 >

1 - 100 of 143 matches

Mail list logo