date:20210603

Failures with gcd functions with GCC snapshots GCC and -O3 (?)

2021-06-03 Thread Michael Paquier

Hi all,

serinus has been complaining about the new gcd functions in 13~:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2021-06-03%2003%3A44%3A14

The overflow detection is going wrong the way up and down, like here:
 SELECT gcd((-9223372036854775808)::int8, (-9223372036854775808)::int8); -- 
overflow
-ERROR:  bigint out of range
+ gcd
+--
+ -9223372036854775808
+(1 row)

That seems like a compiler bug to me as this host uses recent GCC
snapshots, and I cannot see a problem in GCC 10.2 on my own dev box.
But perhaps I am missing something?

Thanks,
--
Michael


signature.asc
Description: PGP signature

Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)

2021-06-03 Thread Dean Rasheed

On Thu, 3 Jun 2021 at 08:26, Michael Paquier  wrote:
>
> Hi all,
>
> serinus has been complaining about the new gcd functions in 13~:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2021-06-03%2003%3A44%3A14
>
> The overflow detection is going wrong the way up and down, like here:
>  SELECT gcd((-9223372036854775808)::int8, (-9223372036854775808)::int8); -- 
> overflow
> -ERROR:  bigint out of range
> + gcd
> +--
> + -9223372036854775808
> +(1 row)
>
> That seems like a compiler bug to me as this host uses recent GCC
> snapshots, and I cannot see a problem in GCC 10.2 on my own dev box.
> But perhaps I am missing something?
>

Huh, yeah. The code is pretty clear that that should throw an error:

if (arg1 == PG_INT64_MIN)
{
if (arg2 == 0 || arg2 == PG_INT64_MIN)
ereport(ERROR,
(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
 errmsg("bigint out of range")));

and FWIW it works OK on my dev box with gcc 10.2.1 and the same cflags.

Regards,
Dean

Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)

2021-06-03 Thread Sergei Kornilov

Hello

I build gcc version 12.0.0 20210603 (experimental) locally, then tried 
REL_13_STABLE with same CFLAGS as serinus
./configure --prefix=/home/melkij/tmp/pgdev/inst CFLAGS="-O3 -ggdb -g3 -Wall 
-Wextra -Wno-unused-parameter -Wno-sign-compare 
-Wno-missing-field-initializers" --enable-tap-tests --enable-cassert 
--enable-debug
check-world passed.

regards, Sergei

Re: Asynchronous Append on postgres_fdw nodes.

2021-06-03 Thread Etsuro Fujita

On Tue, May 11, 2021 at 6:55 PM Etsuro Fujita  wrote:
> On Tue, May 11, 2021 at 6:27 PM Andrey Lepikhov
>  wrote:
> > On 11/5/21 12:24, Etsuro Fujita wrote:
>
> > >>  ->  Append (actual rows=3000 loops=1)
> > >>->  Async Foreign Scan on f1 (actual rows=0 loops=1)
> > >>->  Async Foreign Scan on f2 (actual rows=0 loops=1)
> > >>->  Foreign Scan on f3 (actual rows=3000 loops=1)
> > >>
> > >> Here we give preference to the synchronous scan. Why?
> > >
> > > This would be expected behavior, and the reason is avoid performance
> > > degradation; you might think it would be better to execute the async
> > > Foreign Scan nodes more aggressively, but it would require
> > > waiting/polling for file descriptor events many times, which is
> > > expensive and might cause performance degradation.  I think there is
> > > room for improvement, though.
> > Yes, I agree with you. Maybe you can add note in documentation on
> > async_capable, for example:
> > "... Synchronous and asynchronous scanning strategies can be mixed by
> > optimizer in one scan plan of a partitioned table or an 'UNION ALL'
> > command. For performance reasons, synchronous scans executes before the
> > first of async scan. ..."
>
> +1  But I think this is an independent issue, so I think it would be
> better to address the issue separately.

I think that since postgres-fdw.sgml would be for users rather than
developers, unlike fdwhandler.sgml, it would be better to explain this
more in a not-too-technical way.  So how about something like this?

Asynchronous execution is applied even when an Append node contains
subplan(s) executed synchronously as well as subplan(s) executed
asynchronously.  In that case, if the asynchronous subplans are ones
executed using postgres_fdw, tuples from the asynchronous subplans are
not returned until after at least one synchronous subplan returns all
tuples, as that subplan is executed while the asynchronous subplans
are waiting for the results of queries sent to foreign servers.  This
behavior might change in a future release.

Best regards,
Etsuro Fujita

pg_upgrade is failed for 'plpgsql_call_handler' handler

2021-06-03 Thread tushar


Hi,

In one of my testing scenario, i found pg_upgrade is failed for 
'plpgsql_call_handler' handler


Steps to reproduce - ( on any supported version of PG)

Perform initdb ( ./initdb -D  d1 ;  ./initdb -D d2)

Start d1 cluster(./pg_ctl -D d1 start) , connect to postgres (./psql 
postgres)  and create this language


postgres=# CREATE TRUSTED LANGUAGE plspl_sm HANDLER plpgsql_call_handler;
CREATE LANGUAGE

stop the server (./pg_ctl -D d1 stop)

perform pg_upgrade ( ./pg_upgrade -d d1 -D d2 -b . B .)

will fail with these message

pg_restore: while PROCESSING TOC:
pg_restore: from TOC entry 825; 2612 16384 PROCEDURAL LANGUAGE plspl_sm edb
pg_restore: error: could not execute query: ERROR:  could not open 
extension control file 
"/home/edb/pg14/pg/edbpsql/share/postgresql/extension/plspl_sm.control": 
No such file or directory

Command was: CREATE OR REPLACE PROCEDURAL LANGUAGE "plspl_sm";

is this expected ?

--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [HACKERS] logical decoding of two-phase transactions

2021-06-03 Thread Amit Kapila

On Wed, Jun 2, 2021 at 4:34 AM Peter Smith  wrote:
>
> Please find attached the latest patch set v82*
>

Few comments on 0001:

1.
+ /*
+ * BeginTransactionBlock is necessary to balance the EndTransactionBlock
+ * called within the PrepareTransactionBlock below.
+ */
+ BeginTransactionBlock();
+ CommitTransactionCommand();
+
+ /*
+ * Update origin state so we can restart streaming from correct position
+ * in case of crash.
+ */
+ replorigin_session_origin_lsn = prepare_data.end_lsn;
+ replorigin_session_origin_timestamp = prepare_data.prepare_time;
+
+ PrepareTransactionBlock(gid);
+ CommitTransactionCommand();

Here, the call to CommitTransactionCommand() twice looks a bit odd.
Before the first call, can we write a comment like "This is to
complete the Begin command started by the previous call"?

2.
@@ -85,11 +85,16 @@ typedef struct LogicalDecodingContext
  bool streaming;

  /*
- * Does the output plugin support two-phase decoding, and is it enabled?
+ * Does the output plugin support two-phase decoding.
  */
  bool twophase;

  /*
+ * Is two-phase option given by output plugin?
+ */
+ bool twophase_opt_given;
+
+ /*
  * State for writing output.

I think we can write few comments as to why we need a separate
twophase parameter here? The description of twophase_opt_given can be
changed to: "Is two-phase option given by output plugin? This is to
allow output plugins to enable two_phase at the start of streaming. We
can't rely on twophase parameter that tells whether the plugin
provides all the necessary two_phase APIs for this purpose." Feel free
to add more to it.

3.
@@ -432,10 +432,19 @@ CreateInitDecodingContext(const char *plugin,
  MemoryContextSwitchTo(old_context);

  /*
- * We allow decoding of prepared transactions iff the two_phase option is
- * enabled at the time of slot creation.
+ * We allow decoding of prepared transactions when the two_phase is
+ * enabled at the time of slot creation, or when the two_phase option is
+ * given at the streaming start.
  */
- ctx->twophase &= MyReplicationSlot->data.two_phase;
+ ctx->twophase &= (ctx->twophase_opt_given || slot->data.two_phase);
+
+ /* Mark slot to allow two_phase decoding if not already marked */
+ if (ctx->twophase && !slot->data.two_phase)
+ {
+ slot->data.two_phase = true;
+ ReplicationSlotMarkDirty();
+ ReplicationSlotSave();
+ }

Why do we need to change this during CreateInitDecodingContext which
is called at create_slot time? At that time, we don't need to consider
any options and there is no need to toggle slot's two_phase value.

4.
- /* Binary mode and streaming are only supported in v14 and higher */
+ /*
+ * Binary, streaming, and two_phase are only supported in v14 and
+ * higher
+ */

We can say v15 for two_phase.

5.
-#define LOGICALREP_PROTO_MAX_VERSION_NUM LOGICALREP_PROTO_STREAM_VERSION_NUM
+#define LOGICALREP_PROTO_TWOPHASE_VERSION_NUM 3
+#define LOGICALREP_PROTO_MAX_VERSION_NUM 3

Isn't it better to define LOGICALREP_PROTO_MAX_VERSION_NUM as
LOGICALREP_PROTO_TWOPHASE_VERSION_NUM instead of specifying directly
the number?

6.
+/* Commit (and abort) information */
 typedef struct LogicalRepCommitData
 {
  XLogRecPtr commit_lsn;
@@ -122,6 +132,48 @@ typedef struct LogicalRepCommitData
  TimestampTz committime;
 } LogicalRepCommitData;

Is there a reason for the above comment addition? If so, how is it
related to this patch?

7.
+++ b/src/test/subscription/t/021_twophase.pl
@@ -0,0 +1,299 @@
+# logical replication of 2PC test
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;

In the nearby test files, we have Copyright notice like "# Copyright
(c) 2021, PostgreSQL Global Development Group". We should add one to
the new test files in this patch as well.

8.
+# Also wait for two-phase to be enabled
+my $twophase_query =
+ "SELECT count(1) = 0 FROM pg_subscription WHERE subtwophasestate NOT
IN ('e');";
+$node_subscriber->poll_query_until('postgres', $twophase_query)
+  or die "Timed out while waiting for subscriber to enable twophase";

Isn't it better to write this query as: "SELECT count(1) = 1 FROM
pg_subscription WHERE subtwophasestate ='e';"? It looks a bit odd to
use the NOT IN operator here. Similarly, change the same query used at
another place in the patch.

9.
+# check that transaction is in prepared state on subscriber
+my $result = $node_subscriber->safe_psql('postgres', "SELECT count(*)
FROM pg_prepared_xacts;");
+is($result, qq(1), 'transaction is prepared on subscriber');
+
+# Wait for the statistics to be updated
+$node_publisher->poll_query_until(
+ 'postgres', qq[
+ SELECT count(slot_name) >= 1 FROM pg_stat_replication_slots
+ WHERE slot_name = 'tap_sub'
+ AND total_txns > 0 AND total_bytes > 0;
+]) or die "Timed out while waiting for statistics to be updated";

I don't see the need to check for stats in this test. If we really
want to test stats then we can add a separate test in
contrib\test_decoding\sql\stats but I suggest leaving it. Please do
the same for other stats te

Re: locking [user] catalog tables vs 2pc vs logical rep

2021-06-03 Thread Amit Kapila

On Thu, Jun 3, 2021 at 9:18 AM osumi.takami...@fujitsu.com
 wrote:
>
> On Tuesday, June 1, 2021 4:33 PM Peter Smith 
> > To: Andres Freund 
> > Cc: PostgreSQL-development ; Amit Kapila
> > ; Markus Wanner
> > 
> > Subject: Re: locking [user] catalog tables vs 2pc vs logical rep
> >
> > Hi.
> >
> > The attached PG docs patch about catalog deadlocks was previously
> > implemented in another thread [1], but it seems more relevant to this one.
> >
> > PSA.
> Thank you for providing the patch.
> I have updated your patch to include some other viewpoints.
>

I suggest creating a synchronous replication part of the patch for
back-branches as well.

-- 
With Regards,
Amit Kapila.

Are we missing (void) when return value of fsm_set_and_search is ignored?

2021-06-03 Thread Bharath Rupireddy

Hi,

It looks like for some of the fsm_set_and_search calls whose return
value is ignored (in fsm_search and RecordPageWithFreeSpace), there's
no (void). Is it intentional? In the code base, we generally have
(void) when non-void return value of a function is ignored.

Thoughts?

With Regards,
Bharath Rupireddy.

Re: pg_upgrade is failed for 'plpgsql_call_handler' handler

2021-06-03 Thread Daniel Gustafsson

> On 3 Jun 2021, at 11:53, tushar  wrote:

> In one of my testing scenario, i found pg_upgrade is failed for 
> 'plpgsql_call_handler' handle

This isn't really a pg_upgrade issue but a pg_dump issue.  The handler, inline
nd validator functions will be looked up among the functions loaded into
pg_dump and included in the CREATE LANGUAGE statement.  However, iff they are
in pg_catalog then they wont be found (pg_catalog is excluded in getFuncs) and
a bare CREATE LANGUAGE statement will be emitted.  This bare statement will
then be interpreted as CREATE EXTENSION.

This is intentional since the language template work in 8.1, before then
pg_dump would look up support functions in pg_catalog.

--
Daniel Gustafsson   https://vmware.com/

Re: Are we missing (void) when return value of fsm_set_and_search is ignored?

2021-06-03 Thread Dilip Kumar

On Thu, Jun 3, 2021 at 4:24 PM Bharath Rupireddy
 wrote:
>
> Hi,
>
> It looks like for some of the fsm_set_and_search calls whose return
> value is ignored (in fsm_search and RecordPageWithFreeSpace), there's
> no (void). Is it intentional?

Basically, fsm_set_and_search, serve both "set" and "search", but it
only search if the "minValue" is > 0.  So if the minvalue is passed as
0 then the return value is ignored intentionally.  I can see in both
places where the returned value is ignored the minvalue is passed as
0.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Are we missing (void) when return value of fsm_set_and_search is ignored?

2021-06-03 Thread Bharath Rupireddy

On Thu, Jun 3, 2021 at 4:47 PM Dilip Kumar  wrote:
>
> On Thu, Jun 3, 2021 at 4:24 PM Bharath Rupireddy
>  wrote:
> >
> > Hi,
> >
> > It looks like for some of the fsm_set_and_search calls whose return
> > value is ignored (in fsm_search and RecordPageWithFreeSpace), there's
> > no (void). Is it intentional?
>
> Basically, fsm_set_and_search, serve both "set" and "search", but it
> only search if the "minValue" is > 0.  So if the minvalue is passed as
> 0 then the return value is ignored intentionally.  I can see in both
> places where the returned value is ignored the minvalue is passed as
> 0.

Thanks. I know why we are ignoring the return value. I was trying to
say, when we ignore (for whatsoever reason it maybe) return value of
any non-void returning function, we do something like below right?

(void) fsm_set_and_search(rel, addr, slot, new_cat, 0);

instead of

fsm_set_and_search(rel, addr, slot, new_cat, 0);

With Regards,
Bharath Rupireddy.

Re: [BUG]Update Toast data failure in logical replication

2021-06-03 Thread Dilip Kumar

On Wed, Jun 2, 2021 at 7:23 PM Dilip Kumar  wrote:
>
> On Wed, Jun 2, 2021 at 7:20 PM Kuntal Ghosh  
> wrote:
> >
> > On Wed, Jun 2, 2021 at 3:10 PM Dilip Kumar  wrote:
> > >
> > > Yes, you are right.  I will change it in the next version, along with
> > > the test case.
> > >
> > +/*
> > + * if the key hasn't changed and we're only logging the key, we're 
> > done.
> > + * But if tuple has external data then we might have to detoast the 
> > key.
> > + */
> > This doesn't really mention why we need to detoast the key even when
> > the key remains the same. I guess we can add some more details here.
>
> Noted, let's see what others have to say about fixing this, then I
> will fix this along with one other pending comment and I will also add
> the test case.  Thanks for looking into this.

I have fixed all the pending issues, I see there is already a test
case for this so I have changed the output for that.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From fe1b9131c6c53a6335dd4fd7ceb5f8c18aacdba9 Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Mon, 31 May 2021 14:37:26 +0530
Subject: [PATCH v3] Extract unchanged replica identity key if it is stored
 externally

If replica identity is set to key and the key is not modified we don't log
key seperately because it should be logged along with the updated tuple.  But
if the key is stored externally we must have to detoast and log it separately.
---
 contrib/test_decoding/expected/toast.out |  2 +-
 src/backend/access/heap/heapam.c | 20 ++--
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/contrib/test_decoding/expected/toast.out b/contrib/test_decoding/expected/toast.out
index 75c4d22..769ab0e 100644
--- a/contrib/test_decoding/expected/toast.out
+++ b/contrib/test_decoding/expected/toast.out
@@ -77,7 +77,7 @@ SELECT substr(data, 1, 200) FROM pg_logical_slot_get_changes('regression_slot',
  table public.toasted_key: INSERT: id[integer]:1 toasted_key[text]:'1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123
  COMMIT
  BEGIN
- table public.toasted_key: UPDATE: id[integer]:1 toasted_key[text]:unchanged-toast-datum toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'987654321098765432109876543210987654321098765432109
+ table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678
  COMMIT
  BEGIN
  table public.toasted_key: UPDATE: old-key: toasted_key[text]:'123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 2433998..afd775e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -8339,8 +8339,14 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
 		return tp;
 	}
 
-	/* if the key hasn't changed and we're only logging the key, we're done */
-	if (!key_changed)
+	/*
+	 * If the key hasn't changed and we're only logging the key, we're done.
+	 * But if tuple has external data then it's possible that key might have
+	 * stored externally, if so we need to extract its value because the value
+	 * of the externally stored key data will not be logged with the main tuple
+	 * so we will have to log as old key tuple for replica identity.
+	 */
+	if ((!key_changed) && !HeapTupleHasExternal(tp))
 		return NULL;
 
 	/* find out the replica identity columns */
@@ -8391,6 +8397,16 @@ ExtractReplicaIdentity(Relation relation, HeapTuple tp, bool key_changed,
 		key_tuple = toast_flatten_tuple(oldtup, desc);
 		heap_freetuple(oldtup);
 	}
+	/*
+	 * If key tuple doesn't have any external data and key is not changed then
+	 * just free the key tuple and return NULL.
+	 */
+	else if (!key_changed)
+	{
+		heap_freetuple(key_tuple);
+		*copy = false;
+		return NULL;
+	}
 
 	return key_tuple;
 }
-- 
1.8.3.1

Re: Skip partition tuple routing with constant partition key

2021-06-03 Thread Amit Langote

Hou-san,

On Tue, Jun 1, 2021 at 5:43 PM houzj.f...@fujitsu.com
 wrote:
> From: Amit Langote 
> > > > Where are you thinking to cache the partidx list?  Inside
> > > > PartitionDesc or some executor struct?
> > >
> > > I was thinking cache the partidx list in PartitionDescData which is in
> > > relcache, if possible, we can use the cached partition between statements.
> >
> > Ah, okay.  I thought you were talking about a different idea.
> > How and where would you determine that a cached partidx value is indeed the 
> > correct one for
> > a given row?
> > Anyway, do you want to try writing a patch to see how it might work?
>
> Yeah, the different idea here is to see if it is possible to share the cached
> partition info between statements efficiently.
>
> But, after some research, I found something not as expected:

Thanks for investigating this.

> Currently, we tried to use ExecPartitionCheck to check the if the cached
> partition is the correct one. And if we want to share the cached partition
> between statements, we need to Invoke ExecPartitionCheck for single-row 
> INSERT,
> but the first time ExecPartitionCheck call will need to build expression state
> tree for the partition. From some simple performance tests, the cost to build
> the state tree could be more than the cached partition saved which could bring
> performance degradation.

Yeah, using the executor in the lower layer will defeat the whole
point of caching in that layer.

> So, If we want to share the cached partition between statements, we seems 
> cannot
> use ExecPartitionCheck. Instead, I tried directly invoke the partition support
> function(partsupfunc) to check If the cached info is correct. In this 
> approach I
> tried cache the *bound offset* in PartitionDescData, and we can use the bound 
> offset
> to get the bound datum from PartitionBoundInfoData and invoke the partsupfunc
> to do the CHECK.
>
> Attach a POC patch about it. Just to share an idea about sharing cached 
> partition info
> between statements.

I have not looked at your patch yet, but yeah that's what I would
imagine doing it.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Re: Are we missing (void) when return value of fsm_set_and_search is ignored?

2021-06-03 Thread Julien Rouhaud

On Thu, Jun 3, 2021 at 6:54 PM Bharath Rupireddy
 wrote:
>
> It looks like for some of the fsm_set_and_search calls whose return
> value is ignored (in fsm_search and RecordPageWithFreeSpace), there's
> no (void). Is it intentional? In the code base, we generally have
> (void) when non-void return value of a function is ignored.

That's a good practice, +1 for changing that.

Re: Are we missing (void) when return value of fsm_set_and_search is ignored?

2021-06-03 Thread Dilip Kumar

On Thu, Jun 3, 2021 at 5:11 PM Bharath Rupireddy
 wrote:
>
> On Thu, Jun 3, 2021 at 4:47 PM Dilip Kumar  wrote:
> >
> > On Thu, Jun 3, 2021 at 4:24 PM Bharath Rupireddy
> >  wrote:
> > >
> > > Hi,
> > >
> > > It looks like for some of the fsm_set_and_search calls whose return
> > > value is ignored (in fsm_search and RecordPageWithFreeSpace), there's
> > > no (void). Is it intentional?
> >
> > Basically, fsm_set_and_search, serve both "set" and "search", but it
> > only search if the "minValue" is > 0.  So if the minvalue is passed as
> > 0 then the return value is ignored intentionally.  I can see in both
> > places where the returned value is ignored the minvalue is passed as
> > 0.
>
> Thanks. I know why we are ignoring the return value. I was trying to
> say, when we ignore (for whatsoever reason it maybe) return value of
> any non-void returning function, we do something like below right?
>
> (void) fsm_set_and_search(rel, addr, slot, new_cat, 0);
>
> instead of
>
> fsm_set_and_search(rel, addr, slot, new_cat, 0);

Okay, I thought you were asking whether we are ignoring the return
value is intentional or not.  Yeah, typecasting the return with void
is a better practice for ignoring the return value.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Documentation missing for PGSSLCRLDIR

2021-06-03 Thread Daniel Gustafsson

> On 3 Jun 2021, at 05:13, Michael Paquier  wrote:

> I have noticed that the documentation for PGSSLCRLDIR is missing.
> That seems like an oversight in f5465fa.

+1 on applying this.

While looking at this I found another nearby oversight which needs a backport
down to 13 where it was introduced.  The PGSSLMAXPROTOCOLVERSION documentation
is linking to the minimum protocol version docs.  Fixed in the attached.

--
Daniel Gustafsson   https://vmware.com/

0001-Fix-linkend-reference-for-PGSSLMAXPROTOCOLVERSION.patch
Description: Binary data

A few source files have the wrong copyright year

2021-06-03 Thread David Rowley

I noticed earlier today when working in brin_minmax_multi.c that the
copyright year was incorrect. That caused me to wonder if any other
source files have the incorrect year.

git grep -E "Portions Copyright \(c\) ([0-9]{4}-[0-9]{4}|[0-9]{4}),
PostgreSQL Global Development Group" | grep -Ev "2021"

Seems fairly good at finding them. 14 in total.

The attached fixes.

I'll push this in the New Zealand morning unless anyone comes up with
a reason why I shouldn't before then.

David


fix_incorrect_copyright_years.patch
Description: Binary data

Re: A few source files have the wrong copyright year

2021-06-03 Thread Daniel Gustafsson

> On 3 Jun 2021, at 14:16, David Rowley  wrote:

> git grep -E "Portions Copyright \(c\) ([0-9]{4}-[0-9]{4}|[0-9]{4}),
> PostgreSQL Global Development Group" | grep -Ev "2021"
> 
> Seems fairly good at finding them. 14 in total.

src/tools/copyright.pl finds these as well as contrib/pageinspect/gistfuncs.c
which also ends the range in 2020, might want to include that one too when
pushing this.

--
Daniel Gustafsson   https://vmware.com/

Re: improve installation short version

2021-06-03 Thread Andrew Dunstan


On 6/2/21 5:36 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 6/2/21 4:58 PM, Mark Dilger wrote:
>>> Prior to where your patch makes changes, the docs say, "If you are building 
>>> PostgreSQL for Microsoft Windows, read this chapter if you intend to build 
>>> with MinGW or Cygwin".  I think going on to tell the users to use `su` is a 
>>> bit odd.  Does that exist in standard MinGW and Cygwin environments?  I 
>>> thought "run as" was the Windows option for this.
>> Yes, good point. We should fix that. Yes, "runas" is a sort of su.
>> There's no adduser either.
> There's a whole lot of Unix systems that don't spell that command
> as "adduser", either.  That whole recipe has to be understood as
> a guide, not something you can blindly copy-and-paste.
>
> Maybe what we really need is an initial disclaimer saying something
> along the lines of "Here's approximately what you need to do; adapt
> these commands per local requirements."
>
> And then, perhaps, change the last line to "For more detail, see
> the rest of this chapter".
>
>   



Ok, patch attached


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml
index 3c0aa118c7..92b062bebb 100644
--- a/doc/src/sgml/installation.sgml
+++ b/doc/src/sgml/installation.sgml
@@ -34,22 +34,29 @@ documentation.  See standalone-profile.xsl for details.
  
   Short Version
 
+  
+   The recipe below will work on some Unix platforms. It needs to be adapted
+   to local requirements on your platform.
+  
+
   
 
 ./configure
 make
 su
+# The following commands will be executed as root
 make install
 adduser postgres
 mkdir /usr/local/pgsql/data
 chown postgres /usr/local/pgsql/data
 su - postgres
+# The following commands will be executed as postgres
 /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
 /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start
 /usr/local/pgsql/bin/createdb test
 /usr/local/pgsql/bin/psql test
 
-   The long version is the rest of this
+   For more detail see the rest of this
chapter.

Re: A few source files have the wrong copyright year

2021-06-03 Thread David Rowley

On Fri, 4 Jun 2021 at 00:30, Daniel Gustafsson  wrote:
> src/tools/copyright.pl finds these as well as contrib/pageinspect/gistfuncs.c
> which also ends the range in 2020, might want to include that one too when
> pushing this.

Thanks. I wasn't aware of that.

David

Re: Support for CREATE MODULE?

2021-06-03 Thread Peter Eisentraut


On 02.06.21 16:43, Jim Mlodgenski wrote:

It's already quite hard to tell which part
of a multiply.qualified.name is which, given that SQL says that you can
optionally put a "catalog" (database) name in front of the others.
I really doubt there is a way to shoehorn sub-schemas in there without
creating terrible ambiguities.  Is "a.b.c" a reference to object c in
schema b in database a, or is it a reference to object c in sub-schema b
in schema a?

That was the area I had the most difficult part to reason about. I tried to make
some simplifying assumptions by checking if "a" was the current database.
Since we don't support cross database access, if it was not, I assumed "a"
was a schema. I not sure if that would be valid, but it did scope things
to a more manageable problem.


Given that, as you said, the concept of modules is in the SQL standard, 
there is surely some guidance in there about how this is supposed to 
affect name resolution.  So let's start with that.  Maybe we won't like 
it in the end or whatever, but we should surely look there first.

Re: Duplicate history file?

2021-06-03 Thread Kyotaro Horiguchi

At Tue, 01 Jun 2021 13:03:22 +0900, Tatsuro Yamada 
 wrote in 
> Hi Horiguchi-san,
> 
> On 2021/05/31 16:58, Kyotaro Horiguchi wrote:
> > So, I started a thread for this topic diverged from the following
> > thread.
> > https://www.postgresql.org/message-id/4698027d-5c0d-098f-9a8e-8cf09e36a...@nttcom.co.jp_1
> > 
> >> So, what should we do for the user? I think we should put some notes
> >> in postgresql.conf or in the documentation. For example, something
> >> like this:
> > I'm not sure about the exact configuration you have in mind, but that
> > would happen on the cascaded standby in the case where the upstream
> > promotes. In this case, the history file for the new timeline is
> > archived twice.  walreceiver triggers archiving of the new history
> > file at the time of the promotion, then startup does the same when it
> > restores the file from archive.  Is it what you complained about?
> 
> 
> Thank you for creating a new thread and explaining this.
> We are not using cascade replication in our environment, but I think
> the situation is similar. As an overview, when I do a promote,
> the archive_command fails due to the history file.

Ah, I remembered that PG-REX starts a primary as a standby then
promotes it.

> I've created a reproduction script that includes building replication,
> and I'll share it with you. (I used Robert's test.sh as a reference
> for creating the reproduction script. Thanks)
> 
> The scenario (sr_test_historyfile.sh) is as follows.
> 
> #1 Start pgprimary as a main
> #2 Create standby
> #3 Start pgstandby as a standby
> #4 Execute archive command
> #5 Shutdown pgprimary
> #6 Start pgprimary as a standby
> #7 Promote pgprimary
> #8 Execute archive_command again, but failed since duplicate history
>file exists (see pgstandby.log)

Ok, I clearly understood what you meant. (However, it is not the legit
state where a standby is running without the primary is running..)
Anyway the "test ! -f" can be problematic in the case.

> Note that this may not be appropriate if you consider it as a recovery
> procedure for replication configuration. However, I'm sharing it as it
> is
> because this seems to be the procedure used in the customer's
> environment (PG-REX).

Understood.

> Regarding "test ! -f",
> I am wondering how many people are using the test command for
> archive_command. If I remember correctly, the guide provided by
> NTT OSS Center that we are using does not recommend using the test
> command.

I think, as the PG-REX documentation says, the simple cp works well as
far as the assumption of PG-REX - no double failure happenes, and
following the instruction - holds.

On the other hand, I found that the behavior happens more generally.

If a standby with archive_mode=always craches, it starts recovery from
the last checkpoint. If the checkpoint were in a archived segment, the
restarted standby will fetch the already-archived segment from archive
then fails to archive it. (The attached).

So, your fear stated upthread is applicable for wider situations. The
following suggestion is rather harmful for the archive_mode=always
setting.

https://www.postgresql.org/docs/14/continuous-archiving.html
> The archive command should generally be designed to refuse to
> overwrite any pre-existing archive file. This is an important safety
> feature to preserve the integrity of your archive in case of
> administrator error (such as sending the output of two different
> servers to the same archive directory).

I'm not sure how we should treat this..  Since archive must store
files actually applied to the server data, just being already archived
cannot be the reason for omitting archiving.  We need to make sure the
new file is byte-identical to the already-archived version. We could
compare just *restored* file to the same file in pg_wal but it might
be too much of penalty for for the benefit. (Attached second file.)

Otherwise the documentation would need someting like the following if
we assume the current behavior.

> The archive command should generally be designed to refuse to
> overwrite any pre-existing archive file. This is an important safety
> feature to preserve the integrity of your archive in case of
> administrator error (such as sending the output of two different
> servers to the same archive directory).
+ For standby with the setting archive_mode=always, there's a case where
+ the same file is archived more than once.  For safety, it is
+ recommended that when the destination file exists, the archive_command
+ returns zero if it is byte-identical to the source file.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
# Copyright (c) 2021, PostgreSQL Global Development Group

#
# Tests related to WAL archiving and recovery.
#
use strict;
use warnings;
use PostgresNode;
use TestLib;
use Test::More tests => 1;
use Config;

my $backup_name='mybackup';

my $primary = get_new_node('primary');
$primary->init(
has_archiving=> 1,
allows_streaming

Re: Are we missing (void) when return value of fsm_set_and_search is ignored?

2021-06-03 Thread Bharath Rupireddy

On Thu, Jun 3, 2021 at 5:22 PM Julien Rouhaud  wrote:
>
> On Thu, Jun 3, 2021 at 6:54 PM Bharath Rupireddy
>  wrote:
> >
> > It looks like for some of the fsm_set_and_search calls whose return
> > value is ignored (in fsm_search and RecordPageWithFreeSpace), there's
> > no (void). Is it intentional? In the code base, we generally have
> > (void) when non-void return value of a function is ignored.
>
> That's a good practice, +1 for changing that.

Thanks. PSA v1 patch.

With Regards,
Bharath Rupireddy.


v1-0001-Use-void-when-return-value-of-fsm_set_and_search-.patch
Description: Binary data

Re: Are we missing (void) when return value of fsm_set_and_search is ignored?

2021-06-03 Thread Peter Eisentraut


On 03.06.21 12:54, Bharath Rupireddy wrote:

It looks like for some of the fsm_set_and_search calls whose return
value is ignored (in fsm_search and RecordPageWithFreeSpace), there's
no (void). Is it intentional? In the code base, we generally have
(void) when non-void return value of a function is ignored.


I don't think that is correct.  I don't see anyone writing

(void) printf(...);

so this is not a generally applicable strategy.

We have pg_nodiscard for functions where you explicitly want callers to 
check the return value.  In all other cases, callers are free to ignore 
return values.

Re: improve installation short version

2021-06-03 Thread Peter Eisentraut


On 02.06.21 23:27, Andrew Dunstan wrote:



It's not the fault of your patch, but the docs seem to be misleading in ways 
that your comments don't fix.  (Or perhaps my Windows knowledge is just too 
lacking to realize why these instructions are ok?)

Prior to where your patch makes changes, the docs say, "If you are building PostgreSQL for 
Microsoft Windows, read this chapter if you intend to build with MinGW or Cygwin".  I think 
going on to tell the users to use `su` is a bit odd.  Does that exist in standard MinGW and Cygwin 
environments?  I thought "run as" was the Windows option for this.



Yes, good point. We should fix that. Yes, "runas" is a sort of su.
There's no adduser either.


I think those instructions were written before "sudo" became common. 
Maybe we should update them a bit.

Re: Fixup some appendStringInfo and appendPQExpBuffer calls

2021-06-03 Thread Peter Eisentraut


On 03.06.21 06:17, Tom Lane wrote:

David Rowley  writes:

There are quite a few other places in that file that should be using
DatumGetCString() instead of DatumGetPointer().
Should we fix those too for PG14?


+1.  I'm surprised we are not getting compiler warnings.


Well, DatumGetPointer() returns Pointer, and Pointer is char *.

Re: parent foreign tables and row marks

2021-06-03 Thread Amit Langote

On Thu, Jun 3, 2021 at 10:07 AM Amit Langote  wrote:
> On Thu, Jun 3, 2021 at 3:39 AM Tom Lane  wrote:
> >
> > Amit Langote  writes:
> > > I noticed that 428b260f87e8 (v12) broke the cases where a parent
> > > foreign table has row marks assigned.
> >
> > Indeed :-(.  Fix pushed.  I tweaked the comments and test case slightly.
>
> Thank you.

Ah, I had forgotten to propose that we replace the following in the
preprocess_targetlist()'s row marks loop:

/* child rels use the same junk attrs as their parents */
if (rc->rti != rc->prti)
continue;

by an Assert as follows:

+   /* No child row marks yet. */
+   Assert (rc->rti == rc->prti);

I think the only place that sets prti that is != rti of a row mark is
expand_single_inheritance_child() and we can be sure that that
function now always runs after preprocess_targetlist() has run.
Attached a patch.

Thoughts?

-- 
Amit Langote
EDB: http://www.enterprisedb.com

no-child-rowmarks.patch
Description: Binary data

Re: speed up verifying UTF-8

2021-06-03 Thread Heikki Linnakangas


On 02/06/2021 19:26, John Naylor wrote:
For v10, I've split the patch up into two parts. 0001 uses pure C 
everywhere. This is much smaller and easier to review, and gets us the 
most bang for the buck.


One concern Heikki raised upthread is that platforms with poor 
unaligned-memory access will see a regression. We could easily add an 
#ifdef to take care of that, but I haven't done so here.


To recap: On ascii-only input with storage taken out of the picture, 
profiles of COPY FROM show a reduction from nealy 10% down to just over 
1%. In microbenchmarks found earlier in this thread, this works out to 
about 7 times faster. On multibyte/mixed input, 0001 is a bit faster, 
but not really enough to make a difference in copy performance.


Nice!

This kind of bit-twiddling is fun, so I couldn't resist tinkering with 
it, to see if we can shave some more instructions from it:



+/* from https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord */
+#define HAS_ZERO(chunk) ( \
+   ((chunk) - UINT64CONST(0x0101010101010101)) & \
+~(chunk) & \
+UINT64CONST(0x8080808080808080))
+
+/* Verify a chunk of bytes for valid ASCII including a zero-byte check. */
+static inline int
+check_ascii(const unsigned char *s, int len)
+{
+   uint64  half1,
+   half2,
+   highbits_set;
+
+   if (len >= 2 * sizeof(uint64))
+   {
+   memcpy(&half1, s, sizeof(uint64));
+   memcpy(&half2, s + sizeof(uint64), sizeof(uint64));
+
+   /* If there are zero bytes, bail and let the slow path handle 
it. */
+   if (HAS_ZERO(half1) || HAS_ZERO(half2))
+   return 0;
+
+   /* Check if any bytes in this chunk have the high bit set. */
+   highbits_set = ((half1 | half2) & 
UINT64CONST(0x8080808080808080));
+
+   if (!highbits_set)
+   return 2 * sizeof(uint64);
+   else
+   return 0;
+   }
+   else
+   return 0;
+}


Some ideas:

1. Better to check if any high bits are set first. We care more about 
the speed of that than of detecting zero bytes, because input with high 
bits is valid but zeros are an error.


2. Since we check that there are no high bits, we can do the zero-checks 
with fewer instructions like this:


/* NB: this is only correct if 'chunk' doesn't have any high bits set */
#define HAS_ZERO(chunk) ( \
  ((chunk) + \
   UINT64CONST(0x7f7f7f7f7f7f7f7f)) & \
  UINT64CONST(0x8080808080808080) == UINT64CONST(0x8080808080808080))

3. It's probably cheaper perform the HAS_ZERO check just once on (half1 
| half2). We have to compute (half1 | half2) anyway.



Putting all that together:

/* Verify a chunk of bytes for valid ASCII including a zero-byte check. */
static inline int
check_ascii(const unsigned char *s, int len)
{
uint64  half1,
half2,
highbits_set;
uint64  x;

if (len >= 2 * sizeof(uint64))
{
memcpy(&half1, s, sizeof(uint64));
memcpy(&half2, s + sizeof(uint64), sizeof(uint64));

/* Check if any bytes in this chunk have the high bit set. */
highbits_set = ((half1 | half2) & 
UINT64CONST(0x8080808080808080));
if (highbits_set)
return 0;

/*
 * Check if there are any zero bytes in this chunk. This is 
only correct
 * if there are no high bits set, but we checked that already.
 */
x = (half1 | half2) + UINT64CONST(0x7f7f7f7f7f7f7f7f);
x &= UINT64CONST(0x8080808080808080);
if (x != UINT64CONST(0x8080808080808080))
return 0;

return 2 * sizeof(uint64);
}
else
return 0;
}

In quick testing, that indeed compiles into fewer instructions. With 
GCC, there's no measurable difference in performance. But with clang, 
this version is much faster than the original, because the original 
version is much slower than when compiled with GCC. In other words, this 
version seems to avoid some clang misoptimization. I tested only with 
ASCII input, I haven't tried other cases.


What test set have you been using for performance testing this? I'd like 
to know how this version compares, and I could also try running it on my 
old raspberry pi, which is more strict about alignmnt.


0002 adds the SSE4 implementation on x86-64, and is equally fast on all 
input, at the cost of greater complexity.


Didn't look closely, but seems reasonable at a quick glance.

- Heikki

missing GRANT on pg_subscription columns

2021-06-03 Thread Euler Taveira

Hi,

I was checking the GRANT on pg_subscription and noticed that the command is not
correct. There is a comment that says "All columns of pg_subscription except
subconninfo are readable". However, there are columns that aren't included: oid
and subsynccommit. It seems an oversight in the commits 6f236e1eb8c and
887227a1cc8.

There are monitoring tools and data collectors that aren't using a
superuser to read catalog information (I usually recommend using pg_monitor).
Hence, you cannot join pg_subscription with relations such as
pg_subscription_rel or pg_stat_subscription because column oid has no
column-level privilege. I'm attaching a patch to fix it (indeed, 2 patches
because of additional columns for v14). We should add instructions in the minor
version release notes too.

This issue was reported by Israel Barth.


--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 64b138b9974934f689e57fc34d370424b2a348a9 Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 31 May 2021 19:40:36 -0300
Subject: [PATCH] Grant read privilege to additional pg_subscription columns

pg_subscription should be read by PUBLIC except subconninfo column.
Documentation is correct but the GRANT command is not. Columns oid and
subsynccommit don't have the right privileges. It seems an oversight in
the commits 6f236e1eb8c and 887227a1cc8. The current behavior prohibits
joins between pg_subscription and related tables (pg_subscription_rel
and pg_stat_subscription) for non-superusers.
---
 src/backend/catalog/system_views.sql | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5c84d758bb..5088e7f1d5 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1254,5 +1254,5 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
 
 -- All columns of pg_subscription except subconninfo are readable.
 REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (subdbid, subname, subowner, subenabled, subbinary, substream, subslotname, subpublications)
+GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subbinary, substream, subslotname, subsynccommit, subpublications)
 ON pg_subscription TO public;
-- 
2.20.1

From d54165988c69a021962adaad6e2e31f80f4ad85c Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Tue, 1 Jun 2021 11:17:26 -0300
Subject: [PATCH] Grant read privilege to additional pg_subscription columns

pg_subscription should be read by PUBLIC except subconninfo column.
Documentation is correct but the GRANT command is not. Columns oid and
subsynccommit don't have the right privileges. It seems an oversight in
the commits 6f236e1eb8c and 887227a1cc8. The current behavior prohibits
joins between pg_subscription and related tables (pg_subscription_rel
and pg_stat_subscription) for non-superusers.
---
 src/backend/catalog/system_views.sql | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 51d738cc42..eb363c9ade 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -1122,7 +1122,7 @@ REVOKE ALL ON pg_replication_origin_status FROM public;
 
 -- All columns of pg_subscription except subconninfo are readable.
 REVOKE ALL ON pg_subscription FROM public;
-GRANT SELECT (subdbid, subname, subowner, subenabled, subslotname, subpublications)
+GRANT SELECT (oid, subdbid, subname, subowner, subenabled, subslotname, subsynccommit, subpublications)
 ON pg_subscription TO public;
 
 
-- 
2.20.1

Re: Failures with gcd functions with GCC snapshots GCC and -O3 (?)

2021-06-03 Thread Tom Lane

Michael Paquier  writes:
> serinus has been complaining about the new gcd functions in 13~:

moonjelly, which also runs a bleeding-edge gcc, started to fail the same
way at about the same time.  Given that none of our code in that area
has changed, it's hard to think it's anything but a broken compiler.
Maybe somebody should report that to gcc upstream?

regards, tom lane

Re: pg_upgrade is failed for 'plpgsql_call_handler' handler

2021-06-03 Thread Tom Lane

Daniel Gustafsson  writes:
>> On 3 Jun 2021, at 11:53, tushar  wrote:
>> In one of my testing scenario, i found pg_upgrade is failed for 
>> 'plpgsql_call_handler' handle

> This is intentional since the language template work in 8.1, before then
> pg_dump would look up support functions in pg_catalog.

I don't see any particular need to support reaching inside the guts
of another PL language implementation, as this test case does.
We'd be perfectly within our rights to rename plpgsql_call_handler
as something else; that should be nobody's business except that
of the plpgsql extension.

But yeah, the behavior you're seeing here is intended to support
normally-packaged languages.  pg_dump won't ordinarily dump objects
in pg_catalog, because it assumes stuff in pg_catalog is to
be treated as built-in.

regards, tom lane

Re: pg_upgrade is failed for 'plpgsql_call_handler' handler

2021-06-03 Thread Daniel Gustafsson

> On 3 Jun 2021, at 16:12, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>>> On 3 Jun 2021, at 11:53, tushar  wrote:
>>> In one of my testing scenario, i found pg_upgrade is failed for 
>>> 'plpgsql_call_handler' handle
> 
>> This is intentional since the language template work in 8.1, before then
>> pg_dump would look up support functions in pg_catalog.
> 
> I don't see any particular need to support reaching inside the guts
> of another PL language implementation, as this test case does.
> We'd be perfectly within our rights to rename plpgsql_call_handler
> as something else; that should be nobody's business except that
> of the plpgsql extension.
> 
> But yeah, the behavior you're seeing here is intended to support
> normally-packaged languages.  pg_dump won't ordinarily dump objects
> in pg_catalog, because it assumes stuff in pg_catalog is to
> be treated as built-in.

Agreed, I don't think there is anything we could/should do here (the lack of
complaints in the archives back that up).

--
Daniel Gustafsson   https://vmware.com/

Re: speed up verifying UTF-8

2021-06-03 Thread Greg Stark

> 3. It's probably cheaper perform the HAS_ZERO check just once on (half1
| half2). We have to compute (half1 | half2) anyway.

Wouldn't you have to check (half1 & half2) ?

Re: speed up verifying UTF-8

2021-06-03 Thread Greg Stark

I haven't looked at the surrounding code. Are we processing all the
COPY data in one long stream or processing each field individually? If
we're processing much more than 128 bits and happy to detect NUL
errors only at the end after wasting some work then you could hoist
that has_zero check entirely out of the loop (removing the branch
though it's probably a correctly predicted branch anyways).

Do something like:

zero_accumulator = zero_accumulator & next_chunk

in the loop and then only at the very end check for zeros in that.

Re: Multi-Column List Partitioning

2021-06-03 Thread Nitin Jadhav

> Approach 1 sounds better.  It sounds like approach 1 might help us
> implement support for allowing NULLs in range partition bounds in the
> future, if at all.  For now, it might be better to not allocate the
> isnull array except for list partitioning.

Thanks for confirming.

> I'll wait for you to post a new patch addressing at least the comments
> in my earlier email.  Also, please make sure to run `make check`
> successfully before posting the patch. :)

I have fixed all of the review comments given by you and Jeevan in the
attached patch and also the attached patch contains more changes
compared to the previous patch. Following are the implementation
details.

1. Regarding syntax, the existing syntax will work fine for the
single-column list partitioning. However I have used the new syntax
for the multi-column list partitioning as we discussed earlier. I have
used a combination of 'AND' and 'OR' logic for the partition
constraints as given in the below example.

postgres@17503=#create table t(a int, b text) partition by list(a,b);
CREATE TABLE
postgres@17503=#create table t1 partition of t for values in ((1,'a'),
(NULL,'b'));
CREATE TABLE
postgres@17503=#\d+ t
  Partitioned table "public.t"
 Column |  Type   | Collation | Nullable | Default | Storage  |
Compression | Stats target | Description
+-+---+--+-+--+-+--+-
 a  | integer |   |  | | plain|
 |  |
 b  | text|   |  | | extended |
 |  |
Partition key: LIST (a, b)
Partitions: t1 FOR VALUES IN ((1, 'a'), (NULL, 'b'))

postgres@17503=#\d+ t1
Table "public.t1"
 Column |  Type   | Collation | Nullable | Default | Storage  |
Compression | Stats target | Description
+-+---+--+-+--+-+--+-
 a  | integer |   |  | | plain|
 |  |
 b  | text|   |  | | extended |
 |  |
Partition of: t FOR VALUES IN ((1, 'a'), (NULL, 'b'))
Partition constraint: (((a = 1) AND (b = 'a'::text)) OR ((a IS NULL)
AND (b = 'b'::text)))
Access method: heap

2. In the existing code, NULL values were handled differently. It was
not added to the 'datums' variable, rather used to store the partition
index directly in the 'null_index' variable. Now there is a
possibility of multiple NULL values, hence introducing  a new member
'isnulls' in the 'PartitionBoundInfoData' struct which indicates
whether the corresponding element in the 'datums' is NULL. Now
'null_index' cannot be used directly to store the partition index, so
removed it and made the necessary changes in multiple places.

3. I have added test cases for 'create table' and 'insert' statements
related to multi-column list partitioning and these are working fine
with 'make check'.

4. Handled the partition pruning code to accommodate these changes for
single-column list partitioning. However it is pending for
multi-column list partitioning.

5. I have done necessary changes in partition wise join related code
to accommodate for single-column list partitioning. However it is
pending for multi-column list partitioning.

Kindly review the patch and let me know if any changes are required.

Pending items:
1. Support of partition pruning for multi-column list partitioning.
2. Support of partition wise join for multi-column list partitioning.

I will continue to work on the above 2 items.
Kindly let me know if I am missing something.

Thanks & Regards,
Nitin Jadhav

On Wed, May 26, 2021 at 10:27 AM Amit Langote  wrote:
>
> On Sun, May 23, 2021 at 6:49 PM Nitin Jadhav
>  wrote:
> > > IMO, it is not such a bad syntax from a user's PoV.  It's not hard to
> > > understand from this syntax that the partition constraint is something
> > > like (a, b) = (1, 2) OR (a, b) = (1, 5) OR ..., where the = performs
> > > row-wise comparison.
> >
> > Thanks for suggesting to use row-wise comparison.
>
> Actually, I was just describing how the *users* may want to visualize
> the partition constraint...
>
> > I have few queries
> > with respect to handling of NULL values.
> >
> > 1. What should be the partition constraint for the above case. AFAIK,
> > row-wise comparison wont work with NULL values as shown in [1]. I mean
> > two rows are considered equal if all their corresponding members are
> > non-null and equal. The rows are unequal if any corresponding members
> > are non-null and unequal. Otherwise the result of the row comparison
> > is unknown (null). So we should generate different types of
> > constraints for NULL values.
> >
> > Ex:
> > CREATE TABLE t(a int, b int) PARTITION BY LIST(a,b);
> > CREATE TABLE t_1 PARTITION OF t FOR VALUES IN (1, 1), (1, NULL),
> > (NULL, 1), (NULL, NULL);
> >
> > As per my knowledge

Re: Asynchronous Append on postgres_fdw nodes.

2021-06-03 Thread Andrey Lepikhov


On 3/6/21 14:49, Etsuro Fujita wrote:

On Tue, May 11, 2021 at 6:55 PM Etsuro Fujita  wrote:

On Tue, May 11, 2021 at 6:27 PM Andrey Lepikhov
 wrote:

On 11/5/21 12:24, Etsuro Fujita wrote:



  ->  Append (actual rows=3000 loops=1)
->  Async Foreign Scan on f1 (actual rows=0 loops=1)
->  Async Foreign Scan on f2 (actual rows=0 loops=1)
->  Foreign Scan on f3 (actual rows=3000 loops=1)

Here we give preference to the synchronous scan. Why?


This would be expected behavior, and the reason is avoid performance
degradation; you might think it would be better to execute the async
Foreign Scan nodes more aggressively, but it would require
waiting/polling for file descriptor events many times, which is
expensive and might cause performance degradation.  I think there is
room for improvement, though.

Yes, I agree with you. Maybe you can add note in documentation on
async_capable, for example:
"... Synchronous and asynchronous scanning strategies can be mixed by
optimizer in one scan plan of a partitioned table or an 'UNION ALL'
command. For performance reasons, synchronous scans executes before the
first of async scan. ..."


+1  But I think this is an independent issue, so I think it would be
better to address the issue separately.


I think that since postgres-fdw.sgml would be for users rather than
developers, unlike fdwhandler.sgml, it would be better to explain this
more in a not-too-technical way.  So how about something like this?

Asynchronous execution is applied even when an Append node contains
subplan(s) executed synchronously as well as subplan(s) executed
asynchronously.  In that case, if the asynchronous subplans are ones
executed using postgres_fdw, tuples from the asynchronous subplans are
not returned until after at least one synchronous subplan returns all
tuples, as that subplan is executed while the asynchronous subplans
are waiting for the results of queries sent to foreign servers.  This
behavior might change in a future release.

Good, this text is clear for me.

--
regards,
Andrey Lepikhov
Postgres Professional

Re: speed up verifying UTF-8

2021-06-03 Thread John Naylor

On Thu, Jun 3, 2021 at 10:42 AM Greg Stark  wrote:
>
> I haven't looked at the surrounding code. Are we processing all the
> COPY data in one long stream or processing each field individually?

It happens on 64kB chunks.

> If
> we're processing much more than 128 bits and happy to detect NUL
> errors only at the end after wasting some work then you could hoist
> that has_zero check entirely out of the loop (removing the branch
> though it's probably a correctly predicted branch anyways).
>
> Do something like:
>
> zero_accumulator = zero_accumulator & next_chunk
>
> in the loop and then only at the very end check for zeros in that.

That's the approach taken in the SSE4 patch, and in fact that's the logical
way to do it there. I hadn't considered doing it that way in the pure C
case, but I think it's worth trying.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: speed up verifying UTF-8

2021-06-03 Thread John Naylor

I wrote:

> On Thu, Jun 3, 2021 at 10:42 AM Greg Stark  wrote:
> >

> > If
> > we're processing much more than 128 bits and happy to detect NUL
> > errors only at the end after wasting some work then you could hoist
> > that has_zero check entirely out of the loop (removing the branch
> > though it's probably a correctly predicted branch anyways).
> >
> > Do something like:
> >
> > zero_accumulator = zero_accumulator & next_chunk
> >
> > in the loop and then only at the very end check for zeros in that.
>
> That's the approach taken in the SSE4 patch, and in fact that's the
logical way to do it there. I hadn't considered doing it that way in the
pure C case, but I think it's worth trying.

Actually, I spoke too quickly. We can't have an error accumulator in the C
case because we need to return how many bytes were valid. In fact, in the
SSE case, it checks the error vector at the end and then reruns with the
fallback case to count the valid bytes.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: security_definer_search_path GUC

2021-06-03 Thread Pavel Stehule

čt 3. 6. 2021 v 8:14 odesílatel Joel Jacobson  napsal:

> On Thu, Jun 3, 2021, at 00:55, Marko Tiikkaja wrote:
>
> On Wed, Jun 2, 2021 at 11:32 PM Joel Jacobson  wrote:
>
> But if running a recent PostgreSQL version, with support for extensions, I
> think an even cleaner solution
> would be to package such compatibility versions in a "compat" extension,
> that would just install them into the public schema.
>
>
> Writing, verifying and shipping extension upgrade scripts is not pleasant.
>
>
> I agree. Thanks for acknowledging this problem.
>
> I'm experimenting with an idea that I hope can simplify the "verifying"
> part of the problem.
> hope to have something to show you all soon.
>
> I'd much prefer something that's integrated to the workflow I already have.
>
>
> Fair point. I guess also the initial switching cost of changing workflow
> is quite high and difficult to motivate. So even if extensions ergonomics
> are improved, many existing users will not migrate their workflows anyway
> due to this.
>
>
>
> And if you wonder what functions in public come from the compat extension,
> you can use use \dx+.
>
>
> They still show up everywhere when looking at "public".  So this is only
> slightly better, and a maintenance burden.
>
>
> Good point. I find this annoying as well sometimes.
>
> It's easy to get a list of all objects for an extension, via \dx+
>
> But it's hard to see what objects in a schema, that are provided by
> different extensions, via e.g. \df public.*
>
> What about adding a new "Extension" column next to "Schema" to the
> relevant commands, such as \df?
>

I think so for \df+ it can be very useful. I don't think it is important
enough to be in short form, but it can be nice in enhanced form.

Pavel


> /Joel
>

Re: security_definer_search_path GUC

2021-06-03 Thread Marko Tiikkaja

On Thu, Jun 3, 2021 at 9:14 AM Joel Jacobson  wrote:

> On Thu, Jun 3, 2021, at 00:55, Marko Tiikkaja wrote:
>
> They still show up everywhere when looking at "public".  So this is only
> slightly better, and a maintenance burden.
>
>
> Good point. I find this annoying as well sometimes.
>
> It's easy to get a list of all objects for an extension, via \dx+
>
> But it's hard to see what objects in a schema, that are provided by
> different extensions, via e.g. \df public.*
>
> What about adding a new "Extension" column next to "Schema" to the
> relevant commands, such as \df?
>

That's just one part of it.  The other part of my original proposal was to
avoid having to  SET search_path  for all SECURITY DEFINER functions.  I
still think either being able to lock search_path or the separate prosecdef
search_path is the best option here.

.m

Re: security_definer_search_path GUC

2021-06-03 Thread Pavel Stehule

čt 3. 6. 2021 v 17:54 odesílatel Marko Tiikkaja  napsal:

> On Thu, Jun 3, 2021 at 9:14 AM Joel Jacobson  wrote:
>
>> On Thu, Jun 3, 2021, at 00:55, Marko Tiikkaja wrote:
>>
>> They still show up everywhere when looking at "public".  So this is only
>> slightly better, and a maintenance burden.
>>
>>
>> Good point. I find this annoying as well sometimes.
>>
>> It's easy to get a list of all objects for an extension, via \dx+
>>
>> But it's hard to see what objects in a schema, that are provided by
>> different extensions, via e.g. \df public.*
>>
>> What about adding a new "Extension" column next to "Schema" to the
>> relevant commands, such as \df?
>>
>
> That's just one part of it.  The other part of my original proposal was to
> avoid having to  SET search_path  for all SECURITY DEFINER functions.  I
> still think either being able to lock search_path or the separate prosecdef
> search_path is the best option here.
>

I agree so some possibility of locking search_path or possibility to
control who and when can change it can increase security. This should be a
core feature. It's maybe more generic issue - same functionality can be
required for work_mem setting, maybe max_paralel_workers_per_gather, and
other GUC

Regards

Pavel

>
>
> .m
>

Re: Move pg_attribute.attcompression to earlier in struct for reduced size?

2021-06-03 Thread Alvaro Herrera

On 2021-Jun-02, Michael Paquier wrote:

> After 12 runs of VACUUM FULL on my laptop, I have removed the two
> highest and the two lowest to remove some noise, and did an average of
> the rest:
> - HEAD, 100 text columns: 5720ms
> - REL_13_STABLE, 100 text columns: 4308ms
> - HEAD, 200 text columns: 10020ms
> - REL_13_STABLE, 200 text columns:  8319ms
> 
> So yes, that looks much visible to me, and an argument in favor of the
> removal of the forced recompression on HEAD when rewriting tuples.

Just to be clear -- that's the time to vacuum the table without changing
the compression algorithm, right?  So the overhead is just the check for
whether the recompression is needed, not the recompression itself?

If the check for recompression is this expensive, then yeah I agree that
we should take it out.  If recompression is actually occurring, then I
don't think this is a good test :-)

-- 
Álvaro Herrera   Valdivia, Chile

Re: security_definer_search_path GUC

2021-06-03 Thread Mark Dilger

> On Jun 3, 2021, at 9:03 AM, Pavel Stehule  wrote:
> 
> I agree so some possibility of locking search_path or possibility to control 
> who and when can change it can increase security. This should be a core 
> feature. It's maybe more generic issue - same functionality can be required 
> for work_mem setting, maybe max_paralel_workers_per_gather, and other GUC

Chapman already suggested a mechanism in [1] to allow chaining together 
additional validators for GUCs.

When setting search_path, the check_search_path(char **newval, void **extra, 
GucSource source) function is invoked.  As I understand Chapman's proposal, 
additional validators could be added to any GUC.  You could implement 
search_path restrictions by defining additional validators that enforce 
whatever restriction you like.

Marko, does his idea sound workable for your needs?  I understood your original 
proposal as only restricting the value of search_path within security definer 
functions.  This idea would allow you to restrict it everywhere, and not 
tailored to just that context.

[1] https://www.postgresql.org/message-id/608c9a81.3020...@anastigmatix.net

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: security_definer_search_path GUC

2021-06-03 Thread Marko Tiikkaja

On Thu, Jun 3, 2021 at 7:30 PM Mark Dilger 
wrote:

> > On Jun 3, 2021, at 9:03 AM, Pavel Stehule 
> wrote:
> >
> > I agree so some possibility of locking search_path or possibility to
> control who and when can change it can increase security. This should be a
> core feature. It's maybe more generic issue - same functionality can be
> required for work_mem setting, maybe max_paralel_workers_per_gather, and
> other GUC
>
> Chapman already suggested a mechanism in [1] to allow chaining together
> additional validators for GUCs.
>
> When setting search_path, the check_search_path(char **newval, void
> **extra, GucSource source) function is invoked.  As I understand Chapman's
> proposal, additional validators could be added to any GUC.  You could
> implement search_path restrictions by defining additional validators that
> enforce whatever restriction you like.
>
> Marko, does his idea sound workable for your needs?  I understood your
> original proposal as only restricting the value of search_path within
> security definer functions.  This idea would allow you to restrict it
> everywhere, and not tailored to just that context.
>

Yeah, that would work for my use case just as well.


.m

Re: Decoding of two-phase xacts missing from CREATE_REPLICATION_SLOT command

2021-06-03 Thread Jeff Davis

On Thu, 2021-06-03 at 09:29 +0530, Amit Kapila wrote:
> The idea is to support two_phase via protocol with a subscriber-side
> work where we can test it as well. The code to support it via
> replication protocol is present in the first patch of subscriber-side
> work [1] which uses that code as well. Basically, we don't have a
> good
> way to test it without subscriber-side work so decided to postpone it
> till the corresponding work is done.

Thank you for clarifying.

Right now, it feels a bit incomplete. If it's not much work, I
recommend breaking out the CREATE_REPLICATION_SLOT changes and updating
pg_recvlogical, so that it can go in v14 (and
pg_create_logical_replication_slot() will match
CREATE_REPLICATION_SLOT). But if that's complicated or controversial,
then I'm fine waiting for the other work to complete.

Regards,
Jeff Davis

Re: security_definer_search_path GUC

2021-06-03 Thread Pavel Stehule

čt 3. 6. 2021 v 18:30 odesílatel Mark Dilger 
napsal:

>
>
> > On Jun 3, 2021, at 9:03 AM, Pavel Stehule 
> wrote:
> >
> > I agree so some possibility of locking search_path or possibility to
> control who and when can change it can increase security. This should be a
> core feature. It's maybe more generic issue - same functionality can be
> required for work_mem setting, maybe max_paralel_workers_per_gather, and
> other GUC
>
> Chapman already suggested a mechanism in [1] to allow chaining together
> additional validators for GUCs.
>
> When setting search_path, the check_search_path(char **newval, void
> **extra, GucSource source) function is invoked.  As I understand Chapman's
> proposal, additional validators could be added to any GUC.  You could
> implement search_path restrictions by defining additional validators that
> enforce whatever restriction you like.
>

This design looks good for extensions, but I am not sure if it is good for
users. Some declarative way without necessity to programming or install
some extension can be nice.

Pavel


> Marko, does his idea sound workable for your needs?  I understood your
> original proposal as only restricting the value of search_path within
> security definer functions.  This idea would allow you to restrict it
> everywhere, and not tailored to just that context.
>
> [1]
> https://www.postgresql.org/message-id/608c9a81.3020...@anastigmatix.net
>
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>
>

Re: missing GRANT on pg_subscription columns

2021-06-03 Thread Tom Lane

"Euler Taveira"  writes:
> I was checking the GRANT on pg_subscription and noticed that the command is 
> not
> correct. There is a comment that says "All columns of pg_subscription except
> subconninfo are readable". However, there are columns that aren't included: 
> oid
> and subsynccommit. It seems an oversight in the commits 6f236e1eb8c and
> 887227a1cc8.

Ugh.

> There are monitoring tools and data collectors that aren't using a
> superuser to read catalog information (I usually recommend using pg_monitor).
> Hence, you cannot join pg_subscription with relations such as
> pg_subscription_rel or pg_stat_subscription because column oid has no
> column-level privilege. I'm attaching a patch to fix it (indeed, 2 patches
> because of additional columns for v14). We should add instructions in the 
> minor
> version release notes too.

I agree with fixing this in HEAD.  But given that this has been wrong
since v10 with zero previous complaints, I doubt that it is worth the
complication of trying to do something about it in the back branches.
Maybe we could just adjust the docs there, instead.

regards, tom lane

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2021-06-03 Thread Arne Roland

Hi,


I haven't tested the parallel case, but I think we should sort out (3) 
get_cheapest_fractional_path_for_pathkeys as mentioned above.


I am lost about the comment regarding startup_new_fractional. Could you 
elaborate what you mean by that?


Apart from that, I'd argue for a small test case. I attached a slimmed down 
case of what we were trying to fix. It might be worth to integrate that with an 
existing test, since more than a third of the time seems to be consumed by the 
creation and attachment of partitions.


Regards
Arne



merge-fraction-cheapest_example.sql
Description: merge-fraction-cheapest_example.sql

Re: SSL SNI

2021-06-03 Thread Jacob Champion

On Wed, 2021-04-07 at 15:32 +0200, Peter Eisentraut wrote:
> Committed like that.  (Default to on, but it's easy to change if there 
> are any further thoughts.)

Hi Peter,

It looks like this code needs some guards for a NULL conn->pghost. For example 
when running

psql 'dbname=postgres sslmode=require hostaddr=127.0.0.1'
with no PGHOST in the environment, psql is currently segfaulting for
me.

--Jacob

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Jeff Davis

On Tue, 2020-10-27 at 23:39 -0700, Andres Freund wrote:
> Maybe we should just have --with-ssl={openssl,nss}? That'd avoid
> needing
> to check for errors.

[ apologies for the late reply ]

Would it be more proper to call it --with-tls={openssl,nss} ?

Regards,
Jeff Davis

Re: SSL SNI

2021-06-03 Thread Tom Lane

Jacob Champion  writes:
> It looks like this code needs some guards for a NULL conn->pghost. For 
> example when running
> psql 'dbname=postgres sslmode=require hostaddr=127.0.0.1'
> with no PGHOST in the environment, psql is currently segfaulting for
> me.

Duplicated here:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f3adec47ec3 in __strspn_sse42 () from /lib64/libc.so.6
(gdb) bt
#0  0x7f3adec47ec3 in __strspn_sse42 () from /lib64/libc.so.6
#1  0x7f3adf6b7026 in initialize_SSL (conn=0xed4160)
at fe-secure-openssl.c:1090
#2  0x7f3adf6b8755 in pgtls_open_client (conn=conn@entry=0xed4160)
at fe-secure-openssl.c:132
#3  0x7f3adf6b3955 in pqsecure_open_client (conn=conn@entry=0xed4160)
at fe-secure.c:180
#4  0x7f3adf6a4808 in PQconnectPoll (conn=conn@entry=0xed4160)
at fe-connect.c:3102
#5  0x7f3adf6a5b31 in connectDBComplete (conn=conn@entry=0xed4160)
at fe-connect.c:2219
#6  0x7f3adf6a8968 in PQconnectdbParams (keywords=keywords@entry=0xed40c0, 
values=values@entry=0xed4110, expand_dbname=expand_dbname@entry=1)
at fe-connect.c:669
#7  0x00404db2 in main (argc=, argv=0x7ffc58477208)
at startup.c:266

You don't seem to need the "sslmode=require" either, just an
SSL-enabled server.

regards, tom lane

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Daniel Gustafsson

> On 3 Jun 2021, at 19:37, Jeff Davis  wrote:
> 
> On Tue, 2020-10-27 at 23:39 -0700, Andres Freund wrote:
>> Maybe we should just have --with-ssl={openssl,nss}? That'd avoid
>> needing
>> to check for errors.
> 
> [ apologies for the late reply ]
> 
> Would it be more proper to call it --with-tls={openssl,nss} ?

Well, we use SSL for everything else (GUCs, connection params and env vars etc)
so I think --with-ssl is sensible.

However, SSL and TLS are used quite interchangeably these days so I think it
makes sense to provide --with-tls as an alias.

--
Daniel Gustafsson   https://vmware.com/

Re: SSL SNI

2021-06-03 Thread Tom Lane

I wrote:
> Jacob Champion  writes:
>> It looks like this code needs some guards for a NULL conn->pghost. For 
>> example when running
>> psql 'dbname=postgres sslmode=require hostaddr=127.0.0.1'
>> with no PGHOST in the environment, psql is currently segfaulting for
>> me.

> Duplicated here:

It looks like the immediate problem can be resolved by just adding
a check for conn->pghost not being NULL, since the comment above
says

 * Per RFC 6066, do not set it if the host is a literal IP address (IPv4
 * or IPv6).

and having only hostaddr certainly fits that case.  But I didn't
check to see if any more problems arise later.

regards, tom lane

Re: BUG #16079: Question Regarding the BUG #16064

2021-06-03 Thread Jeff Davis

On Mon, 2020-12-21 at 13:44 -0500, Tom Lane wrote:
> Hm.  I'm less concerned about that scenario than about somebody
> snooping
> the on-the-wire traffic.  If we're going to invent a connection
> setting
> for this, I'd say that in addition to "ok to send cleartext password"
> and "never ok to send cleartext password", there should be a setting
> for
> "send cleartext password only if connection is encrypted".  Possibly
> that should even be the default.

There was a fair amount of related discussion here:

https://www.postgresql.org/message-id/227015d8417f2b4fef03f8966dbfa5cbcc4f44da.camel%40j-davis.com

My feeling after all of that discussion is that the next step would be
to move to some kind of negotiation between client and server about
which methods are mutually acceptable. Right now, the protocol is
structured around the server driving the authentication process, and
the most the client can do is abort.

> BTW, do we have a client-side setting to insist that passwords not be
> sent in MD5 hashing either?  A person who is paranoid about this
> would
> likely want to disable that code path as well.

channel_binding=require is one way to do it, but it also requires ssl.

Regards,
Jeff Davis

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2021-06-03 Thread Tomas Vondra

Hi,

On 6/3/21 7:17 PM, Arne Roland wrote:
> Hi,
> 
> 
> I haven't tested the parallel case, but I think we should sort out (3)
> get_cheapest_fractional_path_for_pathkeys as mentioned above.
> 

Not sure what you refer to by "above" - it's probably better to reply
in-line to existing message, which makes it much cleared.

> 
> I am lost about the comment regarding startup_new_fractional. Could you
> elaborate what you mean by that?
> 

Not sure what this refers to either - there's no startup_new_fractional
in my message and 'git grep startup_new_fractional' returns nothing.

> 
> Apart from that, I'd argue for a small test case. I attached a slimmed
> down case of what we were trying to fix. It might be worth to integrate
> that with an existing test, since more than a third of the time seems to
> be consumed by the creation and attachment of partitions.
> 

Maybe, if there's a suitable table to reuse, we can do that. But I don't
think it matters it takes ~1/3 of the time to attach the partitions.
What's more important is whether it measurably slows down the test
suite, and I don't think that's an issue.

In any case, this seems a bit premature - we need something to test the
patch etc. We can worry about how expensive the test is much later.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: SSL SNI

2021-06-03 Thread Tom Lane

I wrote:
> It looks like the immediate problem can be resolved by just adding
> a check for conn->pghost not being NULL,

... scratch that.  There's another problem here, which is that this
code should not be looking at conn->pghost AT ALL.  That will do the
wrong thing with a multi-element host list.  The right thing to be
looking at is conn->connhost[conn->whichhost].host --- with a test
to make sure it's not NULL or an empty string.  (I didn't stop to
study this code close enough to see if it'll ignore an empty
string without help.)

regards, tom lane

Re: security_definer_search_path GUC

2021-06-03 Thread Mark Dilger

> On Jun 3, 2021, at 9:38 AM, Pavel Stehule  wrote:
> 
> This design looks good for extensions, but I am not sure if it is good for 
> users. Some declarative way without necessity to programming or install some 
> extension can be nice.

I agree, though "some declarative way" is a bit vague.  I've had ideas that 
perhaps superusers should be able to further restrict the [min,max] fields of 
int and real GUCs.  Since -1 is sometimes used to mean "disabled", syntax to 
allow specifying a set might be necessary, something like [-1, 60..600].  For 
text and enum GUCs, perhaps a set of regexps would work, some being required to 
match and others being required not to match, such as:

search_path !~ '\mcustomerx\M'
search_path ~ '^pg_catalog,'

If we did something like this, we'd need it to play nicely with other filters 
provided by extensions, because I'm reasonably sure not all filters could be 
done merely using set notation and regular expression syntax.  In fact, I find 
it hard to convince myself that set notation and regular expression syntax 
would even be useful in a large enough number of cases to be worth 
implementing.  What are your thought on that?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Tom Lane

I wrote:
> Hmm, actually we could make step 2 a shade tighter: if a candidate
> routine is a function, match against proargtypes.  If it's a procedure,
> match against coalesce(proallargtypes, proargtypes).  If we find
> multiple matches, raise ambiguity error.

Where do we stand on this topic?

I'm willing to have a go at implementing things that way, but
time's a-wasting.

regards, tom lane

Re: security_definer_search_path GUC

2021-06-03 Thread Isaac Morland

I thought everybody was already doing this, but maybe not. I put the
following in all my function definitions:

SET search_path FROM CURRENT

(with the exception of a very few functions which explicitly need to use
the caller's search path)

It seems to me that if this was the default (note: I'm totally ignoring
backward compatibility issues for now), then most of these issues wouldn't
exist. My schema creation scripts start with an appropriate search path
setting and that value then gets built into every function they create.

Related question: how can function compilation work when the behaviour
depends on the search path of the caller? In other words, the behaviour of
the function can be totally different on each call. Are there any popular
programming environments in which the behaviour of a called function
depends on the caller's environment (actually yes: shell scripting, with
$PATH especially; but besides that and stored procedures)?

I also want to mention that I consider any suggestion to eliminate the
search_path concept as a complete non-starter. It would be no different
from proposing that the next version of a programming language eliminate
(or stop using) the module system. If I could make it happen easily, I
would go in the other direction and allow schemas to be hierarchical (note:
totally ignoring all sorts of very important choices which are more than
just details about how this should work). I would like to be able to have
an extension or subsystem exist in a single schema, with its objects broken
up into schemas within the schema. Same reason as most languages have
hierarchical module systems.

On Thu, 3 Jun 2021 at 14:25, Mark Dilger 
wrote:

>
>
> > On Jun 3, 2021, at 9:38 AM, Pavel Stehule 
> wrote:
> >
> > This design looks good for extensions, but I am not sure if it is good
> for users. Some declarative way without necessity to programming or install
> some extension can be nice.
>
> I agree, though "some declarative way" is a bit vague.  I've had ideas
> that perhaps superusers should be able to further restrict the [min,max]
> fields of int and real GUCs.  Since -1 is sometimes used to mean
> "disabled", syntax to allow specifying a set might be necessary, something
> like [-1, 60..600].  For text and enum GUCs, perhaps a set of regexps would
> work, some being required to match and others being required not to match,
> such as:
>
> search_path !~ '\mcustomerx\M'
> search_path ~ '^pg_catalog,'
>
> If we did something like this, we'd need it to play nicely with other
> filters provided by extensions, because I'm reasonably sure not all filters
> could be done merely using set notation and regular expression syntax.  In
> fact, I find it hard to convince myself that set notation and regular
> expression syntax would even be useful in a large enough number of cases to
> be worth implementing.  What are your thought on that?
>
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>
>
>
>

Re: speed up verifying UTF-8

2021-06-03 Thread John Naylor

On Thu, Jun 3, 2021 at 9:16 AM Heikki Linnakangas  wrote:

> Some ideas:
>
> 1. Better to check if any high bits are set first. We care more about
> the speed of that than of detecting zero bytes, because input with high
> bits is valid but zeros are an error.
>
> 2. Since we check that there are no high bits, we can do the zero-checks
> with fewer instructions like this:

Both ideas make sense, and I like the shortcut we can take with the zero
check. I think Greg is right that the zero check needs “half1 & half2”, so
I tested with that (updated patches attached).

> What test set have you been using for performance testing this? I'd like

The microbenchmark is the same one you attached to [1], which I extended
with a 95% multibyte case. With the new zero check:

clang 12.0.5 / MacOS:

master:

 chinese | mixed | ascii
-+---+---
 981 |   688 |   371

0001:

 chinese | mixed | ascii
-+---+---
 932 |   548 |   110

plus optimized zero check:

 chinese | mixed | ascii
-+---+---
 689 |   573 |59

It makes sense that the Chinese text case is faster since the zero check is
skipped.

gcc 4.8.5 / Linux:

master:

 chinese | mixed | ascii
-+---+---
2561 |  1493 |   825

0001:

 chinese | mixed | ascii
-+---+---
2968 |  1035 |   158

plus optimized zero check:

 chinese | mixed | ascii
-+---+---
2413 |  1078 |   137

The second machine is a bit older and has an old compiler, but there is
still a small speed increase. In fact, without Heikki's tweaks, 0001
regresses on multibyte.

(Note: I'm not seeing the 7x improvement I claimed for 0001 here, but that
was from memory and I think that was a different machine and newer gcc. We
can report a range of results as we proceed.)

[1]
https://www.postgresql.org/message-id/06d45421-61b8-86dd-e765-f1ce527a5...@iki.fi

--
John Naylor
EDB: http://www.enterprisedb.com

Re: security_definer_search_path GUC

2021-06-03 Thread Pavel Stehule

čt 3. 6. 2021 v 20:25 odesílatel Mark Dilger 
napsal:

>
>
> > On Jun 3, 2021, at 9:38 AM, Pavel Stehule 
> wrote:
> >
> > This design looks good for extensions, but I am not sure if it is good
> for users. Some declarative way without necessity to programming or install
> some extension can be nice.
>
> I agree, though "some declarative way" is a bit vague.  I've had ideas
> that perhaps superusers should be able to further restrict the [min,max]
> fields of int and real GUCs.  Since -1 is sometimes used to mean
> "disabled", syntax to allow specifying a set might be necessary, something
> like [-1, 60..600].  For text and enum GUCs, perhaps a set of regexps would
> work, some being required to match and others being required not to match,
> such as:
>
> search_path !~ '\mcustomerx\M'
> search_path ~ '^pg_catalog,'
>
> If we did something like this, we'd need it to play nicely with other
> filters provided by extensions, because I'm reasonably sure not all filters
> could be done merely using set notation and regular expression syntax.  In
> fact, I find it hard to convince myself that set notation and regular
> expression syntax would even be useful in a large enough number of cases to
> be worth implementing.  What are your thought on that?
>

I don't think so for immutable strings we need regular expressions.  Maybe
use some special keyword

search_path only "pg_catalog"




> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>
>

Re: speed up verifying UTF-8

2021-06-03 Thread Heikki Linnakangas


On 03/06/2021 17:33, Greg Stark wrote:

3. It's probably cheaper perform the HAS_ZERO check just once on (half1

| half2). We have to compute (half1 | half2) anyway.

Wouldn't you have to check (half1 & half2) ?


Ah, you're right of course. But & is not quite right either, it will 
give false positives. That's ok from a correctness point of view here, 
because we then fall back to checking byte by byte, but I don't think 
it's a good tradeoff.


I think this works, however:

/* Verify a chunk of bytes for valid ASCII including a zero-byte check. */
static inline int
check_ascii(const unsigned char *s, int len)
{
uint64  half1,
half2,
highbits_set;
uint64  x1,
x2;
uint64  x;

if (len >= 2 * sizeof(uint64))
{
memcpy(&half1, s, sizeof(uint64));
memcpy(&half2, s + sizeof(uint64), sizeof(uint64));

/* Check if any bytes in this chunk have the high bit set. */
highbits_set = ((half1 | half2) & 
UINT64CONST(0x8080808080808080));
if (highbits_set)
return 0;

/*
 * Check if there are any zero bytes in this chunk.
 *
 * First, add 0x7f to each byte. This sets the high bit in each 
byte,
 * unless it was a zero. We already checked that none of the 
bytes had
 * the high bit set previously, so the max value each byte can 
have
 * after the addition is 0x7f + 0x7f = 0xfe, and we don't need 
to
 * worry about carrying over to the next byte.
 */
x1 = half1 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
x2 = half2 + UINT64CONST(0x7f7f7f7f7f7f7f7f);

/* then check that the high bit is set in each byte. */
x = (x1 | x2);
x &= UINT64CONST(0x8080808080808080);
if (x != UINT64CONST(0x8080808080808080))
return 0;

return 2 * sizeof(uint64);
}
else
return 0;
}

- Heikki

Re: security_definer_search_path GUC

2021-06-03 Thread Mark Dilger




> On Jun 3, 2021, at 12:06 PM, Pavel Stehule  wrote:
> 
> 
> 
> čt 3. 6. 2021 v 20:25 odesílatel Mark Dilger  
> napsal:
> 
> 
> > On Jun 3, 2021, at 9:38 AM, Pavel Stehule  wrote:
> > 
> > This design looks good for extensions, but I am not sure if it is good for 
> > users. Some declarative way without necessity to programming or install 
> > some extension can be nice.
> 
> I agree, though "some declarative way" is a bit vague.  I've had ideas that 
> perhaps superusers should be able to further restrict the [min,max] fields of 
> int and real GUCs.  Since -1 is sometimes used to mean "disabled", syntax to 
> allow specifying a set might be necessary, something like [-1, 60..600].  For 
> text and enum GUCs, perhaps a set of regexps would work, some being required 
> to match and others being required not to match, such as:
> 
> search_path !~ '\mcustomerx\M'
> search_path ~ '^pg_catalog,'
> 
> If we did something like this, we'd need it to play nicely with other filters 
> provided by extensions, because I'm reasonably sure not all filters could be 
> done merely using set notation and regular expression syntax.  In fact, I 
> find it hard to convince myself that set notation and regular expression 
> syntax would even be useful in a large enough number of cases to be worth 
> implementing.  What are your thought on that?
> 
> I don't think so for immutable strings we need regular expressions.  Maybe 
> use some special keyword
> 
> search_path only "pg_catalog" 

I think we're trying to solve different problems.  I'm trying to allow 
non-superusers to set GUCs while putting constraints on what values they 
choose.  You appear to be trying to revoke the ability to set a GUC by forcing 
it to only ever have a single value.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: speed up verifying UTF-8

2021-06-03 Thread John Naylor

On Thu, Jun 3, 2021 at 3:08 PM Heikki Linnakangas  wrote:
>
> On 03/06/2021 17:33, Greg Stark wrote:
> >> 3. It's probably cheaper perform the HAS_ZERO check just once on (half1
> > | half2). We have to compute (half1 | half2) anyway.
> >
> > Wouldn't you have to check (half1 & half2) ?
>
> Ah, you're right of course. But & is not quite right either, it will
> give false positives. That's ok from a correctness point of view here,
> because we then fall back to checking byte by byte, but I don't think
> it's a good tradeoff.

Ah, of course.

> /*
>  * Check if there are any zero bytes in this chunk.
>  *
>  * First, add 0x7f to each byte. This sets the high bit
in each byte,
>  * unless it was a zero. We already checked that none of
the bytes had
>  * the high bit set previously, so the max value each
byte can have
>  * after the addition is 0x7f + 0x7f = 0xfe, and we don't
need to
>  * worry about carrying over to the next byte.
>  */
> x1 = half1 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
> x2 = half2 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
>
> /* then check that the high bit is set in each byte. */
> x = (x1 | x2);
> x &= UINT64CONST(0x8080808080808080);
> if (x != UINT64CONST(0x8080808080808080))
> return 0;

That seems right, I'll try that and update the patch. (Forgot to attach
earlier anyway)

--
John Naylor
EDB: http://www.enterprisedb.com

Re: speed up verifying UTF-8

2021-06-03 Thread Heikki Linnakangas

On 03/06/2021 22:10, John Naylor wrote:
On Thu, Jun 3, 2021 at 3:08 PM Heikki Linnakangas > wrote:

 >                 x1 = half1 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
 >                 x2 = half2 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
 >
 >                 /* then check that the high bit is set in each byte. */
 >                 x = (x1 | x2);
 >                 x &= UINT64CONST(0x8080808080808080);
 >                 if (x != UINT64CONST(0x8080808080808080))
 >                         return 0;

That seems right, I'll try that and update the patch. (Forgot to attach 
earlier anyway)

Ugh, actually that has the same issue as before. If one of the bytes is 
in one half is zero, but not in the other half, this fail to detect it. 
Sorry for the noise..

- Heikki

Re: [PATCH] expand the units that pg_size_pretty supports on output

2021-06-03 Thread David Christensen

New versions attached to address the initial CF feedback and rebase on HEAD
as of now.

0001-Expand-the-units-that-pg_size_pretty-numeric-knows-a.patch

- expands the units that pg_size_pretty() can handle up to YB.

0002-Expand-the-supported-units-in-pg_size_bytes-to-cover.patch

- expands the units that pg_size_bytes() can handle, up to YB.


0002-Expand-the-supported-units-in-pg_size_bytes-to-cover.patch
Description: Binary data


0001-Expand-the-units-that-pg_size_pretty-numeric-knows-a.patch
Description: Binary data

Re: speed up verifying UTF-8

2021-06-03 Thread Heikki Linnakangas


On 03/06/2021 22:16, Heikki Linnakangas wrote:

On 03/06/2021 22:10, John Naylor wrote:

On Thu, Jun 3, 2021 at 3:08 PM Heikki Linnakangas mailto:hlinn...@iki.fi>> wrote:
  >                 x1 = half1 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
  >                 x2 = half2 + UINT64CONST(0x7f7f7f7f7f7f7f7f);
  >
  >                 /* then check that the high bit is set in each byte. */
  >                 x = (x1 | x2);
  >                 x &= UINT64CONST(0x8080808080808080);
  >                 if (x != UINT64CONST(0x8080808080808080))
  >                         return 0;

That seems right, I'll try that and update the patch. (Forgot to attach
earlier anyway)


Ugh, actually that has the same issue as before. If one of the bytes is
in one half is zero, but not in the other half, this fail to detect it.
Sorry for the noise..


If you replace (x1 | x2) with (x1 & x2) above, I think it's correct.

- Heikki

Re: security_definer_search_path GUC

2021-06-03 Thread Pavel Stehule

čt 3. 6. 2021 v 21:11 odesílatel Mark Dilger 
napsal:

>
>
> > On Jun 3, 2021, at 12:06 PM, Pavel Stehule 
> wrote:
> >
> >
> >
> > čt 3. 6. 2021 v 20:25 odesílatel Mark Dilger <
> mark.dil...@enterprisedb.com> napsal:
> >
> >
> > > On Jun 3, 2021, at 9:38 AM, Pavel Stehule 
> wrote:
> > >
> > > This design looks good for extensions, but I am not sure if it is good
> for users. Some declarative way without necessity to programming or install
> some extension can be nice.
> >
> > I agree, though "some declarative way" is a bit vague.  I've had ideas
> that perhaps superusers should be able to further restrict the [min,max]
> fields of int and real GUCs.  Since -1 is sometimes used to mean
> "disabled", syntax to allow specifying a set might be necessary, something
> like [-1, 60..600].  For text and enum GUCs, perhaps a set of regexps would
> work, some being required to match and others being required not to match,
> such as:
> >
> > search_path !~ '\mcustomerx\M'
> > search_path ~ '^pg_catalog,'
> >
> > If we did something like this, we'd need it to play nicely with other
> filters provided by extensions, because I'm reasonably sure not all filters
> could be done merely using set notation and regular expression syntax.  In
> fact, I find it hard to convince myself that set notation and regular
> expression syntax would even be useful in a large enough number of cases to
> be worth implementing.  What are your thought on that?
> >
> > I don't think so for immutable strings we need regular expressions.
> Maybe use some special keyword
> >
> > search_path only "pg_catalog"
>
> I think we're trying to solve different problems.  I'm trying to allow
> non-superusers to set GUCs while putting constraints on what values they
> choose.  You appear to be trying to revoke the ability to set a GUC by
> forcing it to only ever have a single value.
>

My proposal doesn't mean the search_path cannot be changed - it limits
possible values like your patch. Maybe we can get inspiration from
pg_hba.conf


>
> —
> Mark Dilger
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
>
>

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Andrew Dunstan

On 6/3/21 2:29 PM, Tom Lane wrote:
> I wrote:
>> Hmm, actually we could make step 2 a shade tighter: if a candidate
>> routine is a function, match against proargtypes.  If it's a procedure,
>> match against coalesce(proallargtypes, proargtypes).  If we find
>> multiple matches, raise ambiguity error.
> Where do we stand on this topic?
>
> I'm willing to have a go at implementing things that way, but
> time's a-wasting.
>
>   

So AIUI your suggestion is that ALTER/DROP ROUTINE will look for an
ambiguity. If it doesn't find one it proceeds, otherwise it complains in
which case the user will have to fall back to ALTER/DROP
FUNCTION/PROCEDURE. Is that right? It seems a reasonable approach, and I
wouldn't expect to find too many ambiguous cases in practice.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Andrew Dunstan



On 6/3/21 1:47 PM, Daniel Gustafsson wrote:
>> On 3 Jun 2021, at 19:37, Jeff Davis  wrote:
>>
>> On Tue, 2020-10-27 at 23:39 -0700, Andres Freund wrote:
>>> Maybe we should just have --with-ssl={openssl,nss}? That'd avoid
>>> needing
>>> to check for errors.
>> [ apologies for the late reply ]
>>
>> Would it be more proper to call it --with-tls={openssl,nss} ?
> Well, we use SSL for everything else (GUCs, connection params and env vars 
> etc)
> so I think --with-ssl is sensible.
>
> However, SSL and TLS are used quite interchangeably these days so I think it
> makes sense to provide --with-tls as an alias.
>

Yeah, but it's annoying to have to start every talk I give touching this
subject with the slide that says "When we say SSL we really means TLS".
Maybe release 15 would be a good time to rename user-visible option
names etc, with support for legacy names.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Make unlogged table resets detectable

2021-06-03 Thread Jeff Davis

One problem with unlogged tables is that the application has no way to
tell if they were reset, or they just happen to be empty.

This can be a problem with sharding, where you might have different
shards of an unlogged table on different servers. If one server
crashes, you'll be missing only one shard of the data, which may appear
inconsistent. In that case, you'd like the application (or sharding
solution) to be able to detect that one shard was lost, and TRUNCATE
those that remain to get back to a reasonable state.

It would be easy enough for the init fork to have a single page with a
flag set. That way, when the main fork is replaced with the init fork,
other code could detect that a reset happened.

When detected, depending on a GUC, the behavior could be to auto-
truncate it (to get the current silent behavior), or refuse to perform
the operation (except an explicit TRUNCATE), or issue a
warning/log/notice.

The biggest challenge would be: when should we detect that the reset
has happened? There might be a lot of entry points. Another idea would
be to just have a SQL function that the application could call whenever
it needs to know.

Thoughts?

Jeff Davis

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Jeff Davis

On Thu, 2021-06-03 at 15:53 -0400, Andrew Dunstan wrote:
> Yeah, but it's annoying to have to start every talk I give touching
> this
> subject with the slide that says "When we say SSL we really means
> TLS".
> Maybe release 15 would be a good time to rename user-visible option
> names etc, with support for legacy names.

Sounds good to me, though I haven't looked into how big of a diff that
will be.

Also, do we have precedent for GUC aliases? That might be a little
weird.

Regards,
Jeff Davis

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Tom Lane

Andrew Dunstan  writes:
> So AIUI your suggestion is that ALTER/DROP ROUTINE will look for an
> ambiguity. If it doesn't find one it proceeds, otherwise it complains in
> which case the user will have to fall back to ALTER/DROP
> FUNCTION/PROCEDURE. Is that right? It seems a reasonable approach, and I
> wouldn't expect to find too many ambiguous cases in practice.

Yeah, I think that practical problems would be pretty rare.  My impression
is that users tend not to use function/procedure name overloading too much
in the first place, and none of this affects you at all till you do.

Once you do, you'll possibly notice that PG's rules for which combinations
of signatures are allowed are different from the spec's.  I believe that
we're largely more generous than the spec, but there are a few cases where
this proposal isn't.  An example is that (AFAICT) the spec allows having
both
create procedure divide(x int, y int, OUT q int) ...
create procedure divide(x int, y int, OUT q int, OUT r int) ...
which I want to reject because they have the same input parameters.
This is perhaps annoying.  But seeing that the spec won't allow you to
also have divide() procedures for other datatypes, I'm having a hard
time feeling that this is losing on the overloading-flexibility front.

regards, tom lane

Re: Race condition in recovery?

2021-06-03 Thread Robert Haas

On Thu, May 27, 2021 at 2:26 AM Dilip Kumar  wrote:
> Changed as suggested.

I don't think the code as written here is going to work on Windows,
because your code doesn't duplicate enable_restoring's call to
perl2host or its backslash-escaping logic. It would really be better
if we could use enable_restoring directly. Also, I discovered that the
'return' in cp_history_files should really say 'exit', because
otherwise it generates a complaint every time it's run. It should also
have 'use strict' and 'use warnings' at the top.

Here's a version of your test case patch with the 1-line code fix
added, the above issues addressed, and a bunch of cosmetic tweaks.
Unfortunately, it doesn't pass for me consistently. I'm not sure if
that's because I broke something with my changes, or because the test
contains an underlying race condition which we need to address.
Attached also are the log files from a failed run if you want to look
at them. The key lines seem to be:

2021-06-03 16:16:53.984 EDT [47796] LOG:  restarted WAL streaming at
0/300 on timeline 2
2021-06-03 16:16:54.197 EDT [47813] 025_stuck_on_old_timeline.pl LOG:
statement: SELECT count(*) FROM tab_int
2021-06-03 16:16:54.197 EDT [47813] 025_stuck_on_old_timeline.pl
ERROR:  relation "tab_int" does not exist at character 22

Or from the main log:

Waiting for replication conn cascade's replay_lsn to pass '0/300' on standby
done
error running SQL: 'psql::1: ERROR:  relation "tab_int" does not exist
LINE 1: SELECT count(*) FROM tab_int
 ^'

I wonder whether that problem points to an issue with this incantation:

$node_standby->wait_for_catchup($node_cascade, 'replay',
$node_standby->lsn('replay'));

But I'm not sure, and I'm out of time to investigate for today.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v4-0001-Fix-corner-case-failure-of-new-standby-to-follow-.patch
Description: Binary data


025_stuck_on_old_timeline_primary.log
Description: Binary data


025_stuck_on_old_timeline_cascade.log
Description: Binary data


regress_log_025_stuck_on_old_timeline
Description: Binary data


025_stuck_on_old_timeline_standby.log
Description: Binary data

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Andrew Dunstan



On 6/3/21 4:21 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> So AIUI your suggestion is that ALTER/DROP ROUTINE will look for an
>> ambiguity. If it doesn't find one it proceeds, otherwise it complains in
>> which case the user will have to fall back to ALTER/DROP
>> FUNCTION/PROCEDURE. Is that right? It seems a reasonable approach, and I
>> wouldn't expect to find too many ambiguous cases in practice.
> Yeah, I think that practical problems would be pretty rare.  My impression
> is that users tend not to use function/procedure name overloading too much
> in the first place, and none of this affects you at all till you do.
>
> Once you do, you'll possibly notice that PG's rules for which combinations
> of signatures are allowed are different from the spec's.  I believe that
> we're largely more generous than the spec, but there are a few cases where
> this proposal isn't.  An example is that (AFAICT) the spec allows having
> both
>   create procedure divide(x int, y int, OUT q int) ...
>   create procedure divide(x int, y int, OUT q int, OUT r int) ...
> which I want to reject because they have the same input parameters.
> This is perhaps annoying.  But seeing that the spec won't allow you to
> also have divide() procedures for other datatypes, I'm having a hard
> time feeling that this is losing on the overloading-flexibility front.
>
>   



Not sure I follow the "other datatypes" bit. Are you saying the spec
won't let you have this?:

create procedure divide(x int, y int, OUT q int);
create procedure divide(x int, y int, OUT q float);

cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pgsql: Add regression test for recovery pause.

2021-06-03 Thread Andrew Dunstan



On 6/2/21 6:25 PM, Andrew Dunstan wrote:
>
>
> Looks to me like we're getting munged by the msys shell, and unlike on
> msys2 there isn't a way to disable it:
> https://stackoverflow.com/questions/7250130/how-to-stop-mingw-and-msys-from-mangling-path-names-given-at-the-command-line
>
>
> c.f. commit 73ff3a0abbb
>
>
> Maybe a robust solution would be to have the query piped to psql on its
> stdin rather than on the command line. poll_query_until looks on a quick
> check like the only place in PostgresNode where we use "psql -c"
>
>
> I'll experiment a bit tomorrow.
>
>
>


My suspicion was correct. Fix pushed.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: pgsql: Add regression test for recovery pause.

2021-06-03 Thread Tom Lane

Andrew Dunstan  writes:
> My suspicion was correct. Fix pushed.

Great, thanks.

Do we need to worry about back-patching that?  It seems only
accidental if no existing back-branch test cases hit this.

regards, tom lane

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2021-06-03 Thread Zhihong Yu

On Thu, Jun 3, 2021 at 11:12 AM Tomas Vondra 
wrote:

> Hi,
>
> On 6/3/21 7:17 PM, Arne Roland wrote:
> > Hi,
> >
> >
> > I haven't tested the parallel case, but I think we should sort out (3)
> > get_cheapest_fractional_path_for_pathkeys as mentioned above.
> >
>
> Not sure what you refer to by "above" - it's probably better to reply
> in-line to existing message, which makes it much cleared.
>
> >
> > I am lost about the comment regarding startup_new_fractional. Could you
> > elaborate what you mean by that?
> >
>
> Not sure what this refers to either - there's no startup_new_fractional
> in my message and 'git grep startup_new_fractional' returns nothing.
>
> >
> > Apart from that, I'd argue for a small test case. I attached a slimmed
> > down case of what we were trying to fix. It might be worth to integrate
> > that with an existing test, since more than a third of the time seems to
> > be consumed by the creation and attachment of partitions.
> >
>
> Maybe, if there's a suitable table to reuse, we can do that. But I don't
> think it matters it takes ~1/3 of the time to attach the partitions.
> What's more important is whether it measurably slows down the test
> suite, and I don't think that's an issue.
>
> In any case, this seems a bit premature - we need something to test the
> patch etc. We can worry about how expensive the test is much later.
>
> regards
>
> --
> Tomas Vondra
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
>
> Hi,
In REL_11_STABLE branch, a search revealed the following:

src/backend/optimizer/path/pathkeys.c: *
get_cheapest_fractional_path_for_pathkeys
src/backend/optimizer/path/pathkeys.c:get_cheapest_fractional_path_for_pathkeys(List
*paths,
src/backend/optimizer/plan/planagg.c:
get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
src/include/optimizer/paths.h:extern Path
*get_cheapest_fractional_path_for_pathkeys(List *paths,

It seems this function has been refactored out in subsequent releases.

FYI

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2021-06-03 Thread Zhihong Yu

On Thu, Jun 3, 2021 at 1:50 PM Zhihong Yu  wrote:

>
>
> On Thu, Jun 3, 2021 at 11:12 AM Tomas Vondra <
> tomas.von...@enterprisedb.com> wrote:
>
>> Hi,
>>
>> On 6/3/21 7:17 PM, Arne Roland wrote:
>> > Hi,
>> >
>> >
>> > I haven't tested the parallel case, but I think we should sort out (3)
>> > get_cheapest_fractional_path_for_pathkeys as mentioned above.
>> >
>>
>> Not sure what you refer to by "above" - it's probably better to reply
>> in-line to existing message, which makes it much cleared.
>>
>> >
>> > I am lost about the comment regarding startup_new_fractional. Could you
>> > elaborate what you mean by that?
>> >
>>
>> Not sure what this refers to either - there's no startup_new_fractional
>> in my message and 'git grep startup_new_fractional' returns nothing.
>>
>> >
>> > Apart from that, I'd argue for a small test case. I attached a slimmed
>> > down case of what we were trying to fix. It might be worth to integrate
>> > that with an existing test, since more than a third of the time seems to
>> > be consumed by the creation and attachment of partitions.
>> >
>>
>> Maybe, if there's a suitable table to reuse, we can do that. But I don't
>> think it matters it takes ~1/3 of the time to attach the partitions.
>> What's more important is whether it measurably slows down the test
>> suite, and I don't think that's an issue.
>>
>> In any case, this seems a bit premature - we need something to test the
>> patch etc. We can worry about how expensive the test is much later.
>>
>> regards
>>
>> --
>> Tomas Vondra
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>>
>> Hi,
> In REL_11_STABLE branch, a search revealed the following:
>
> src/backend/optimizer/path/pathkeys.c: *
> get_cheapest_fractional_path_for_pathkeys
> src/backend/optimizer/path/pathkeys.c:get_cheapest_fractional_path_for_pathkeys(List
> *paths,
> src/backend/optimizer/plan/planagg.c:
> get_cheapest_fractional_path_for_pathkeys(final_rel->pathlist,
> src/include/optimizer/paths.h:extern Path
> *get_cheapest_fractional_path_for_pathkeys(List *paths,
>
> It seems this function has been refactored out in subsequent releases.
>
> FYI
>

Sent a bit too soon.

The above function still exists.
But startup_new_fractional was nowhere to be found.

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Daniel Gustafsson

> On 3 Jun 2021, at 22:14, Jeff Davis  wrote:
> 
> On Thu, 2021-06-03 at 15:53 -0400, Andrew Dunstan wrote:
>> Yeah, but it's annoying to have to start every talk I give touching
>> this
>> subject with the slide that says "When we say SSL we really means
>> TLS".
>> Maybe release 15 would be a good time to rename user-visible option
>> names etc, with support for legacy names.

Perhaps.  Having spent some time in this space, SSL has IMHO become the de
facto term for an encrypted connection at the socket layer, with TLS being the
current protocol suite (additionally, often referred to SSL/TLS).  Offering
tls* counterparts to our ssl GUCs etc will offer a level of correctness but I
doubt we'll ever get rid of ssl* so we might not help too many users by the
added complexity.

It might also put us a hard spot if the next TLS spec ends up being called
something other than TLS?  It's clearly happened before =)

> Sounds good to me, though I haven't looked into how big of a diff that
> will be.
> 
> Also, do we have precedent for GUC aliases? That might be a little
> weird.

I don't think we do currently, but I have a feeling the topic has surfaced here
before.

If we end up settling on this being something we want I can volunteer to do the
legwork, but it seems a discussion best had before a patch is drafted.

--
Daniel Gustafsson   https://vmware.com/

DELETE CASCADE

2021-06-03 Thread David Christensen

Hi -hackers,

Presented for discussion is a POC for a DELETE CASCADE functionality,
which will allow you one-shot usage of treating existing NO ACTION and
RESTRICT FK constraints as if they were originally defined as CASCADE
constraints.  I can't tell you how many times this functionality would have
been useful in the field, and despite the expected answer of "define your
constraints right in the first place", this is not always an option, nor is
the ability to change that easily (or create new constraints that need to
revalidate against big tables) always the best option.

That said, I'm happy to quibble about the specific approach to be taken;
I've written this based on the most straightforward way I could come up
with to accomplish this, but if there are better directions to take to get
the equivalent functionality I'm happy to discuss.

>From the commit message:

Proof of concept of allowing a DELETE statement to override formal FK's
handling from RESTRICT/NO
ACTION and treat as CASCADE instead.

Syntax is "DELETE CASCADE ..." instead of "DELETE ... CASCADE" due to
unresolvable bison conflicts.

Sample session:

  postgres=# create table foo (id serial primary key, val text);
  CREATE TABLE
  postgres=# create table bar (id serial primary key, foo_id int references
foo(id), val text);
  CREATE TABLE
  postgres=# insert into foo (val) values ('a'),('b'),('c');
  INSERT 0 3
  postgres=# insert into bar (foo_id, val) values
(1,'d'),(1,'e'),(2,'f'),(2,'g');
  INSERT 0 4
  postgres=# select * from foo;
   id | val
  +-
1 | a
2 | b
3 | c
  (3 rows)

  postgres=# select * from bar;
   id | foo_id | val
  ++-
1 |  1 | d
2 |  1 | e
3 |  2 | f
4 |  2 | g
  (4 rows)

  postgres=# delete from foo where id = 1;
  ERROR:  update or delete on table "foo" violates foreign key constraint
"bar_foo_id_fkey" on table "bar"
  DETAIL:  Key (id)=(1) is still referenced from table "bar".
  postgres=# delete cascade from foo where id = 1;
  DELETE 1
  postgres=# select * from foo;
   id | val
  +-
2 | b
3 | c
  (2 rows)

  postgres=# select * from bar;
   id | foo_id | val
  ++-
3 |  2 | f
4 |  2 | g
  (2 rows)


Best,

David


0001-Add-support-for-DELETE-CASCADE.patch
Description: Binary data

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Tom Lane

Andrew Dunstan  writes:
> Not sure I follow the "other datatypes" bit. Are you saying the spec
> won't let you have this?:

> create procedure divide(x int, y int, OUT q int);
> create procedure divide(x int, y int, OUT q float);

In fact it won't, because the spec's rule is simply "you can't have
two procedures with the same name and same number of parameters"
(where they count OUT parameters, I believe).  However the case
I was considering was wanting to have

create procedure divide(x int, y int, OUT q int) ...
create procedure divide(x numeric, y numeric, OUT q numeric) ...

which likewise falls foul of the spec's restriction, but which
IMO must be allowed in Postgres.

regards, tom lane

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Tom Lane

Daniel Gustafsson  writes:
> It might also put us a hard spot if the next TLS spec ends up being called
> something other than TLS?  It's clearly happened before =)

Good point.  I'm inclined to just stick with the SSL terminology.

>> Also, do we have precedent for GUC aliases? That might be a little
>> weird.

> I don't think we do currently, but I have a feeling the topic has surfaced 
> here
> before.

We do, look for "sort_mem" in guc.c.  So it's not like it'd be
inconvenient to implement.  But I think user confusion and the
potential for the new terminology to fail to be any more
future-proof are good reasons to just leave the names alone.

regards, tom lane

Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2021-06-03 Thread Tomas Vondra

On 6/3/21 10:52 PM, Zhihong Yu wrote:
> ...
> 
> Sent a bit too soon.
> 
> The above function still exists.
> But startup_new_fractional was nowhere to be found.

Actually, there are two comments

/* XXX maybe we should have startup_new_fractional? */

in the patch I posted - I completely forgot about that. But I think
that's a typo, I think - it should be

/* XXX maybe we should have startup_neq_fractional? */

and the new flag would work similarly to startup_neq_total, i.e. it's
pointless to add paths where startup == fractional cost.

At least I think that was the idea when I wrote the patch, it way too
long ago.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Daniel Gustafsson

> On 3 Jun 2021, at 22:55, Tom Lane  wrote:

>>> Also, do we have precedent for GUC aliases? That might be a little
>>> weird.
> 
>> I don't think we do currently, but I have a feeling the topic has surfaced 
>> here
>> before.
> 
> We do, look for "sort_mem" in guc.c.

I knew it seemed familiar but I failed to find it, thanks for the pointer.

--
Daniel Gustafsson   https://vmware.com/

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Bruce Momjian

On Thu, Jun  3, 2021 at 04:55:45PM -0400, Tom Lane wrote:
> Daniel Gustafsson  writes:
> > It might also put us a hard spot if the next TLS spec ends up being called
> > something other than TLS?  It's clearly happened before =)
> 
> Good point.  I'm inclined to just stick with the SSL terminology.

I wonder if we should use SSL/TLS in more places in our documentation.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: pgsql: Add regression test for recovery pause.

2021-06-03 Thread Andrew Dunstan



On 6/3/21 4:45 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> My suspicion was correct. Fix pushed.
> Great, thanks.
>
> Do we need to worry about back-patching that?  It seems only
> accidental if no existing back-branch test cases hit this.


Well, we haven't had breakage, but its also useful to keep things in
sync as much as possible. Ill do it shortly.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Tom Lane

Bruce Momjian  writes:
> I wonder if we should use SSL/TLS in more places in our documentation.

No objection to doing that in the docs; I'm just questioning
switching the code-visible names.

regards, tom lane

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Peter Eisentraut


On 02.06.21 02:04, Tom Lane wrote:

I wrote:

It's possible that we could revert proargtypes and still accommodate
the spec's definition for ALTER/DROP ROUTINE/PROCEDURE.  I'm imagining
some rules along the line of:
1. If arg list contains any parameter modes, then it must be PG
syntax, so interpret it according to our traditional rules.
2. Otherwise, try to match the given arg types against *both*
proargtypes and proallargtypes.  If we get multiple matches,
complain that the command is ambiguous.  (In the case of DROP
PROCEDURE, it's probably OK to consider only proallargtypes.)


Hmm, actually we could make step 2 a shade tighter: if a candidate
routine is a function, match against proargtypes.  If it's a procedure,
match against coalesce(proallargtypes, proargtypes).  If we find
multiple matches, raise ambiguity error.

The cases where you get the error could be resolved by either
using traditional PG syntax, or (in most cases) by saying
FUNCTION or PROCEDURE instead of ROUTINE.


I'm ok with this proposal.

Re: Support for NSS as a libpq TLS backend

2021-06-03 Thread Daniel Gustafsson

> On 3 Jun 2021, at 23:11, Tom Lane  wrote:
> 
> Bruce Momjian  writes:
>> I wonder if we should use SSL/TLS in more places in our documentation.
> 
> No objection to doing that in the docs; I'm just questioning
> switching the code-visible names.

As long as it's still searchable by "SSL", "TLS" and "SSL/TLS" and not just the
latter.

--
Daniel Gustafsson   https://vmware.com/

Re: DELETE CASCADE

2021-06-03 Thread Isaac Morland

On Thu, 3 Jun 2021 at 16:49, David Christensen <
david.christen...@crunchydata.com> wrote:

> Hi -hackers,
>
> Presented for discussion is a POC for a DELETE CASCADE functionality,
> which will allow you one-shot usage of treating existing NO ACTION and
> RESTRICT FK constraints as if they were originally defined as CASCADE
> constraints.  I can't tell you how many times this functionality would have
> been useful in the field, and despite the expected answer of "define your
> constraints right in the first place", this is not always an option, nor is
> the ability to change that easily (or create new constraints that need to
> revalidate against big tables) always the best option.
>

I would sometimes find this convenient. There are circumstances where I
don't want every DELETE to blunder all over the database deleting stuff,
but certain specific DELETEs should take care of the referencing tables.

An additional syntax to say "CASCADE TO table1, table2" would be safer and
sometimes useful in the case where I know I want to cascade to specific
other tables but not all (and in particular not to ones I didn't think of
when I wrote the query); I might almost suggest omitting the cascade to all
syntax (or maybe have a separate syntax, literally "CASCADE TO ALL TABLES"
or some such).

What happens if I don't have delete permission on the referencing table?
When a foreign key reference delete cascades, I can cause records to
disappear from a referencing table even if I don't have delete permission
on that table. This feels like it's just supposed to be a convenience that
replaces multiple DELETE invocations but one way or the other we need to be
clear on the behaviour.

Sidebar: isn't this inconsistent with trigger behaviour in general? When I
say "ON DELETE CASCADE" what I mean and what I get are the same: whenever
the referenced row is deleted, the referencing row also disappears,
regardless of the identity or permissions of the role running the actual
DELETE. But any manually implemented trigger runs as the caller; I cannot
make the database do something when a table update occurs; I can only make
the role doing the table update perform some additional actions.

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Andrew Dunstan



On 6/3/21 4:50 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> Not sure I follow the "other datatypes" bit. Are you saying the spec
>> won't let you have this?:
>> create procedure divide(x int, y int, OUT q int);
>> create procedure divide(x int, y int, OUT q float);
> In fact it won't, because the spec's rule is simply "you can't have
> two procedures with the same name and same number of parameters"
> (where they count OUT parameters, I believe).  


Oh. That's a truly awful rule.



> However the case
> I was considering was wanting to have
>
>   create procedure divide(x int, y int, OUT q int) ...
>   create procedure divide(x numeric, y numeric, OUT q numeric) ...
>
> which likewise falls foul of the spec's restriction, but which
> IMO must be allowed in Postgres.
>


Right, we should certainly allow that.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Tom Lane

Peter Eisentraut  writes:
> On 02.06.21 02:04, Tom Lane wrote:
>> Hmm, actually we could make step 2 a shade tighter: if a candidate
>> routine is a function, match against proargtypes.  If it's a procedure,
>> match against coalesce(proallargtypes, proargtypes).  If we find
>> multiple matches, raise ambiguity error.

> I'm ok with this proposal.

Cool.  Do you want to try to implement it, or shall I?

A question that maybe we should refer to the RMT is whether it's
too late for this sort of redesign for v14.  I dislike reverting
the OUT-procedure feature altogether in v14, but perhaps that's
the sanest way to proceed.

regards, tom lane

Re: alter table set TABLE ACCESS METHOD

2021-06-03 Thread Jeff Davis

On Thu, 2021-05-06 at 17:24 -0700, Jeff Davis wrote:
> It's the table AM's responsibility to detoast out-of-line datums and
> toast any values that are too large (see
> heapam.c:heap_prepare_insert()).

Do we have general agreement on this point? Did I miss another purpose
of detoasting in tablecmds.c, or can we just remove that part of the
patch?

Regards,
Jeff Davis

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Peter Eisentraut


On 03.06.21 22:21, Tom Lane wrote:

Once you do, you'll possibly notice that PG's rules for which combinations
of signatures are allowed are different from the spec's.  I believe that
we're largely more generous than the spec, but there are a few cases where
this proposal isn't.  An example is that (AFAICT) the spec allows having
both
create procedure divide(x int, y int, OUT q int) ...
create procedure divide(x int, y int, OUT q int, OUT r int) ...
which I want to reject because they have the same input parameters.
This is perhaps annoying.  But seeing that the spec won't allow you to
also have divide() procedures for other datatypes, I'm having a hard
time feeling that this is losing on the overloading-flexibility front.


I'm okay with disallowing this.  In my experience, overloading of 
procedures is done even more rarely than of functions, so this probably 
won't affect anything in practice.


(I'm by no means suggesting this, but I could imagine a catalog 
representation that allows this but still checks that you can't have 
multiple candidates that differ only by the type of an OUT parameters. 
Say with some kind of bitmap or boolean array that indicates where the 
OUT parameters are.  Then you can only have one candidate with a given 
number of arguments, but the above could be allowed.  Again, I'm not 
suggesting this, but it's a possibility in theory.)

Re: DELETE CASCADE

2021-06-03 Thread David G. Johnston

On Thu, Jun 3, 2021 at 1:49 PM David Christensen <
david.christen...@crunchydata.com> wrote:

> Presented for discussion is a POC for a DELETE CASCADE functionality,
> which will allow you one-shot usage of treating existing NO ACTION and
> RESTRICT FK constraints as if they were originally defined as CASCADE
> constraints.
>

ON DELETE NO ACTION constraints become ON DELETE CASCADE constraints - ON
DELETE SET NULL constraints are ignored, and not possible to emulate via
this feature.


>   I can't tell you how many times this functionality would have been
> useful in the field, and despite the expected answer of "define your
> constraints right in the first place", this is not always an option, nor is
> the ability to change that easily (or create new constraints that need to
> revalidate against big tables) always the best option.
>

Once...but I agreed.

>
> That said, I'm happy to quibble about the specific approach to be taken;
> I've written this based on the most straightforward way I could come up
> with to accomplish this, but if there are better directions to take to get
> the equivalent functionality I'm happy to discuss.
>
>
This behavior should require the same permissions as actually creating an
ON DELETE CASCADE FK on the cascaded-to tables.  i.e., Table Owner role
membership (the requirement for FK permissions can be assumed by the
presence of the existing FK constraint and being the table's owner).

Having the defined FK behaviors be more readily changeable, while not
mitigating this need, is IMO a more important feature to implement.  If
there is a reason that cannot be implemented (besides no one has bothered
to take the time) then I would consider that reason to also apply to
prevent implementing this work-around.

David J.

Re: CALL versus procedures with output-only arguments

2021-06-03 Thread Tom Lane

Peter Eisentraut  writes:
> On 03.06.21 22:21, Tom Lane wrote:
>> An example is that (AFAICT) the spec allows having both
>>  create procedure divide(x int, y int, OUT q int) ...
>>  create procedure divide(x int, y int, OUT q int, OUT r int) ...
>> which I want to reject because they have the same input parameters.

> (I'm by no means suggesting this, but I could imagine a catalog 
> representation that allows this but still checks that you can't have 
> multiple candidates that differ only by the type of an OUT parameters. 
> Say with some kind of bitmap or boolean array that indicates where the 
> OUT parameters are.  Then you can only have one candidate with a given 
> number of arguments, but the above could be allowed.  Again, I'm not 
> suggesting this, but it's a possibility in theory.)

We could certainly do something like that in a green field.  But one
of the reasons I'm unhappy about the current design is that I'm convinced
that altering the definition of pg_proc.proargtypes will break client-side
code that's looking at the catalogs.  I don't think we get to monkey with
such fundamental bits of the catalog data without a really good reason.
Allowing different OUT parameters for the same IN parameters doesn't seem
to me to qualify, given that there are other reasons why that's dubious.

regards, tom lane

Re: DELETE CASCADE

2021-06-03 Thread David Christensen

On Thu, Jun 3, 2021 at 4:48 PM David G. Johnston 
wrote:

> On Thu, Jun 3, 2021 at 1:49 PM David Christensen <
> david.christen...@crunchydata.com> wrote:
>
>> Presented for discussion is a POC for a DELETE CASCADE functionality,
>> which will allow you one-shot usage of treating existing NO ACTION and
>> RESTRICT FK constraints as if they were originally defined as CASCADE
>> constraints.
>>
>
> ON DELETE NO ACTION constraints become ON DELETE CASCADE constraints - ON
> DELETE SET NULL constraints are ignored, and not possible to emulate via
> this feature.
>

I have not tested this part per se (which clearly I need to expand the
existing test suite), but my reasoning here was that ON DELETE SET
NULL/DEFAULT would still be applied with their defined behaviors (being
that we're still calling the underlying RI triggers using SPI) with the
same results; the intent of this feature is just to suppress the RESTRICT
action and cascade the DELETE to all tables (on down the chain) which would
normally block this, without having to manually figure all the dependencies
which can be inferred by the database itself.

>   I can't tell you how many times this functionality would have been
>> useful in the field, and despite the expected answer of "define your
>> constraints right in the first place", this is not always an option, nor is
>> the ability to change that easily (or create new constraints that need to
>> revalidate against big tables) always the best option.
>>
>
> Once...but I agreed.
>

Heh.

> That said, I'm happy to quibble about the specific approach to be taken;
>> I've written this based on the most straightforward way I could come up
>> with to accomplish this, but if there are better directions to take to get
>> the equivalent functionality I'm happy to discuss.
>>
>>
> This behavior should require the same permissions as actually creating an
> ON DELETE CASCADE FK on the cascaded-to tables.  i.e., Table Owner role
> membership (the requirement for FK permissions can be assumed by the
> presence of the existing FK constraint and being the table's owner).
>

I'm not sure if this would be overly prohibitive or not, but if you're the
table owner this should just work, like you point out.  I think this
restriction could be fine for the common case, and if there was a way to
hint if/when this failed to cascade as to the actual reason for the failure
I'm fine with that part too. (I was assuming that DELETE permission on the
underlying tables + existence of FK would be enough in practice, but we
could definitely tighten that up.)

> Having the defined FK behaviors be more readily changeable, while not
> mitigating this need, is IMO a more important feature to implement.  If
> there is a reason that cannot be implemented (besides no one has bothered
> to take the time) then I would consider that reason to also apply to
> prevent implementing this work-around.
>

Agreed that this would be a nice feature to have too; noone wants to break
FK consistency to change things or require a rescan of a valid constraint.

Re: DELETE CASCADE

2021-06-03 Thread David Christensen

On Thu, Jun 3, 2021 at 4:15 PM Isaac Morland 
wrote:

> On Thu, 3 Jun 2021 at 16:49, David Christensen <
> david.christen...@crunchydata.com> wrote:
>
>> Hi -hackers,
>>
>> Presented for discussion is a POC for a DELETE CASCADE functionality,
>> which will allow you one-shot usage of treating existing NO ACTION and
>> RESTRICT FK constraints as if they were originally defined as CASCADE
>> constraints.  I can't tell you how many times this functionality would have
>> been useful in the field, and despite the expected answer of "define your
>> constraints right in the first place", this is not always an option, nor is
>> the ability to change that easily (or create new constraints that need to
>> revalidate against big tables) always the best option.
>>
>
> I would sometimes find this convenient. There are circumstances where I
> don't want every DELETE to blunder all over the database deleting stuff,
> but certain specific DELETEs should take care of the referencing tables.
>
> An additional syntax to say "CASCADE TO table1, table2" would be safer and
> sometimes useful in the case where I know I want to cascade to specific
> other tables but not all (and in particular not to ones I didn't think of
> when I wrote the query); I might almost suggest omitting the cascade to all
> syntax (or maybe have a separate syntax, literally "CASCADE TO ALL TABLES"
> or some such).
>

I'm not fond of the syntax requirements for the explicitness here, plus it
seems like it would complicate the functionality of the patch (which
currently is able to just slightly refactor the RI triggers to account for
a single state variable, rather than do anything smarter than that).  I do
understand the desire/need for visibility into what would be affected with
an offhand statement.

What happens if I don't have delete permission on the referencing table?
> When a foreign key reference delete cascades, I can cause records to
> disappear from a referencing table even if I don't have delete permission
> on that table. This feels like it's just supposed to be a convenience that
> replaces multiple DELETE invocations but one way or the other we need to be
> clear on the behaviour.
>

Did you test this and find a failure? Because it is literally using all of
the same RI proc code/permissions as defined I would expect that it would
just abort the transaction.  (I am working on expanding the test suite for
this feature to allow for test cases like this, so keep 'em coming... :-))


> Sidebar: isn't this inconsistent with trigger behaviour in general? When I
> say "ON DELETE CASCADE" what I mean and what I get are the same: whenever
> the referenced row is deleted, the referencing row also disappears,
> regardless of the identity or permissions of the role running the actual
> DELETE. But any manually implemented trigger runs as the caller; I cannot
> make the database do something when a table update occurs; I can only make
> the role doing the table update perform some additional actions.
>

Have you found a failure?  Because all this is doing is effectively calling
the guts of the cascade RI routines, so no differences should occur.  If
not, I'm not quite clear on your objection; can you clarify?

David

Re: DELETE CASCADE

2021-06-03 Thread David Christensen

>
> What happens if I don't have delete permission on the referencing table?
>> When a foreign key reference delete cascades, I can cause records to
>> disappear from a referencing table even if I don't have delete permission
>> on that table. This feels like it's just supposed to be a convenience that
>> replaces multiple DELETE invocations but one way or the other we need to be
>> clear on the behaviour.
>>
>
> Did you test this and find a failure? Because it is literally using all of
> the same RI proc code/permissions as defined I would expect that it would
> just abort the transaction.  (I am working on expanding the test suite for
> this feature to allow for test cases like this, so keep 'em coming... :-))
>

Enclosed is a basic test script and the corresponding output run through
`psql -e` (will adapt into part of the regression test, but wanted to get
this out there).  TL;DR; DELETE CASCADE behaves exactly as if said
constraint were defined as a ON DELETE CASCADE FK constraint wrt DELETE
permission behavior.  I do agree in this case, that it makes sense to throw
an error if we're trying to bypass the RESTRICT behavior and we are not
part of the table owner role (and since this would be called/checked
recursively for each table involved in the graph I think we can count on it
reporting the appropriate error message in this case).


test_delete_cascade.out
Description: Binary data


test_delete_cascade.sql
Description: Binary data

1 2 >

1 - 100 of 133 matches

Mail list logo