(LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented

2021-09-28 Thread bt21tanigaway

Hi,

(LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented in 
tab-complete. I made a patch for these options.


regards,
Koyu Tanigawa
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5cd5838668..12c55f5904 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3599,39 +3599,49 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("(");

 /* LOCK */
-	/* Complete LOCK [TABLE] with a list of tables */
+	/* Complete LOCK [TABLE] [ONLY] with a list of tables */
 	else if (Matches("LOCK"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables,
-   " UNION SELECT 'TABLE'");
+COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION SELECT 'TABLE'" " UNION SELECT 'ONLY'");
+
 	else if (Matches("LOCK", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, "");
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION SELECT 'ONLY'");

+	else if (Matches("LOCK", "TABLE", "ONLY") || Matches("LOCK", "ONLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, "");
 	/* For the following, handle the case of a single table only for now */

-	/* Complete LOCK [TABLE]  with "IN" */
-	else if (Matches("LOCK", MatchAnyExcept("TABLE")) ||
-			 Matches("LOCK", "TABLE", MatchAny))
-		COMPLETE_WITH("IN");
+	/* Complete LOCK [TABLE] [ONLY]  with "IN" or "NOWAIT" */
+	else if (Matches("LOCK", MatchAnyExcept("TABLE|ONLY")) ||
+			 Matches("LOCK", "TABLE", MatchAnyExcept("ONLY")) ||
+			 Matches("LOCK", "ONLY", MatchAny) ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny))
+		COMPLETE_WITH("IN", "NOWAIT");

-	/* Complete LOCK [TABLE]  IN with a lock mode */
+	/* Complete LOCK [TABLE] [ONLY]  IN with a lock mode */
 	else if (Matches("LOCK", MatchAny, "IN") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN"))
+			 Matches("LOCK", "TABLE", MatchAny, "IN") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN"))
 		COMPLETE_WITH("ACCESS SHARE MODE",
 	  "ROW SHARE MODE", "ROW EXCLUSIVE MODE",
 	  "SHARE UPDATE EXCLUSIVE MODE", "SHARE MODE",
 	  "SHARE ROW EXCLUSIVE MODE",
 	  "EXCLUSIVE MODE", "ACCESS EXCLUSIVE MODE");

-	/* Complete LOCK [TABLE]  IN ACCESS|ROW with rest of lock mode */
+	/* Complete LOCK [TABLE][ONLY]  IN ACCESS|ROW with rest of lock mode */
 	else if (Matches("LOCK", MatchAny, "IN", "ACCESS|ROW") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN", "ACCESS|ROW"))
+			 Matches("LOCK", "TABLE", MatchAny, "IN", "ACCESS|ROW") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN", "ACCESS|ROW"))
 		COMPLETE_WITH("EXCLUSIVE MODE", "SHARE MODE");

-	/* Complete LOCK [TABLE]  IN SHARE with rest of lock mode */
+	/* Complete LOCK [TABLE] [ONLY]  IN SHARE with rest of lock mode */
 	else if (Matches("LOCK", MatchAny, "IN", "SHARE") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN", "SHARE"))
-		COMPLETE_WITH("MODE", "ROW EXCLUSIVE MODE",
-	  "UPDATE EXCLUSIVE MODE");
+			 Matches("LOCK", "TABLE", MatchAny, "IN", "SHARE") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN", "SHARE"))
+		COMPLETE_WITH("MODE", "ROW EXCLUSIVE MODE", "UPDATE EXCLUSIVE MODE");
+
+  /* Complete LOCK [TABLE] [ONLY]  [IN lockmode MODE] with "NOWAIT"*/
+	else if (HeadMatches("LOCK") && TailMatches("MODE"))
+		COMPLETE_WITH("NOWAIT");

 /* NOTIFY --- can be inside EXPLAIN, RULE, etc */
 	else if (TailMatches("NOTIFY"))


Re: (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented

2021-09-28 Thread Fujii Masao




On 2021/09/28 16:13, bt21tanigaway wrote:

Hi,

(LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented in 
tab-complete. I made a patch for these options.


Thanks for the patch!

The patch seems to forget to handle the tab-completion for
"LOCK ONLY  IN".

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented

2021-09-28 Thread bt21tanigaway

2021-09-28 16:36 に Fujii Masao さんは書きました:

On 2021/09/28 16:13, bt21tanigaway wrote:

Hi,

(LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented in 
tab-complete. I made a patch for these options.


Thanks for the patch!
The patch seems to forget to handle the tab-completion for
"LOCK ONLY  IN".


Thanks for your comment!
I attach a new patch fixed to this mail.

Regards,

Koyu Tanigawa





Re: (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented

2021-09-28 Thread bt21tanigaway

2021-09-28 17:03 に bt21tanigaway さんは書きました:

2021-09-28 16:36 に Fujii Masao さんは書きました:

On 2021/09/28 16:13, bt21tanigaway wrote:

Hi,

(LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented in 
tab-complete. I made a patch for these options.


Thanks for the patch!
The patch seems to forget to handle the tab-completion for
"LOCK ONLY  IN".


Thanks for your comment!
I attach a new patch fixed to this mail.

Regards,

Koyu Tanigawa


Sorry, I forgot to attach patch file.
"fix-tab-complete2.patch" is fixed!

Regards,

Koyu Tanigawadiff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5cd5838668..251b2af9a5 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3599,39 +3599,52 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("(");

 /* LOCK */
-	/* Complete LOCK [TABLE] with a list of tables */
+	/* Complete LOCK [TABLE] [ONLY] with a list of tables */
 	else if (Matches("LOCK"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables,
-   " UNION SELECT 'TABLE'");
+COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION SELECT 'TABLE'" " UNION SELECT 'ONLY'");
+
 	else if (Matches("LOCK", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, "");
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION SELECT 'ONLY'");

+	else if (Matches("LOCK", "TABLE", "ONLY") || Matches("LOCK", "ONLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, "");
 	/* For the following, handle the case of a single table only for now */

-	/* Complete LOCK [TABLE]  with "IN" */
-	else if (Matches("LOCK", MatchAnyExcept("TABLE")) ||
-			 Matches("LOCK", "TABLE", MatchAny))
-		COMPLETE_WITH("IN");
+	/* Complete LOCK [TABLE] [ONLY]  with "IN" or "NOWAIT" */
+	else if (Matches("LOCK", MatchAnyExcept("TABLE|ONLY")) ||
+			 Matches("LOCK", "TABLE", MatchAnyExcept("ONLY")) ||
+			 Matches("LOCK", "ONLY", MatchAny) ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny))
+		COMPLETE_WITH("IN", "NOWAIT");

-	/* Complete LOCK [TABLE]  IN with a lock mode */
+	/* Complete LOCK [TABLE] [ONLY]  IN with a lock mode */
 	else if (Matches("LOCK", MatchAny, "IN") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN"))
+			 Matches("LOCK", "TABLE", MatchAny, "IN") ||
+			 Matches("LOCK", "ONLY", MatchAny, "IN") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN"))
 		COMPLETE_WITH("ACCESS SHARE MODE",
 	  "ROW SHARE MODE", "ROW EXCLUSIVE MODE",
 	  "SHARE UPDATE EXCLUSIVE MODE", "SHARE MODE",
 	  "SHARE ROW EXCLUSIVE MODE",
 	  "EXCLUSIVE MODE", "ACCESS EXCLUSIVE MODE");

-	/* Complete LOCK [TABLE]  IN ACCESS|ROW with rest of lock mode */
+	/* Complete LOCK [TABLE][ONLY]  IN ACCESS|ROW with rest of lock mode */
 	else if (Matches("LOCK", MatchAny, "IN", "ACCESS|ROW") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN", "ACCESS|ROW"))
+			 Matches("LOCK", "TABLE", MatchAny, "IN", "ACCESS|ROW") ||
+			 Matches("LOCK", "ONLY", MatchAny, "IN", "ACCESS|ROW") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN", "ACCESS|ROW"))
 		COMPLETE_WITH("EXCLUSIVE MODE", "SHARE MODE");

-	/* Complete LOCK [TABLE]  IN SHARE with rest of lock mode */
+	/* Complete LOCK [TABLE] [ONLY]  IN SHARE with rest of lock mode */
 	else if (Matches("LOCK", MatchAny, "IN", "SHARE") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN", "SHARE"))
-		COMPLETE_WITH("MODE", "ROW EXCLUSIVE MODE",
-	  "UPDATE EXCLUSIVE MODE");
+			 Matches("LOCK", "TABLE", MatchAny, "IN", "SHARE") ||
+			 Matches("LOCK", "ONLY", MatchAny, "IN", "SHARE") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN", "SHARE"))
+		COMPLETE_WITH("MODE", "ROW EXCLUSIVE MODE", "UPDATE EXCLUSIVE MODE");
+
+  /* Complete LOCK [TABLE] [ONLY]  [IN lockmode MODE] with "NOWAIT"*/
+	else if (HeadMatches("LOCK") && TailMatches("MODE"))
+		COMPLETE_WITH("NOWAIT");

 /* NOTIFY --- can be inside EXPLAIN, RULE, etc */
 	else if (TailMatches("NOTIFY"))


Re: improvements in Unicode tables generation code

2021-09-28 Thread Peter Eisentraut


On 20.07.21 13:57, Peter Eisentraut wrote:
Perhaps we should change the script or Makefile so that it doesn't 
create all the maps in one go?


I agree, either comment it better or just write one file at a time. 
I'll take another look at that.


Here is a patch that does it one file (pair) at a time.  The other rules 
besides UCS_to_most.pl actually had the same problem, since they produce 
two output files, so running in parallel called each script twice.  In 
this patch, all of that is heavily refactored and works correctly now. 
Note that UCS_to_most.pl already accepted a command-line argument to 
specify which encoding to work with.


Here is an updated patch with a thinko fix that made the previous patch 
not actually work.
From d243f21f4f4d38db08f427556256c87681a2c831 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 28 Sep 2021 10:17:31 +0200
Subject: [PATCH v2] Make Unicode makefile parallel-safe

Fix the rules so that each rule is parallel safe, using the same
trickery that we use elsewhere in the tree for rules that produce more
than one output file.  Refactor the whole makefile so that there is
less repetition.

Discussion: 
https://www.postgresql.org/message-id/18e34084-aab1-1b4c-edd1-c4f9fb04f714%40enterprisedb.com
---
 src/backend/utils/mb/Unicode/Makefile | 134 +-
 1 file changed, 45 insertions(+), 89 deletions(-)

diff --git a/src/backend/utils/mb/Unicode/Makefile 
b/src/backend/utils/mb/Unicode/Makefile
index ed6fc07e08..6e54b8f291 100644
--- a/src/backend/utils/mb/Unicode/Makefile
+++ b/src/backend/utils/mb/Unicode/Makefile
@@ -12,101 +12,57 @@ subdir = src/backend/utils/mb/Unicode
 top_builddir = ../../../../..
 include $(top_builddir)/src/Makefile.global
 
-ISO8859MAPS = iso8859_2_to_utf8.map utf8_to_iso8859_2.map \
-   iso8859_3_to_utf8.map utf8_to_iso8859_3.map \
-   iso8859_4_to_utf8.map utf8_to_iso8859_4.map \
-   iso8859_5_to_utf8.map utf8_to_iso8859_5.map \
-   iso8859_6_to_utf8.map utf8_to_iso8859_6.map \
-   iso8859_7_to_utf8.map utf8_to_iso8859_7.map \
-   iso8859_8_to_utf8.map utf8_to_iso8859_8.map \
-   iso8859_9_to_utf8.map utf8_to_iso8859_9.map \
-   iso8859_10_to_utf8.map utf8_to_iso8859_10.map \
-   iso8859_13_to_utf8.map utf8_to_iso8859_13.map \
-   iso8859_14_to_utf8.map utf8_to_iso8859_14.map \
-   iso8859_15_to_utf8.map utf8_to_iso8859_15.map \
-   iso8859_16_to_utf8.map utf8_to_iso8859_16.map
-
-WINMAPS = win866_to_utf8.map utf8_to_win866.map \
-   win874_to_utf8.map utf8_to_win874.map \
-   win1250_to_utf8.map utf8_to_win1250.map \
-   win1251_to_utf8.map utf8_to_win1251.map \
-   win1252_to_utf8.map utf8_to_win1252.map \
-   win1253_to_utf8.map utf8_to_win1253.map \
-   win1254_to_utf8.map utf8_to_win1254.map \
-   win1255_to_utf8.map utf8_to_win1255.map \
-   win1256_to_utf8.map utf8_to_win1256.map \
-   win1257_to_utf8.map utf8_to_win1257.map \
-   win1258_to_utf8.map utf8_to_win1258.map
-
-GENERICMAPS = $(ISO8859MAPS) $(WINMAPS) \
-   gbk_to_utf8.map utf8_to_gbk.map \
-   koi8r_to_utf8.map utf8_to_koi8r.map \
-   koi8u_to_utf8.map utf8_to_koi8u.map
-
-SPECIALMAPS = euc_cn_to_utf8.map utf8_to_euc_cn.map \
-   euc_jp_to_utf8.map utf8_to_euc_jp.map \
-   euc_kr_to_utf8.map utf8_to_euc_kr.map \
-   euc_tw_to_utf8.map utf8_to_euc_tw.map \
-   sjis_to_utf8.map utf8_to_sjis.map \
-   gb18030_to_utf8.map utf8_to_gb18030.map \
-   big5_to_utf8.map utf8_to_big5.map \
-   johab_to_utf8.map utf8_to_johab.map \
-   uhc_to_utf8.map utf8_to_uhc.map \
-   euc_jis_2004_to_utf8.map utf8_to_euc_jis_2004.map \
-   shift_jis_2004_to_utf8.map utf8_to_shift_jis_2004.map
-
-MAPS = $(GENERICMAPS) $(SPECIALMAPS)
-
-ISO8859TEXTS = 8859-2.TXT 8859-3.TXT 8859-4.TXT 8859-5.TXT \
-   8859-6.TXT 8859-7.TXT 8859-8.TXT 8859-9.TXT \
-   8859-10.TXT 8859-13.TXT 8859-14.TXT 8859-15.TXT \
-   8859-16.TXT
-
-WINTEXTS = CP866.TXT CP874.TXT CP936.TXT \
-   CP1250.TXT CP1251.TXT \
-   CP1252.TXT CP1253.TXT CP1254.TXT CP1255.TXT \
-   CP1256.TXT CP1257.TXT CP1258.TXT
-
-GENERICTEXTS = $(ISO8859TEXTS) $(WINTEXTS) \
-   KOI8-R.TXT KOI8-U.TXT
 
-all: $(MAPS)
-
-$(GENERICMAPS): UCS_to_most.pl $(GENERICTEXTS)
-   $(PERL) -I $(srcdir) $<
-
-johab_to_utf8.map utf8_to_johab.map: UCS_to_JOHAB.pl JOHAB.TXT
-   $(PERL) -I $(srcdir) $<
-
-uhc_to_utf8.map utf8_to_uhc.map: UCS_to_UHC.pl windows-949-2000.xml
-   $(PERL) -I $(srcdir) $<
-
-euc_jp_to_utf8.map utf8_to_euc_jp.map: UCS_to_EUC_JP.pl CP932.TXT JIS0212.TXT
-   $(PERL) -I $(srcdir) $<
+# Define a rule to create the map files from downloaded text input
+# files using a script.  Arguments:
+#
+# 1: encoding name used in output files (lower case)
+# 2: script name
+# 3: input text files
+# 4: argument to pass to script (optional)
+#
+# We also collect all the input and output files in variables to
+# define the build and clean rules below.
+#
+# N

Re: storing an explicit nonce

2021-09-28 Thread Ants Aasma
On Mon, 27 Sept 2021 at 23:34, Bruce Momjian  wrote:

> On Sun, Sep  5, 2021 at 10:51:42PM +0800, Sasasu wrote:
> > Hi, community,
> >
> > It looks like we are still considering AES-CBC, AES-XTS, and
> AES-GCM(-SIV).
> > I want to say something that we don't think about.
> >
> > For AES-CBC, the IV should be not predictable. I think LSN or HASH(LSN,
> > block number or something) is predictable. There are many CVE related to
> > AES-CBC with a predictable IV.
>
> The LSN would change every time the page is modified, so while the LSN
> could be predicted, it would not be reused.  However, there is currently
> no work being done on page-level encryption of Postgres.
>

We are still working on our TDE patch. Right now the focus is on
refactoring temporary file access to make the TDE patch itself smaller.
Reconsidering encryption mode choices given concerns expressed is next.
Currently a viable option seems to be AES-XTS with LSN added into the IV.
XTS doesn't have an issue with predictable IV and isn't totally broken in
case of IV reuse.

-- 

Ants Aasma
Senior Database Engineerwww.cybertec-postgresql.com


Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert

2021-09-28 Thread thomas
On Tue, 28 Sep 2021 02:09:11 +0100, Bruce Momjian  said:
> I don't think public CA's are not a good idea for complex setups since
> they open the ability for an external party to create certificates that
> are trusted by your server's CA, e.g., certificate authentication.

I'm not arguing for, and in fact would argue against, public CA for
client certs.

So that's a separate issue.

Note that mTLS prevents a MITM attack that exposes server data even if
server cert is compromised or re-issued, so if the install is using
client certs (with private CA) then the public CA for server matters
much less.

You can end up at the wrong server, yes, and provide data as INSERT,
but can't steal or corrupt existing data.

And you say for complex setups. Fair enough. But currently I'd say the
default is wrong, and what should be default is not configurable.

--
typedef struct me_s {
  char name[]  = { "Thomas Habets" };
  char email[] = { "tho...@habets.se" };
  char kernel[]= { "Linux" };
  char *pgpKey[]   = { "http://www.habets.pp.se/pubkey.txt"; };
  char pgp[] = { "9907 8698 8A24 F52F 1C2E  87F6 39A4 9EEA 460A 0169" };
  char coolcmd[]   = { "echo '. ./_&. ./_'>_;. ./_" };
} me_t;




Re: Failed transaction statistics to measure the logical replication progress

2021-09-28 Thread Amit Kapila
On Tue, Sep 28, 2021 at 11:35 AM Masahiko Sawada  wrote:
>
> On Tue, Sep 28, 2021 at 1:54 PM Amit Kapila  wrote:
> >
> > >
> > > Then, if, we proceed in this direction,
> > > the place to implement those stats
> > > would be on the LogicalRepWorker struct, instead ?
> > >
> >
> > Or, we can make existing stats persistent and then add these stats on
> > top of it. Sawada-San, do you have any thoughts on this matter?
>
> I think that making existing stats including received_lsn and
> last_msg_receipt_time persistent by using stats collector could cause
> massive reporting messages. We can report these messages with a
> certain interval to reduce the amount of messages but we will end up
> seeing old stats on the view.
>

Can't we keep the current and new stats both in-memory and persist on
disk? So, the persistent stats data will be used to fill the in-memory
counters after restarting of workers, otherwise, we will always refer
to in-memory values.

> Another idea could be to have a separate
> view, say pg_stat_subscription_xact but I'm not sure it's a better
> idea.
>

Yeah, that is another idea but I am afraid that having three different
views for subscription stats will be too much. I think it would be
better if we can display these additional stats via the existing view
pg_stat_subscription or the new view pg_stat_subscription_errors (or
whatever name we want to give it).

-- 
With Regards,
Amit Kapila.




Re: Gather performance analysis

2021-09-28 Thread Amit Kapila
On Tue, Sep 28, 2021 at 12:19 PM Dilip Kumar  wrote:
>
> On Mon, Sep 27, 2021 at 10:52 PM Robert Haas  wrote:
>
> >
> > And most of the time, that's probably a good bet. But, if you do
> > somehow hit the losing case repeatedly, then you could see a
> > significant regression. And that might explain Tomas's results.
> > Perhaps for some reason they just happen to hit that case over and
> > over again. If that's true, it would be useful to know why it happens
> > in that case and not others, because then maybe we could avoid the
> > problem somehow. However, I'm not sure how to figure that out, and I'm
> > not even entirely sure it's important to figure it out.
> >
>
> Yeah, if it is losing in some cases then it is definitely good to know
> the reason, I was just looking into the performance numbers shared by
> Tomas in query-results, I can see the worst case is
> with 1000 rows, 10 columns and 4 threads and queue size 64k.
> Basically, if we see the execution time with head is ~804ms whereas
> with patch it is ~1277 ms.  But then I just tried to notice the
> pattern with different queue size so number are like below,
>
>   16k 64k   256k1024k
> Head   1232.779804.241134.723901.257
> Patch  1371.4931277.705862.598783.481
>
> So what I have noticed is that in most of the cases on head as well as
> with the patch, increasing the queue size make it faster, but with
> head suddenly for this particular combination of rows, column and
> thread the execution time is very low for 64k queue size and then
> again the execution time increased with 256k queue size and then
> follow the pattern.  So this particular dip in the execution time on
> the head looks a bit suspicious to me.
>

I concur with your observation. Isn't it possible to repeat the same
test in the same environment to verify these results?

-- 
With Regards,
Amit Kapila.




RE: Added schema level support for publication.

2021-09-28 Thread tanghy.f...@fujitsu.com
On Monday, September 27, 2021 1:32 PM, vignesh C  wrote:

>Attached v33 patch has the preprocess_pubobj_list review comment fix
>suggested by Alvaro at [1]. The
>v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
>the grammar changes as suggested by Alvaro at [1]. If we agree this is
>better, I will merge this into the 0001 patch.
>[1] - 
>https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql

About the schema patch, I think a schema and a table which belongs to this 
schema shouldn't be specified at the same time. 
But what if someone uses "ALTER TABLE ... SET SCHEMA ..." after "CREATE 
PUBLICATION"?

For example:

create schema sch1;
create schema sch2;
create table sch2.t (a int);
create publication pub1 for all tables in schema sch1, table sch2.t; alter 
table sch2.t set schema sch1;

postgres=# \dRp+
  Publication pub1
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
--++-+-+-+---+--
--++-+-+-+---+
 postgres | f  | t   | t   | t   | t | f
Tables:
"sch1.t"
Tables from schemas:
"sch1"

Table t has been output twice.
I think this should not be supported, should we do something for this scenario?

Regards
Tang


Re: Gather performance analysis

2021-09-28 Thread Tomas Vondra

On 9/28/21 12:53 PM, Amit Kapila wrote:

On Tue, Sep 28, 2021 at 12:19 PM Dilip Kumar  wrote:


On Mon, Sep 27, 2021 at 10:52 PM Robert Haas  wrote:



And most of the time, that's probably a good bet. But, if you do
somehow hit the losing case repeatedly, then you could see a
significant regression. And that might explain Tomas's results.
Perhaps for some reason they just happen to hit that case over and
over again. If that's true, it would be useful to know why it happens
in that case and not others, because then maybe we could avoid the
problem somehow. However, I'm not sure how to figure that out, and I'm
not even entirely sure it's important to figure it out.



Yeah, if it is losing in some cases then it is definitely good to know
the reason, I was just looking into the performance numbers shared by
Tomas in query-results, I can see the worst case is
with 1000 rows, 10 columns and 4 threads and queue size 64k.
Basically, if we see the execution time with head is ~804ms whereas
with patch it is ~1277 ms.  But then I just tried to notice the
pattern with different queue size so number are like below,

   16k 64k   256k1024k
Head   1232.779804.241134.723901.257
Patch  1371.4931277.705862.598783.481

So what I have noticed is that in most of the cases on head as well as
with the patch, increasing the queue size make it faster, but with
head suddenly for this particular combination of rows, column and
thread the execution time is very low for 64k queue size and then
again the execution time increased with 256k queue size and then
follow the pattern.  So this particular dip in the execution time on
the head looks a bit suspicious to me.



I concur with your observation. Isn't it possible to repeat the same
test in the same environment to verify these results?



I can repeat any tests we need on that machine, of course.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Gather performance analysis

2021-09-28 Thread Dilip Kumar
On Tue, Sep 28, 2021 at 5:21 PM Tomas Vondra
 wrote:
> >> Yeah, if it is losing in some cases then it is definitely good to know
> >> the reason, I was just looking into the performance numbers shared by
> >> Tomas in query-results, I can see the worst case is
> >> with 1000 rows, 10 columns and 4 threads and queue size 64k.
> >> Basically, if we see the execution time with head is ~804ms whereas
> >> with patch it is ~1277 ms.  But then I just tried to notice the
> >> pattern with different queue size so number are like below,
> >>
> >>16k 64k   256k1024k
> >> Head   1232.779804.241134.723901.257
> >> Patch  1371.4931277.705862.598783.481
> >>
> >> So what I have noticed is that in most of the cases on head as well as
> >> with the patch, increasing the queue size make it faster, but with
> >> head suddenly for this particular combination of rows, column and
> >> thread the execution time is very low for 64k queue size and then
> >> again the execution time increased with 256k queue size and then
> >> follow the pattern.  So this particular dip in the execution time on
> >> the head looks a bit suspicious to me.
> >>
> >
> > I concur with your observation. Isn't it possible to repeat the same
> > test in the same environment to verify these results?
> >
>
> I can repeat any tests we need on that machine, of course.
>

I think that would be great, can we just test this specific target
where we are seeing a huge dip with the patch, e.g.
with 1000 rows, 10 columns and 4 threads, and queue size 64k.  In
my performance machine, I tried to run this test multiple times but on
the head, it is taking ~2000 ms whereas with the patch it is ~1500 ms,
so I am not able to reproduce this.  So it would be good if you can
run only this specific test and repeat it a couple of times on your
performance machine.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: when the startup process doesn't (logging startup delays)

2021-09-28 Thread Nitin Jadhav
> That's really not what I'm complaining about. I think if we're going
> to give an example at all, 10ms is a better example than 100ms,
> because 10s is a value that people are more likely to find useful. But
> I'm not sure that it's necessary to mention a specific value, and if
> it is, I think it needs to be phrased in a less confusing way.
>
> > >>It also looks pretty silly to say that if you set the value to 10s, 
> > >>something
> > >>will happen every 10s. What else would anyone expect to happen?
> >
> > @Robert: that's consistent with existing documentation, even though it might
> > seem obvious and silly to us.
> >
> > | For example, if you set this to 250ms then all automatic vacuums and 
> > analyzes that run 250ms or longer will be logged
> > | For example, if you set it to 250ms then all SQL statements that run 
> > 250ms or longer will be logged
>
> Fair enough, but I still don't like it much. I tried my hand at
> rewriting this and came up with the attached:
>
> + Sets the amount of time after which the startup process will log
> + a message about a long-running operation that is still in progress,
> + as well as the interval between further progress messages for that
> + operation. This setting is applied separately to each operation.
> + For example, if syncing the data directory takes 25 seconds and
> + thereafter resetting unlogged relations takes 8 seconds, and if this
> + setting has the default value of 10 seconds, then a messages will be
> + logged for syncing the data directory after it has been in progress
> + for 10 seconds and again after it has been in progress for 20 
> seconds,
> + but nothing will be logged for resetting unlogged operations.
> + A setting of 0 disables the feature.
>
> I prefer this to Nitin's version because I think it could be unclear
> to someone that the value applies separately to each operation,
> whereas I don't think we need to document that we can't guarantee that
> the messages will be perfectly on time - that's true of every kind of
> scheduled event in pretty much every computer system - or what happens
> if the system clock goes backwards. Those are things we should try to
> get right, as well as we can anyway, but we don't need to tell the
> user that we got them right.

I thought mentioning the unit in milliseconds and the example in
seconds would confuse the user, so I changed the example to
milliseconds.The message behind the above description looks good to me
however I feel some sentences can be explained in less words. The
information related to the units is missing and I feel it should be
mentioned in the document. The example says, if the setting has the
default value of 10 seconds, then it explains the behaviour. I feel it
may not be the default value, it can be any value set by the user. So
mentioning 'default' in the example does not look good to me. I feel
we just have to mention "if this setting is set to the value of 10
seconds". Below is the modified version of the above information.

+ Sets the amount of time after every such interval the startup process
+ will log a message about a long-running operation that is still in
+ progress. This setting is applied separately to each operation.
+ For example, if syncing the data directory takes 25 seconds and
+ thereafter resetting unlogged relations takes 8 seconds, and if this
+ setting is set to the value of 10 seconds, then a messages will be
+ logged for syncing the data directory after it has been in progress
+ for 10 seconds and again after it has been in progress for 20 seconds,
+ but nothing will be logged for resetting unlogged operations.
+ A setting of 0 disables the feature. If this value
+ is specified without units, it is taken as milliseconds.

> Well, I see that -1 is now disallowed, and that's good as far as it
> goes, but 0 still does not actually disable the feature. I don't
> understand why you posted the previous version of the patch without
> testing that it works, and I even less understand why you are posting
> another version without fixing the bug that I pointed out to you in
> the last version.

I had added additional code to check the value of the
'log_startup_progress_interval' variable and  disable the feature in
case of -1 in the earlier versions of the patch (Specifically
v9.patch). There was a review comment for v9 patch and it resulted in
major refactoring of the patch. The comment was

> With these changes you'd have only 1 place in the code that needs to
> care about log_startup_progress_interval, as opposed to 3 as you have
> it currently, and only one place that enables the timeout, as opposed
> to 2 as you have it currently. I think that would be tidier.

Based on the above comment and the idea behind enabling the timer, it
does not log anything if the value is set to -1. So I

Re: [PATCH] Print error when libpq-refs-stamp fails

2021-09-28 Thread Daniel Gustafsson
> On 28 Sep 2021, at 00:55, Rachel Heaton  wrote:

> While developing I got this error and it was difficult to figure out what was 
> going on. 
> 
> Thanks to Jacob, I was able to learn the context of the failure, so we 
> created this small patch. 

I can see that, and I think this patch makes sense even though we don't have
much of a precedent for outputting informational messages from Makefiles.

> The text of the error message, of course, is up for debate, but hopefully 
> this will make it more clear to others. 

+   echo 'libpq must not call exit'; exit 1; \

Since it's not actually libpq which calls exit (no such patch would ever be
committed), I think it would be clearer to indicate that a library linked to is
the culprit.  How about something like "libpq must not be linked against any
library calling exit"?

--
Daniel Gustafsson   https://vmware.com/





Re: PROXY protocol support

2021-09-28 Thread Magnus Hagander
On Fri, Sep 10, 2021 at 1:44 AM Jacob Champion  wrote:
>
> On Wed, 2021-09-08 at 18:51 +, Jacob Champion wrote:
> > I still owe you that overall review. Hoping to get to it this week.
>
> And here it is. I focused on things other than UnwrapProxyConnection()
> for this round, since I think that piece is looking solid.

Thanks!


> > +   if (port->isProxy)
> > +   {
> > +   ereport(LOG,
> > +   (errcode_for_socket_access(),
> > +errmsg("Ident authentication cannot be 
> > used over PROXY connections")));
>
> What are the rules on COMMERROR vs LOG when dealing with authentication
> code? I always thought COMMERROR was required, but I see now that LOG
> (among others) is suppressed to the client during authentication.

I honestly don't know :) In this case, LOG is what's used for all the
other message in errors in ident_inet(), so I picked it for
consistency.


> >  #ifdef USE_SSL
> > /* No SSL when disabled or on Unix sockets */
> > -   if (!LoadedSSL || IS_AF_UNIX(port->laddr.addr.ss_family))
> > +   if (!LoadedSSL || (IS_AF_UNIX(port->laddr.addr.ss_family) 
> > && !port->isProxy))
> > SSLok = 'N';
> > else
> > SSLok = 'S';/* Support for SSL */
> > @@ -2087,7 +2414,7 @@ retry1:
> >
> >  #ifdef ENABLE_GSS
> > /* No GSSAPI encryption when on Unix socket */
> > -   if (!IS_AF_UNIX(port->laddr.addr.ss_family))
> > +   if (!IS_AF_UNIX(port->laddr.addr.ss_family) || 
> > port->isProxy)
> > GSSok = 'G';
>
> Now that we have port->daddr, could these checks be simplified to just
> IS_AF_UNIX(port->daddr...)? Or is there a corner case I'm missing for
> the port->isProxy case?

Yeah, I think they could.


> > +* Note: AuthenticationTimeout is applied here while waiting for the
> > +* startup packet, and then again in InitPostgres for the duration of 
> > any
> > +* authentication operations.  So a hostile client could tie up the
> > +* process for nearly twice AuthenticationTimeout before we kick him 
> > off.
>
> This comment needs to be adjusted after the move; waiting for the
> startup packet comes later, and it looks like we can now tie up 3x the
> timeout for the proxy case.

Good point.


> > +   /* Check if this is a proxy connection and if so unwrap the 
> > proxying */
> > +   if (port->isProxy)
> > +   {
> > +   enable_timeout_after(STARTUP_PACKET_TIMEOUT, 
> > AuthenticationTimeout * 1000);
> > +   if (UnwrapProxyConnection(port) != STATUS_OK)
> > +   proc_exit(0);
>
> I think the timeout here could comfortably be substantially less than
> the overall authentication timeout, since the proxy should send its
> header immediately even if the client takes its time with the startup
> packet. The spec suggests allowing 3 seconds minimum to cover a
> retransmission. Maybe something to tune in the future?

Maybe. I'll leave it with a new comment for now about us diong it, and
that we may want to consider igt in the future.

>
> > +   /* Also listen to the PROXY port on this address, 
> > if configured */
> > +   if (ProxyPortNumber)
> > +   {
> > +   if (strcmp(curhost, "*") == 0)
> > +   socket = 
> > StreamServerPort(AF_UNSPEC, NULL,
> > +   
> >   (unsigned short) ProxyPortNumber,
> > +   
> >   NULL,
> > +   
> >   ListenSocket, MAXLISTEN);
>
> Sorry if you already talked about this upthread somewhere, but it looks
> like another downside of treating "proxy mode" as a server-wide on/off
> switch is that it cuts the effective MAXLISTEN in half, from 64 to 32,
> since we're opening two sockets for every address. If I've understood
> that correctly, it might be worth mentioning in the docs.

Correct. I don't see a way to avoid that without complicating things
(as long as we want the ports to be separate), but I also don't see it
as something that's reality to be an issue in reality.

I would agree with documenting it, but I can't actually find us
documenting the MAXLISTEN value anywhere. Do we?


> > -   if (!success && elemlist != NIL)
> > +   if (socket == NULL && elemlist != NIL)
> > ereport(FATAL,
> > (errmsg("could not create any 
> > TCP/IP sockets")));
>
> With this change in PostmasterMain, it looks like `success` is no
> longer a useful variable. But I'm not convinced that this is the
> correct logic -- this is just chec

Re: pgcrypto support for bcrypt $2b$ hashes

2021-09-28 Thread Daniel Gustafsson
> On 28 Sep 2021, at 05:15, Daniel Fone  wrote:
> 
> Hi Daniel,
> 
> Thanks for the feedback.
> 
>> On 26/09/2021, at 12:09 AM, Daniel Gustafsson  wrote:
>> 
>> But 2b and 2a hashes aren't equal, although very similar.  2a should have the
>> many-buggy to one-correct collision safety and 2b hashes shouldn't.  The fact
>> that your hashes work isn't conclusive evidence.
> 
> I was afraid this might be a bit naive. Re-reading the crypt_blowfish release 
> notes, it’s principally the changes introducing $2y$ into 1.2 that we need, 
> with support for OpenBSD $2b$ introduced in 1.3. Do I understand this 
> correctly?

Yeah, we'd want a port of 1.3 into pgcrypto essentially.

>> Upgrading our crypt_blowfish.c to the upstream 1.3 version would be the 
>> correct
>> fix IMO, but since we have a few local modifications it's not a drop-in.  I
>> don't think it would be too hairy, but one needs to be very careful when
>> dealing with crypto.
> 
> My C experience is limited, but I can make an initial attempt if the effort 
> would be worthwhile. Is this realistically a patch that a newcomer to the 
> codebase should attempt?

I don't see why not, the best first patches are those scratching an itch.  If
you feel up for it then give it a go, I - and the rest of pgsql-hackers - can
help if you need to bounce ideas.  Many of the changes in the pgcrypto BF code
is whitespace and formatting, which are performed via pgindent.  I would
suggest to familiarize yourself with pgindent in order to tease them out
easier.  Another set of changes are around error handling and reporting, which
is postgres specific.

>> Actually it is, in table F.16 in the below documentation page we refer to our
>> supported level as "Blowfish-based, variant 2a”.
> 
> Sorry I wasn’t clear. My point was that the docs only mention $2a$, and $2x$ 
> isn’t mentioned even though pgcrypto supports it. As part of the upgrade to 
> 1.3, perhaps the docs can be updated to mention variants x, y, and b as well.

Aha, now I see what you mean, yes you are right.  I think the docs should be
updated regardless of the above as a first step to properly match what's in the
tree. Unless there are objections I propose to apply the attached.

--
Daniel Gustafsson   https://vmware.com/



pgcrypto_blowfish.diff
Description: Binary data



Re: Couldn't we mark enum_in() as immutable?

2021-09-28 Thread Daniel Gustafsson
> On 27 Sep 2021, at 23:54, Tom Lane  wrote:

> So it seems like the results of
> enum_in() are immutable for as long as anybody could care about them,
> and that's good enough.

+1. I can't think of a situation where this wouldn't hold.

--
Daniel Gustafsson   https://vmware.com/





Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-28 Thread Magnus Hagander
On Wed, Sep 1, 2021 at 8:43 PM Jacob Champion  wrote:
>
> On Wed, 2021-09-01 at 15:42 +, Jacob Champion wrote:
> > The cfbot found a failure in postgres_fdw, which I completely neglected
> > in my design. I think the desired functionality should be to allow the
> > ldapuser connection option during CREATE USER MAPPING but not CREATE
> > SERVER.
>
> Fixed in v2, attached.

A couple of quick comments from a quick look-over:

I'm a bit hesitant about the ldapuser libpq parameter. Do we really
want to limit ourselves to just ldap, if we allow this? I mean, why
not allow say radius or pam to also specify a different username for
the external system? If we want to do that, now or in the future, we
should have a much more generic parameter name, something like
authuser?

Why do we actually need ldap_map_dn? Shouldn't this just be what
happens if you specify map= on an ldap connection?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: Trap errors from streaming child in pg_basebackup to exit early

2021-09-28 Thread Magnus Hagander
On Fri, Sep 3, 2021 at 11:53 AM Daniel Gustafsson  wrote:
>
> > On 1 Sep 2021, at 12:28, Bharath Rupireddy 
> >  wrote:
> >
> > On Wed, Sep 1, 2021 at 1:56 PM Daniel Gustafsson  wrote:
> >> A v2 with the above fixes is attached.
> >
> > Thanks for the updated patch. Here are some comments:
> >
> > 1) Do we need to set bgchild = -1 before the exit(1); in the code
> > below so that we don't kill(bgchild, SIGTERM); unnecessarily in
> > kill_bgchild_atexit?
>
> Good point. We can also inspect bgchild_exited in kill_bgchild_atexit.
>
> > 2) Missing "," after "On Windows, we use a ."
> > + * that time. On Windows we use a background thread which can communicate
> >
> > 3) How about "/* Flag to indicate whether or not child process exited
> > */" instead of +/* State of child process */?
>
> Fixed.
>
> > 4) Instead of just exiting from the main pg_basebackup process when
> > the child WAL receiver dies, can't we think of restarting the child
> > process, probably with the WAL streaming position where it left off or
> > stream from the beginning? This way, the work that the main
> > pg_basebackup has done so far doesn't get wasted. I'm not sure if this
> > affects the pg_basebackup functionality. We can restart the child
> > process for 1 or 2 times, if it still dies, we can kill the main
> > pg_baasebackup process too. Thoughts?
>
> I was toying with the idea, but I ended up not pursuing it.  This error is 
> well
> into the “really shouldn’t happen, but can” territory and it’s quite likely
> that some level of manual intervention is required to make it successfully
> restart.  I’m not convinced that adding complicated logic to restart (and even
> more complicated tests to simulate and test it) will be worthwhile.
>

I think the restart scenario while nice, definitely means moving the
goalposts quite far. Let's get this detection in first at least, and
then we can always consider that a separate patch in the future.

Might be worth noting in one of the comments the difference in
behaviour if the backend process/thread *crashes* -- that is, on Unix
it will be detected via the signal handler and on Windows the whole
process including the main thread will die, so there is nothing to
detect.

Other places in the code just refers to the background process as "the
background process".  The term "WAL receiver" is new from this patch.
While I agree it's in many ways a better term, I think (1) we should
try to be consistent, and (2) maybe use a different term than "WAL
receiver" specifically because we have a backend component with that
name.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented

2021-09-28 Thread Shinya Kato

2021-09-28 17:06 に bt21tanigaway さんは書きました:

2021-09-28 17:03 に bt21tanigaway さんは書きました:

2021-09-28 16:36 に Fujii Masao さんは書きました:

On 2021/09/28 16:13, bt21tanigaway wrote:

Hi,

(LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented in 
tab-complete. I made a patch for these options.


Thanks for the patch!
The patch seems to forget to handle the tab-completion for
"LOCK ONLY  IN".


Thanks for your comment!
I attach a new patch fixed to this mail.

Regards,

Koyu Tanigawa


Sorry, I forgot to attach patch file.
"fix-tab-complete2.patch" is fixed!

Regards,

Koyu Tanigawa

Thank you for your patch.
I have two comments.

1. When I executed git apply, an error occured.
---
$ git apply ~/Downloads/fix-tab-complete2.patch
/home/penguin/Downloads/fix-tab-complete2.patch:14: indent with spaces.
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION SELECT 
'TABLE'" " UNION SELECT 'ONLY'");

warning: 1 line adds whitespace errors.
---

2. The command "LOCK TABLE a, b;" can be executed, but tab-completion 
doesn't work properly. Is it OK?


--
Regards,

--
Shinya Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Couldn't we mark enum_in() as immutable?

2021-09-28 Thread Andrew Dunstan


On 9/27/21 5:54 PM, Tom Lane wrote:
> Currently enum_in() is marked as stable, on the reasonable grounds
> that it depends on system catalog contents.  However, after the
> discussion at [1] I'm wondering why it wouldn't be perfectly safe,
> and useful, to mark it as immutable.
>
> Here's my reasoning: "immutable" promises that the function will
> always give the same results for the same inputs.  However, one of
> the inputs is the type OID for the desired enum type.  It's certainly
> possible that the enum type could be dropped later, and then its type
> OID could be recycled, so that at some future epoch enum_in() might
> give a different result for the "same" type OID.  But I cannot think
> of any situation where a stored output value of enum_in() would
> outlive the dropping of the enum type.  It certainly couldn't happen
> for a table column of that type, nor would it be safe for a stored
> rule to outlive a type it mentions.  So it seems like the results of
> enum_in() are immutable for as long as anybody could care about them,
> and that's good enough.
>
> Moreover, if it's *not* good enough, then our existing practice of
> folding enum literals to OID constants on-sight must be unsafe too.
>
> So it seems like this would be okay, and if we did it it'd eliminate
> some annoying corner cases for query optimization, as seen in the
> referenced thread.
>
> I think that a similar argument could be made about enum_out, although
> maybe I'm missing some interesting case there.
>
> Thoughts?
>
>   


The value returned depends on the label values in pg_enum, so if someone
decided to rename a label that would affect it, no? Same for enum_out.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





RE: (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented

2021-09-28 Thread Shinya11.Kato
>-Original Message-
>From: bt21tanigaway 
>Sent: Tuesday, September 28, 2021 5:06 PM
>To: Fujii Masao ;
>pgsql-hackers@lists.postgresql.org
>Subject: Re: (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet
>implemented
>
>2021-09-28 17:03 に bt21tanigaway さんは書きました:
>> 2021-09-28 16:36 に Fujii Masao さんは書きました:
>>> On 2021/09/28 16:13, bt21tanigaway wrote:
 Hi,

 (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented in
 tab-complete. I made a patch for these options.
>>>
>>> Thanks for the patch!
>>> The patch seems to forget to handle the tab-completion for "LOCK ONLY
>>>  IN".
>>
>> Thanks for your comment!
>> I attach a new patch fixed to this mail.
>>
>> Regards,
>>
>> Koyu Tanigawa
>
>Sorry, I forgot to attach patch file.
>"fix-tab-complete2.patch" is fixed!
>
>Regards,
>
>Koyu Tanigawa
Thank you for your patch.
I have two comments.

1. When I executed git apply, an error occured.
---
$ git apply ~/Downloads/fix-tab-complete2.patch
/home/penguin/Downloads/fix-tab-complete2.patch:14: indent with spaces.
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION SELECT 
'TABLE'" " UNION SELECT 'ONLY'");
warning: 1 line adds whitespace errors.
---

2. The command "LOCK TABLE a, b;" can be executed, but tab-completion doesn't 
work properly. Is it OK?

-- 
Regards,

--
Shinya Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION


Re: Fixing WAL instability in various TAP tests

2021-09-28 Thread Andrew Dunstan


On 9/27/21 10:20 PM, Tom Lane wrote:
> Michael Paquier  writes:
>> On Mon, Sep 27, 2021 at 04:19:27PM -0400, Tom Lane wrote:
>>> I tried the same thing (i.e., re-enable bloom's TAP test) on my laptop
>>> just now, and it passed fine.  The laptop is not exactly the same
>>> as longfin was in 2018, but it ought to be close enough.  Not sure
>>> what to make of that --- maybe the failure is only intermittent,
>>> or else we fixed the underlying issue since then.
>> Honestly, I have no idea what change in the backend matters here.  And
>> it is not like bloom has changed in any significant way since d3c09b9.
> I went so far as to check out 03faa4a8dd on longfin's host, and I find
> that I cannot reproduce the failure shown at
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2018-11-25+23%3A59%3A03
>
> So that's the same hardware, and identical PG source tree, and different
> results.  This seems to leave only two theories standing:
>
> 1.  It was a since-fixed macOS bug.  (Unlikely, especially if we also saw
> it on other platforms.)
>
> 2.  The failure manifested only in the buildfarm, not under manual "make
> check".  This is somewhat more plausible, especially since subsequent
> buildfarm script changes might then explain why it went away.  But I have
> no idea what the "subsequent script changes" might've been.
>
>   


Nothing I can think of.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: verify_heapam for sequences?

2021-09-28 Thread Peter Eisentraut

On 30.08.21 21:00, Mark Dilger wrote:

The attached patch changes both contrib/amcheck/ and src/bin/pg_amcheck/ to 
allow checking sequences.  In both cases, the changes required are fairly 
minor, though they both entail some documentation changes.

It seems fairly straightforward that if a user calls verify_heapam() on a 
sequence, then the new behavior is what they want.  It is not quite so clear 
for pg_amcheck.

In pg_amcheck, the command-line arguments allow discriminating between tables and indexes 
with materialized views quietly treated as tables (which, of course, they are.)  In v14, 
sequences were not treated as tables, nor checked at all.  In this new patch, sequences 
are quietly treated the same way as tables.  By "quietly", I mean there are no 
command-line switches to specifically filter them in or out separately from filtering 
ordinary tables.

This is a user-facing behavioral change, and the user might not be imagining 
sequences specifically when specifying a table name pattern that matches both 
tables and sequences.  Do you see any problem with that?  It was already true 
that materialized views matching a table name pattern would be checked, so this 
new behavior is not entirely out of line with the old behavior.

The new behavior is documented, and since I'm updating the docs, I made the 
behavior with respect to materialized views more explicit.


committed




Re: POC: Cleaning up orphaned files using undo logs

2021-09-28 Thread Antonin Houska
Amit Kapila  wrote:

> On Mon, Sep 20, 2021 at 10:55 AM Antonin Houska  wrote:
> >
> > Amit Kapila  wrote:
> >
> > > On Thu, Sep 9, 2021 at 8:33 PM Antonin Houska  wrote:
> >
> > > > The problem of the temporary undo log is that it's loaded into local 
> > > > buffers
> > > > and that backend can exit w/o flushing local buffers to disk, and thus 
> > > > we are
> > > > not guaranteed to find enough information when trying to discard the 
> > > > undo log
> > > > the backend wrote. I'm thinking about the following solutions:
> > > >
> > > > 1. Let the backend manage temporary undo log on its own (even the slot
> > > >metadata would stay outside the shared memory, and in particular the
> > > >insertion pointer could start from 1 for each session) and remove the
> > > >segment files at the same moment the temporary relations are removed.
> > > >
> > > >However, by moving the temporary undo slots away from the shared 
> > > > memory,
> > > >computation of oldestFullXidHavingUndo (see the PROC_HDR structure) 
> > > > would
> > > >be affected. It might seem that a transaction which only writes undo 
> > > > log
> > > >for temporary relations does not need to affect 
> > > > oldestFullXidHavingUndo,
> > > >but it needs to be analyzed thoroughly. Since oldestFullXidHavingUndo
> > > >prevents transactions to be truncated from the CLOG too early, I 
> > > > wonder if
> > > >the following is possible (This scenario is only applicable to the 
> > > > zheap
> > > >storage engine [1], which is not included in this patch, but should 
> > > > already
> > > >be considered.):
> > > >
> > > >A transaction creates a temporary table, does some (many) changes 
> > > > and then
> > > >gets rolled back. The undo records are being applied and it takes 
> > > > some
> > > >time. Since XID of the transaction did not affect 
> > > > oldestFullXidHavingUndo,
> > > >the XID can disappear from the CLOG due to truncation.
> > > >
> > >
> > > By above do you mean to say that in zheap code, we don't consider XIDs
> > > that operate on temp table/undo for oldestFullXidHavingUndo?
> >
> > I was referring to the code
> >
> > /* We can't process temporary undo logs. */
> > if (log->meta.persistence == UNDO_TEMP)
> > continue;
> >
> > in undodiscard.c:UndoDiscard().
> >
> 
> Here, I think it will just skip undo of temporary undo logs and
> oldestFullXidHavingUndo should be advanced after skipping it.

Right, it'll be adavanced, but the transaction XID (if the transaction wrote
only to temporary relations) might still be needed.

> > >
> > > > However zundo.c in
> > > >[1] indicates that the transaction status *is* checked during undo
> > > >execution, so we might have a problem.
> > > >
> > >
> > > It would be easier to follow if you can tell which exact code are you
> > > referring here?
> >
> > In meant the call of TransactionIdDidCommit() in
> > zundo.c:zheap_exec_pending_rollback().
> >
> 
> IIRC, this should be called for temp tables after they have exited as
> this is only to apply the pending undo actions if any, and in case of
> temporary undo after session exit, we shouldn't need it.

I see (had to play with debugger a bit). Currently this works because the
temporary relations are dropped by AbortTransaction() ->
smgrDoPendingDeletes(), before the undo execution starts. The situation will
change as soon as the file removal will also be handled by the undo subsystem,
however I'm still not sure how to hit the TransactionIdDidCommit() call for
the XID already truncated from CLOG.

I'm starting to admint that there's no issue here: temporary undo is always
applied immediately in foreground, and thus the zheap_exec_pending_rollback()
function never needs to examine XID which no longer exists in the CLOG.

> I am not able to understand what exact problem you are facing for temp
> tables after the session exit. Can you please explain it a bit more?

The problem is that the temporary undo buffers are loaded into backend-local
buffers. Thus there's no guarantee that we'll find a consistent information in
the undo file even if the backend exited cleanly (local buffers are not
flushed at backend exit and there's no WAL for them). However, we need to read
the undo file to find out if (part of) it can be discarded.

I'm trying to find out whether we can ignore the temporary undo when trying to
advance oldestFullXidHavingUndo or not. If we can, then each backend can mange
its temporary undo on its own and - instead of checking which chunks can be
discarded - simply delete the undo files on exit as a whole, just like it
deletes temporary relations. Thus we wouldn't need to pay any special
attention to discarding. Also, if backends managed the temporary undo this
way, it wouldn't be necessary to track it via the shared memory
(UndoLogMetaData).

(With this approach, the undo record to delete the temporary relation must not
be

Re: how to distinguish between using the server as a standby or for executing a targeted recovery in PG 11?

2021-09-28 Thread Andrew Dunstan


On 9/24/21 12:46 PM, Bharath Rupireddy wrote:
> Hi,
>
> I'm trying to set up a postgres server with version 11 in targeted
> recovery mode (for the first time after my journey started with
> postgres) and I came across the explanation at [1] in PG 12 and newer
> versions that we have a clear differentiation as to what is the
> "standby" mode or "targeted recovery" mode. How do we differentiate
> these two modes in PG 11? Can anyone please help me with it?
>
> PS: hackers-list may not be the right place to ask, but I'm used to
> seeking help from it.
>


see 
and 


(And yes, pgsql-general would be the right forum)


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





reindexdb usage message about system catalogs

2021-09-28 Thread Magnus Hagander
Reindexdb help has this for selection of what to reindex:

  -s, --system reindex system catalogs
  -S, --schema=SCHEMA  reindex specific schema(s) only
  -t, --table=TABLEreindex specific table(s) only

Is there a reason the "only" is missing from the -s option? AFAIK that's
what it means, so the attached patch should be correct?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/
diff --git a/doc/src/sgml/ref/reindexdb.sgml b/doc/src/sgml/ref/reindexdb.sgml
index 80a7f84886..8cb8bf4fa3 100644
--- a/doc/src/sgml/ref/reindexdb.sgml
+++ b/doc/src/sgml/ref/reindexdb.sgml
@@ -208,7 +208,7 @@ PostgreSQL documentation
   --system
   

-Reindex database's system catalogs.
+Reindex database's system catalogs only.

   
  
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index a0b0250c49..64fbb40baf 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -798,7 +798,7 @@ help(const char *progname)
 	printf(_("  -i, --index=INDEXrecreate specific index(es) only\n"));
 	printf(_("  -j, --jobs=NUM   use this many concurrent connections to reindex\n"));
 	printf(_("  -q, --quiet  don't write any messages\n"));
-	printf(_("  -s, --system reindex system catalogs\n"));
+	printf(_("  -s, --system reindex system catalogs only\n"));
 	printf(_("  -S, --schema=SCHEMA  reindex specific schema(s) only\n"));
 	printf(_("  -t, --table=TABLEreindex specific table(s) only\n"));
 	printf(_("  --tablespace=TABLESPACE  tablespace where indexes are rebuilt\n"));


Document spaces in .pgpass need to be escaped

2021-09-28 Thread James Coleman
A coworker has a space in a Postgres password and noticed .pgpass
didn't work; escaping it fixed the issue. That requirement wasn't
documented (despite other escaping requirements being documented), so
I've attached a patch to add that comment.

Thanks,
James Coleman


v1-0001-Document-spaces-in-.pgpass-need-to-be-escaped.patch
Description: Binary data


Re: Add prefix pg_catalog to pg_get_statisticsobjdef_columns() in describe.c (\dX)

2021-09-28 Thread Magnus Hagander
On Mon, Sep 27, 2021 at 2:14 AM Tatsuro Yamada <
tatsuro.yamada...@nttcom.co.jp> wrote:

> Hi,
>
> I found other functions that we should add "pg_catalog" prefix in
> describe.c. This fix is similar to the following commit.
>

Hi!

Yup, that's correct. Applied and backpatched to 14 (but it won't be in the
14.0 release).

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: POC: Cleaning up orphaned files using undo logs

2021-09-28 Thread Dmitry Dolgov
> On Tue, Sep 21, 2021 at 10:07:55AM +0200, Dmitry Dolgov wrote:
> On Tue, 21 Sep 2021 09:00 Antonin Houska,  wrote:
>
> > Dmitry Dolgov <9erthali...@gmail.com> wrote:
> >
> > > Yep, makes sense, thanks. I have few more questions:
> > >
> > > * The use case with orphaned files is working somewhat differently after
> > >   the rebase on the latest master, do you observe it as well? The
> > >   difference is ApplyPendingUndo -> SyncPostCheckpoint doesn't clean up
> > >   an orphaned relation file immediately (only later on checkpoint)
> > >   because of empty pendingUnlinks. I haven't investigated more yet, but
> > >   seems like after this commit:
> > >
> > > commit 7ff23c6d277d1d90478a51f0dd81414d343f3850
> > > Author: Thomas Munro 
> > > Date:   Mon Aug 2 17:32:20 2021 +1200
> > >
> > > Run checkpointer and bgwriter in crash recovery.
> > >
> > > Start up the checkpointer and bgwriter during crash recovery
> > (except in
> > > --single mode), as we do for replication.  This wasn't done back
> > in
> > > commit cdd46c76 out of caution.  Now it seems like a better idea
> > to make
> > > the environment as similar as possible in both cases.  There may
> > also be
> > > some performance advantages.
> > >
> > >   something has to be updated (pendingOps are empty right now, so no
> > >   unlink request is remembered).
> >
> > I haven't been debugging that part recently, but yes, this commit is
> > relevant,
> > thanks for pointing that out! Attached is a patch that should fix it. I'll
> > include it in the next version of the patch series, unless you tell me that
> > something is still wrong.
> >
>
> Sure, but  I can take a look only in a couple of days.

Thanks for the patch.

Hm, maybe there is some misunderstanding. My question above was about
the changed behaviour, when orphaned files (e.g. created relation files
after the backend was killed) are removed only by checkpointer when it
kicks in. As far as I understand, the original intention was to do this
job right away, that's why SyncPre/PostCheckpoint was invoked. But the
recent changes around checkpointer make the current implementation
insufficient.

The patch you've proposed removes invokation of SyncPre/PostCheckpoint,
do I see it correctly? In this sense it doesn't change anything, except
removing non-functioning code of course. But the question, probably
reformulated from the more design point of view, stays the same — when
and by which process such orphaned files have to be removed? I've
assumed by removing right away the previous version was trying to avoid
any kind of thunder effects of removing too many at once, but maybe I'm
mistaken here.




Re: Added schema level support for publication.

2021-09-28 Thread vignesh C
On Mon, Sep 27, 2021 at 2:46 PM Greg Nancarrow  wrote:
>
> On Mon, Sep 27, 2021 at 2:32 PM vignesh C  wrote:
> >
> > Attached v33 patch has the preprocess_pubobj_list review comment fix
> > suggested by Alvaro at [1].
>
> A minor point I noticed in the v33-0002 patch, in the code added to
> the listSchemas() function of src/bin/psql/describe.c, shouldn't it
> "return false" (not true) if PSQLexec() fails?
> Also, since the PQExpBufferData buf is re-used in the added code, it's
> handling is a little inconsistent to similar existing code.
> See below for suggested update.

Modified
This is handled in the v34 patch attached at [1].
[1] - 
https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com

Regards,
Vignesh




Re: Added schema level support for publication.

2021-09-28 Thread vignesh C
On Mon, Sep 27, 2021 at 4:51 PM Greg Nancarrow  wrote:
>
> On Mon, Sep 27, 2021 at 2:32 PM vignesh C  wrote:
> >
> > Attached v33 patch has the preprocess_pubobj_list review comment fix
> > suggested by Alvaro at [1].
>
> In the v33-0003 patch, there's a couple of error-case tests that have
> comments copied from success-case tests:
>
> +-- should be able to add table to schema publication
> ...
> +-- should be able to drop table from schema publication
> ...
>
> These should be changed to something similar to that used for other
> error-case tests, like:
>
> +-- fail - can't add a table of the same schema to the schema publication
> +-- fail - can't drop a table from the schema publication which isn't
> in the publication

Modified

> Also, for the following error:
>
> ERROR:  cannot add ... to publication
> DETAIL:  Table's schema "" is already part of the publication
> or part of the specified schema list.
>
> there needs to be a test case to test the "... or part of the
> specified schema list" case.

Added the test
This is handled in the v34 patch attached at [1].
[1] - 
https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com

Regards,
Vignesh




Re: Added schema level support for publication.

2021-09-28 Thread vignesh C
On Tue, Sep 28, 2021 at 4:35 PM tanghy.f...@fujitsu.com
 wrote:
>
> On Monday, September 27, 2021 1:32 PM, vignesh C  wrote:
>
> >Attached v33 patch has the preprocess_pubobj_list review comment fix
> >suggested by Alvaro at [1]. The
> >v33-0006-Alternate-grammar-for-ALL-TABLES-IN-SCHEMA.patch patch has
> >the grammar changes as suggested by Alvaro at [1]. If we agree this is
> >better, I will merge this into the 0001 patch.
> >[1] - 
> >https://www.postgresql.org/message-id/202109241325.eag5g6mpvoup%40alvherre.pgsql
>
> About the schema patch, I think a schema and a table which belongs to this 
> schema shouldn't be specified at the same time.
> But what if someone uses "ALTER TABLE ... SET SCHEMA ..." after "CREATE 
> PUBLICATION"?
>
> For example:
>
> create schema sch1;
> create schema sch2;
> create table sch2.t (a int);
> create publication pub1 for all tables in schema sch1, table sch2.t; alter 
> table sch2.t set schema sch1;
>
> postgres=# \dRp+
>   Publication pub1
>   Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
> --++-+-+-+---+--
> --++-+-+-+---+
>  postgres | f  | t   | t   | t   | t | f
> Tables:
> "sch1.t"
> Tables from schemas:
> "sch1"
>
> Table t has been output twice.
> I think this should not be supported, should we do something for this 
> scenario?

Yes this should not be supported, we should throw an error in this case.
This is handled in the v34 patch attached at [1].
[1] - 
https://www.postgresql.org/message-id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.gmail.com

Regards,
Vignesh




Re: when the startup process doesn't (logging startup delays)

2021-09-28 Thread Robert Haas
On Tue, Sep 28, 2021 at 8:06 AM Nitin Jadhav
 wrote:
> I thought mentioning the unit in milliseconds and the example in
> seconds would confuse the user, so I changed the example to
> milliseconds.The message behind the above description looks good to me
> however I feel some sentences can be explained in less words. The
> information related to the units is missing and I feel it should be
> mentioned in the document. The example says, if the setting has the
> default value of 10 seconds, then it explains the behaviour. I feel it
> may not be the default value, it can be any value set by the user. So
> mentioning 'default' in the example does not look good to me. I feel
> we just have to mention "if this setting is set to the value of 10
> seconds". Below is the modified version of the above information.

It is common to mention what the default is as part of the
documentation of a GUC. I don't see why this one should be an
exception, especially since not mentioning it reduces the length of
the documentation by exactly one word.

I don't mind the sentence you added at the end to clarify the default
units, but the way you've rewritten the first sentence makes it, in my
opinion, much less clear.

> I had added additional code to check the value of the
> 'log_startup_progress_interval' variable and  disable the feature in
> case of -1 in the earlier versions of the patch (Specifically
> v9.patch). There was a review comment for v9 patch and it resulted in
> major refactoring of the patch.
...
> The answer for the question of "I don't understand why you posted the
> previous version of the patch without testing that it works" is, for
> the value of -1, the above description was my understanding and for
> the value of 0, the older versions of the patch was behaving as
> documented. But with the later versions the behaviour got changed and
> I missed to modify the documentation. So I modified it in the last
> version.

v9 was posted on August 3rd. I told you that it wasn't working on
September 23rd. You posted a new version that still does not work on
September 27th. I think you should have tested each version of your
patch before posting it, and especially after any major refactorings.
And if for whatever reason you didn't, then certainly after I told you
on September 23rd that it didn't work, I think you should have fixed
it before posting a new version.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Couldn't we mark enum_in() as immutable?

2021-09-28 Thread Tom Lane
Andrew Dunstan  writes:
> On 9/27/21 5:54 PM, Tom Lane wrote:
>> Currently enum_in() is marked as stable, on the reasonable grounds
>> that it depends on system catalog contents.  However, after the
>> discussion at [1] I'm wondering why it wouldn't be perfectly safe,
>> and useful, to mark it as immutable.

> The value returned depends on the label values in pg_enum, so if someone
> decided to rename a label that would affect it, no? Same for enum_out.

Hm.  I'd thought about this to the extent of considering that if we
rename label A to B, then stored values of "A" would now print as "B",
and const-folding "A" earlier would track that which seems OK.
But you're right that then introducing a new definition of "A"
(via ADD or RENAME) would make things messy.

>> Moreover, if it's *not* good enough, then our existing practice of
>> folding enum literals to OID constants on-sight must be unsafe too.

I'm still a little troubled by this angle.  However, we've gotten away
with far worse instability for datetime literals, so maybe it's not a
problem in practice.

regards, tom lane




Re: Couldn't we mark enum_in() as immutable?

2021-09-28 Thread Komяpa
PostGIS has a very similar thing: ST_Transform is marked as immutable but
does depend on contents of spatial_ref_sys table. Although it is shipped
with extension and almost never changes incompatibly, there are scenarios
where it breaks: dump/restore + index or generated column can fail the
import if data gets fed into the immutable function before the contents of
spatial_ref_sys is restored. I'd love this issue to be addressed at the
core level as benefits of having it as immutable outweigh even this
unfortunate issue.



On Tue, Sep 28, 2021 at 6:04 PM Tom Lane  wrote:

> Andrew Dunstan  writes:
> > On 9/27/21 5:54 PM, Tom Lane wrote:
> >> Currently enum_in() is marked as stable, on the reasonable grounds
> >> that it depends on system catalog contents.  However, after the
> >> discussion at [1] I'm wondering why it wouldn't be perfectly safe,
> >> and useful, to mark it as immutable.
>
> > The value returned depends on the label values in pg_enum, so if someone
> > decided to rename a label that would affect it, no? Same for enum_out.
>
> Hm.  I'd thought about this to the extent of considering that if we
> rename label A to B, then stored values of "A" would now print as "B",
> and const-folding "A" earlier would track that which seems OK.
> But you're right that then introducing a new definition of "A"
> (via ADD or RENAME) would make things messy.
>
> >> Moreover, if it's *not* good enough, then our existing practice of
> >> folding enum literals to OID constants on-sight must be unsafe too.
>
> I'm still a little troubled by this angle.  However, we've gotten away
> with far worse instability for datetime literals, so maybe it's not a
> problem in practice.
>
> regards, tom lane
>
>
>


Re: Non-decimal integer literals

2021-09-28 Thread Peter Eisentraut

On 07.09.21 13:50, Zhihong Yu wrote:

On 16.08.21 17:32, John Naylor wrote:
 > The one thing that jumped out at me on a cursory reading is
 > the {integer} rule, which seems to be used nowhere except to
 > call process_integer_literal, which must then inspect the token
text to
 > figure out what type of integer it is. Maybe consider 4 separate
 > process_*_literal functions?

Agreed, that can be done in a simpler way.  Here is an updated patch.

Hi,
Minor comment:

+SELECT int4 '0o112';

Maybe involve digits of up to 7 in the octal test case.


Good point, here is a lightly updated patch.
From 43957a1f48ed6f750f231ef8e3533d74d7ac4cc9 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 28 Sep 2021 17:14:44 +0200
Subject: [PATCH v3] Non-decimal integer literals

Add support for hexadecimal, octal, and binary integer literals:

0x42F
0o273
0b100101

per SQL:202x draft.

This adds support in the lexer as well as in the integer type input
functions.

Discussion: 
https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb...@enterprisedb.com
---
 doc/src/sgml/syntax.sgml   | 26 ++
 src/backend/catalog/information_schema.sql |  6 +-
 src/backend/catalog/sql_features.txt   |  1 +
 src/backend/parser/scan.l  | 87 +--
 src/backend/utils/adt/int8.c   | 54 
 src/backend/utils/adt/numutils.c   | 97 ++
 src/fe_utils/psqlscan.l| 55 
 src/interfaces/ecpg/preproc/pgc.l  | 64 +-
 src/test/regress/expected/int2.out | 19 +
 src/test/regress/expected/int4.out | 37 +
 src/test/regress/expected/int8.out | 19 +
 src/test/regress/sql/int2.sql  |  7 ++
 src/test/regress/sql/int4.sql  | 11 +++
 src/test/regress/sql/int8.sql  |  7 ++
 14 files changed, 425 insertions(+), 65 deletions(-)

diff --git a/doc/src/sgml/syntax.sgml b/doc/src/sgml/syntax.sgml
index d66560b587..a4f04199c6 100644
--- a/doc/src/sgml/syntax.sgml
+++ b/doc/src/sgml/syntax.sgml
@@ -694,6 +694,32 @@ Numeric Constants
 
 
 
+
+ Additionally, non-decimal integer constants can be used in these forms:
+
+0xhexdigits
+0ooctdigits
+0bbindigits
+
+ hexdigits is one or more hexadecimal digits
+ (0-9, A-F), octdigits is one or more octal
+ digits (0-7), bindigits is one or more binary
+ digits (0 or 1).  Hexadecimal digits and the radix prefixes can be in
+ upper or lower case.  Note that only integers can have non-decimal forms,
+ not numbers with fractional parts.
+
+
+
+ These are some examples of this:
+0b100101
+0B10011001
+0o273
+0O755
+0x42f
+0X
+
+
+
 
  integer
  bigint
diff --git a/src/backend/catalog/information_schema.sql 
b/src/backend/catalog/information_schema.sql
index 11d9dd60c2..ce88c483a2 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -119,7 +119,7 @@ CREATE FUNCTION _pg_numeric_precision(typid oid, typmod 
int4) RETURNS integer
  WHEN 1700 /*numeric*/ THEN
   CASE WHEN $2 = -1
THEN null
-   ELSE (($2 - 4) >> 16) & 65535
+   ELSE (($2 - 4) >> 16) & 0x
END
  WHEN 700 /*float4*/ THEN 24 /*FLT_MANT_DIG*/
  WHEN 701 /*float8*/ THEN 53 /*DBL_MANT_DIG*/
@@ -147,7 +147,7 @@ CREATE FUNCTION _pg_numeric_scale(typid oid, typmod int4) 
RETURNS integer
WHEN $1 IN (1700) THEN
 CASE WHEN $2 = -1
  THEN null
- ELSE ($2 - 4) & 65535
+ ELSE ($2 - 4) & 0x
  END
ELSE null
   END;
@@ -163,7 +163,7 @@ CREATE FUNCTION _pg_datetime_precision(typid oid, typmod 
int4) RETURNS integer
WHEN $1 IN (1083, 1114, 1184, 1266) /* time, timestamp, same + tz */
THEN CASE WHEN $2 < 0 THEN 6 ELSE $2 END
WHEN $1 IN (1186) /* interval */
-   THEN CASE WHEN $2 < 0 OR $2 & 65535 = 65535 THEN 6 ELSE $2 & 65535 
END
+   THEN CASE WHEN $2 < 0 OR $2 & 0x = 0x THEN 6 ELSE $2 & 
0x END
ELSE null
   END;
 
diff --git a/src/backend/catalog/sql_features.txt 
b/src/backend/catalog/sql_features.txt
index 9f424216e2..d6359503f3 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -526,6 +526,7 @@ T652SQL-dynamic statements in SQL routines  
NO
 T653   SQL-schema statements in external routines  YES 
 T654   SQL-dynamic statements in external routines NO  
 T655   Cyclically dependent routines   YES 
+T661   Non-decimal integer literalsYES SQL:202x draft
 T811   Basic SQL/JSON constructor functionsNO  
 T812   SQL/JSON: JSON_OBJECTAGG

Re: Non-decimal integer literals

2021-09-28 Thread Peter Eisentraut



On 09.09.21 16:08, Vik Fearing wrote:

Even without that point, this patch *is* going to break valid queries,
because every one of those cases is a valid number-followed-by-identifier
today,


Ah, true that.  So if this does go in, we may as well add the
underscores at the same time.


Yeah, looks like I'll need to look into the identifier lexing issues 
previously discussed.  I'll attack that during the next commit fest.



so I kind of wonder why we're in such a hurry to adopt something
that hasn't even made it past draft-standard status.

I don't really see a hurry here.  I am fine with waiting until the draft
becomes final.


Right, the point is to explore this now so that it can be ready when the 
standard is ready.





Re: Couldn't we mark enum_in() as immutable?

2021-09-28 Thread Andrew Dunstan


On 9/28/21 11:04 AM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 9/27/21 5:54 PM, Tom Lane wrote:
>>> Currently enum_in() is marked as stable, on the reasonable grounds
>>> that it depends on system catalog contents.  However, after the
>>> discussion at [1] I'm wondering why it wouldn't be perfectly safe,
>>> and useful, to mark it as immutable.
>> The value returned depends on the label values in pg_enum, so if someone
>> decided to rename a label that would affect it, no? Same for enum_out.
> Hm.  I'd thought about this to the extent of considering that if we
> rename label A to B, then stored values of "A" would now print as "B",
> and const-folding "A" earlier would track that which seems OK.
> But you're right that then introducing a new definition of "A"
> (via ADD or RENAME) would make things messy.
>
>>> Moreover, if it's *not* good enough, then our existing practice of
>>> folding enum literals to OID constants on-sight must be unsafe too.
> I'm still a little troubled by this angle.  However, we've gotten away
> with far worse instability for datetime literals, so maybe it's not a
> problem in practice.
>
>   


Yeah, I suspect it's not.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [PATCH] Print error when libpq-refs-stamp fails

2021-09-28 Thread Rachel Heaton
On Tue, Sep 28, 2021 at 6:14 AM Daniel Gustafsson  wrote:

>
> Since it's not actually libpq which calls exit (no such patch would ever be
> committed), I think it would be clearer to indicate that a library linked to 
> is
> the culprit.  How about something like "libpq must not be linked against any
> library calling exit"?
>

Excellent update to the error message. Patch attached.


0002-Print-error-when-libpq-refs-stamp-fails.patch
Description: Binary data


Re: FETCH FIRST clause PERCENT option

2021-09-28 Thread Jaime Casanova
On Thu, Mar 25, 2021 at 11:15:10AM -0400, David Steele wrote:
> On 1/26/21 11:29 AM, Surafel Temesgen wrote:
> > 
> > On Mon, Jan 25, 2021 at 2:39 PM Kyotaro Horiguchi
> > mailto:horikyota@gmail.com>> wrote:
> > 
> > Sorry for the dealy. I started to look this.
> > 
> > Hi kyotaro,
> > Thanks for looking into this but did we agree to proceed
> > on this approach? I fear that it will be the west of effort if
> > Andrew comes up with the patch for his approach.
> > Andrew Gierth: Are you working on your approach?
> 
> [Added Andrew to recipients]
> 
> Andrew, would you care to comment?
> 

Hi everyone,

This is still marked as needs review, but I think it should be marked
as "Returned with feedback" or "waiting on author".

suggestions?

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL




Re: [BUG] failed assertion in EnsurePortalSnapshotExists()

2021-09-28 Thread Tom Lane
"Drouvot, Bertrand"  writes:
> Does it make sense (as it is currently) to set the ActiveSnapshot to 
> NULL and not ensuring the same is done for ActivePortal->portalSnapshot?

I think that patch is just a kluge :-(

After tracing through this I've figured out what is happening, and
why you need to put the exception block inside a loop to make it
happen.  The first iteration goes fine, and we end up with an empty
ActiveSnapshot stack (and no portalSnapshots) because that's how
the COMMIT leaves it.  In the second iteration, we try to
re-establish the portal snapshot, but at the point where
EnsurePortalSnapshotExists is called (from the first INSERT
command) we are already inside a subtransaction thanks to the
plpgsql exception block.  This means that the stacked ActiveSnapshot
has as_level = 2, although the Portal owning it belongs to the
outer transaction.  So at the next exception, AtSubAbort_Snapshot
zaps the stacked ActiveSnapshot, but the Portal stays alive and
now it has a dangling portalSnapshot pointer.

Basically there seem to be two ways to fix this, both requiring
API additions to snapmgr.c (hence, cc'ing Alvaro for opinion):

1. Provide a variant of PushActiveSnapshot that allows the caller
to specify the as_level to use, and then have
EnsurePortalSnapshotExists, as well as other places that create
Portal-associated snapshots, use this with as_level equal to the
Portal's createSubid.  The main hazard I see here is that this could
theoretically allow the ActiveSnapshot stack to end up with
out-of-order as_level values, causing issues.  For the moment we
could probably just add assertions to check that the passed as_level
is >= next level, or maybe even that this variant is only called with
empty ActiveSnapshot stack.

2. Provide a way for AtSubAbort_Portals to detect whether a
portalSnapshot pointer points to a snap that's going to be
deleted by AtSubAbort_Snapshot, and then just have it clear any
portalSnapshots that are about to become dangling.  (This'd amount
to accepting the possibility that portalSnapshot is of a different
subxact level from the portal, and dealing with the situation.)

I initially thought #2 would be the way to go, but it turns out
to be a bit messy since what we have is a Snapshot pointer not an
ActiveSnapshotElt pointer.  We'd have to do something like search the
ActiveSnapshot stack looking for pointer equality to the caller's
Snapshot pointer, which seems fragile --- do we assume as_snap is
unique for any other purpose?

That being the case, I'm now leaning to #1.  Thoughts?

regards, tom lane




Re: Fixing WAL instability in various TAP tests

2021-09-28 Thread Tom Lane
I wrote:
> So that's the same hardware, and identical PG source tree, and different
> results.  This seems to leave only two theories standing:

I forgot theory 3: it's intermittent.  Apparently the probability has
dropped a lot since 2018, but behold:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=flaviventris&dt=2021-09-28%2014%3A20%3A41

(with successful runs just before and after this one, on the same
animal)

Note that the delta is not exactly like the previous result, either.
So there's more than one symptom, but in any case it seems like
we have an issue in WAL replay.  I wonder whether it's bloom's fault
or a core bug.

regards, tom lane




Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-28 Thread Jacob Champion
On Tue, 2021-09-28 at 15:38 +0200, Magnus Hagander wrote:
> I'm a bit hesitant about the ldapuser libpq parameter. Do we really
> want to limit ourselves to just ldap, if we allow this? I mean, why
> not allow say radius or pam to also specify a different username for
> the external system? If we want to do that, now or in the future, we
> should have a much more generic parameter name, something like
> authuser?

I'd be on board with a more general option name.

So from the perspective of a SASL exchange, PGUSER would be the
authorization identity, and PGAUTHUSER, say, would be the
authentication identity. Is "auth" a clear enough prefix that users and
devs will understand what the difference is between the two?

 | authn authz
-+---
  envvar | PGAUTHUSERPGUSER
conninfo | authuser  user
frontend | conn->pgauthuser  conn->pguser backend | port->auth_user   
port->user_name

> Why do we actually need ldap_map_dn? Shouldn't this just be what
> happens if you specify map= on an ldap connection?

For simple-bind setups, I think requiring users to match an entire DN
is probably unnecessary (and/or dangerous once you start getting into
regex mapping), so the map uses the bare username by default. My intent
was for that to have the same default behavior as cert maps.

Thanks,
--Jacob


Re: Fixing WAL instability in various TAP tests

2021-09-28 Thread Mark Dilger



> On Sep 28, 2021, at 10:27 AM, Tom Lane  wrote:
> 
> I wonder whether it's bloom's fault
> or a core bug.

Looking closer at the TAP test, it's not ORDERing the result set from the 
SELECTs on either node, but it is comparing the sets for stringwise equality, 
which is certainly order dependent.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-28 Thread Jacob Champion
On Tue, 2021-09-28 at 18:02 +, Jacob Champion wrote:
> On Tue, 2021-09-28 at 15:38 +0200, Magnus Hagander wrote:
> > I'm a bit hesitant about the ldapuser libpq parameter. Do we really
> > want to limit ourselves to just ldap, if we allow this? I mean, why
> > not allow say radius or pam to also specify a different username for
> > the external system? If we want to do that, now or in the future, we
> > should have a much more generic parameter name, something like
> > authuser?
> 
> I'd be on board with a more general option name.
> 
> So from the perspective of a SASL exchange, PGUSER would be the
> authorization identity, and PGAUTHUSER, say, would be the
> authentication identity. Is "auth" a clear enough prefix that users and
> devs will understand what the difference is between the two?
> 
>  | authn authz
> -+---
>   envvar | PGAUTHUSERPGUSER
> conninfo | authuser  user
> frontend | conn->pgauthuser  conn->pguser backend | port->auth_user   
> port->user_name
> 
> > Why do we actually need ldap_map_dn? Shouldn't this just be what
> > happens if you specify map= on an ldap connection?
> 
> For simple-bind setups, I think requiring users to match an entire DN
> is probably unnecessary (and/or dangerous once you start getting into
> regex mapping), so the map uses the bare username by default. My intent
> was for that to have the same default behavior as cert maps.
> 
> Thanks,
> --Jacob



Re: Fixing WAL instability in various TAP tests

2021-09-28 Thread Tom Lane
I wrote:
> So there's more than one symptom, but in any case it seems like
> we have an issue in WAL replay.  I wonder whether it's bloom's fault
> or a core bug.

Actually ... I bet it's just the test script's fault.  It waits for the
standby to catch up like this:

my $caughtup_query =
  "SELECT pg_current_wal_lsn() <= write_lsn FROM pg_stat_replication 
WHERE application_name = '$applname';";
$node_primary->poll_query_until('postgres', $caughtup_query)
  or die "Timed out while waiting for standby 1 to catch up";

which seems like completely the wrong condition.  Don't we need the
standby to have *replayed* the WAL, not merely written it to disk?

I'm also wondering why this doesn't use wait_for_catchup, instead
of reinventing the query to use.

regards, tom lane




Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-28 Thread Jacob Champion
On Tue, 2021-09-28 at 18:08 +, Jacob Champion wrote:
> >  | authn authz
> > -+---
> >   envvar | PGAUTHUSERPGUSER
> > conninfo | authuser  user
> > frontend | conn->pgauthuser  conn->pguser backend | port->auth_user   
> > port->user_name

Ugh, PEBKAC problems today... apologies. This should have been

 | authn authz
-+---
  envvar | PGAUTHUSERPGUSER
conninfo | authuser  user
frontend | conn->pgauthuser  conn->pguser
 backend | port->auth_user   port->user_name

--Jacob


Re: Fixing WAL instability in various TAP tests

2021-09-28 Thread Tom Lane
Mark Dilger  writes:
> Looking closer at the TAP test, it's not ORDERing the result set from the 
> SELECTs on either node, but it is comparing the sets for stringwise equality, 
> which is certainly order dependent.

Well, it's forcing a bitmap scan, so what we're getting is the native
ordering of a bitmapscan result.  That should match, given that what
we're doing is physical replication.  I think adding ORDER BY would
be more likely to obscure real issues than hide test instability.

regards, tom lane




Re: Fixing WAL instability in various TAP tests

2021-09-28 Thread Mark Dilger



> On Sep 28, 2021, at 11:07 AM, Mark Dilger  
> wrote:
> 
> Looking closer at the TAP test, it's not ORDERing the result set from the 
> SELECTs on either node, but it is comparing the sets for stringwise equality, 
> which is certainly order dependent.

Taking the output from the buildfarm page, parsing out the first test's results 
and comparing got vs. expected for this test:

is($primary_result, $standby_result, "$test_name: query result matches");

the primary result had all the same rows as the standby, along with additional 
rows.  Comparing the results, they match other than rows missing from the 
standby that are present on the primary.  That seems consistent with the view 
that the query on the standby is running before all the data has replicated 
across.

However, the missing rows all have column i either 0 or 3, though the test 
round-robins i=0..9 as it performs the inserts.  I would expect the wal for the 
inserts to not cluster around any particular value of i.  The DELETE and VACUUM 
commands do operate on a single value of i, so that would make sense if the 
data failed to be deleted on the standby after successfully being deleted on 
the primary, but then I'd expect the standby to have more rows, not fewer.

Perhaps having the bloom index messed up answers that, though.  I think it 
should be easy enough to get the path to the heap main table fork and the bloom 
main index fork for both the primary and standby and do a filesystem comparison 
as part of the wal test.  That would tell us if they differ, and also if the 
differences are limited to just one or the other.

I'll go write that up


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Fixing WAL instability in various TAP tests

2021-09-28 Thread Tom Lane
Mark Dilger  writes:
> Perhaps having the bloom index messed up answers that, though.  I think it 
> should be easy enough to get the path to the heap main table fork and the 
> bloom main index fork for both the primary and standby and do a filesystem 
> comparison as part of the wal test.  That would tell us if they differ, and 
> also if the differences are limited to just one or the other.

I think that's probably overkill, and definitely out-of-scope for
contrib/bloom.  If we fear that WAL replay is not reproducing the data
accurately, we should be testing for that in some more centralized place.

Anyway, I confirmed my diagnosis by adding a delay in WAL apply
(0001 below); that makes this test fall over spectacularly.
And 0002 fixes it.  So I propose to push 0002 as soon as the
v14 release freeze ends.

Should we back-patch 0002?  I'm inclined to think so.  Should
we then also back-patch enablement of the bloom test?  Less
sure about that, but I'd lean to doing so.  A test that appears
to be there but isn't actually invoked is pretty misleading.

regards, tom lane

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index e51a7a749d..eecbe57aee 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7370,6 +7370,9 @@ StartupXLOG(void)
 			{
 bool		switchedTLI = false;
 
+if (random() < INT_MAX/100)
+	pg_usleep(10);
+
 #ifdef WAL_DEBUG
 if (XLOG_DEBUG ||
 	(rmid == RM_XACT_ID && trace_recovery_messages <= DEBUG2) ||
diff --git a/contrib/bloom/t/001_wal.pl b/contrib/bloom/t/001_wal.pl
index 55ad35926f..be8916a8eb 100644
--- a/contrib/bloom/t/001_wal.pl
+++ b/contrib/bloom/t/001_wal.pl
@@ -16,12 +16,10 @@ sub test_index_replay
 {
 	my ($test_name) = @_;
 
+	local $Test::Builder::Level = $Test::Builder::Level + 1;
+
 	# Wait for standby to catch up
-	my $applname = $node_standby->name;
-	my $caughtup_query =
-	  "SELECT pg_current_wal_lsn() <= write_lsn FROM pg_stat_replication WHERE application_name = '$applname';";
-	$node_primary->poll_query_until('postgres', $caughtup_query)
-	  or die "Timed out while waiting for standby 1 to catch up";
+	$node_primary->wait_for_catchup($node_standby);
 
 	my $queries = qq(SET enable_seqscan=off;
 SET enable_bitmapscan=on;


Re: POC: Cleaning up orphaned files using undo logs

2021-09-28 Thread Antonin Houska
Dmitry Dolgov <9erthali...@gmail.com> wrote:

> > On Tue, Sep 21, 2021 at 10:07:55AM +0200, Dmitry Dolgov wrote:
> > On Tue, 21 Sep 2021 09:00 Antonin Houska,  wrote:
> >
> > > Dmitry Dolgov <9erthali...@gmail.com> wrote:
> > >
> > > > Yep, makes sense, thanks. I have few more questions:
> > > >
> > > > * The use case with orphaned files is working somewhat differently after
> > > >   the rebase on the latest master, do you observe it as well? The
> > > >   difference is ApplyPendingUndo -> SyncPostCheckpoint doesn't clean up
> > > >   an orphaned relation file immediately (only later on checkpoint)
> > > >   because of empty pendingUnlinks. I haven't investigated more yet, but
> > > >   seems like after this commit:
> > > >
> > > > commit 7ff23c6d277d1d90478a51f0dd81414d343f3850
> > > > Author: Thomas Munro 
> > > > Date:   Mon Aug 2 17:32:20 2021 +1200
> > > >
> > > > Run checkpointer and bgwriter in crash recovery.
> > > >
> > > > Start up the checkpointer and bgwriter during crash recovery
> > > (except in
> > > > --single mode), as we do for replication.  This wasn't done back
> > > in
> > > > commit cdd46c76 out of caution.  Now it seems like a better idea
> > > to make
> > > > the environment as similar as possible in both cases.  There may
> > > also be
> > > > some performance advantages.
> > > >
> > > >   something has to be updated (pendingOps are empty right now, so no
> > > >   unlink request is remembered).
> > >
> > > I haven't been debugging that part recently, but yes, this commit is
> > > relevant,
> > > thanks for pointing that out! Attached is a patch that should fix it. I'll
> > > include it in the next version of the patch series, unless you tell me 
> > > that
> > > something is still wrong.
> > >
> >
> > Sure, but  I can take a look only in a couple of days.
> 
> Thanks for the patch.
> 
> Hm, maybe there is some misunderstanding. My question above was about
> the changed behaviour, when orphaned files (e.g. created relation files
> after the backend was killed) are removed only by checkpointer when it
> kicks in. As far as I understand, the original intention was to do this
> job right away, that's why SyncPre/PostCheckpoint was invoked. But the
> recent changes around checkpointer make the current implementation
> insufficient.

> The patch you've proposed removes invokation of SyncPre/PostCheckpoint,
> do I see it correctly? In this sense it doesn't change anything, except
> removing non-functioning code of course.

Yes, it sounds like a misundeerstanding. I thought you complain about code
which is no longer needed.

The original intention was to make sure that the files are ever unlinked. IIRC
before the commit 7ff23c6d27 the calls SyncPre/PostCheckpoint were necessary
because the checkpointer wasn't runnig that early during the startup. Without
these calls the startup process would exit without doing anything. Sorry, I
see now that the comment incorrectly says "... it seems simpler ...", but in
fact it was necessary.

> But the question, probably
> reformulated from the more design point of view, stays the same — when
> and by which process such orphaned files have to be removed? I've
> assumed by removing right away the previous version was trying to avoid
> any kind of thunder effects of removing too many at once, but maybe I'm
> mistaken here.

I'm just trying to use the existing infrastructure: the effect of DROP TABLE
also appear to be performed by the checkpointer. However I don't know why the
unlinks need to be performed by the checkpointer.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com




Re: statement_timeout vs DECLARE CURSOR

2021-09-28 Thread Tom Lane
[ redirect to -hackers ]

I wrote:
>> Christophe Pettus  writes:
>>> A bit more poking revealed the reason: The ON HOLD cursor's query is 
>>> executed at commit time (which is, logically, not interruptible), but 
>>> that's all wrapped in the single statement outside of a transaction.

>> Hmm ... seems like a bit of a UX failure.  I wonder why we don't persist
>> such cursors before we get into the uninterruptible part of COMMIT.

> Oh, I see the issue.  It's not that that part of COMMIT isn't
> interruptible; you can control-C out of it just fine.  The problem
> is that finish_xact_command() disarms the statement timeout before
> starting CommitTransactionCommand at all.

> We could imagine pushing the responsibility for that down into
> xact.c, allowing it to happen after CommitTransaction has finished
> running user-defined code.  But it seems like a bit of a mess
> because there are so many other code paths there.  Not sure how
> to avoid future bugs-of-omission.

Actually ... maybe it needn't be any harder than the attached?

This makes it possible for a statement timeout interrupt to occur
anytime during CommitTransactionCommand, but I think
CommitTransactionCommand had better be able to protect itself
against that anyway, for a couple of reasons:

1. It's not significantly different from a query-cancel interrupt,
which surely could arrive during that window.

2. COMMIT-within-procedures already exposes us to statement timeout
during COMMIT.

regards, tom lane

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 0775abe35d..9ec96f2453 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -2713,9 +2713,6 @@ start_xact_command(void)
 static void
 finish_xact_command(void)
 {
-	/* cancel active statement timeout after each command */
-	disable_statement_timeout();
-
 	if (xact_started)
 	{
 		CommitTransactionCommand();
@@ -2733,6 +2730,9 @@ finish_xact_command(void)
 
 		xact_started = false;
 	}
+
+	/* cancel active statement timeout after each command */
+	disable_statement_timeout();
 }
 
 


Re: pgcrypto support for bcrypt $2b$ hashes

2021-09-28 Thread Daniel Fone

> On 29/09/2021, at 2:33 AM, Daniel Gustafsson  wrote:
> 
>> On 28 Sep 2021, at 05:15, Daniel Fone  wrote:
>> 
>>> On 26/09/2021, at 12:09 AM, Daniel Gustafsson  wrote:
>>> 
>>> Upgrading our crypt_blowfish.c to the upstream 1.3 version would be the 
>>> correct
>>> fix IMO, but since we have a few local modifications it's not a drop-in.  I
>>> don't think it would be too hairy, but one needs to be very careful when
>>> dealing with crypto.
>> 
>> My C experience is limited, but I can make an initial attempt if the effort 
>> would be worthwhile. Is this realistically a patch that a newcomer to the 
>> codebase should attempt?
> 
> I don't see why not, the best first patches are those scratching an itch.  If
> you feel up for it then give it a go, I - and the rest of pgsql-hackers - can
> help if you need to bounce ideas.

I’m glad you said that. I couldn’t resist trying and have attached a patch. By 
referencing the respective git logs, I didn’t have too much difficulty 
identifying the material changes in each codebase. I’ve documented all the 
postgres-specific changes to upstream in the header comment for each file.

Daniel



0001-Merge-upstream-crypt_blowfish-1.3.patch
Description: Binary data


Some thoughts about the TAP tests' wait_for_catchup()

2021-09-28 Thread Tom Lane
I noticed that some test scripts, instead of using wait_for_catchup
to wait for replication catchup, use ad-hoc code like

my $primary_lsn =
  $primary->safe_psql('postgres', 'select pg_current_wal_lsn()');
$standby->poll_query_until('postgres',
qq{SELECT '$primary_lsn'::pg_lsn <= pg_last_wal_replay_lsn()})
  or die "standby never caught up";

This does not look much like what wait_for_catchup does, which
typically ends up issuing queries like

SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming'
FROM pg_catalog.pg_stat_replication WHERE application_name = 'standby';

It seems to me that for most purposes wait_for_catchup's approach is
strictly worse, for two reasons:

1. It continually recomputes the primary's pg_current_wal_lsn().
Thus, if anything is happening on the primary (e.g. autovacuum),
we're moving the goalposts as to how much the standby is required
to replay before we continue.  That slows down the tests, makes
them less reproducible, and could help to mask actual bugs of the
form this-hasn't-been-done-when-it-should-have.

2. It's querying the primary's view of the standby's state, which
introduces a reporting delay.  This has exactly the same three
problems as the other point.

So I think we ought to change wait_for_catchup to do things more
like the first way, where feasible.  This seems pretty easy for
the call sites that provide the standby's PostgresNode; but it's
not so easy for the ones that just provide a subscription name.

So my first question is: to what extent are these tests specifically
intended to exercise the replay reporting mechanisms?  Would we lose
important coverage if we changed *all* the call sites to use the
query-the-standby approach?  If so, which ones might be okay to
change?

The second question is what do we want the API to look like?  This
is simple if we can change all wait_for_catchup callers to pass
the standby node.  Otherwise, the choices are to split wait_for_catchup
into two implementations, or to keep its current API and have it
do significantly different things depending on isa("PostgresNode").

Thoughts?

regards, tom lane




Re: Incorrect fd handling in syslogger.c for Win64 under EXEC_BACKEND

2021-09-28 Thread Michael Paquier
On Tue, Sep 28, 2021 at 02:36:52PM +0900, Michael Paquier wrote:
> I wrote that a bit too quickly.  After looking at it, what we could
> use to parse the handle pointer is scanint8() instead, even if that's
> a bit ugly.  I also found the code a bit confused regarding "fd", that
> could be manipulated as an int or intptr_t, so something like the
> attached should improve the situation.

As reminded by Jacob, the code is corrently correct as handles are
4 bytes on both Win32 and Win64:
https://docs.microsoft.com/en-us/windows/win32/winauto/32-bit-and-64-bit-interoperability

Sorry for the noise.  It looks like I got confused by intptr_t :p
--
Michael


signature.asc
Description: PGP signature


Re: POC: Cleaning up orphaned files using undo logs

2021-09-28 Thread Thomas Munro
On Wed, Sep 29, 2021 at 8:18 AM Antonin Houska  wrote:
> I'm just trying to use the existing infrastructure: the effect of DROP TABLE
> also appear to be performed by the checkpointer. However I don't know why the
> unlinks need to be performed by the checkpointer.

For DROP TABLE, we leave an empty file (I've been calling it a
"tombstone file") so that GetNewRelFileNode() won't let you reuse the
same relfilenode in the same checkpoint cycle.  One reason is that
wal_level=minimal has a data-eating crash recovery failure mode if you
reuse a relfilenode in a checkpoint cycle.




Re: Add prefix pg_catalog to pg_get_statisticsobjdef_columns() in describe.c (\dX)

2021-09-28 Thread Tatsuro Yamada

Hi Magnus!



I found other functions that we should add "pg_catalog" prefix in
describe.c. This fix is similar to the following commit.

Yup, that's correct. Applied and backpatched to 14 (but it won't be in the 14.0 
release).



Thank you!


Regards,
Tatsuro Yamada






Re: Add jsonlog log_destination for JSON server logs

2021-09-28 Thread Michael Paquier
On Tue, Sep 28, 2021 at 12:30:10PM +0900, Michael Paquier wrote:
> 0001 is in a rather commitable shape, and I have made the code
> consistent with HEAD.  However, I think that its handling of
> _get_osfhandle() is clunky for 64-bit compilations as long is 32b in
> WIN32 but intptr_t is platform-dependent as it could be 32b or 64b, so
> atoi() would overflow if the handle is larger than INT_MAX for 64b
> builds:
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/standard-types
> This problem deserves a different thread.

This happens to not be a problem as only 32 bits are significant for
handles for both Win32 and Win64.  This also means that we should be
able to remove the use for "long" in this code, making the routines
more symmetric.  I have done more tests with Win32 and Win64, and
applied it.  I don't have MinGW environments at hand, but looking at
the upstream code that should not matter.  The buildfarm will let
us know soon enough if there is a problem thanks to the TAP tests of
pg_ctl.
--
Michael


signature.asc
Description: PGP signature


Re: Make relfile tombstone files conditional on WAL level

2021-09-28 Thread Thomas Munro
On Fri, Mar 5, 2021 at 11:02 AM Thomas Munro  wrote:
> This is a WIP with an open question to research: what could actually
> break if we did this?

I thought this part of bgwriter.c might be a candidate:

   if (FirstCallSinceLastCheckpoint())
   {
   /*
* After any checkpoint, close all smgr files.  This is so we
* won't hang onto smgr references to deleted files indefinitely.
*/
smgrcloseall();
   }

Hmm, on closer inspection, isn't the lack of real interlocking with
checkpoints a bit suspect already?  What stops bgwriter from writing
to the previous relfilenode generation's fd, if a relfilenode is
recycled while BgBufferSync() is running?  Not sinval, and not the
above code that only runs between BgBufferSync() invocations.




Re: reindexdb usage message about system catalogs

2021-09-28 Thread Michael Paquier
On Tue, Sep 28, 2021 at 04:15:22PM +0200, Magnus Hagander wrote:
> Is there a reason the "only" is missing from the -s option? AFAIK that's
> what it means, so the attached patch should be correct?

I cannot think of a reason.  This seems historically inherited from
pg_dump, and the option got added when the tool was moved from
contrib/ to src/bin/ as of 85e9a5a.
--
Michael


signature.asc
Description: PGP signature


Re: Fixing WAL instability in various TAP tests

2021-09-28 Thread Michael Paquier
On Tue, Sep 28, 2021 at 03:00:13PM -0400, Tom Lane wrote:
> Should we back-patch 0002?  I'm inclined to think so.  Should
> we then also back-patch enablement of the bloom test?  Less
> sure about that, but I'd lean to doing so.  A test that appears
> to be there but isn't actually invoked is pretty misleading.

A backpatch sounds adapted to me for both patches.  The only risk that
I could see here is somebody implementing a new test by copy-pasting
this one if we were to keep things as they are on stable branches.
--
Michael


signature.asc
Description: PGP signature


RE: Added schema level support for publication.

2021-09-28 Thread tanghy.f...@fujitsu.com
On Monday, Tuesday, September 28, 2021 10:49 PM, vignesh C 
 wrote:
> 
> Yes this should not be supported, we should throw an error in this case.
> This is handled in the v34 patch attached at [1].
> [1] - https://www.postgresql.org/message-
> id/CALDaNm2Z9TfuoCf09YGKfwy7F1NwC4iCXJGTaZS%3DchH6VHtadQ%40mail.g
> mail.com
> 

Thanks for fixing it. I confirmed the error can be output as expected.

Here is a problem related to publish_via_partition_root option when using this
patch. With this option on, I think pg_get_publication_tables function gave an
unexcepted result and the subscriber would get dual data during table sync.


For example:
(I used pg_publication_tables view to make it looks clearer)

create schema sch1;
create table sch1.tbl1 (a int) partition by range ( a );
create table sch1.tbl1_part1 partition of sch1.tbl1 for values from (1) to (10);
create table sch1.tbl1_part2 partition of sch1.tbl1 for values from (10) to 
(20);
create table sch1.tbl1_part3 partition of sch1.tbl1 for values from (20) to 
(30);
create publication pub for all tables in schema sch1 
with(publish_via_partition_root=1);

postgres=# select * from pg_publication_tables where pubname='pub';
 pubname | schemaname | tablename
-++
 pub | sch1   | tbl1_part1
 pub | sch1   | tbl1_part2
 pub | sch1   | tbl1_part3
 pub | sch1   | tbl1
(4 rows)


It shows both the partitioned table and its leaf partitions. But the result of
FOR ALL TABLES publication couldn't show the leaf partitions.


postgres=# create publication pub_all for all tables 
with(publish_via_partition_root=1);
CREATE PUBLICATION
postgres=# select * from pg_publication_tables where pubname='pub_all';
 pubname | schemaname | tablename
-++---
 pub_all | sch1   | tbl1
(1 row)


How about make the following change to avoid it? I tried it and it also fixed 
dual
data issue during table sync.


diff --git a/src/backend/catalog/pg_publication.c 
b/src/backend/catalog/pg_publication.c
index 04e785b192..4e8ccdabc6 100644
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -632,7 +632,8 @@ GetSchemaPublicationRelations(Oid schemaid, 
PublicationPartOpt pub_partopt)
Form_pg_class relForm = (Form_pg_class) GETSTRUCT(tuple);
Oid relid = relForm->oid;

-   if (is_publishable_class(relid, relForm))
+   if (is_publishable_class(relid, relForm) &&
+   !(relForm->relispartition && pub_partopt == 
PUBLICATION_PART_ROOT))
result = lappend_oid(result, relid);
}


Regards
Tang


Re: Make relfile tombstone files conditional on WAL level

2021-09-28 Thread Thomas Munro
On Wed, Aug 4, 2021 at 3:22 AM Robert Haas  wrote:
> It's not really clear to me what problem is at hand. The problems that
> the tombstone system created for the async I/O stuff weren't really
> explained properly, IMHO. And I don't think the current system is all
> that ugly. it's not the most beautiful thing in the world but we have
> lots of way worse hacks. And, it's easy to understand, requires very
> little code, and has few moving parts that can fail. As hacks go it's
> a quality hack, I would say.

It's not really an AIO problem.  It's just that while testing the AIO
stuff across a lot of operating systems, we had tests failing on
Windows because the extra worker processes you get if you use
io_method=worker were holding cached descriptors and causing stuff
like DROP TABLESPACE to fail.  AFAIK every problem we discovered in
that vein is a current live bug in all versions of PostgreSQL for
Windows (it just takes other backends or the bgwriter to hold an fd at
the wrong moment).  The solution I'm proposing to that general class
of problem is https://commitfest.postgresql.org/34/2962/ .

In the course of thinking about that, it seemed natural to look into
the possibility of getting rid of the tombstones, so that at least
Unix systems don't find themselves having to suffer through a
CHECKPOINT just to drop a tablespace that happens to contain a
tombstone.




RE: Added schema level support for publication.

2021-09-28 Thread houzj.f...@fujitsu.com
On Tues, Sep 28, 2021 10:46 PM vignesh C  wrote:
> Attached v34 patch has the changes for the same.

Thanks for updating the patch.
Here are a few comments.

1)
+ * ALL TABLES IN SCHEMA schema [[, ...]

[[ -> [

2)
+   /* ALTER PUBLICATION ... ADD/DROP TABLE/ALL TABLES IN SCHEMA parameters 
*/

The two '/' seems a bit unclear and it doesn't mention the SET case.
Maybe we can write like:

/* parameters used for ALTER PUBLICATION ... ADD/DROP/SET publication objects */

3)
+   /*
+* Check if setting the relation to a different schema will result in 
the
+* publication having schema and same schema's table in the publication.
+*/
+   if (stmt->objectType == OBJECT_TABLE)
+   {
+   ListCell   *lc;
+   List   *schemaPubids = GetSchemaPublications(nspOid);
+   foreach(lc, schemaPubids)
+   {
+   Oid pubid = lfirst_oid(lc);
+   if (list_member_oid(GetPublicationRelations(pubid, 
PUBLICATION_PART_ALL),
+   relid))
+   ereport(ERROR,

How about we check this case like the following ?

List   *schemaPubids = GetSchemaPublications(nspOid);
List   *relPubids = GetRelationPublications(RelationGetRelid(rel));
if (list_intersection(schemaPubids, relPubids))
ereport(ERROR, ...

Best regards,
Hou zj


Re: how to distinguish between using the server as a standby or for executing a targeted recovery in PG 11?

2021-09-28 Thread Bharath Rupireddy
On Tue, Sep 28, 2021 at 7:39 PM Andrew Dunstan  wrote:
>
>
> On 9/24/21 12:46 PM, Bharath Rupireddy wrote:
> > Hi,
> >
> > I'm trying to set up a postgres server with version 11 in targeted
> > recovery mode (for the first time after my journey started with
> > postgres) and I came across the explanation at [1] in PG 12 and newer
> > versions that we have a clear differentiation as to what is the
> > "standby" mode or "targeted recovery" mode. How do we differentiate
> > these two modes in PG 11? Can anyone please help me with it?
> >
> > PS: hackers-list may not be the right place to ask, but I'm used to
> > seeking help from it.
> >
>
>
> see 
> and 

Thanks! It looks like the 'standby_mode = off' in the recovery.conf
with a 'recovery_target' makes the server to be in "targeted recovery"
mode.

Regards,
Bharath Rupireddy.




Re: Added schema level support for publication.

2021-09-28 Thread Amit Kapila
On Wed, Sep 29, 2021 at 9:07 AM houzj.f...@fujitsu.com
 wrote:
>
> On Tues, Sep 28, 2021 10:46 PM vignesh C  wrote:
> > Attached v34 patch has the changes for the same.
>
> 3)
> +   /*
> +* Check if setting the relation to a different schema will result in 
> the
> +* publication having schema and same schema's table in the 
> publication.
> +*/
> +   if (stmt->objectType == OBJECT_TABLE)
> +   {
> +   ListCell   *lc;
> +   List   *schemaPubids = GetSchemaPublications(nspOid);
> +   foreach(lc, schemaPubids)
> +   {
> +   Oid pubid = lfirst_oid(lc);
> +   if (list_member_oid(GetPublicationRelations(pubid, 
> PUBLICATION_PART_ALL),
> +   relid))
> +   ereport(ERROR,
>
> How about we check this case like the following ?
>
> List   *schemaPubids = GetSchemaPublications(nspOid);
> List   *relPubids = GetRelationPublications(RelationGetRelid(rel));
> if (list_intersection(schemaPubids, relPubids))
> ereport(ERROR, ...
>

Won't this will allow changing one of the partitions for which only
partitioned table is part of the target schema? And then probably we
won't be able to provide the exact publication in the error message if
we followed the above?

-- 
With Regards,
Amit Kapila.




Re: (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented

2021-09-28 Thread bt21tanigaway

2021-09-28 22:55 に shinya11.k...@nttdata.com さんは書きました:

-Original Message-
From: bt21tanigaway 
Sent: Tuesday, September 28, 2021 5:06 PM
To: Fujii Masao ;
pgsql-hackers@lists.postgresql.org
Subject: Re: (LOCK TABLE options) “ONLY” and “NOWAIT” are not yet
implemented

2021-09-28 17:03 に bt21tanigaway さんは書きました:

2021-09-28 16:36 に Fujii Masao さんは書きました:

On 2021/09/28 16:13, bt21tanigaway wrote:

Hi,

(LOCK TABLE options) “ONLY” and “NOWAIT” are not yet implemented in
tab-complete. I made a patch for these options.


Thanks for the patch!
The patch seems to forget to handle the tab-completion for "LOCK 
ONLY

 IN".


Thanks for your comment!
I attach a new patch fixed to this mail.

Regards,

Koyu Tanigawa


Sorry, I forgot to attach patch file.
"fix-tab-complete2.patch" is fixed!

Regards,

Koyu Tanigawa

Thank you for your patch.
I have two comments.

1. When I executed git apply, an error occured.
---
$ git apply ~/Downloads/fix-tab-complete2.patch
/home/penguin/Downloads/fix-tab-complete2.patch:14: indent with spaces.
COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION
SELECT 'TABLE'" " UNION SELECT 'ONLY'");
warning: 1 line adds whitespace errors.
---


Thank you for your feedback.
I might have added whitespace when I was checking the patch file.
I attach a new patch to this mail.


2. The command "LOCK TABLE a, b;" can be executed, but tab-completion
doesn't work properly. Is it OK?

It's OK for now.
But it should be able to handle a case of multiple tables in the future.

Regards,

Koyu Tanigawadiff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 5cd5838668..eada1f453c 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3599,39 +3599,52 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("(");
 
 /* LOCK */
-	/* Complete LOCK [TABLE] with a list of tables */
+	/* Complete LOCK [TABLE] [ONLY] with a list of tables */
 	else if (Matches("LOCK"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables,
-   " UNION SELECT 'TABLE'");
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION SELECT 'TABLE'" " UNION SELECT 'ONLY'");
+
 	else if (Matches("LOCK", "TABLE"))
-		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, "");
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, " UNION SELECT 'ONLY'");
 
+	else if (Matches("LOCK", "TABLE", "ONLY") || Matches("LOCK", "ONLY"))
+		COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, "");
 	/* For the following, handle the case of a single table only for now */
-
-	/* Complete LOCK [TABLE]  with "IN" */
-	else if (Matches("LOCK", MatchAnyExcept("TABLE")) ||
-			 Matches("LOCK", "TABLE", MatchAny))
-		COMPLETE_WITH("IN");
-
-	/* Complete LOCK [TABLE]  IN with a lock mode */
+	
+	/* Complete LOCK [TABLE] [ONLY]  with "IN" or "NOWAIT" */
+	else if (Matches("LOCK", MatchAnyExcept("TABLE|ONLY")) ||
+			 Matches("LOCK", "TABLE", MatchAnyExcept("ONLY")) ||
+			 Matches("LOCK", "ONLY", MatchAny) ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny))
+		COMPLETE_WITH("IN", "NOWAIT");
+
+	/* Complete LOCK [TABLE] [ONLY]  IN with a lock mode */
 	else if (Matches("LOCK", MatchAny, "IN") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN"))
+			 Matches("LOCK", "TABLE", MatchAny, "IN") ||
+			 Matches("LOCK", "ONLY", MatchAny, "IN") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN"))
 		COMPLETE_WITH("ACCESS SHARE MODE",
 	  "ROW SHARE MODE", "ROW EXCLUSIVE MODE",
 	  "SHARE UPDATE EXCLUSIVE MODE", "SHARE MODE",
 	  "SHARE ROW EXCLUSIVE MODE",
 	  "EXCLUSIVE MODE", "ACCESS EXCLUSIVE MODE");
 
-	/* Complete LOCK [TABLE]  IN ACCESS|ROW with rest of lock mode */
+	/* Complete LOCK [TABLE][ONLY]  IN ACCESS|ROW with rest of lock mode */
 	else if (Matches("LOCK", MatchAny, "IN", "ACCESS|ROW") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN", "ACCESS|ROW"))
+			 Matches("LOCK", "TABLE", MatchAny, "IN", "ACCESS|ROW") ||
+			 Matches("LOCK", "ONLY", MatchAny, "IN", "ACCESS|ROW") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN", "ACCESS|ROW"))
 		COMPLETE_WITH("EXCLUSIVE MODE", "SHARE MODE");
 
-	/* Complete LOCK [TABLE]  IN SHARE with rest of lock mode */
+	/* Complete LOCK [TABLE] [ONLY]  IN SHARE with rest of lock mode */
 	else if (Matches("LOCK", MatchAny, "IN", "SHARE") ||
-			 Matches("LOCK", "TABLE", MatchAny, "IN", "SHARE"))
-		COMPLETE_WITH("MODE", "ROW EXCLUSIVE MODE",
-	  "UPDATE EXCLUSIVE MODE");
+			 Matches("LOCK", "TABLE", MatchAny, "IN", "SHARE") ||
+			 Matches("LOCK", "ONLY", MatchAny, "IN", "SHARE") ||
+			 Matches("LOCK", "TABLE", "ONLY", MatchAny, "IN", "SHARE"))
+		COMPLETE_WITH("MODE", "ROW EXCLUSIVE MODE", "UPDATE EXCLUSIVE MODE");
+
+  /* Complete LOCK [TABLE] [ONLY]  [IN lockmode MODE] with "NOWAIT"*/
+	else if (HeadMatches("LOCK") && TailMatches("MODE"))
+		COMPLETE_WITH("NOWAIT");
 
 /* NOTIFY --- can be inside EXPLAIN, RULE, etc */
 	else if (TailMatches("NOTIFY"))


Re: Added schema level support for publication.

2021-09-28 Thread Amit Kapila
On Tue, Sep 28, 2021 at 8:15 PM vignesh C  wrote:
>
> On Mon, Sep 27, 2021 at 12:15 PM houzj.f...@fujitsu.com
>  wrote:
>
> Attached v34 patch has the changes for the same.
>

Few comments on v34-0001-Added-schema-level-support-for-publication
==
1.
+ * This rule parses publication object with and without keyword prefix.

I think we should write it as: "This rule parses publication objects
with and without keyword prefixes."

2.
+ * For the object without keyword prefix, we cannot just use
relation_expr here,
+ * because some extended expression in relation_expr cannot be used as a

/expression/expressions

3.
+/*
+ * Process pubobjspec_list to check for errors in any of the objects and
+ * convert PUBLICATIONOBJ_CONTINUATION into appropriate PublicationObjSpecType
+ * type.

4.
+ /*
+ * Check if setting the relation to a different schema will result in the
+ * publication having schema and same schema's table in the publication.
+ */
+ if (stmt->objectType == OBJECT_TABLE)
+ {
+ ListCell   *lc;
+ List*schemaPubids = GetSchemaPublications(nspOid);
+ foreach(lc, schemaPubids)
+ {
+ Oid pubid = lfirst_oid(lc);
+ if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
+ relid))
+ ereport(ERROR,
+ errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot move table \"%s\" to schema \"%s\"",
+RelationGetRelationName(rel), stmt->newschema),
+ errdetail("Altering table will result in having schema \"%s\" and
same schema's table \"%s\" in the publication \"%s\" which is not
supported.",
+   stmt->newschema,
+   RelationGetRelationName(rel),
+   get_publication_name(pubid, false)));
+ }
+ }

Let's slightly change the comment as: "Check that setting the relation
to a different schema won't result in the publication having schema
and the same schema's table." and errdetail as: "The schema \"%s\" and
same schema's table \"%s\" cannot be part of the same publication
\"%s\"."

Maybe it is better to specify that this will disallow the partition table case.

5.
ObjectsInPublicationToOids()
{
..
+ pubobj = (PublicationObjSpec *) lfirst(cell);
+ if (pubobj->pubobjtype == PUBLICATIONOBJ_TABLE)

It is better to keep an empty line between above two lines.

6.
List*schemaPubids = GetSchemaPublications(nspOid);
foreach(lc, schemaPubids)
..
Oid pubid = lfirst_oid(lc);
if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),

Add an empty line between each of the above two lines.

7.
+ /*
+ * Schema lock is held until the publication is altered to prevent
+ * concurrent schema deletion. No need to unlock the schemas, the locks
+ * will be released automatically at the end of alter publication command.
+ */
+ LockSchemaList(schemaidlist);

I think it is better to add a similar comment at other places where
this function is called. And we can shorten the comment atop
LockSchemaList to something like: "The schemas specified in the schema
list are locked in AccessShareLock mode in order to prevent concurrent
schema deletion."

8. In CreatePublication(), the check if (stmt->for_all_tables) can be
the first check and then in else if we can process tables and schemas.

9.
AlterPublication()
{
..
+ /* Lock the publication so nobody else can do anything with it. */
+ LockDatabaseObject(PublicationRelationId, pubform->oid, 0,
+AccessExclusiveLock);

I think it is better to say why we need this lock. So, can we change
the comment to something like: "Lock the publication so nobody else
can do anything with it. This prevents concurrent alter to add
table(s) that were already going to become part of the publication by
adding corresponding schema(s) via this command and similarly it will
prevent the concurrent addition of schema(s) for which there is any
corresponding table being added by this command."

-- 
With Regards,
Amit Kapila.




Re: [Proposal] Global temporary tables

2021-09-28 Thread Pavel Stehule
hi

ne 26. 9. 2021 v 6:05 odesílatel wenjing  napsal:

>
>
> 2021年9月23日 21:55,Tony Zhu  写道:
>
> Hi Wenjing
>
> we have reviewed the code, and done the regression tests,  all tests is
> pass,  we believe the feature code quality is ready for production ; and I
> will change the status to "Ready for commit”
>
> Thank you very much for your attention and testing.
> As we communicated, I fixed several issues and attached the latest patch.
>

looks so windows build is broken

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.148542

Regards

Pavel

>
>
> Wenjing
>
>
>


Re: Failed transaction statistics to measure the logical replication progress

2021-09-28 Thread Greg Nancarrow
On Tue, Sep 28, 2021 at 8:04 PM Amit Kapila  wrote:
>
> > Another idea could be to have a separate
> > view, say pg_stat_subscription_xact but I'm not sure it's a better
> > idea.
> >
>
> Yeah, that is another idea but I am afraid that having three different
> views for subscription stats will be too much. I think it would be
> better if we can display these additional stats via the existing view
> pg_stat_subscription or the new view pg_stat_subscription_errors (or
> whatever name we want to give it).
>


It seems that we have come full-circle in the discussion of how these
new stats could be best added.
Other than adding a new "pg_stats_subscription_xact" view (which Amit
thought would result in too many views) I don't see any clear solution
here, because the new xact-related cumulative stats don't really
consistently integrate with the existing in-memory stats in
pg_stat_subscription, and IMHO the new stats wouldn't integrate well
into the existing error-related stats in the
"pg_stat_subscription_errors" view (even if its name was changed, that
view in any case maintains lowel-level error details and the way error
entries are removed doesn't allow the "success" stats to be maintained
for the subscription and doesn't fit well with the added
"pg_stat_reset_subscription_error" function either).
If we can't really add another view like "pg_stats_subscription_xact",
it seems we need to find some way the new stats fit more consistently
into the existing "pg_stat_subscription" view.

Regards,
Greg Nancarrow
Fujitsu Australia




RE: Failed transaction statistics to measure the logical replication progress

2021-09-28 Thread osumi.takami...@fujitsu.com
Hi,


Thank you, Amit-san and Sawada-san for the discussion.
On Tuesday, September 28, 2021 7:05 PM Amit Kapila  
wrote:
> > Another idea could be to have a separate view, say
> > pg_stat_subscription_xact but I'm not sure it's a better idea.
> >
> 
> Yeah, that is another idea but I am afraid that having three different
> views for subscription stats will be too much. I think it would be
> better if we can display these additional stats via the existing view
> pg_stat_subscription or the new view pg_stat_subscription_errors (or
> whatever name we want to give it).
pg_stat_subscription_errors specializes in showing an error record.
So, it would be awkward to combine it with other normal xact stats.


> > > > Then, if, we proceed in this direction, the place to implement
> > > > those stats would be on the LogicalRepWorker struct, instead ?
> > > >
> > >
> > > Or, we can make existing stats persistent and then add these stats
> > > on top of it. Sawada-San, do you have any thoughts on this matter?
> >
> > I think that making existing stats including received_lsn and
> > last_msg_receipt_time persistent by using stats collector could cause
> > massive reporting messages. We can report these messages with a
> > certain interval to reduce the amount of messages but we will end up
> > seeing old stats on the view.
> >
> 
> Can't we keep the current and new stats both in-memory and persist on disk?
> So, the persistent stats data will be used to fill the in-memory counters 
> after
> restarting of workers, otherwise, we will always refer to in-memory values.
I felt this isn't impossible.
When we have to update the values of the xact stats is
the end of message apply for COMMIT, COMMIT PREPARED, STREAM_ABORT and etc
or the time when an error happens during apply. Then, if we want,
we can update xact stats values at such moments accordingly.
I'm thinking that we will have a hash table whose key is a pair of subid + relid
and entry is a proposed stats structure and update the entry,
depending on the above timings.

Here, one thing a bit unclear to me is
whether we should move existing stats of pg_stat_subscription
(such as last_lsn and reply_lsn) to the hash entry or not.
Now, in pg_stat_get_subscription() for pg_stat_subscription view,
current stats values are referenced directly from (a copy of)
existing LogicalRepCtx->workers. I felt that we need to avoid
a situation that some existing data are fetched from LogicalRepWorker
and other new xact stats are from the hash in the function,
in order to keeping the alignment of the function. Was this correct ?

Another thing we need to talk is where we put a new file
of contents of pg_stat_subscription. I'm thinking that
it's pg_logical/, because above idea does not interact with
stats collector any more.

Let me know if I miss something.


Best Regards,
Takamichi Osumi



Re: POC: Cleaning up orphaned files using undo logs

2021-09-28 Thread Antonin Houska
Thomas Munro  wrote:

> On Wed, Sep 29, 2021 at 8:18 AM Antonin Houska  wrote:
> > I'm just trying to use the existing infrastructure: the effect of DROP TABLE
> > also appear to be performed by the checkpointer. However I don't know why 
> > the
> > unlinks need to be performed by the checkpointer.
> 
> For DROP TABLE, we leave an empty file (I've been calling it a
> "tombstone file") so that GetNewRelFileNode() won't let you reuse the
> same relfilenode in the same checkpoint cycle.  One reason is that
> wal_level=minimal has a data-eating crash recovery failure mode if you
> reuse a relfilenode in a checkpoint cycle.

Interesting. Is the problem that REDO of the DROP TABLE command deletes the
relfilenode which already contains the new data, but the new data cannot be
recovered because (due to wal_level=minimal) it's not present in WAL? In this
case I suppose that the checkpoint just ensures that the DROP TABLE won't be
replayed during the next crash recovery.

BTW, does that comment fix attached make sense to you? The corresponding code
in InitSync() is

/*
 * Create pending-operations hashtable if we need it.  Currently, we 
need
 * it if we are standalone (not under a postmaster) or if we are a
 * checkpointer auxiliary process.
 */
if (!IsUnderPostmaster || AmCheckpointerProcess())

I suspect this is also related to the commit 7ff23c6d27.

Thanks for your answer, I was considering to add you to CC :-)

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

diff --git a/src/backend/storage/sync/sync.c b/src/backend/storage/sync/sync.c
index 1c78581354..ae6c5ff8e4 100644
--- a/src/backend/storage/sync/sync.c
+++ b/src/backend/storage/sync/sync.c
@@ -563,7 +563,7 @@ RegisterSyncRequest(const FileTag *ftag, SyncRequestType type,
 
 	if (pendingOps != NULL)
 	{
-		/* standalone backend or startup process: fsync state is local */
+		/* standalone backend or checkpointer process: fsync state is local */
 		RememberSyncRequest(ftag, type);
 		return true;
 	}


Re: Added schema level support for publication.

2021-09-28 Thread Greg Nancarrow
On Wed, Sep 29, 2021 at 3:16 PM Amit Kapila  wrote:
>
> 4.
> + /*
> + * Check if setting the relation to a different schema will result in the
> + * publication having schema and same schema's table in the publication.
> + */
> + if (stmt->objectType == OBJECT_TABLE)
> + {
> + ListCell   *lc;
> + List*schemaPubids = GetSchemaPublications(nspOid);
> + foreach(lc, schemaPubids)
> + {
> + Oid pubid = lfirst_oid(lc);
> + if (list_member_oid(GetPublicationRelations(pubid, PUBLICATION_PART_ALL),
> + relid))
> + ereport(ERROR,
> + errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> + errmsg("cannot move table \"%s\" to schema \"%s\"",
> +RelationGetRelationName(rel), stmt->newschema),
> + errdetail("Altering table will result in having schema \"%s\" and
> same schema's table \"%s\" in the publication \"%s\" which is not
> supported.",
> +   stmt->newschema,
> +   RelationGetRelationName(rel),
> +   get_publication_name(pubid, false)));
> + }
> + }
>
> Let's slightly change the comment as: "Check that setting the relation
> to a different schema won't result in the publication having schema
> and the same schema's table." and errdetail as: "The schema \"%s\" and
> same schema's table \"%s\" cannot be part of the same publication
> \"%s\"."
>

Since this code is in AlterTableNamespace() and the relation being
checked may or may not be part of a publication, I'd use "a
publication" instead of "the publication" in the comment.
Also, I'd say that we're doing the check because the mentioned
combination is not supported.

i.e. "Check that setting the relation to a different schema won't
result in a publication having both a schema and the same schema's
table, as this is not supported."

Regards,
Greg Nancarrow
Fujitsu Australia




RE: Failed transaction statistics to measure the logical replication progress

2021-09-28 Thread osumi.takami...@fujitsu.com
On Wednesday, September 29, 2021 2:59 PM Greg Nancarrow  
wrote:
> On Tue, Sep 28, 2021 at 8:04 PM Amit Kapila 
> wrote:
> >
> > > Another idea could be to have a separate view, say
> > > pg_stat_subscription_xact but I'm not sure it's a better idea.
> > >
> >
> > Yeah, that is another idea but I am afraid that having three different
> > views for subscription stats will be too much. I think it would be
> > better if we can display these additional stats via the existing view
> > pg_stat_subscription or the new view pg_stat_subscription_errors (or
> > whatever name we want to give it).
> 
> It seems that we have come full-circle in the discussion of how these new 
> stats
> could be best added.
> Other than adding a new "pg_stats_subscription_xact" view (which Amit
> thought would result in too many views) I don't see any clear solution here,
> because the new xact-related cumulative stats don't really consistently
> integrate with the existing in-memory stats in pg_stat_subscription, and IMHO
> the new stats wouldn't integrate well into the existing error-related stats 
> in the
> "pg_stat_subscription_errors" view (even if its name was changed, that view in
> any case maintains lowel-level error details and the way error entries are
> removed doesn't allow the "success" stats to be maintained for the
> subscription and doesn't fit well with the added
> "pg_stat_reset_subscription_error" function either).
> If we can't really add another view like "pg_stats_subscription_xact", it 
> seems
> we need to find some way the new stats fit more consistently into the existing
> "pg_stat_subscription" view.
Yeah, I agree with your conclusion.
I feel we cannot avoid changing some base of pg_stat_subscription
for the xact stats.



Best Regards,
Takamichi Osumi



RE: Added schema level support for publication.

2021-09-28 Thread houzj.f...@fujitsu.com
On Wed, Sep 29, 2021 12:34 PM Amit Kapila 
> On Wed, Sep 29, 2021 at 9:07 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Tues, Sep 28, 2021 10:46 PM vignesh C  wrote:
> > > Attached v34 patch has the changes for the same.
> >
> > 3)
> > +   /*
> > +* Check if setting the relation to a different schema will result 
> > in the
> > +* publication having schema and same schema's table in the
> publication.
> > +*/
> > +   if (stmt->objectType == OBJECT_TABLE)
> > +   {
> > +   ListCell   *lc;
> > +   List   *schemaPubids =
> GetSchemaPublications(nspOid);
> > +   foreach(lc, schemaPubids)
> > +   {
> > +   Oid pubid = lfirst_oid(lc);
> > +   if (list_member_oid(GetPublicationRelations(pubid,
> PUBLICATION_PART_ALL),
> > +   relid))
> > +   ereport(ERROR,
> >
> > How about we check this case like the following ?
> >
> > List   *schemaPubids = GetSchemaPublications(nspOid);
> > List   *relPubids = GetRelationPublications(RelationGetRelid(rel));
> > if (list_intersection(schemaPubids, relPubids))
> > ereport(ERROR, ...
> >
> 
> Won't this will allow changing one of the partitions for which only 
> partitioned
> table is part of the target schema?

I think it still disallow changing partition's schema to the published one.
I tested with the following SQLs.
-
create schema sch1;
create schema sch2;
create schema sch3;

create table sch1.tbl1 (a int) partition by range ( a );
create table sch2.tbl1_part1 partition of sch1.tbl1 for values from (1) to 
(101);
create table sch3.tbl1_part2 partition of sch1.tbl1 for values from (101) to 
(200);
create publication pub for ALL TABLES IN schema sch1, TABLE sch2.tbl1_part1;
alter table sch2.tbl1_part1 set schema sch1;
---* It will report an error here *
-

Did I miss something ?

Best regards,
Hou zj