Re: [PATCH] Add pretty-printed XML output option

2023-02-09 Thread Jim Jones

On 09.02.23 08:23, Tom Lane wrote:

Um ... why are you using PG_TRY here at all?  It seems like
you have full control of the actually likely error cases.
The only plausible error out of the StringInfo calls is OOM,
and you probably don't want to trap that at all.


My intention was to catch any unexpected error from 
xmlDocDumpFormatMemory and handle it properly. But I guess you're right, 
I can control the likely error cases by checking doc and nbytes.


You suggest something along these lines?

    xmlDocPtr  doc;
    xmlChar    *xmlbuf = NULL;
    text   *arg = PG_GETARG_TEXT_PP(0);
    StringInfoData buf;
    int nbytes;

    doc = xml_parse(arg, XMLOPTION_DOCUMENT, false, 
GetDatabaseEncoding(), NULL);


    if(!doc)
    elog(ERROR, "could not parse the given XML document");

    xmlDocDumpFormatMemory(doc, &xmlbuf, &nbytes, 1);

    xmlFreeDoc(doc);

    if(!nbytes)
    elog(ERROR, "could not indent the given XML document");

    initStringInfo(&buf);
    appendStringInfoString(&buf, (const char *)xmlbuf);

    xmlFree(xmlbuf);

    PG_RETURN_XML_P(stringinfo_to_xmltype(&buf));


Thanks!

Best, Jim



smime.p7s
Description: S/MIME Cryptographic Signature


Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-09 Thread Amit Kapila
On Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
 wrote:
>
> At Wed, 8 Feb 2023 09:03:03 +, "Hayato Kuroda (Fujitsu)" 
>  wrote in
> > Thank you for reviewing! PSA new version.
>
> +   if (statusinterval_ms > 0 && diffms > statusinterval_ms)
>
> The next expected feedback time is measured from the last status
> report.  Thus, it seems to me this may suppress feedbacks from being
> sent for an unexpectedly long time especially when min_apply_delay is
> shorter than wal_r_s_interval.
>

I think the minimum time before we send any feedback during the wait
is wal_r_s_interval. Now, I think if there is no transaction for a
long time before we get a new transaction, there should be keep-alive
messages in between which would allow us to send feedback at regular
intervals (wal_receiver_status_interval). So, I think we should be
able to send feedback in less than 2 * wal_receiver_status_interval
unless wal_sender/receiver timeout is very large and there is a very
low volume of transactions. Now, we can try to send the feedback
before we start waiting or maybe after every
wal_receiver_status_interval / 2 but I think that will lead to more
spurious feedback messages than we get the benefit from them.

-- 
With Regards,
Amit Kapila.




Re: Can we do something to help stop users mistakenly using force_parallel_mode?

2023-02-09 Thread John Naylor
On Thu, Feb 9, 2023 at 5:50 AM David Rowley  wrote:
> Attached updated patch.

Looks good at a glance, just found a spurious word:

+ "by forcing the planner into to generate plans which contains nodes "

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Is psSocketPoll doing the right thing?

2023-02-09 Thread Fujii Masao




On 2023/02/09 11:50, Kyotaro Horiguchi wrote:

Hello.

While looking a patch, I found that pqSocketPoll passes through the
result from poll(2) to the caller and throws away revents. If I
understand it correctly, poll() *doesn't* return -1 nor errno by the
reason it has set POLLERR, POLLHUP, POLLNVAL, and POLLRDHUP for some
of the target sockets, and returns 0 unless poll() itself failed to
work.


As far as I understand correctly, poll() returns >0 if "revents"
has either of those bits, not 0 nor -1.

You're thinking that pqSocketPoll() should check "revents" and
return -1 if either of those bits is set?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Consolidate ItemPointer to Datum conversion functions

2023-02-09 Thread Peter Eisentraut

On 06.02.23 11:11, Heikki Linnakangas wrote:

On 06/02/2023 11:54, Peter Eisentraut wrote:

Instead of defining the same set of macros several times, define it once
in an appropriate header file.  In passing, convert to inline functions.


Looks good to me. Did you consider moving PG_GETARG_ITEMPOINTER and 
PG_RETURN_ITEMPOINTER, too? They're only used in tid.c, but for most 
datatypes, we define the PG_GETARG and PG_RETURN macros in the same 
header file as the the Datum conversion functions.


Yeah that makes sense.  Here is an updated patch for that.
From 47b316eb6cfa50412d81fd183a8b54e0a104a685 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 9 Feb 2023 09:23:20 +0100
Subject: [PATCH v2] Consolidate ItemPointer to Datum conversion functions

Instead of defining the same set of macros several times, define it
once in an appropriate header file.  In passing, convert to inline
functions.

Discussion: 
https://www.postgresql.org/message-id/flat/844dd4c5-e5a1-3df1-bfaf-d1e1c2a16e45%40enterprisedb.com
---
 contrib/pageinspect/btreefuncs.c |  2 --
 contrib/pageinspect/ginfuncs.c   |  3 ---
 contrib/pageinspect/gistfuncs.c  |  2 --
 src/backend/utils/adt/tid.c  |  5 -
 src/include/storage/itemptr.h| 20 
 5 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index e4e5dc3c81..9cdc8e182b 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -50,8 +50,6 @@ PG_FUNCTION_INFO_V1(bt_multi_page_stats);
 
 #define IS_INDEX(r) ((r)->rd_rel->relkind == RELKIND_INDEX)
 #define IS_BTREE(r) ((r)->rd_rel->relam == BTREE_AM_OID)
-#define DatumGetItemPointer(X)  ((ItemPointer) DatumGetPointer(X))
-#define ItemPointerGetDatum(X)  PointerGetDatum(X)
 
 /* 
  * structure for single btree page statistics
diff --git a/contrib/pageinspect/ginfuncs.c b/contrib/pageinspect/ginfuncs.c
index efaa47e86d..0f846988df 100644
--- a/contrib/pageinspect/ginfuncs.c
+++ b/contrib/pageinspect/ginfuncs.c
@@ -21,9 +21,6 @@
 #include "utils/builtins.h"
 #include "utils/rel.h"
 
-#define DatumGetItemPointer(X)  ((ItemPointer) DatumGetPointer(X))
-#define ItemPointerGetDatum(X)  PointerGetDatum(X)
-
 
 PG_FUNCTION_INFO_V1(gin_metapage_info);
 PG_FUNCTION_INFO_V1(gin_page_opaque_info);
diff --git a/contrib/pageinspect/gistfuncs.c b/contrib/pageinspect/gistfuncs.c
index 3a947c82af..100697814d 100644
--- a/contrib/pageinspect/gistfuncs.c
+++ b/contrib/pageinspect/gistfuncs.c
@@ -31,8 +31,6 @@ PG_FUNCTION_INFO_V1(gist_page_items_bytea);
 
 #define IS_GIST(r) ((r)->rd_rel->relam == GIST_AM_OID)
 
-#define ItemPointerGetDatum(X)  PointerGetDatum(X)
-
 
 Datum
 gist_page_opaque_info(PG_FUNCTION_ARGS)
diff --git a/src/backend/utils/adt/tid.c b/src/backend/utils/adt/tid.c
index 251219a1ef..77fb74ab0c 100644
--- a/src/backend/utils/adt/tid.c
+++ b/src/backend/utils/adt/tid.c
@@ -37,11 +37,6 @@
 #include "utils/varlena.h"
 
 
-#define DatumGetItemPointer(X)  ((ItemPointer) DatumGetPointer(X))
-#define ItemPointerGetDatum(X)  PointerGetDatum(X)
-#define PG_GETARG_ITEMPOINTER(n) DatumGetItemPointer(PG_GETARG_DATUM(n))
-#define PG_RETURN_ITEMPOINTER(x) return ItemPointerGetDatum(x)
-
 #define LDELIM '('
 #define RDELIM ')'
 #define DELIM  ','
diff --git a/src/include/storage/itemptr.h b/src/include/storage/itemptr.h
index 354e50e68b..fafefa14cd 100644
--- a/src/include/storage/itemptr.h
+++ b/src/include/storage/itemptr.h
@@ -222,4 +222,24 @@ extern int32 ItemPointerCompare(ItemPointer arg1, 
ItemPointer arg2);
 extern void ItemPointerInc(ItemPointer pointer);
 extern void ItemPointerDec(ItemPointer pointer);
 
+/* 
+ * Datum conversion functions
+ * 
+ */
+
+static inline ItemPointer
+DatumGetItemPointer(Datum X)
+{
+   return (ItemPointer) DatumGetPointer(X);
+}
+
+static inline Datum
+ItemPointerGetDatum(const ItemPointerData *X)
+{
+   return PointerGetDatum(X);
+}
+
+#define PG_GETARG_ITEMPOINTER(n) DatumGetItemPointer(PG_GETARG_DATUM(n))
+#define PG_RETURN_ITEMPOINTER(x) return ItemPointerGetDatum(x)
+
 #endif /* ITEMPTR_H */
-- 
2.39.1



Re: Exit walsender before confirming remote flush in logical replication

2023-02-09 Thread Masahiko Sawada
Hi,

On Wed, Feb 8, 2023 at 6:47 PM Hayato Kuroda (Fujitsu)
 wrote:
>
> >
> > Dear Horiguchi-san,
> >
> > Thank you for checking the patch! PSA new version.
>
> PSA rebased patch that supports updated time-delayed patch[1].
>

Thank you for the patch! Here are some comments on v5 patch:

+/*
+ * Options for controlling the behavior of the walsender. Options can be
+ * specified in the START_STREAMING replication command. Currently only one
+ * option is allowed.
+ */
+typedef struct
+{
+WalSndShutdownMode shutdown_mode;
+} WalSndOptions;
+
+static WalSndOptions *my_options = NULL;

I'm not sure we need to have it as a struct at this stage since we
support only one option. I wonder if we can have one value, say
shutdown_mode, and we can make it a struct when we really need it.
Even if we use WalSndOptions struct, I don't think we need to
dynamically allocate it. Since a walsender can start logical
replication multiple times in principle, my_options is not freed.

---
+/*
+ * Parse given shutdown mode.
+ *
+ * Currently two values are accepted - "wait_flush" and "immediate"
+ */
+static void
+ParseShutdownMode(char *shutdownmode)
+{
+if (pg_strcasecmp(shutdownmode, "wait_flush") == 0)
+my_options->shutdown_mode = WALSND_SHUTDOWN_MODE_WAIT_FLUSH;
+else if (pg_strcasecmp(shutdownmode, "immediate") == 0)
+my_options->shutdown_mode = WALSND_SHUTDOWN_MODE_IMMIDEATE;
+else
+ereport(ERROR,
+errcode(ERRCODE_SYNTAX_ERROR),
+errmsg("SHUTDOWN_MODE requires
\"wait_flush\" or \"immediate\""));
+}

I think we should make the error message consistent with other enum
parameters. How about the message like:

ERROR:  invalid value shutdown mode: "%s"

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: proposal: psql: psql variable BACKEND_PID

2023-02-09 Thread Peter Eisentraut

On 03.02.23 11:41, Pavel Stehule wrote:
We can simply allow an access to backend process id thru psql variable. 
I propose the name "BACKEND_PID". The advantages of usage are simple 
accessibility by command \set, and less typing then using function 
pg_backend_pid, because psql variables are supported by tab complete 
routine. Implementation is very simple, because we can use the function 
PQbackendPID.


What would this be useful for?

You can mostly do this using

select pg_backend_pid() AS "BACKEND_PID" \gset





Re: proposal: psql: psql variable BACKEND_PID

2023-02-09 Thread Pavel Stehule
čt 9. 2. 2023 v 9:57 odesílatel Peter Eisentraut <
peter.eisentr...@enterprisedb.com> napsal:

> On 03.02.23 11:41, Pavel Stehule wrote:
> > We can simply allow an access to backend process id thru psql variable.
> > I propose the name "BACKEND_PID". The advantages of usage are simple
> > accessibility by command \set, and less typing then using function
> > pg_backend_pid, because psql variables are supported by tab complete
> > routine. Implementation is very simple, because we can use the function
> > PQbackendPID.
>
> What would this be useful for?
>
> You can mostly do this using
>
>  select pg_backend_pid() AS "BACKEND_PID" \gset
>

there are 2 (3) my motivations

first and main (for me) - I can use psql variables tab complete - just
:B - it is significantly faster
second - I can see all connection related information by \set
third - there is not hook on reconnect in psql - so if you implement
BACKEND_PID by self, you ensure to run query with pg_backend_pid() after
any reconnect or connection change.

It is clean so you can run "select pg_backend_pid() AS "BACKEND_PID" \gset"
and you can store it to .psqlrc. But most of the time I am in customer's
environment, and I have the time, possibility to do a complete setup of
.psqlrc. It looks (for me) like a generally useful feature to be
everywhere.

Regards

Pavel


Re: A bug with ExecCheckPermissions

2023-02-09 Thread Sergey Shinderuk

On 08.02.2023 21:23, Alvaro Herrera wrote:

On 2023-Feb-08, Amit Langote wrote:


On Wed, Feb 8, 2023 at 16:19 Alvaro Herrera  wrote:



I think we should also patch ExecCheckPermissions to use forboth(),
scanning the RTEs as it goes over the perminfos, and make sure that the
entries are consistent.


Hmm, we can’t use forboth here, because not all RTEs have the corresponding
RTEPermissionInfo, inheritance children RTEs, for example.


Doh, of course.


Also, it doesn’t make much sense to reinstate the original loop over
range table and fetch the RTEPermissionInfo for the RTEs with non-0
perminfoindex, because the main goal of the patch was to make
ExecCheckPermissions() independent of range table length.


Yeah, I'm thinking in a mechanism that would allow us to detect bugs in
development builds — no need to have it run in production builds.
However, I can't see any useful way to implement it.




Maybe something like the attached would do?


--
Sergey Shinderukhttps://postgrespro.com/
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 39bfb48dc22..94866743f48 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -577,6 +577,28 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
ListCell   *l;
boolresult = true;
 
+#ifdef USE_ASSERT_CHECKING
+   Bitmapset  *indexset = NULL;
+
+   /* Check that rteperminfos is consistent with rangeTable */
+   foreach(l, rangeTable)
+   {
+   RangeTblEntry *rte = lfirst_node(RangeTblEntry, l);
+
+   if (rte->perminfoindex != 0)
+   {
+   /* Sanity checks */
+   (void) getRTEPermissionInfo(rteperminfos, rte);
+   /* Many-to-one mapping not allowed */
+   Assert(!bms_is_member(rte->perminfoindex, indexset));
+   indexset = bms_add_member(indexset, rte->perminfoindex);
+   }
+   }
+
+   /* All rteperminfos are referenced */
+   Assert(bms_num_members(indexset) == list_length(rteperminfos));
+#endif
+
foreach(l, rteperminfos)
{
RTEPermissionInfo *perminfo = lfirst_node(RTEPermissionInfo, l);
diff --git a/src/backend/utils/adt/ri_triggers.c 
b/src/backend/utils/adt/ri_triggers.c
index 375b17b9f3b..f6d19226a0a 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -1399,6 +1399,7 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, 
Relation pk_rel)
pkrte->relid = RelationGetRelid(pk_rel);
pkrte->relkind = pk_rel->rd_rel->relkind;
pkrte->rellockmode = AccessShareLock;
+   pkrte->perminfoindex = 2;
 
pk_perminfo = makeNode(RTEPermissionInfo);
pk_perminfo->relid = RelationGetRelid(pk_rel);
@@ -1409,6 +1410,7 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, 
Relation pk_rel)
fkrte->relid = RelationGetRelid(fk_rel);
fkrte->relkind = fk_rel->rd_rel->relkind;
fkrte->rellockmode = AccessShareLock;
+   fkrte->perminfoindex = 1;
 
fk_perminfo = makeNode(RTEPermissionInfo);
fk_perminfo->relid = RelationGetRelid(fk_rel);


Inconsistent nullingrels due to oversight in deconstruct_distribute_oj_quals

2023-02-09 Thread Richard Guo
When we try to generate qual variants with different nullingrels in
deconstruct_distribute_oj_quals, we traverse all the JoinTreeItems and
adjust qual nulling bits as we crawl up the join tree.  For a
SpecialJoinInfo which commutes with current sjinfo from below left, in
the next level up it would null all the relids in its righthand.  So we
adjust qual nulling bits as below.

 /*
  * Adjust qual nulling bits for next level up, if needed.  We
  * don't want to put sjinfo's own bit in at all, and if we're
  * above sjinfo then we did it already.
  */
 if (below_sjinfo)
 quals = (List *)
 add_nulling_relids((Node *) quals,
othersj->min_righthand,
bms_make_singleton(othersj->ojrelid));

It seems to me there is oversight here.  Actually in next level up this
othersj would null all the relids in its syn_righthand, not only the
relids in its min_righthand.   If the quals happen to contain references
to relids which are in othersj->syn_righthand but not in
othersj->min_righthand, these relids would not get updated with
othersj->ojrelid added.  And this would cause qual nulling bits not
consistent.

I've managed to devise a query that can show this problem.

create table t1(a int, b int);
create table t2(a int, b int);
create table t3(a int, b int);
create table t4(a int, b int);

insert into t1 select i, i from generate_series(1,10)i;
insert into t2 select i, i from generate_series(1,10)i;
insert into t3 select i, i from generate_series(1,1000)i;
insert into t4 select i, i from generate_series(1,1000)i;
analyze;

select * from t1 left join (t2 left join t3 on t2.a > t3.a) on t1.b = t2.b
left join t4 on t2.b = t3.b;

This query would trigger the Assert() in search_indexed_tlist_for_var.
So I wonder that we should use othersj->syn_righthand here.

--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -2046,7 +2046,7 @@ deconstruct_distribute_oj_quals(PlannerInfo *root,
if (below_sjinfo)
quals = (List *)
add_nulling_relids((Node *) quals,
-  othersj->min_righthand,
+  othersj->syn_righthand,
   bms_make_singleton(othersj->ojrelid));

Thanks
Richard


Re: daitch_mokotoff module

2023-02-09 Thread Dag Lem
Tomas Vondra  writes:

> On 2/8/23 15:31, Dag Lem wrote:
>> Alvaro Herrera  writes:
>> 
>>> On 2023-Jan-17, Dag Lem wrote:
>>>
 + * Daitch-Mokotoff Soundex
 + *
 + * Copyright (c) 2021 Finance Norway
 + * Author: Dag Lem 
>>>
>>> Hmm, I don't think we accept copyright lines that aren't "PostgreSQL
>>> Global Development Group".  Is it okay to use that, and update the year
>>> to 2023?  (Note that answering "no" very likely means your patch is not
>>> candidate for inclusion.)  Also, we tend not to have "Author:" lines.
>>>
>> 
>> You'll have to forgive me for not knowing about this rule:
>> 
>>   grep -ER "Copyright.*[0-9]{4}" contrib/ | grep -v PostgreSQL
>> 
>> In any case, I have checked with the copyright owner, and it would be OK
>> to assign the copyright to "PostgreSQL Global Development Group".
>> 
>
> I'm not entirely sure what's the rule either, and I'm a committer. My
> guess is these cases are either old and/or adding a code that already
> existed elsewhere (like some of the double metaphone, for example), or
> maybe both. But I'd bet we'd prefer not adding more ...
>
>> To avoid going back and forth with patches, how do you propose that the
>> sponsor and the author of the contributed module should be credited?
>> Woule something like this be acceptable?
>> 
>
> We generally credit contributors in two ways - by mentioning them in the
> commit message, and by listing them in the release notes (for individual
> features).
>

I'll ask again, would the proposed credits be acceptable? In this case,
the code already existed elsewhere (as in your example for double
metaphone) as a separate extension. The copyright owner is OK with
copyright assignment, however I find it quite unreasonable that proper
credits should not be given. Neither commit messages nor release notes
follow the contributed module, which is in its entirety contributed by
an external entity.

I'll also point out that in addition to credits in code all over the
place, PostgreSQL has much more prominent credits in the documentation:

  grep -ER "Author" doc/ | grep -v PostgreSQL

"Author" is even documented as a top level section in the Reference
Pages as "Author (only used in the contrib section)", see

  https://www.postgresql.org/docs/15/docguide-style.html#id-1.11.11.8.2

If there really exists some new rule which says that for new
contributions under contrib/, credits should not be allowed in any way
in either code or documentation (IANAL, but AFAIU this would be in
conflict with laws on author's moral rights in several countries), then
one would reasonably expect that you'd be upfront about this, both in
documentation, and also as the very first thing when a contribution is
first proposed for inclusion.

Best regards

Dag Lem




Re: A bug with ExecCheckPermissions

2023-02-09 Thread Amit Langote
Hi,

On Thu, Feb 9, 2023 at 14:44 Sergey Shinderuk 
wrote:

> On 08.02.2023 21:23, Alvaro Herrera wrote:
> > On 2023-Feb-08, Amit Langote wrote:
> >
> >> On Wed, Feb 8, 2023 at 16:19 Alvaro Herrera 
> wrote:
> >
> >>> I think we should also patch ExecCheckPermissions to use forboth(),
> >>> scanning the RTEs as it goes over the perminfos, and make sure that the
> >>> entries are consistent.
> >>
> >> Hmm, we can’t use forboth here, because not all RTEs have the
> corresponding
> >> RTEPermissionInfo, inheritance children RTEs, for example.
> >
> > Doh, of course.
> >
> >> Also, it doesn’t make much sense to reinstate the original loop over
> >> range table and fetch the RTEPermissionInfo for the RTEs with non-0
> >> perminfoindex, because the main goal of the patch was to make
> >> ExecCheckPermissions() independent of range table length.
> >
> > Yeah, I'm thinking in a mechanism that would allow us to detect bugs in
> > development builds — no need to have it run in production builds.
> > However, I can't see any useful way to implement it.
> >
>
>
> Maybe something like the attached would do?


Thanks for the patch.  Something like this makes sense to me.

> --
Thanks, Amit Langote
EDB: http://www.enterprisedb.com


RE: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-09 Thread Takamichi Osumi (Fujitsu)
Hi,


On Thursday, February 9, 2023 4:56 PM Amit Kapila  
wrote:
> On Thu, Feb 9, 2023 at 12:17 AM Peter Smith 
> wrote:
> >
> > On Wed, Feb 8, 2023 at 8:03 PM Hayato Kuroda (Fujitsu)
> >  wrote:
> > >
> > ...
> > > > ==
> > > >
> > > > src/backend/replication/logical/worker.c
> > > >
> > > > 2. maybe_apply_delay
> > > >
> > > > + if (wal_receiver_status_interval > 0 && diffms >
> > > > + wal_receiver_status_interval * 1000L) { WaitLatch(MyLatch,
> > > > +   WL_LATCH_SET | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
> > > > +   wal_receiver_status_interval * 1000L,
> > > > +   WAIT_EVENT_RECOVERY_APPLY_DELAY);
> send_feedback(last_received,
> > > > + true, false, true); }
> > > >
> > > > I felt that introducing another variable like:
> > > >
> > > > long statusinterval_ms = wal_receiver_status_interval * 1000L;
> > > >
> > > > would help here by doing 2 things:
> > > > 1) The condition would be easier to read because the ms units
> > > > would be the same
> > > > 2) Won't need * 1000L repeated in two places.
> > > >
> > > > Only, do take care to assign this variable in the right place in
> > > > this loop in case the configuration is changed.
> > >
> > > Fixed. Calculations are done on two lines - first one is the
> > > entrance of the loop, and second one is the after SIGHUP is detected.
> > >
> >
> > TBH, I expected you would write this as just a *single* variable
> > assignment before the condition like below:
> >
> > SUGGESTION (tweaked comment and put single assignment before
> > condition)
> > /*
> >  * Call send_feedback() to prevent the publisher from exiting by
> >  * timeout during the delay, when the status interval is greater than
> >  * zero.
> >  */
> > status_interval_ms = wal_receiver_status_interval * 1000L; if
> > (status_interval_ms > 0 && diffms > status_interval_ms) { ...
> >
> > ~
> >
> > I understand in theory, your code is more efficient, but in practice,
> > I think the overhead of a single variable assignment every loop
> > iteration (which is doing WaitLatch anyway) is of insignificant
> > concern, whereas having one assignment is simpler than having two IMO.
> >
> 
> Yeah, that sounds better to me as well.
OK, fixed.

The comment adjustment suggested by Peter-san above
was also included in this v33.
Please have a look at the attached patch.



Best Regards,
Takamichi Osumi



v33-0001-Time-delayed-logical-replication-subscriber.patch
Description:  v33-0001-Time-delayed-logical-replication-subscriber.patch


Re: Improve logging when using Huge Pages

2023-02-09 Thread Alvaro Herrera
On 2023-Feb-08, Justin Pryzby wrote:

> I don't think it makes sense to run postgres -C huge_pages_active,
> however, so I see no issue that that would always returns "false".

Hmm, I would initialize it to return "unknown" rather than "off" — and
make sure it turns "off" at the appropriate time.  Otherwise you're just
moving the confusion elsewhere.

> If need be, maybe the documentation could say "indicates whether huge
> pages are active for the running server".

Dunno, that seems way too subtle.

> Does anybody else want to vote for a function rather than a
> RUNTIME_COMPUTED GUC ?

I don't think I'd like to have SELECT show_block_size() et al, so I'd
rather not go that way.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"¿Qué importan los años?  Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo"  (Mafalda)




Re: ANY_VALUE aggregate

2023-02-09 Thread Vik Fearing

On 1/23/23 08:50, David Rowley wrote:

On Thu, 19 Jan 2023 at 06:01, Vik Fearing  wrote:

Thank you for the review.  Attached is a new version rebased to d540a02a72.


I've only a bunch of nit-picks, personal preferences and random
thoughts to offer as a review:

1. I'd be inclined *not* to mention the possible future optimisation in:

+ * Currently this just returns the first value, but in the future it might be
+ * able to signal to the aggregate that it does not need to be called anymore.

I think it's unlikely that the transfn would "signal" such a thing. It
seems more likely if we did anything about it that nodeAgg.c would
maybe have some additional knowledge not to call that function if the
agg state already has a value. Just so we're not preempting how we
might do such a thing in the future, it seems best just to remove the
mention of it. I don't really think it serves as a good reminder that
we might want to do this one day anyway.


Modified.  My logic in having the transition function signal that it is 
finished is to one day allow something like:


  array_agg(x order by y limit z)


2. +any_value_trans(PG_FUNCTION_ARGS)

Many of transition function names end in "transfn", not "trans". I
think it's better to follow the existing (loosely followed) naming
pattern that a few aggregates seem to follow rather than invent a new
one.


Renamed.


3. I tend to try to copy the capitalisation of keywords from the
surrounding regression tests. I see the following breaks that.

+SELECT any_value(v) FILTER (WHERE v > 2) FROM (VALUES (1), (2), (3)) AS v (v);

(obviously, ideally, we'd always just follow the same capitalisation
of keywords everywhere in each .sql file, but we've long broken that
and the best way can do is be consistent with surrounding tests)


Downcased.


4. I think I'd use the word "Returns" instead of "Chooses" in:

+Chooses a non-deterministic value from the non-null input values.


Done.


5. I've not managed to find a copy of the 2023 draft, so I'm assuming
you've got the ignoring of NULLs correct.


Yes, I do.  This is part of , so SQL:2016 10.9 
GR 7.a applies.



6. Is it worth adding a WindowFunc test somewhere in window.sql with
an any_value(...) over (...)?  Is what any_value() returns as a
WindowFunc equally as non-deterministic as when it's used as an
Aggref? Can we assume there's no guarantee that it'll return the same
value for each partition in each row? Does the spec mention anything
about that?


This is governed by SQL:2016 10.9 GR 1.d and 1.e which defines the 
source rows for the aggregate: either a group or a window frame.  There 
is no difference in behavior.  I don't think a windowed test is useful 
here unless I were to implement moving transitions.  I think that might 
be overkill for this function.



7. I wondered if it's worth adding a
SupportRequestOptimizeWindowClause support function for this
aggregate. I'm thinking that it might not be as likely people would
use something more specific like first_value/nth_value/last_value
instead of using any_value as a WindowFunc. Also, I'm currently
thinking that a SupportRequestWFuncMonotonic for any_value() is not
worth the dozen or so lines of code it would take to write it. I'm
assuming it would always be a MONOTONICFUNC_BOTH function. It seems
unlikely that someone would have a subquery with a WHERE clause in the
upper-level query referencing the any_value() aggregate.  Thought I'd
mention both of these things anyway as someone else might think of
some good reason we should add them that I didn't think of.


I thought about this for a while and decided that it was not worthwhile.

v4 attached.  I am putting this back to Needs Review in the commitfest 
app, but these changes were editorial so it is probably RfC like Peter 
had set it.  I will let you be the judge of that.

--
Vik Fearing
From 410751cfa6367e5436e20011f3f47f37888190a1 Mon Sep 17 00:00:00 2001
From: Vik Fearing 
Date: Thu, 9 Feb 2023 10:37:10 +0100
Subject: [PATCH v4] Implement ANY_VALUE aggregate

SQL:2023 defines an ANY_VALUE aggregate whose purpose is to emit an
implementation-dependent (i.e. non-deterministic) value from the
aggregated rows.
---
 doc/src/sgml/func.sgml   | 14 ++
 src/backend/catalog/sql_features.txt |  1 +
 src/backend/utils/adt/misc.c | 10 ++
 src/include/catalog/pg_aggregate.dat |  4 
 src/include/catalog/pg_proc.dat  |  8 
 src/test/regress/expected/aggregates.out | 24 
 src/test/regress/sql/aggregates.sql  |  6 ++
 7 files changed, 67 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e09e289a43..8bdef6eb32 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -19735,6 +19735,20 @@ SELECT NULLIF(value, '(none)') ...
  
 
  
+  
+   
+
+ any_value
+
+any_value ( anyelement )
+same as input type
+   
+   
+

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

2023-02-09 Thread Aleksander Alekseev
Hi,

> Yes, and the fact is that cmin == cmax is something that we don't normally
> produce

Not sure if this is particularly relevant to this discussion but I
can't resist noticing that the heap doesn't even store cmin and
cmax... There is only HeapTupleHeaderData.t_cid and flags. cmin/cmax
are merely smoke and mirrors we use to trick a user.

And yes, the patch doesn't seem to break much mirrors:

```
=# create table t (a int unique, b int);
=# insert into t values (1,1), (1,2) on conflict (a) do update set b = 0;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
--+--+--+--+---+---
  731 |0 |0 |0 | 1 | 0
=# begin;
=# insert into t values (2,1), (2,2), (3,1) on conflict (a) do update set b = 0;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
--+--+--+--+---+---
  731 |0 |0 |0 | 1 | 0
  732 |0 |0 |0 | 2 | 0
  732 |0 |0 |0 | 3 | 1

=# insert into t values (2,1), (2,2), (3,1) on conflict (a) do update set b = 0;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
--+--+--+--+---+---
  731 |0 |0 |0 | 1 | 0
  732 |  732 |1 |1 | 2 | 0
  732 |  732 |1 |1 | 3 | 0

=# commit;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
--+--+--+--+---+---
  731 |0 |0 |0 | 1 | 0
  732 |  732 |1 |1 | 2 | 0
  732 |  732 |1 |1 | 3 | 0
```

> That's a spectactularly wrong argument in almost all cases. Unless you have a
> way to get to full branch coverage or use a model checker that basically does
> the same, testing isn't going to give you a whole lot of confidence that you
> haven't introduced bugs.

But neither will reviewing a lot of code...

> I've said my piece, as-is I vote to reject the patch.

Fair enough. I'm merely saying that rejecting a patch because it
doesn't include a TLA+ model is something novel :)

> I don't buy your argument about DO UPDATE needing to be brought into
> line with DO NOTHING. In any case I'm pretty sure that Tom's remarks
> in 2016 about a behavioral inconsistencies (which you cited) actually
> called for making DO NOTHING more like DO UPDATE -- not the other way
> around.

Interesting. Yep, we could use a bit of input from Tom on this one.

This of course would break backward compatibility. But we can always
invent something like:

```
INSERT INTO ..
ON CONFLICT DO [NOTHING|UPDATE .. ]
[ALLOWING|FORBIDDING] SELF CONFLICTS;
```

... if we really want to.

> problem. To me the best argument is also the simplest: who would want
> us to allow it, and for what purpose?

Good question.

This arguably has little use for application developers. As an
application developer you typically know your unique constraints and
using this knowledge you can rewrite the query as needed and add any
other accompanying logic.

However, extension developers, as an example, often don't know the
underlying unique constraints (more specifically, it's difficult to
look for them and process them manually) and often have to process any
garbage the application developer passes to an extension.

This of course is applicable not only to extensions, but to any
middleware between the DBMS and the application.

-- 
Best regards,
Aleksander Alekseev




RE: Exit walsender before confirming remote flush in logical replication

2023-02-09 Thread Hayato Kuroda (Fujitsu)
Dear Osumi-san,

Thank you for reviewing! PSA new version.

> (1)
> 
> +  Decides the condition for exiting the walsender process.
> +  'wait_flush', which is the default, the 
> walsender
> +  will wait for all the sent WALs to be flushed on the subscriber 
> side,
> +  before exiting the process. 'immediate' will 
> exit
> +  without confirming the remote flush. This may break the consistency
> +  between publisher and subscriber, but it may be useful for a system
> +  that has a high-latency network to reduce the amount of time for
> +  shutdown.
>
> (1-1)
> 
> The first part "exiting the walsender process" can be improved.
> Probably, you can say "the exiting walsender process" or
> "Decides the behavior of the walsender process at shutdown" instread.

Fixed. Second idea was chosen.

> (1-2)
> 
> Also, the next sentence can be improved something like
> "If the shutdown mode is wait_flush, which is the default, the
> walsender waits for all the sent WALs to be flushed on the subscriber side.
> If it is immediate, the walsender exits without confirming the remote flush".

Fixed.

> (1-3)
> 
> We don't need to wrap wait_flush and immediate by single quotes
> within the literal tag.

This style was ported from the SNAPSHOT options part, so I decided to keep.


> (2)
> 
> +   /* minapplydelay affects SHUTDOWN_MODE option */
> 
> I think we can move this comment to just above the 'if' condition
> and combine it with the existing 'if' conditions comments.

Moved and added some comments.

> (3) 001_rep_changes.pl
> 
> (3-1) Question
> 
> In general, do we add this kind of check when we extend the protocol
> (STREAM_REPLICATION command)
> or add a new condition for apply worker exit ?
> In case when we would like to know the restart of the walsender process in TAP
> tests,
> then could you tell me why the new test code matches the purpose of this 
> patch ?

The replication command is not for normal user, so I think we don't have to 
test itself.

The check that waits to restart the apply worker was added to improve the 
robustness.
I think there is a possibility to fail the test when the apply worker recevies 
a transaction
before it checks new subscription option. Now the failure can be avoided by
confriming to reload pg_subscription and restart.

> (3-2)
> 
> +  "Timed out while waiting for apply to restart after changing 
> min_apply_delay
> to non-zero value";
> 
> Probably, we can partly change this sentence like below, because we check
> walsender's pid.
> FROM: "... while waiting for apply to restart..."
> TO:   "... while waiting for the walsender to restart..."

Right, fixed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v6-0001-Time-delayed-logical-replication-subscriber.patch
Description: v6-0001-Time-delayed-logical-replication-subscriber.patch


v6-0002-Extend-START_REPLICATION-command-to-accept-walsen.patch
Description:  v6-0002-Extend-START_REPLICATION-command-to-accept-walsen.patch


RE: Exit walsender before confirming remote flush in logical replication

2023-02-09 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san,

Thank you for reviewing!

> +/*
> + * Options for controlling the behavior of the walsender. Options can be
> + * specified in the START_STREAMING replication command. Currently only one
> + * option is allowed.
> + */
> +typedef struct
> +{
> +WalSndShutdownMode shutdown_mode;
> +} WalSndOptions;
> +
> +static WalSndOptions *my_options = NULL;
> 
> I'm not sure we need to have it as a struct at this stage since we
> support only one option. I wonder if we can have one value, say
> shutdown_mode, and we can make it a struct when we really need it.
> Even if we use WalSndOptions struct, I don't think we need to
> dynamically allocate it. Since a walsender can start logical
> replication multiple times in principle, my_options is not freed.

+1, removed the structure.

> ---
> +/*
> + * Parse given shutdown mode.
> + *
> + * Currently two values are accepted - "wait_flush" and "immediate"
> + */
> +static void
> +ParseShutdownMode(char *shutdownmode)
> +{
> +if (pg_strcasecmp(shutdownmode, "wait_flush") == 0)
> +my_options->shutdown_mode =
> WALSND_SHUTDOWN_MODE_WAIT_FLUSH;
> +else if (pg_strcasecmp(shutdownmode, "immediate") == 0)
> +my_options->shutdown_mode =
> WALSND_SHUTDOWN_MODE_IMMIDEATE;
> +else
> +ereport(ERROR,
> +errcode(ERRCODE_SYNTAX_ERROR),
> +errmsg("SHUTDOWN_MODE requires
> \"wait_flush\" or \"immediate\""));
> +}
> 
> I think we should make the error message consistent with other enum
> parameters. How about the message like:
> 
> ERROR:  invalid value shutdown mode: "%s"

Modified like enum parameters and hint message was also provided.

New patch is attached in [1].

[1]: 
https://www.postgresql.org/message-id/TYAPR01MB586683FC450662990E356A0EF5D99%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

2023-02-09 Thread Aleksander Alekseev
Hi,

```
=# commit;
=# SELECT xmin, xmax, cmin, cmax, * FROM t;
 xmin | xmax | cmin | cmax | a | b
--+--+--+--+---+---
  731 |0 |0 |0 | 1 | 0
  732 |  732 |1 |1 | 2 | 0
  732 |  732 |1 |1 | 3 | 0
```

Oops, you got me :) This of course isn't right - the xmax transaction
is committed but we still see the data, etc.

If we really are going to work on this, this part is going to require more work.

-- 
Best regards,
Aleksander Alekseev




Re: Rework LogicalOutputPluginWriterUpdateProgress (WAS Re: Logical replication timeout ...)

2023-02-09 Thread Amit Kapila
On Thu, Feb 9, 2023 at 11:21 AM Amit Kapila  wrote:
>
>
> How about renaming ProcessPendingWrites to WaitToSendPendingWrites or
> WalSndWaitToSendPendingWrites?
>

How about renaming WalSndUpdateProgress() to
WalSndUpdateProgressAndSendKeepAlive() or
WalSndUpdateProgressAndKeepAlive()?

One thing to note about the changes we are discussing here is that
some of the plugins like wal2json already call
OutputPluginUpdateProgress in their commit callback. They may need to
update it accordingly.

One difference I see with the patch is that I think we will end up
sending keepalive for empty prepared transactions even though we don't
skip sending begin/prepare messages for those. The reason why we don't
skip sending prepare for empty 2PC xacts is that if the WALSender
restarts after the PREPARE of a transaction and before the COMMIT
PREPARED of the same transaction then we won't be able to figure out
if we have skipped sending BEGIN/PREPARE of a transaction. To skip
sending prepare for empty xacts, we previously thought of some ideas
like (a) At commit-prepare time have a check on the subscriber-side to
know whether there is a corresponding prepare for it before actually
doing commit-prepare but that sounded costly. (b) somehow persist the
information whether the PREPARE for a xact is already sent and then
use that information for commit prepared but again that also didn't
sound like a good idea.

-- 
With Regards,
Amit Kapila.




Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 13:06:04 +0300, Aleksander Alekseev wrote:
> > Yes, and the fact is that cmin == cmax is something that we don't normally
> > produce
> 
> Not sure if this is particularly relevant to this discussion but I
> can't resist noticing that the heap doesn't even store cmin and
> cmax... There is only HeapTupleHeaderData.t_cid and flags. cmin/cmax
> are merely smoke and mirrors we use to trick a user.

No, they're not just that. Yes, cmin/cmax aren't both stored on-disk, but if
both are needed, they *are* stored in-memory. We can do that because it's only
ever needed from within a transaction.


> > That's a spectactularly wrong argument in almost all cases. Unless you have 
> > a
> > way to get to full branch coverage or use a model checker that basically 
> > does
> > the same, testing isn't going to give you a whole lot of confidence that you
> > haven't introduced bugs.
> 
> But neither will reviewing a lot of code...

And yet my review did figure out that your patch would have visibility
problems, which you did end up having, as you noticed yourself downthread :)


> > I've said my piece, as-is I vote to reject the patch.
> 
> Fair enough. I'm merely saying that rejecting a patch because it
> doesn't include a TLA+ model is something novel :)

I obviously am not suggesting that (although some things could probably
benefit). Just that not having an example showing something working, isn't
sufficient to consider something suspicious OK.

And changes affecting heapam.c visibility semantics need extremely careful
review, I have the battle scars to prove that to be true :P.

Greetings,

Andres Freund




Re: [PATCH] Add pretty-printed XML output option

2023-02-09 Thread Peter Eisentraut

On 02.02.23 21:35, Jim Jones wrote:
This small patch introduces a XML pretty print function. It basically 
takes advantage of the indentation feature of xmlDocDumpFormatMemory 
from libxml2 to format XML strings.


I suggest we call it "xmlformat", which is an established term for this.





Reconcile stats in find_tabstat_entry() and get rid of PgStat_BackendFunctionEntry

2023-02-09 Thread Drouvot, Bertrand

Hi hackers,

Please find attached a patch proposal for $SUBJECT.

The idea has been raised in [1] by Andres: it would allow to simplify even more 
the work done to
generate pg_stat_get_xact*() functions with Macros.

Indeed, with the reconciliation done in find_tabstat_entry() then all the 
pg_stat_get_xact*() functions
(making use of find_tabstat_entry()) now "look the same" (should they take into 
account live subtransactions or not).

Looking forward to your feedback,

Regards

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

[1]: 
https://www.postgresql.org/message-id/20230111225907.6el6c5j3hukizqxc%40awork3.anarazel.dediff --git a/src/backend/utils/activity/pgstat.c 
b/src/backend/utils/activity/pgstat.c
index 0fa5370bcd..a08a2c8e66 100644
--- a/src/backend/utils/activity/pgstat.c
+++ b/src/backend/utils/activity/pgstat.c
@@ -291,7 +291,7 @@ static const PgStat_KindInfo 
pgstat_kind_infos[PGSTAT_NUM_KINDS] = {
.shared_size = sizeof(PgStatShared_Function),
.shared_data_off = offsetof(PgStatShared_Function, stats),
.shared_data_len = sizeof(((PgStatShared_Function *) 0)->stats),
-   .pending_size = sizeof(PgStat_BackendFunctionEntry),
+   .pending_size = sizeof(PgStat_FunctionCounts),
 
.flush_pending_cb = pgstat_function_flush_cb,
},
diff --git a/src/backend/utils/activity/pgstat_function.c 
b/src/backend/utils/activity/pgstat_function.c
index 6139a27fee..0fdcefb783 100644
--- a/src/backend/utils/activity/pgstat_function.c
+++ b/src/backend/utils/activity/pgstat_function.c
@@ -73,7 +73,7 @@ pgstat_init_function_usage(FunctionCallInfo fcinfo,
   PgStat_FunctionCallUsage 
*fcu)
 {
PgStat_EntryRef *entry_ref;
-   PgStat_BackendFunctionEntry *pending;
+   PgStat_FunctionCounts *pending;
boolcreated_entry;
 
if (pgstat_track_functions <= fcinfo->flinfo->fn_stats)
@@ -121,10 +121,10 @@ pgstat_init_function_usage(FunctionCallInfo fcinfo,
 
pending = entry_ref->pending;
 
-   fcu->fs = &pending->f_counts;
+   fcu->fs = pending;
 
/* save stats for this function, later used to compensate for recursion 
*/
-   fcu->save_f_total_time = pending->f_counts.f_total_time;
+   fcu->save_f_total_time = pending->f_total_time;
 
/* save current backend-wide total time */
fcu->save_total = total_func_time;
@@ -192,10 +192,10 @@ pgstat_end_function_usage(PgStat_FunctionCallUsage *fcu, 
bool finalize)
 bool
 pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool nowait)
 {
-   PgStat_BackendFunctionEntry *localent;
+   PgStat_FunctionCounts *localent;
PgStatShared_Function *shfuncent;
 
-   localent = (PgStat_BackendFunctionEntry *) entry_ref->pending;
+   localent = (PgStat_FunctionCounts *) entry_ref->pending;
shfuncent = (PgStatShared_Function *) entry_ref->shared_stats;
 
/* localent always has non-zero content */
@@ -203,11 +203,11 @@ pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool 
nowait)
if (!pgstat_lock_entry(entry_ref, nowait))
return false;
 
-   shfuncent->stats.f_numcalls += localent->f_counts.f_numcalls;
+   shfuncent->stats.f_numcalls += localent->f_numcalls;
shfuncent->stats.f_total_time +=
-   INSTR_TIME_GET_MICROSEC(localent->f_counts.f_total_time);
+   INSTR_TIME_GET_MICROSEC(localent->f_total_time);
shfuncent->stats.f_self_time +=
-   INSTR_TIME_GET_MICROSEC(localent->f_counts.f_self_time);
+   INSTR_TIME_GET_MICROSEC(localent->f_self_time);
 
pgstat_unlock_entry(entry_ref);
 
@@ -215,11 +215,11 @@ pgstat_function_flush_cb(PgStat_EntryRef *entry_ref, bool 
nowait)
 }
 
 /*
- * find any existing PgStat_BackendFunctionEntry entry for specified function
+ * find any existing PgStat_FunctionCounts entry for specified function
  *
  * If no entry, return NULL, don't create a new one
  */
-PgStat_BackendFunctionEntry *
+PgStat_FunctionCounts *
 find_funcstat_entry(Oid func_id)
 {
PgStat_EntryRef *entry_ref;
diff --git a/src/backend/utils/activity/pgstat_relation.c 
b/src/backend/utils/activity/pgstat_relation.c
index 2e20b93c20..b33965c581 100644
--- a/src/backend/utils/activity/pgstat_relation.c
+++ b/src/backend/utils/activity/pgstat_relation.c
@@ -466,14 +466,30 @@ PgStat_TableStatus *
 find_tabstat_entry(Oid rel_id)
 {
PgStat_EntryRef *entry_ref;
+   PgStat_TableStatus *tablestatus = NULL;
 
entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, 
MyDatabaseId, rel_id);
if (!entry_ref)
entry_ref = pgstat_fetch_pending_entry(PGSTAT_KIND_RELATION, 
InvalidOid, rel_id);
 
if (entry_ref)
-   return entry_ref->pending;
-   return NULL;
+   {
+   PgStat_TableSta

Re: [PATCH] Make ON CONFLICT DO NOTHING and ON CONFLICT DO UPDATE consistent

2023-02-09 Thread Aleksander Alekseev
Hi,

> And yet my review did figure out that your patch would have visibility
> problems, which you did end up having, as you noticed yourself downthread :)

Yep, this particular implementation turned out to be buggy.

>> I don't buy your argument about DO UPDATE needing to be brought into
>> line with DO NOTHING. In any case I'm pretty sure that Tom's remarks
>> in 2016 about a behavioral inconsistencies (which you cited) actually
>> called for making DO NOTHING more like DO UPDATE -- not the other way
>> around.
>
> Interesting. Yep, we could use a bit of input from Tom on this one.
>
> This of course would break backward compatibility. But we can always
> invent something like:
>
> ```
> INSERT INTO ..
> ON CONFLICT DO [NOTHING|UPDATE .. ]
> [ALLOWING|FORBIDDING] SELF CONFLICTS;
> ```
>
> ... if we really want to.

I suggest we discuss if we even want to support something like this
before processing further and then think about a particular
implementation if necessary.

One thing that occured to me during the discussion is that we don't
necessarily have to physically write one tuple at a time to the heap.
Alternatively we could use information about the existing unique
constraints and write only the needed tuples.

> However, extension developers, as an example, often don't know the
> underlying unique constraints (more specifically, it's difficult to
> look for them and process them manually) and often have to process any
> garbage the application developer passes to an extension.
>
> This of course is applicable not only to extensions, but to any
> middleware between the DBMS and the application.

This however is arguably a niche use case. So maybe we don't want to
spend time on this.

-- 
Best regards,
Aleksander Alekseev




Re: [PATCH] Compression dictionaries for JSONB

2023-02-09 Thread Aleksander Alekseev
Hi Andres,

> > So to clarify, are we talking about tuple-level compression? Or
> > perhaps page-level compression?
>
> Tuple level.

> although my own patch proposed attribute-level compression, not
> tuple-level one, it is arguably closer to tuple-level approach than
> page-level one

Just wanted to make sure that by tuple-level we mean the same thing.

When saying tuple-level do you mean that the entire tuple should be
compressed as one large binary (i.e. similarly to page-level
compression but more granularly), or every single attribute should be
compressed separately (similarly to how TOAST does this)?

-- 
Best regards,
Aleksander Alekseev




Re: Support logical replication of DDLs

2023-02-09 Thread Alvaro Herrera
I happened to notice that MINVFUNC in 0003 displays like this
  "fmt": "MINVFUNC==%{type}T",
in some cases; this appears in the JSON that's emitted by the regression
tests at some point.  How can we detect this kind of thing, so that
these mistakes become self-evident?  I thought the intention of the
regress module was to run the deparsed code, so the syntax error should
have become obvious.

...

Oh, I see the problem.  There are two 'fmt' lines for that clause (and
many others), one of which is used when the clause is not present.  So
there's never a syntax error, because this one never expands other than
to empty.

AFAICS this defeats the purpose of the 'present' field.  I mean, if the
clause is never to deparse, then why have it there in the first place?
If we want to have it, then it has to be correct.


I think we should design the code to avoid the repetition, because that
has an inherent risk of typo bugs and such.  Maybe we should set forth
policy that each 'fmt' string should appear in the source code only
once.  So instead of this

+   /* MINVFUNC */
+   if (OidIsValid(agg->aggminvtransfn))
+   tmp = new_objtree_VA("MINVFUNC=%{type}T", 1,
+"type", ObjTypeObject,
+
new_objtree_for_qualname_id(ProcedureRelationId,
+
agg->aggminvtransfn));
+   else
+   {
+   tmp = new_objtree("MINVFUNC==%{type}T");
+   append_bool_object(tmp, "present", false);
+   }

we would have something like

   tmp = new_objtree("MINVFUNC=%{type}T");
   if (OidIsValid(agg->aggminvtransfn))
   {
  append_bool_object(tmp, "present", true);
  append...(tmp, "type", new_objtree_for_qualname_id(ProcedureRelationId, 
agg->aggminvtransfn));
   }
   else
   {
  append_bool_object(tmp, "present", false);
   }


-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"No necesitamos banderas
 No reconocemos fronteras"  (Jorge González)




Re: [PATCH] Compression dictionaries for JSONB

2023-02-09 Thread Andres Freund
Hi, 

On February 9, 2023 2:50:57 AM PST, Aleksander Alekseev 
 wrote:
>Hi Andres,
>
>> > So to clarify, are we talking about tuple-level compression? Or
>> > perhaps page-level compression?
>>
>> Tuple level.
>
>> although my own patch proposed attribute-level compression, not
>> tuple-level one, it is arguably closer to tuple-level approach than
>> page-level one
>
>Just wanted to make sure that by tuple-level we mean the same thing.
>
>When saying tuple-level do you mean that the entire tuple should be
>compressed as one large binary (i.e. similarly to page-level
>compression but more granularly), or every single attribute should be
>compressed separately (similarly to how TOAST does this)?

Good point - should have been clearer. I meant attribute wise compression. Like 
we do today, except that we would use a dictionary to increase compression 
rates.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-02-09 Thread John Naylor
On Thu, Feb 9, 2023 at 2:08 PM Masahiko Sawada 
wrote:
> I think it's still important for lazy vacuum that an iteration over a
> TID store returns TIDs in ascending order, because otherwise a heap
> vacuum does random writes. That being said, we can have
> RT_ITERATE_NEXT() return key-value pairs in an order regardless of how
> the key chunks are stored in a node.

Okay, we can keep that possibility in mind if we need to go there.

> > Note: The regression seems to have started in v17, which is the first
with a full template.

> > 0007 removes some dead code. RT_NODE_INSERT_INNER is only called during
RT_SET_EXTEND, so it can't possibly find an existing key. This kind of
change is much easier with the inner/node cases handled together in a
template, as far as being sure of how those cases are different. I thought
about trying the search in assert builds and verifying it doesn't exist,
but thought yet another #ifdef would be too messy.

It just occurred to me that these facts might be related. v17 was the first
use of the full template, and I decided then I liked one of your earlier
patches where replace_node() calls node_update_inner() better than calling
node_insert_inner() with a NULL parent, which was a bit hard to understand.
That now-dead code was actually used in the latter case for updating the
(original) parent. It's possible that trying to use separate paths
contributed to the regression. I'll try the other way and report back.

> I've attached the patch that adds a simple benchmark test using
> TidStore. With this test, I got similar trends of results to yours
> with gcc, but I've not analyzed them in depth yet.

Thanks for that! I'll take a look.

> BTW it would be better to remove the RT_DEBUG macro from
bench_radix_tree.c.

Absolutely.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: pg_stat_statements and "IN" conditions

2023-02-09 Thread Peter Eisentraut

On 07.02.23 21:14, Sergei Kornilov wrote:

It seems a little strange to me that with const_merge_threshold = 1, such a 
test case gives the same result as with const_merge_threshold = 2


What is the point of making this a numeric setting?  Either you want to 
merge all values or you don't want to merge any values.





Re: Introduce a new view for checkpointer related stats

2023-02-09 Thread Bharath Rupireddy
On Thu, Feb 9, 2023 at 12:33 PM Andres Freund  wrote:
>
> Hi,

Thanks for looking at this.

> On 2023-02-09 12:21:51 +0530, Bharath Rupireddy wrote:
> > @@ -1105,18 +1105,22 @@ CREATE VIEW pg_stat_archiver AS
> >
> >  CREATE VIEW pg_stat_bgwriter AS
> >  SELECT
> > -pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
> > -pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req,
> > -pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
> > -pg_stat_get_checkpoint_sync_time() AS checkpoint_sync_time,
> > -pg_stat_get_bgwriter_buf_written_checkpoints() AS 
> > buffers_checkpoint,
> >  pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean,
> >  pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean,
> > -pg_stat_get_buf_written_backend() AS buffers_backend,
> > -pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
> >  pg_stat_get_buf_alloc() AS buffers_alloc,
> >  pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;
> >
> > +CREATE VIEW pg_stat_checkpointer AS
> > +SELECT
> > +pg_stat_get_timed_checkpoints() AS timed_checkpoints,
> > +pg_stat_get_requested_checkpoints() AS requested_checkpoints,
> > +pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
> > +pg_stat_get_checkpoint_sync_time() AS checkpoint_sync_time,
> > +pg_stat_get_buf_written_checkpoints() AS 
> > buffers_written_checkpoints,
> > +pg_stat_get_buf_written_backend() AS buffers_written_backend,
> > +pg_stat_get_buf_fsync_backend() AS buffers_fsync_backend,
> > +pg_stat_get_checkpointer_stat_reset_time() AS stats_reset;
> > +
>
> I don't think the backend written stats belong more accurately in
> pg_stat_checkpointer than pg_stat_bgwriter.

We accumulate buffers_written_backend and buffers_fsync_backend of all
backends under checkpointer stats to show the aggregated results to
the users. I think this is correct because the checkpointer is the one
that processes fsync requests (of course, backends themselves can
fsync when needed, that's what the buffers_fsync_backend shows),
whereas bgwriter doesn't perform IIUC.

> I continue to be worried about breaking just about any postgres monitoring
> setup.

Hm. Yes, it requires minimal and straightforward changes in monitoring
scripts. Please note that we separated out bgwriter and checkpointer
in v9.2 12 years ago but we haven't had a chance to separate the stats
so far. We might do it at some point of time, IMHO this is that time.

We did away with promote_trigger_file (cd4329d) very recently. The
agreement was that the changes required to move on to other mechanisms
of promotion are minimal, hence we didn't want it to be first
deprecated and then removed.

>From the discussion upthread, it looks like Robert, Amit, Bertrand,
Greg and myself are in favour of not having a deprecated version but
moving them to the new pg_stat_checkpointer view.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Weird failure with latches in curculio on v15

2023-02-09 Thread Robert Haas
On Wed, Feb 8, 2023 at 7:24 PM Nathan Bossart  wrote:
> On Thu, Feb 09, 2023 at 08:56:24AM +0900, Michael Paquier wrote:
> > On Wed, Feb 08, 2023 at 02:25:54PM -0800, Nathan Bossart wrote:
> >> These are all good points.  Perhaps there could be a base archiver
> >> implementation that shell_archive uses (and that other modules could use if
> >> desired, which might be important for backward compatibility with the
> >> existing callbacks).  But if you want to do something fancier than
> >> archiving sequentially, you could write your own.
> >
> > Which is basically the kind of things you can already achieve with a
> > background worker and a module of your own?
>
> IMO one of the big pieces that's missing is a way to get the next N files
> to archive.  Right now, you'd have to trawl through archive_status on your
> own if you wanted to batch/parallelize.  I think one advantage of what
> Robert is suggesting is that we could easily provide a supported way to get
> the next set of files to archive, and we can asynchronously mark them
> "done".  Otherwise, each module has to implement this.

Right.

I think that we could certainly, as Michael suggests, have people
provide their own background worker rather than having the archiver
invoke the user-supplied code directly. As long as the functions that
you need in order to get the necessary information can be called from
some other process, that's fine. The only difficulty I see is that if
the archiving is happening from a separate background worker rather
than from the archiver, then what is the archiver doing? We could
somehow arrange to not run the archiver process at all, or I guess to
just sit there and have it do nothing. Or, we can decide not to have a
separate background worker and just have the archiver call the
user-supplied core directly. I kind of like that approach at the
moment; it seems more elegant to me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pgsql: Use appropriate wait event when sending data in the apply worker

2023-02-09 Thread Robert Haas
On Mon, Feb 6, 2023 at 11:40 PM Amit Kapila  wrote:
> Use appropriate wait event when sending data in the apply worker.
>
> Currently, we reuse WAIT_EVENT_LOGICAL_PARALLEL_APPLY_STATE_CHANGE in the
> apply worker while sending data to the parallel apply worker via a shared
> memory queue. This is not appropriate as one won't be able to distinguish
> whether the worker is waiting for sending data or for the state change.
>
> To patch instead uses the wait event WAIT_EVENT_MQ_SEND which has been
> already used in blocking mode while sending data via a shared memory
> queue.

This is not right at all. You should invent a new wait state if you're
waiting in a new place.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: AW: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-02-09 Thread Tomas Vondra



On 2/9/23 10:03, Hans Buschmann wrote:
> Hello Tomas,
> 
> 
> Thank you for looking at.
> 
> 
> First, I miscalculated the factor which should be about 50, not 500. Sorry.
> 
> Then I want to show you the table definitions (simple, very similar,
> ommited child_tables and additional indexes, here using always "ONLY"):
> 
> cpsdb_matcol=# \d sa_upper;
>                                        Tabelle ╗public.sa_upper½
>     Spalte    |          Typ          | Sortierfolge | NULL erlaubt? | 
>          Vorgabewert
> --+---+--+---+--
>  id_sup       | integer               |              | not null      |
> generated by default as identity
>  sup_season   | smallint              |              |               |
>  sup_sa_code  | character varying(10) | C            |               |
>  sup_mat_code | character varying(4)  | C            |               |
>  sup_clr_code | character varying(3)  | C            |               |
> Indexe:
>     "sa_upper_active_pkey" PRIMARY KEY, btree (id_sup)
>  
> 
> cpsdb_matcol=# \d sa_lining+;
>                                        Tabelle ╗public.sa_lining½
>     Spalte    |          Typ          | Sortierfolge | NULL erlaubt? | 
>          Vorgabewert
> --+---+--+---+--
>  id_sli       | integer               |              | not null      |
> generated by default as identity
>  sli_season   | smallint              |              |               |
>  sli_sa_code  | character varying(10) | C            |               |
>  sli_mat_code | character varying(4)  | C            |               |
>  sli_clr_code | character varying(3)  | C            |               |
> Indexe:
>     "sa_lining_active_pkey" PRIMARY KEY, btree (id_sli)
>  
> 
> cpsdb_matcol=# \d sa_insole;
>                                        Tabelle ╗public.sa_insole½
>     Spalte    |          Typ          | Sortierfolge | NULL erlaubt? | 
>          Vorgabewert
> --+---+--+---+--
>  id_sin       | integer               |              | not null      |
> generated by default as identity
>  sin_season   | smallint              |              |               |
>  sin_sa_code  | character varying(10) | C            |               |
>  sin_mat_code | character varying(4)  | C            |               |
>  sin_clr_code | character varying(3)  | C            |               |
> Indexe:
>     "sa_insole_active_pkey" PRIMARY KEY, btree (id_sin)
>  
> 
> cpsdb_matcol=# \d sa_outsole;
>                                       Tabelle ╗public.sa_outsole½
>     Spalte    |          Typ          | Sortierfolge | NULL erlaubt? | 
>          Vorgabewert
> --+---+--+---+--
>  id_sou       | integer               |              | not null      |
> generated by default as identity
>  sou_season   | smallint              |              |               |
>  sou_sa_code  | character varying(10) | C            |               |
>  sou_mat_code | character varying(4)  | C            |               |
>  sou_clr_code | character varying(3)  | C            |               |
> Indexe:
>     "sa_outsole_active_pkey" PRIMARY KEY, btree (id_sou)
>  
> The xxx_target tables are very similiar, here the upper one as an example:
> They are count_aggregates of the whole dataset, where
> up_mat_code=sup_mat_code etc.
> 
> cpsdb_matcol=# \d upper_target
>                     Tabelle ╗admin.upper_target½
>    Spalte    |   Typ    | Sortierfolge | NULL erlaubt? | Vorgabewert
> -+--+--+---+-
>  id_up       | smallint |              |               |
>  nup         | integer  |              |               |
>  up_mat_code | text     | C            |               |
> 
> 
> 
> I have reworked the two queries to show their complete explain plans:
> 
> 1. query with left join in the qupd CTE:
> 
> \set only 'ONLY'
> 
> cpsdb_matcol=# explain analyze -- explain analyze verbose -- explain --
> select * from ( -- select count(*) from ( -- select length(sel) from (
> cpsdb_matcol-# with
> cpsdb_matcol-# qup as (
> cpsdb_matcol(# select
> cpsdb_matcol(#  curr_season -- all xxx_seasosn are always smallint
> cpsdb_matcol(# ,curr_code-- all xx_code are always varchar(10)
> cpsdb_matcol(# ,array_agg(id_up order by
> id_up)||array_fill(0::smallint,array[10]) as mat_arr
> cpsdb_matcol(# ,array_agg(curr_mat_code order by id_up) as matcode_arr
> cpsdb_matcol(# ,bit_or(imask) as ibitmask
> cpsdb_matcol(# from(
> cpsdb_matcol(# select
> cpsdb_matcol(#  sup_season as curr_season
> cpsdb_matcol(# ,sup_sa_code as curr_code
> cpsdb_matcol(# ,sup_mat_code as curr_mat_code
> cpsdb_matcol(# ,sup_clr_code as curr_clr_code
> cpsdb_matcol(# ,id_up
> cpsdb_matcol(# ,

Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-02-09 Thread Peter Eisentraut

On 08.02.23 23:18, Tom Lane wrote:

I pushed the discussed documentation improvements, and changed the
behavior of "ninja docs" to only build the HTML docs.


I don't like this change.  Now the default set of docs is different 
between the make builds and the meson builds.  And people will be less 
likely to make sure the man pages still build.


What's wrong with just typing "ninja html"?





RE: [Proposal] Add foreign-server health checks infrastructure

2023-02-09 Thread Hayato Kuroda (Fujitsu)
Dear Horiguchi-san,

Thank you for checking! The patch will be attached to another mail.

> At Fri, 27 Jan 2023 06:57:01 +, "Hayato Kuroda (Fujitsu)"
>  wrote in
> > I found cfbot failure, PSA fixed version.
> 
> +   Unlike , this function checks socket
> +   health. This check is performed by polling the socket. This function 
> is
> +   currently available only on systems that support the non-standard
> +   POLLRDHUP extension to the
> poll system
> 
> I find it quite confusing that we have pqSocketCheck and PQconnCheck,
> that does almost the same thing.. Since pqSocketCheck is a static
> function, we can modify the function as we like.

Renamed to pqSocketIsReadableOrWritableOrValid(), but seemed very bad...

> I still don't understand why we need pqconnCheck_internal separate
> from pqSocketPoll(), and PQconnCheck from pqSocketCheck.

pqconnCheck_internal() was combined into pqSocketPoll(), but PQconnCheck() still
exists. libpq-fe.h, did not include standard header files except stdio.h. I'm 
not
sure whether we can add an inclusion of time.h, because it may break the 
compatibility
that some platform may not have the header. If there are not such a system, we 
may
able to export pqSocketCheck() and remove PQconnCheck().

The side effect of this changes is that codes become dirty when we add kqueue() 
support...

> https://www.postgresql.org/message-id/flat/TYAPR01MB58665BF23D38EDF1
> 0028DE2AF5299%40TYAPR01MB5866.jpnprd01.prod.outlook.com#47d21431bf9
> fa94f763c824f6e81fa54
> > IIUC, pqSocketCheck () calls pqSocketPoll(),
> > and in the pqSocketPoll() we poll()'d the POLLIN or POLLOUT event.
> > But according to [1], we must wait POLLRDHUP event,
> > so we cannot reuse it straightforward.
> 
> Yeah, I didn't suggest to use the function as-is. Couldn't we extend
> the fucntion by letting it accept end_time = 0 && !forRead &&
> !forWrite, not causing side effects?

Modified accordingly. Is it what you expected?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED





RE: [Proposal] Add foreign-server health checks infrastructure

2023-02-09 Thread Hayato Kuroda (Fujitsu)
Dear Katsuragi-san,

Thank you for reviewing! PSA new version patches.

> 0001:
> +   while (result < 0 && errno == EINTR);
> +
> +   if (result < 0)
> +   return -1;
> 
> this `return -1` is not indented properly.

This part is no longer needed. Please see another discussion[1].

> 0002:
> +postgres_fdw_verify_connection_states(server_name
> text) returns boolean
> ...
> +  extension to the poll system call, including
> Linux. This
> +  returns true if checked connections are still
> valid,
> +  or the checking is not supported on this platform.
> false
> +  is returned if the local session seems to be disconnected from
> other
> +  servers. Example usage of the function:
> 
> Here, 'still valid' seems a little bit confusing because this 'valid' is
> not
> the same as postgres_fdw_get_connections's 'valid' [1].
> 
> Should 'still valid' be 'existing connection is not closed by the remote
> peer'?
> But this description does not cover all the cases where this function
> returns true...
> I think one choice is to write all the cases like 'returns true if any
> of the
> following condition is satisfied. 1) existing connection is not closed
> by the
> remote peer 2) there is no connection for specified server yet 3) the
> checking
> is not supported...'. If my understanding is not correct, please point
> it out.

Modified like you pointed out.

> BTW, is it reasonable to return true if ConnectionHash is not
> initialized or
> there is no ConnCacheEntry for specified remote server? What do you
> think
> about returning NULL in that case?

I'm not sure which one is better, but modified accordingly.

> 0003:
> I think it is better that the test covers all the new functions.
> How about adding a test for postgres_fdw_verify_connection_states_all?
> 
> 
> +--
> ===
> 
> +-- test for postgres_fdw_verify_foreign_servers function
> +--
> ===
> 
> 
> Writing all the functions' name like 'test for
> postgres_fdw_verify_connection_states
> and postgres_fdw_can_verify_connection_states' looks straightforward.
> What do you think about this?

Added.

> 0004:
> Sorry, I have not read 0004. I will read.

No problem:-).

[1]: 
https://www.postgresql.org/message-id/TYAPR01MB58664E039F45959AB321FA1FF5D99%40TYAPR01MB5866.jpnprd01.prod.outlook.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v31-0001-Add-PQconnCheck-and-PQcanConnCheck-to-libpq.patch
Description:  v31-0001-Add-PQconnCheck-and-PQcanConnCheck-to-libpq.patch


v31-0002-postgres_fdw-add-postgres_fdw_verify_connection_.patch
Description:  v31-0002-postgres_fdw-add-postgres_fdw_verify_connection_.patch


v31-0003-add-test.patch
Description: v31-0003-add-test.patch


v31-0004-add-kqueue-support-for-PQconnCheck-and-PQcanConn.patch
Description:  v31-0004-add-kqueue-support-for-PQconnCheck-and-PQcanConn.patch


Re: ICU locale validation / canonicalization

2023-02-09 Thread Peter Eisentraut

On 08.02.23 08:59, Jeff Davis wrote:

The overall benefit here is that we keep our catalogs consistently
using an independent standard format for ICU locale strings, rather
than whatever the user specifies. That makes it less likely that ICU
needs to use any fallback logic when trying to open a collator, which
could only be bad news.


One use case is that if a user specifies a locale, say, of 'de-AT', this 
might canonicalize to 'de' today, but we should still store what the 
user specified because 1) that documents what the user wanted, and 2) it 
might not canonicalize to the same thing tomorrow.






Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-02-09 Thread Tom Lane
Peter Eisentraut  writes:
> On 08.02.23 23:18, Tom Lane wrote:
>> I pushed the discussed documentation improvements, and changed the
>> behavior of "ninja docs" to only build the HTML docs.

> I don't like this change.  Now the default set of docs is different 
> between the make builds and the meson builds.  And people will be less 
> likely to make sure the man pages still build.

What?  The default behavior of "make" has been to build only the
html docs for many years.  And I've never ever seen a case where
the html docs build and the man pages don't.

> What's wrong with just typing "ninja html"?

Don't really care how the command is spelled, but there needs to
be a convenient way to get that behavior.

regards, tom lane




Re: pg_stat_statements and "IN" conditions

2023-02-09 Thread Dmitry Dolgov
> On Tue, Feb 07, 2023 at 11:14:52PM +0300, Sergei Kornilov wrote:
> Hello!

Thanks for reviewing.

> Unfortunately, rebase is needed again due to recent changes in 
> queryjumblefuncs ( 9ba37b2cb6a174b37fc51d0649ef73e56eae27fc )

Yep, my favourite game, rebaseball. Will post a new version soon, after
figuring out all the recent questions.

> It seems a little strange to me that with const_merge_threshold = 1, such a 
> test case gives the same result as with const_merge_threshold = 2
>
> select pg_stat_statements_reset();
> set const_merge_threshold to 1;
> select * from test where i in (1,2,3);
> select * from test where i in (1,2);
> select * from test where i in (1);
> select query, calls from pg_stat_statements order by query;
>
> query| calls
> -+---
>  select * from test where i in (...) | 2
>  select * from test where i in ($1)  | 1
>
> Probably const_merge_threshold = 1 should produce only "i in (...)"?

Well, it's not intentional, probably I need to be more careful with
off-by-one. Although I agree to a certain extent with Peter questioning
the value of having numerical option here, let me think about this.

> const_merge_threshold is "the minimal length of an array" (more or equal) or 
> "array .. length is larger than" (not equals)? I think the documentation is 
> ambiguous in this regard.
>
> I also noticed a typo in guc_tables.c: "Sets the minimal numer of constants 
> in an array" -> number

Yep, I'll rephrase the documentation.




Re: pg_stat_statements and "IN" conditions

2023-02-09 Thread Dmitry Dolgov
> On Thu, Feb 09, 2023 at 02:30:34PM +0100, Peter Eisentraut wrote:
> On 07.02.23 21:14, Sergei Kornilov wrote:
> > It seems a little strange to me that with const_merge_threshold = 1, such a 
> > test case gives the same result as with const_merge_threshold = 2
>
> What is the point of making this a numeric setting?  Either you want to
> merge all values or you don't want to merge any values.

At least in theory the definition of "too many constants" is different
for different use cases and I see allowing to configure it as a way of
reducing the level of surprise here. The main scenario for a numerical
setting would be to distinguish between normal usage with just a handful
of constants (and the user expecting to see them represented in pgss)
and some sort of outliers with thousands of constants in a query (e.g.
as a defence mechanism for the infrastructure working with those
metrics). But I agree that it's not clear how much value is in that.

Not having strong opinion about this I would be fine changing it to a
boolean option (with an actual limit hidden internally) if everyone
agrees it fits better.




Re: meson: Optionally disable installation of test modules

2023-02-09 Thread Nazir Bilal Yavuz

Hi,


On 2/8/23 13:30, Peter Eisentraut wrote:


If you feel that your patch is ready, please add it to the commit 
fest. I look forward to reviewing it.



Thanks! Commit fest entry link: https://commitfest.postgresql.org/42/4173/


Regards,
Nazir Bilal Yavuz
Microsoft





[Question] Revamping btree gist implementation of inet

2023-02-09 Thread Ankit Kumar Pandey

Hi,


I was looking at bug mentioned at 
https://www.postgresql.org/message-id/flat/201010112055.o9BKtZf7011251%40wwwmaster.postgresql.org


Issue appears to be in gbt_inet_compress which doesn't store inet 
details like ip_family and netmask details in inetKEY and


gbt_inet_consistent which doesn't have enough info (as gbt_inet_compress 
didn't store them) and


uses vague convert_network_to_scalar for performing operations.

Looking at reference implementation for inet in 
src/backend/utils/adt/network_gist.c, if we add missing inet data


(as seen in GistInetKey) into inetKEY and refactor gbt_inet_consistent 
to use network functions instead of using convert_network_to_scalar, 
mentioned bugs might be alleviated.


I just to know if this is right direction to approach this problem?


Thanks,

Ankit





Re: Weird failure with latches in curculio on v15

2023-02-09 Thread Tom Lane
Robert Haas  writes:
> I think that we could certainly, as Michael suggests, have people
> provide their own background worker rather than having the archiver
> invoke the user-supplied code directly. As long as the functions that
> you need in order to get the necessary information can be called from
> some other process, that's fine. The only difficulty I see is that if
> the archiving is happening from a separate background worker rather
> than from the archiver, then what is the archiver doing? We could
> somehow arrange to not run the archiver process at all, or I guess to
> just sit there and have it do nothing. Or, we can decide not to have a
> separate background worker and just have the archiver call the
> user-supplied core directly. I kind of like that approach at the
> moment; it seems more elegant to me.

I'm fairly concerned about the idea of making it common for people
to write their own main loop for the archiver.  That means that, if
we have a bug fix that requires the archiver to do X, we will not
just be patching our own code but trying to get an indeterminate
set of third parties to add the fix to their code.

If we think we need primitives to let the archiver hooks get all
the pending files, or whatever, by all means add those.  But don't
cede fundamental control of the archiver.  The hooks need to be
decoration on a framework we provide, not the framework themselves.

regards, tom lane




Re: AW: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2023-02-09 Thread Tomas Vondra
> 
> FWIW I suggest you provide the data in a form that's easier to use (like
> a working SQL script). More people are likely to look and help than when
> they have to extract stuff from an e-mail, fill in missing pieces etc.
> 

BTW if anyone wants to play with this, here are the SQL scripts I used
to create the tables and the queries. There's no data, but it's enough
to see how the plans change.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

query-complete.sql
Description: application/sql


query-smaller.sql
Description: application/sql


create-join.sql
Description: application/sql


Re: ICU locale validation / canonicalization

2023-02-09 Thread Robert Haas
On Wed, Feb 8, 2023 at 2:59 AM Jeff Davis  wrote:
> We do check that the value is accepted by ICU, but ICU seems to accept
> anything and use some fallback logic. Bogus strings will typically end
> up as the "root" locale (spelled "root" or "").

I've noticed this, and I think it's really frustrating. There's barely
any documentation of what strings you're allowed to specify, and the
documentation that does exist is extremely difficult to understand.
Normally, you could work around that problem to some degree by making
a guess at what you're supposed to be doing and then seeing whether
the program accepts it, but here that doesn't work either. It just
accepts anything you give it and then you have to try to figure out
whether the behavior is what you wanted. But there's also no real
documentation of what the behavior of any collation is, so you're
apparently just supposed to magically know what collations exist and
how they behave and then you can test whether the string you put in
gave you the behavior you wanted.

Adding validation and canonicalization wouldn't cure the documentation
problems, but it would be a big help. You still wouldn't know what
string you were supposed to be passing to ICU, but if you did pass it
a string, you'd find out what it thought that string meant. I think
that would be a huge step forward.

Unfortunately, I have no idea whether your specific ideas about how to
make that happen are any good or not. But I hope they are, because the
current situation is pessimal.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Inconsistent nullingrels due to oversight in deconstruct_distribute_oj_quals

2023-02-09 Thread Tom Lane
Richard Guo  writes:
> It seems to me there is oversight here.  Actually in next level up this
> othersj would null all the relids in its syn_righthand, not only the
> relids in its min_righthand.

Good point.  I think this code originated before it was clear to me
that nullingrels would need to follow the syntactic structure.

> This query would trigger the Assert() in search_indexed_tlist_for_var.
> So I wonder that we should use othersj->syn_righthand here.

There are two such calls in deconstruct_distribute_oj_quals ...
don't they both need this change?

regards, tom lane




Re: Weird failure with latches in curculio on v15

2023-02-09 Thread Robert Haas
On Thu, Feb 9, 2023 at 10:51 AM Tom Lane  wrote:
> I'm fairly concerned about the idea of making it common for people
> to write their own main loop for the archiver.  That means that, if
> we have a bug fix that requires the archiver to do X, we will not
> just be patching our own code but trying to get an indeterminate
> set of third parties to add the fix to their code.

I don't know what kind of bug we could really have in the main loop
that would be common to every implementation. They're probably all
going to check for interrupts, do some work, and then wait for I/O on
some things by calling select() or some equivalent. But the work, and
the wait for the I/O, would be different for every implementation. I
would anticipate that the amount of common code would be nearly zero.

Imagine two archive modules, one of which archives files via HTTP and
the other of which archives them via SSH. They need to do a lot of the
same things, but the code is going to be totally different. When the
HTTP archiver module needs to open a new connection, it's going to
call some libcurl function. When the SSH archiver module needs to do
the same thing, it's going to call some libssh function. It seems
quite likely that the HTTP implementation would want to juggle
multiple connections in parallel, but the SSH implementation might not
want to do that, or its logic for determining how many connections to
open might be completely different based on the behavior of that
protocol vs. the other protocol. Once either implementation has sent
as much data it can over the connections it has open, it needs to wait
for those sockets to become write-ready or, possibly, read-ready.
There again, each one will be calling into a different library to do
that. It could be that in this particular case, but would be waiting
for a set of file descriptors, and we could provide some framework for
waiting on a set of file descriptors provided by the module. But you
could also have some other archiver implementation that is, say,
waiting for a process to terminate rather than for a file descriptor
to become ready for I/O.

> If we think we need primitives to let the archiver hooks get all
> the pending files, or whatever, by all means add those.  But don't
> cede fundamental control of the archiver.  The hooks need to be
> decoration on a framework we provide, not the framework themselves.

I don't quite see how you can make asynchronous and parallel archiving
work if the archiver process only calls into the archive module at
times that it chooses. That would mean that the module has to return
control to the archiver when it's in the middle of archiving one or
more files -- and then I don't see how it can get control back at the
appropriate time. Do you have a thought about that?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Inconsistent nullingrels due to oversight in deconstruct_distribute_oj_quals

2023-02-09 Thread Tom Lane
I wrote:
> Richard Guo  writes:
>> It seems to me there is oversight here.  Actually in next level up this
>> othersj would null all the relids in its syn_righthand, not only the
>> relids in its min_righthand.

> Good point.  I think this code originated before it was clear to me
> that nullingrels would need to follow the syntactic structure.

Although ... the entire point here is that we're trying to build quals
that don't match the original syntactic structure.  I'm worried that
distribute_qual_to_rels will do the wrong thing (put the qual at the
wrong level) if we add more nullingrel bits than we meant to.  This
might be less trivial than it appears.

regards, tom lane




Re: HOT chain validation in verify_heapam()

2023-02-09 Thread Himanshu Upadhyaya
On Wed, Feb 8, 2023 at 11:17 PM Robert Haas  wrote:

> On Sun, Feb 5, 2023 at 3:57 AM Himanshu Upadhyaya
>  wrote:
> > Thanks, yes it's working fine with Prepared Transaction.
> > Please find attached the v9 patch incorporating all the review comments.
>
> I don't know quite how we're still going around in circles about this,
> but this code makes no sense to me at all:
>
> /*
>  * Add data to the predecessor array even if the current or
>  * successor's LP is not valid. We will not process/validate
> these
>  * offset entries while looping over the predecessor array but
>  * having all entries in the predecessor array will help in
>  * identifying(and validating) the Root of a chain.
>  */
> if (!lp_valid[ctx.offnum] || !lp_valid[nextoffnum])
> {
> predecessor[nextoffnum] = ctx.offnum;
> continue;
> }
>
> If the current offset number is not for a valid line pointer, then it
> makes no sense to talk about the successor. An invalid redirected line
> pointer is one that points off the end of the line pointer array, or
> to before the beginning of the line pointer array, or to a line
> pointer that is unused. An invalid line pointer that is LP_USED is one
> which points to a location outside the page, or to a location inside
> the page. In none of these cases does it make any sense to talk about
> the next tuple. If the line pointer isn't valid, it's pointing to some
> invalid location where there cannot possibly be a tuple. In other
> words, if lp_valid[ctx.offnum] is false, then nextoffnum is a garbage
> value, and therefore referencing predecessor[nextoffnum] is useless
> and dangerous.
>
> If the next offset number is not for a valid line pointer, we could in
> theory still assign to the predecessor array, as you propose here. In
> that case, the tuple or line pointer at ctx.offnum is pointing to the
> line pointer at nextoffnum and that is all fine. But what is the
> point? The comment claims that the point is that it will help us
> identify and validate the root of the hot chain. But if the line
> pointer at nextoffnum is not valid, it can't be the root of a hot
> chain. When we're talking about the root of a HOT chain, we're
> speaking about a tuple. If lp_valid[nextoffnum] is false, there is no
> tuple. Instead of pointing to a tuple, that line pointer is pointing
> to garbage.
>
>
Initially while implementing logic to identify the root of the HOT chain
I was getting crash and regression failure's that time I thought of having
this check along with a few other changes that were required,
but you are right, it's unnecessary to add data to the predecessor
array(in this case) and is not required. I am removing this from the patch.

-- 
Regards,
Himanshu Upadhyaya
EnterpriseDB: http://www.enterprisedb.com
From 19c93cce0189150d6bfe68237eb3d5a414a18ad9 Mon Sep 17 00:00:00 2001
From: Himanshu Upadhyaya 
Date: Thu, 9 Feb 2023 22:00:25 +0530
Subject: [PATCH v10] Implement HOT chain validation in verify_heapam()

Himanshu Upadhyaya, reviewed by Robert Haas, Aleksander Alekseev, Andres Freund
---
 contrib/amcheck/verify_heapam.c   | 303 +-
 src/bin/pg_amcheck/t/004_verify_heapam.pl | 241 -
 2 files changed, 527 insertions(+), 17 deletions(-)

diff --git a/contrib/amcheck/verify_heapam.c b/contrib/amcheck/verify_heapam.c
index 4fcfd6df72..7fc984dd33 100644
--- a/contrib/amcheck/verify_heapam.c
+++ b/contrib/amcheck/verify_heapam.c
@@ -150,7 +150,7 @@ typedef struct HeapCheckContext
 } HeapCheckContext;
 
 /* Internal implementation */
-static void check_tuple(HeapCheckContext *ctx);
+static void check_tuple(HeapCheckContext *ctx, bool *lp_valid);
 static void check_toast_tuple(HeapTuple toasttup, HeapCheckContext *ctx,
 			  ToastedAttribute *ta, int32 *expected_chunk_seq,
 			  uint32 extsize);
@@ -160,7 +160,7 @@ static void check_toasted_attribute(HeapCheckContext *ctx,
 	ToastedAttribute *ta);
 
 static bool check_tuple_header(HeapCheckContext *ctx);
-static bool check_tuple_visibility(HeapCheckContext *ctx);
+static bool check_tuple_visibility(HeapCheckContext *ctx, bool *lp_valid);
 
 static void report_corruption(HeapCheckContext *ctx, char *msg);
 static void report_toast_corruption(HeapCheckContext *ctx,
@@ -399,9 +399,14 @@ verify_heapam(PG_FUNCTION_ARGS)
 	for (ctx.blkno = first_block; ctx.blkno <= last_block; ctx.blkno++)
 	{
 		OffsetNumber maxoff;
+		OffsetNumber predecessor[MaxOffsetNumber];
+		OffsetNumber successor[MaxOffsetNumber];
+		bool		lp_valid[MaxOffsetNumber];
 
 		CHECK_FOR_INTERRUPTS();
 
+		memset(predecessor, 0, sizeof(OffsetNumber) * MaxOffsetNumber);
+
 		/* Optionally skip over all-frozen or all-visible blocks */
 		if (skip_option != SKIP_PAGES_NONE)
 		{
@@ -433,6 +438,10 @@ verify_heapam(PG_FUNCTION_ARGS)
 		for (ctx.offnum = FirstOffsetNumber; ctx.offnum <= maxoff;
 			

Re: Weird failure with latches in curculio on v15

2023-02-09 Thread Nathan Bossart
On Thu, Feb 09, 2023 at 11:12:21AM -0500, Robert Haas wrote:
> On Thu, Feb 9, 2023 at 10:51 AM Tom Lane  wrote:
>> If we think we need primitives to let the archiver hooks get all
>> the pending files, or whatever, by all means add those.  But don't
>> cede fundamental control of the archiver.  The hooks need to be
>> decoration on a framework we provide, not the framework themselves.
> 
> I don't quite see how you can make asynchronous and parallel archiving
> work if the archiver process only calls into the archive module at
> times that it chooses. That would mean that the module has to return
> control to the archiver when it's in the middle of archiving one or
> more files -- and then I don't see how it can get control back at the
> appropriate time. Do you have a thought about that?

I've been thinking about this, actually.  I'm wondering if we could provide
a list of files to the archiving callback (configurable via a variable in
ArchiveModuleState), and then have the callback return a list of files that
are archived.  (Or maybe we just put the list of files that need archiving
in ArchiveModuleState.)  The returned list could include files that were
sent to the callback previously.  The archive module would be responsible
for creating background worker(s) (if desired), dispatching files
to-be-archived to its background worker(s), and gathering the list of
archived files to return.

This is admittedly half-formed, but I'm tempted to hack something together
quickly to see whether it might be viable.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: recovery modules

2023-02-09 Thread Nathan Bossart
On Wed, Feb 08, 2023 at 09:16:19PM -0800, Andres Freund wrote:
> Pushed. Thanks!

Thanks!

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: pg_stat_statements and "IN" conditions

2023-02-09 Thread Alvaro Herrera
On 2023-Feb-09, Dmitry Dolgov wrote:

> > On Thu, Feb 09, 2023 at 02:30:34PM +0100, Peter Eisentraut wrote:

> > What is the point of making this a numeric setting?  Either you want
> > to merge all values or you don't want to merge any values.
> 
> At least in theory the definition of "too many constants" is different
> for different use cases and I see allowing to configure it as a way of
> reducing the level of surprise here.

I was thinking about this a few days ago and I agree that we don't
necessarily want to make it just a boolean thing; we may want to make it
more complex.  One trivial idea is to make it group entries in powers of
10: for 0-9 elements, you get one entry, and 10-99 you get a different
one, and so on:

# group everything in a single bucket
const_merge_threshold = true / yes / on 

# group 0-9, 10-99, 100-999, 1000-
const_merge_treshold = powers

Ideally the value would be represented somehow in the query text. For
example

 query| calls
--+---
 select * from test where i in ({... 0-9 entries ...})| 2
 select * from test where i in ({... 10-99 entries ...})  | 1

What do you think?  The jumble would have to know how to reduce all
values within each power-of-ten group to one specific value, but I don't
think that should be particularly difficult.


-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Find a bug in a program, and fix it, and the program will work today.
Show the program how to find and fix a bug, and the program
will work forever" (Oliver Silfridge)




Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 09:57:42 -0500, Tom Lane wrote:
> Peter Eisentraut  writes:
> > On 08.02.23 23:18, Tom Lane wrote:
> >> I pushed the discussed documentation improvements, and changed the
> >> behavior of "ninja docs" to only build the HTML docs.
>
> > I don't like this change.  Now the default set of docs is different
> > between the make builds and the meson builds.  And people will be less
> > likely to make sure the man pages still build.
>
> What?  The default behavior of "make" has been to build only the
> html docs for many years.  And I've never ever seen a case where
> the html docs build and the man pages don't.

I think this misunderstanding is again due to the confusion between the 'all'
target in doc/src/sgml and the default target, just like earlier in the thread
/ why I ended up with the prior set of targets under 'docs'.

  # Make "html" the default target, since that is what most people tend
  # to want to use.
  html:
  ...
  all: html man


Given the repeated confusion from that, among fairly senior hackers, perhaps
we ought to at least put those lines next to each other? It's certainly not
obvious as-is.



> > What's wrong with just typing "ninja html"?
>
> Don't really care how the command is spelled, but there needs to
> be a convenient way to get that behavior.

Perhaps we should have doc-html, doc-man, doc-all or such?

The shell autocompletions for ninja work pretty well for me, a prefix like
that would make it easier to discover such "sub"-targets.


I'm was pondering adding a 'help' target that shows important targets.

Greetings,

Andres Freund




Re: run pgindent on a regular basis / scripted manner

2023-02-09 Thread Andrew Dunstan


On 2023-02-08 We 21:29, Shinoda, Noriyoshi (PN Japan FSIP) wrote:


Hi,
I tried the committed pgindent.
The attached small patch changes spaces in the usage message to tabs.
Options other than --commit start with a tab.



Thanks, pushed.


cheers


andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com


Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-02-09 Thread Tom Lane
Andres Freund  writes:
> I think this misunderstanding is again due to the confusion between the 'all'
> target in doc/src/sgml and the default target, just like earlier in the thread
> / why I ended up with the prior set of targets under 'docs'.

>   # Make "html" the default target, since that is what most people tend
>   # to want to use.
>   html:
>   ...
>   all: html man

> Given the repeated confusion from that, among fairly senior hackers, perhaps
> we ought to at least put those lines next to each other? It's certainly not
> obvious as-is.

I think there are ordering constraints between these and the
Makefile.global inclusion.  But we could add a comment beside the "all:"
line pointing out that that's not the default target.

> Perhaps we should have doc-html, doc-man, doc-all or such?

No objection here.

If we intend to someday build tarballs with meson, there'd need to be
a target that builds html+man, but that could perhaps be named
"distprep".

regards, tom lane




Minor meson gripe

2023-02-09 Thread Peter Geoghegan
Currently, meson has a test suite named "setup". According to the
Wiki, this is needed to get something equivalent to "make check", by
running "meson test -v --suite setup --suite regress".

Some questions about this:

* Isn't it confusing that we have a suite by that name, given that we
also need to use the unrelated --setup flag for some nearby testing
recipes?

* Why do we actually need a "setup" suite?

Offhand it appears that a simple "meson test -v --suite regress" works
just as well. Have I missed something?

-- 
Peter Geoghegan




Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2023-02-09 Thread Thomas Munro
On Wed, Feb 8, 2023 at 5:12 PM Michael Paquier  wrote:
> At the end, I have just done this stuff down to ~12, 11 does not seem
> worth the trouble as the next stable version to go out of support.
> I'll reduce gokiburi's script a bit, as a result, until the oldest
> version support is v12.

For the record, according to [1] it's not necessary to use
--reset-author when back-patching.  (Maybe a little confusingly,
because it's not quite clear whether our policies consider the author
field to be meaningful or not.)

[1] 
https://www.postgresql.org/message-id/flat/1347459696.16215.11.camel%40vanquo.pezone.net




Re: Improve logging when using Huge Pages

2023-02-09 Thread Nathan Bossart
On Thu, Feb 09, 2023 at 10:40:13AM +0100, Alvaro Herrera wrote:
> On 2023-Feb-08, Justin Pryzby wrote:
>> I don't think it makes sense to run postgres -C huge_pages_active,
>> however, so I see no issue that that would always returns "false".
> 
> Hmm, I would initialize it to return "unknown" rather than "off" — and
> make sure it turns "off" at the appropriate time.  Otherwise you're just
> moving the confusion elsewhere.

I think this approach would address my concerns about using a GUC.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Weird failure with latches in curculio on v15

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 11:12:21 -0500, Robert Haas wrote:
> On Thu, Feb 9, 2023 at 10:51 AM Tom Lane  wrote:
> > I'm fairly concerned about the idea of making it common for people
> > to write their own main loop for the archiver.  That means that, if
> > we have a bug fix that requires the archiver to do X, we will not
> > just be patching our own code but trying to get an indeterminate
> > set of third parties to add the fix to their code.

I'm somewhat concerned about that too, but perhaps from a different
angle. First, I think we don't do our users a service by defaulting the
in-core implementation to something that doesn't scale to even a moderately
busy server. Second, I doubt we'll get the API for any of this right, without
an acutual user that does something more complicated than restoring one-by-one
in a blocking manner.


> I don't know what kind of bug we could really have in the main loop
> that would be common to every implementation. They're probably all
> going to check for interrupts, do some work, and then wait for I/O on
> some things by calling select() or some equivalent. But the work, and
> the wait for the I/O, would be different for every implementation. I
> would anticipate that the amount of common code would be nearly zero.

I don't think it's that hard to imagine problems. To be reasonably fast, a
decent restore implementation will have to 'restore ahead'. Which also
provides ample things to go wrong. E.g.

- WAL source is switched, restore module needs to react to that, but doesn't,
  we end up lots of wasted work, or worse, filename conflicts
- recovery follows a timeline, restore module doesn't catch on quickly enough
- end of recovery happens, restore just continues on


> > If we think we need primitives to let the archiver hooks get all
> > the pending files, or whatever, by all means add those.  But don't
> > cede fundamental control of the archiver.  The hooks need to be
> > decoration on a framework we provide, not the framework themselves.
>
> I don't quite see how you can make asynchronous and parallel archiving
> work if the archiver process only calls into the archive module at
> times that it chooses. That would mean that the module has to return
> control to the archiver when it's in the middle of archiving one or
> more files -- and then I don't see how it can get control back at the
> appropriate time. Do you have a thought about that?

I don't think archiver is the hard part, that already has a dedicated
process, and it also has something of a queuing system already. The startup
process imo is the complicated one...

If we had a 'restorer' process, startup fed some sort of a queue with things
to restore in the near future, it might be more realistic to do something you
describe?

Greetings,

Andres Freund




Re: recovery modules

2023-02-09 Thread Nathan Bossart
rebased for cfbot

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From ea6339276c6863f260572dfc816f9dd27ac7b516 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 27 Jan 2023 21:01:22 -0800
Subject: [PATCH v9 1/3] s/ArchiveContext/ArchiveCallbacks

---
 src/backend/postmaster/pgarch.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index e551af2905..065d7d1313 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -97,7 +97,7 @@ char	   *XLogArchiveLibrary = "";
  */
 static time_t last_sigterm_time = 0;
 static PgArchData *PgArch = NULL;
-static ArchiveModuleCallbacks ArchiveContext;
+static ArchiveModuleCallbacks ArchiveCallbacks;
 
 
 /*
@@ -406,8 +406,8 @@ pgarch_ArchiverCopyLoop(void)
 			HandlePgArchInterrupts();
 
 			/* can't do anything if not configured ... */
-			if (ArchiveContext.check_configured_cb != NULL &&
-!ArchiveContext.check_configured_cb())
+			if (ArchiveCallbacks.check_configured_cb != NULL &&
+!ArchiveCallbacks.check_configured_cb())
 			{
 ereport(WARNING,
 		(errmsg("archive_mode enabled, yet archiving is not configured")));
@@ -508,7 +508,7 @@ pgarch_archiveXlog(char *xlog)
 	snprintf(activitymsg, sizeof(activitymsg), "archiving %s", xlog);
 	set_ps_display(activitymsg);
 
-	ret = ArchiveContext.archive_file_cb(xlog, pathname);
+	ret = ArchiveCallbacks.archive_file_cb(xlog, pathname);
 	if (ret)
 		snprintf(activitymsg, sizeof(activitymsg), "last was %s", xlog);
 	else
@@ -814,7 +814,7 @@ HandlePgArchInterrupts(void)
 /*
  * LoadArchiveLibrary
  *
- * Loads the archiving callbacks into our local ArchiveContext.
+ * Loads the archiving callbacks into our local ArchiveCallbacks.
  */
 static void
 LoadArchiveLibrary(void)
@@ -827,7 +827,7 @@ LoadArchiveLibrary(void)
  errmsg("both archive_command and archive_library set"),
  errdetail("Only one of archive_command, archive_library may be set.")));
 
-	memset(&ArchiveContext, 0, sizeof(ArchiveModuleCallbacks));
+	memset(&ArchiveCallbacks, 0, sizeof(ArchiveModuleCallbacks));
 
 	/*
 	 * If shell archiving is enabled, use our special initialization function.
@@ -844,9 +844,9 @@ LoadArchiveLibrary(void)
 		ereport(ERROR,
 (errmsg("archive modules have to define the symbol %s", "_PG_archive_module_init")));
 
-	(*archive_init) (&ArchiveContext);
+	(*archive_init) (&ArchiveCallbacks);
 
-	if (ArchiveContext.archive_file_cb == NULL)
+	if (ArchiveCallbacks.archive_file_cb == NULL)
 		ereport(ERROR,
 (errmsg("archive modules must register an archive callback")));
 
@@ -859,6 +859,6 @@ LoadArchiveLibrary(void)
 static void
 pgarch_call_module_shutdown_cb(int code, Datum arg)
 {
-	if (ArchiveContext.shutdown_cb != NULL)
-		ArchiveContext.shutdown_cb();
+	if (ArchiveCallbacks.shutdown_cb != NULL)
+		ArchiveCallbacks.shutdown_cb();
 }
-- 
2.25.1

>From e4e28ed2aea395e5120389e92d6944c48468208e Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 27 Jan 2023 21:16:34 -0800
Subject: [PATCH v9 2/3] move archive module exports to dedicated headers

---
 contrib/basic_archive/basic_archive.c   |  2 +-
 src/backend/postmaster/pgarch.c |  2 ++
 src/backend/postmaster/shell_archive.c  |  3 +-
 src/backend/utils/misc/guc_tables.c |  1 +
 src/include/postmaster/archive_module.h | 47 +
 src/include/postmaster/pgarch.h | 39 
 src/include/postmaster/shell_archive.h  | 24 +
 7 files changed, 77 insertions(+), 41 deletions(-)
 create mode 100644 src/include/postmaster/archive_module.h
 create mode 100644 src/include/postmaster/shell_archive.h

diff --git a/contrib/basic_archive/basic_archive.c b/contrib/basic_archive/basic_archive.c
index 36b7a4814a..384336884d 100644
--- a/contrib/basic_archive/basic_archive.c
+++ b/contrib/basic_archive/basic_archive.c
@@ -32,7 +32,7 @@
 
 #include "common/int.h"
 #include "miscadmin.h"
-#include "postmaster/pgarch.h"
+#include "postmaster/archive_module.h"
 #include "storage/copydir.h"
 #include "storage/fd.h"
 #include "utils/guc.h"
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 065d7d1313..281d9fd8b7 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -34,8 +34,10 @@
 #include "lib/binaryheap.h"
 #include "libpq/pqsignal.h"
 #include "pgstat.h"
+#include "postmaster/archive_module.h"
 #include "postmaster/interrupt.h"
 #include "postmaster/pgarch.h"
+#include "postmaster/shell_archive.h"
 #include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/latch.h"
diff --git a/src/backend/postmaster/shell_archive.c b/src/backend/postmaster/shell_archive.c
index 7771b951b7..0f0558e155 100644
--- a/src/backend/postmaster/shell_archive.c
+++ b/src/backend/postmaster/shell_archive.c
@@ -20,7 +20,8 @@
 #include "access/xlog.h"
 #include "common/percentrepl.h"
 #include "pgstat.h"
-#i

Re: pg_stat_statements and "IN" conditions

2023-02-09 Thread Dmitry Dolgov
> On Thu, Feb 09, 2023 at 06:26:51PM +0100, Alvaro Herrera wrote:
> On 2023-Feb-09, Dmitry Dolgov wrote:
>
> > > On Thu, Feb 09, 2023 at 02:30:34PM +0100, Peter Eisentraut wrote:
>
> > > What is the point of making this a numeric setting?  Either you want
> > > to merge all values or you don't want to merge any values.
> >
> > At least in theory the definition of "too many constants" is different
> > for different use cases and I see allowing to configure it as a way of
> > reducing the level of surprise here.
>
> I was thinking about this a few days ago and I agree that we don't
> necessarily want to make it just a boolean thing; we may want to make it
> more complex.  One trivial idea is to make it group entries in powers of
> 10: for 0-9 elements, you get one entry, and 10-99 you get a different
> one, and so on:
>
> # group everything in a single bucket
> const_merge_threshold = true / yes / on
>
> # group 0-9, 10-99, 100-999, 1000-
> const_merge_treshold = powers
>
> Ideally the value would be represented somehow in the query text. For
> example
>
>  query| calls
> --+---
>  select * from test where i in ({... 0-9 entries ...})| 2
>  select * from test where i in ({... 10-99 entries ...})  | 1
>
> What do you think?  The jumble would have to know how to reduce all
> values within each power-of-ten group to one specific value, but I don't
> think that should be particularly difficult.

Yeah, it sounds appealing and conveniently addresses the question of
losing the information about how many constants originally were there.
Not sure if the example above would be the most natural way to represent
it in the query text, but otherwise I'm going to try implementing this.
Stay tuned.




Re: recovery modules

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 11:39:17 -0800, Nathan Bossart wrote:
> rebased for cfbot

I think this nearly ready. Michael, are you planning to commit this?

Personally I'd probably squash these into a single commit.


> diff --git a/doc/src/sgml/archive-modules.sgml 
> b/doc/src/sgml/archive-modules.sgml
> index ef02051f7f..2db1b19216 100644
> --- a/doc/src/sgml/archive-modules.sgml
> +++ b/doc/src/sgml/archive-modules.sgml
> @@ -47,23 +47,30 @@
> normal library search path is used to locate the library.  To provide the
> required archive module callbacks and to indicate that the library is
> actually an archive module, it needs to provide a function named
> -   _PG_archive_module_init.  This function is passed a
> -   struct that needs to be filled with the callback function pointers for
> -   individual actions.
> +   _PG_archive_module_init.  This function must return 
> an
> +   ArchiveModuleCallbacks struct filled with the
> +   callback function pointers for individual actions.

I'd probably mention that this should typically be of server lifetime / a
'static const' struct. Tableam documents this as follows:

  The result of the function must be a pointer to a struct of type
  TableAmRoutine, which contains everything that the
  core code needs to know to make use of the table access method. The return
  value needs to be of server lifetime, which is typically achieved by
  defining it as a static const variable in global
  scope


> +
> +  
> +   
> +archive_library is only loaded in the archiver 
> process.
> +   
> +  
>   

That's not really related to any of the changes here, right?

I'm not sure it's a good idea to document that. We e.g. probably should allow
the library to check that the configuration is correct, at postmaster start,
rather than later, at runtime.


>   
> @@ -73,6 +80,20 @@ typedef void (*ArchiveModuleInit) (struct 
> ArchiveModuleCallbacks *cb);
> The server will call them as required to process each individual WAL file.
>
>  
> +  
> +   Startup Callback
> +   
> +The startup_cb callback is called shortly after the
> +module is loaded.  This callback can be used to perform any additional
> +initialization required.  If the archive module needs to have a state, it
> +can use state->private_data to store it.

s/needs to have a state/has state/?


> @@ -83,7 +104,7 @@ typedef void (*ArchiveModuleInit) (struct 
> ArchiveModuleCallbacks *cb);
>  assumes the module is configured.
>  
>  
> -typedef bool (*ArchiveCheckConfiguredCB) (void);
> +typedef bool (*ArchiveCheckConfiguredCB) (ArchiveModuleState *state);
>  
>  
>  If true is returned, the server will proceed with

Hm. I wonder if ArchiveCheckConfiguredCB() should actually work without the
state. We're not really doing anything yet, at that point, so it shouldn't
really need state?

The reason I'm wondering is that I think we should consider calling this from
the GUC assignment hook, at least in postmaster. Which would make it more
convenient to not have state, I think?



> @@ -128,7 +149,7 @@ typedef bool (*ArchiveFileCB) (const char *file, const 
> char *path);
>  these situations.
>  
>  
> -typedef void (*ArchiveShutdownCB) (void);
> +typedef void (*ArchiveShutdownCB) (ArchiveModuleState *state);
>  
> 
>

Perhaps mention that this needs to free state it allocated in the
ArchiveModuleState, or it'll be leaked?

Greetings,

Andres Freund




Re: Can we do something to help stop users mistakenly using force_parallel_mode?

2023-02-09 Thread David Rowley
On Thu, 9 Feb 2023 at 21:20, John Naylor  wrote:
> Looks good at a glance, just found a spurious word:
>
> + "by forcing the planner into to generate plans which contains nodes "

Thanks for looking. I'll fix that.

Likely the hardest part to get right here is the new name.  Can anyone
think of anything better than debug_parallel_query?

David




Re: MacOS: xsltproc fails with "warning: failed to load external entity"

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 13:48:46 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > I think this misunderstanding is again due to the confusion between the 
> > 'all'
> > target in doc/src/sgml and the default target, just like earlier in the 
> > thread
> > / why I ended up with the prior set of targets under 'docs'.
> 
> >   # Make "html" the default target, since that is what most people tend
> >   # to want to use.
> >   html:
> >   ...
> >   all: html man
> 
> > Given the repeated confusion from that, among fairly senior hackers, perhaps
> > we ought to at least put those lines next to each other? It's certainly not
> > obvious as-is.
> 
> I think there are ordering constraints between these and the
> Makefile.global inclusion.  But we could add a comment beside the "all:"
> line pointing out that that's not the default target.

Yes, html: has to happen before the inclusion of Makefile.global to become the
default target, but afaics we can just move "all: html man" up?


> If we intend to someday build tarballs with meson, there'd need to be
> a target that builds html+man, but that could perhaps be named
> "distprep".

Yea, a distprep target just depending on all the required targets seems to be
the way to go for that.


Not really related: I think we should seriously consider removing most of the
things distprep includes in the tarball. I'd leave docs in though. But IMO all
of the generated code doesn't make sense in this day and age. I guess that's a
discussion for a different thread and a different day.

Greetings,

Andres Freund




Re: Minor meson gripe

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 11:01:31 -0800, Peter Geoghegan wrote:
> Currently, meson has a test suite named "setup". According to the
> Wiki, this is needed to get something equivalent to "make check", by
> running "meson test -v --suite setup --suite regress".

Yep.


> Some questions about this:
>
> * Isn't it confusing that we have a suite by that name, given that we
> also need to use the unrelated --setup flag for some nearby testing
> recipes?

Hm. I don't find it particularly confusing, but I don't think I'm a good judge
of that, too close.


> * Why do we actually need a "setup" suite?
>
> Offhand it appears that a simple "meson test -v --suite regress" works
> just as well. Have I missed something?

It'll work, but only if you have run setup before. And it'll not use changed C
code.

The setup suite creates the installation in tmp_install/. So if you haven't
run the tests before, it'll fail due to that missing. If you have run it
before, but have changed code, it'll not get used.


The background for the issue is that while meson test supports dependencies
for each test, and will build exactly the required dependencies if you run
individual tests with meson test, it unfortunately also adds all the test
dependencies to the default ninja target.

That's mostly for historical reasons, because initially meson didn't support
dependencies for tests. There's recent work on changing that though.

Creating the temp installation every time you run 'ninja' would not be
nice. On slower machines it can take quite a while.


I think medium term we should just stop requiring a temporary install to run
tests, it's substantial, unnecessary, overhead, and it requires us to build
way too much to run basic tests. It'd not take a whole lot to make that work:

- a search path for finding extensions, which'd be very useful for other
  reasons as well

- a way to tell 'postgres', 'initdb' etc, which use find_other_exec(), that
  they should use PATH
  
- a way to tell initdb where to find things like postgres.bki, postgres where
  it can find timezone data, etc.


Greetings,

Andres Freund




pg_usleep for multisecond delays

2023-02-09 Thread Nathan Bossart
I just found myself carefully counting the zeros in a call to pg_usleep().
Besides getting my eyes checked, perhaps there should be a wrapper called
pg_ssleep() than can be used for multisecond sleeps.  Or maybe the
USECS_PER_SEC macro should be used more widely.  I attached a patch for the
former approach.  I don't have a strong opinion, but I do think it's worth
improving readability a bit here.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3feee28d19..63de896cae 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -2976,7 +2976,7 @@ _bt_pendingfsm_finalize(Relation rel, BTVacState *vstate)
 	 * never be effective without some other backend concurrently consuming an
 	 * XID.
 	 */
-	pg_usleep(500L);
+	pg_ssleep(5);
 #endif
 
 	/*
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index f9f0f6db8d..87664045d0 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5103,7 +5103,7 @@ StartupXLOG(void)
 	/* This is just to allow attaching to startup process with a debugger */
 #ifdef XLOG_REPLAY_DELAY
 	if (ControlFile->state != DB_SHUTDOWNED)
-		pg_usleep(6000L);
+		pg_ssleep(60);
 #endif
 
 	/*
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index ff6149a179..744adff984 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -441,7 +441,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 			(errmsg_internal("autovacuum launcher started")));
 
 	if (PostAuthDelay)
-		pg_usleep(PostAuthDelay * 100L);
+		pg_ssleep(PostAuthDelay);
 
 	SetProcessingMode(InitProcessing);
 
@@ -561,7 +561,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 		 * Sleep at least 1 second after any error.  We don't want to be
 		 * filling the error logs as fast as we can.
 		 */
-		pg_usleep(100L);
+		pg_ssleep(1);
 	}
 
 	/* We can now handle ereport(ERROR) */
@@ -687,7 +687,7 @@ AutoVacLauncherMain(int argc, char *argv[])
  * of a worker will continue to fail in the same way.
  */
 AutoVacuumShmem->av_signal[AutoVacForkFailed] = false;
-pg_usleep(100L);	/* 1s */
+pg_ssleep(1);
 SendPostmasterSignal(PMSIGNAL_START_AUTOVAC_WORKER);
 continue;
 			}
@@ -1708,7 +1708,7 @@ AutoVacWorkerMain(int argc, char *argv[])
 (errmsg_internal("autovacuum: processing database \"%s\"", dbname)));
 
 		if (PostAuthDelay)
-			pg_usleep(PostAuthDelay * 100L);
+			pg_ssleep(PostAuthDelay);
 
 		/* And do an appropriate amount of work */
 		recentXid = ReadNextTransactionId();
diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c
index 0dd22b2351..6d38dfeeba 100644
--- a/src/backend/postmaster/bgworker.c
+++ b/src/backend/postmaster/bgworker.c
@@ -759,7 +759,7 @@ StartBackgroundWorker(void)
 
 	/* Apply PostAuthDelay */
 	if (PostAuthDelay > 0)
-		pg_usleep(PostAuthDelay * 100L);
+		pg_ssleep(PostAuthDelay);
 
 	/*
 	 * Set up signal handlers.
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 9bb47da404..147f9b1e38 100644
--- a/src/backend/postmaster/bgwriter.c
+++ b/src/backend/postmaster/bgwriter.c
@@ -196,7 +196,7 @@ BackgroundWriterMain(void)
 		 * to be repeated, and we don't want to be filling the error logs as
 		 * fast as we can.
 		 */
-		pg_usleep(100L);
+		pg_ssleep(1);
 
 		/*
 		 * Close all open files after any error.  This is helpful on Windows,
diff --git a/src/backend/postmaster/checkpointer.c b/src/backend/postmaster/checkpointer.c
index aaad5c5228..12786229dd 100644
--- a/src/backend/postmaster/checkpointer.c
+++ b/src/backend/postmaster/checkpointer.c
@@ -310,7 +310,7 @@ CheckpointerMain(void)
 		 * to be repeated, and we don't want to be filling the error logs as
 		 * fast as we can.
 		 */
-		pg_usleep(100L);
+		pg_ssleep(1);
 
 		/*
 		 * Close all open files after any error.  This is helpful on Windows,
diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index e551af2905..6460c69726 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -450,7 +450,7 @@ pgarch_ArchiverCopyLoop(void)
 }
 
 /* wait a bit before retrying */
-pg_usleep(100L);
+pg_ssleep(1);
 continue;
 			}
 
@@ -482,7 +482,7 @@ pgarch_ArchiverCopyLoop(void)
 	xlog)));
 	return;		/* give up archiving for now */
 }
-pg_usleep(100L);	/* wait a bit before retrying */
+pg_ssleep(1);	/* wait a bit before retrying */
 			}
 		}
 	}
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 2552327d90..6b80f423a8 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -4292,7 +4292,7 @@ BackendInitialize(Port *port)
 	 * is not honored until after authentication.)
 	 */
 	

Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 13:30:30 -0500, Tom Lane wrote:
> 0003 lacks meson support (anyone want to help with that?)

I'll give it a go, unless somebody else wants to.


Do we expect pg_bsd_indent to build / work on windows, right now? If it
doesn't, do we want to make that a hard requirement?

I'll have CI test that, once I added meson support.


> I'd anticipate pushing 0001-0003 shortly but holding 0004 until we are ready
> to do the post-March-CF pgindent run.  (Come to think of it, 0004 had
> probably better include a pg_bsd_indent version bump too.)

How large is the diff that creates? If it's not super-widespread, it might be
ok to do that earlier. I wouldn't mind not seeing that uglyness every time I
run pgindent on a patch... Although I guess post-March-CF isn't that far away
at this point :)

Greetings,

Andres Freund




Re: ICU locale validation / canonicalization

2023-02-09 Thread Jeff Davis
On Thu, 2023-02-09 at 15:44 +0100, Peter Eisentraut wrote:
> One use case is that if a user specifies a locale, say, of 'de-AT',
> this 
> might canonicalize to 'de' today,

Canonicalization should not lose useful information, it should just
rearrange it, so I don't see a risk here based on what I read and the
behavior I saw. In ICU, "de-AT" canonicalizes to "de_AT" and becomes
the language tag "de-AT".

> but we should still store what the 
> user specified because 1) that documents what the user wanted, and 2)
> it 
> might not canonicalize to the same thing tomorrow.

We don't want to store things with ambiguous interpretations that could
change tomorrow; that's a recipe for trouble. That's why most people
store timestamps as the offset from some epoch in UTC rather than as
"2/9/23" (Feb 9 or Sept 2? 1923 or 2023?). There are exceptions where
you would want to store something like that, but I don't see why they'd
apply in this case, where reinterpretation probably means a corrupted
index.

If the user wants to know how their ad-hoc string was interpreted, they
can look at the resulting BCP 47 language tag, and see if it's what
they meant. We can try to make this user-friendly by offering a NOTICE,
WARNING, or helper functions that allow them to explore. We can also
double check that the canonicalized form resolves to the same actual
collator to be safe, and maybe even fall back to whatever the user
specified if not. I'm open to discuss how strict we want to be and what
kind of escape hatches we need to offer.

There is still a risk that the BCP 47 language tag resolves to a
different specific ICU collator or different collator version tomorrow.
That's why we need to be careful about versioning (library versions or
collator versions or both), and we've had long discussions about that.

-- 
Jeff Davis
PostgreSQL Contributor Team - AWS






Re: pg_usleep for multisecond delays

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 12:59:29 -0800, Nathan Bossart wrote:
> I just found myself carefully counting the zeros in a call to pg_usleep().
> Besides getting my eyes checked, perhaps there should be a wrapper called
> pg_ssleep() than can be used for multisecond sleeps.  Or maybe the
> USECS_PER_SEC macro should be used more widely.  I attached a patch for the
> former approach.  I don't have a strong opinion, but I do think it's worth
> improving readability a bit here.

pg_usleep() should pretty much never used for sleeps that long, at least in
the backend - depending on the platform they're not interruptible. Most of the
things changed here are debugging tools, but even so, it's e.g. pretty
annoying. E.g. you can't normally shut down while a backend is in
pre_auth_delay.

So I'm not sure it's the right direction to make pg_usleep() easier to use...

Greetings,

Andres Freund




Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Tom Lane
Andres Freund  writes:
> On 2023-02-09 13:30:30 -0500, Tom Lane wrote:
>> 0003 lacks meson support (anyone want to help with that?)

> I'll give it a go, unless somebody else wants to.

Thanks.

> Do we expect pg_bsd_indent to build / work on windows, right now?

It would be desirable, for sure.  I've not noticed anything remarkably
unportable in the code, so probably it's just a matter of getting the
build infrastructure to build it.  I suppose that we aren't going to
update the src/tools/msvc scripts anymore, so getting meson to handle
it should be enough??

>> I'd anticipate pushing 0001-0003 shortly but holding 0004 until we are ready
>> to do the post-March-CF pgindent run.  (Come to think of it, 0004 had
>> probably better include a pg_bsd_indent version bump too.)

> How large is the diff that creates? If it's not super-widespread, it might be
> ok to do that earlier.

It wasn't that big; I posted it in the thread discussing that change.

I think the real issue might just be that, assuming we bump the
pg_bsd_indent version number, that in itself would force interested
committers to update their copy Right Now.  I'd rather give a little
notice.

regards, tom lane




Re: postgres_fdw, dblink, and CREATE SUBSCRIPTION security

2023-02-09 Thread Jacob Champion
On 2/6/23 08:22, Robert Haas wrote:
> I don't think that's quite the right concept. It seems to me that the
> client is responsible for informing the server of what the situation
> is, and the server is responsible for deciding whether to allow the
> connection. In your scenario, the client is not only communicating
> information ("here's the password I have got") but also making demands
> on the server ("DO NOT authenticate using anything else"). I like the
> first part fine, but not the second part.

For what it's worth, making a negative demand during authentication is
pretty standard: if you visit example.com and it tells you "I need your
OS login password and Social Security Number to authenticate you," you
have the option of saying "no thanks" and closing the tab.

It's not really about protecting the server at that point; the server
can protect itself. It's about protecting *you*. Allowing the proxy to
pin a specific set of authentication details to the connection is just a
way for it to close the tab on a server that would otherwise pull some
other piece of ambient authority out of it.

In a hypothetical world where the server presented the client with a
list of authentication options before allowing any access, this would
maybe be a little less convoluted to solve. For example, a proxy seeing
a SASL list of

- ANONYMOUS
- EXTERNAL

could understand that both methods allow the client to assume the
authority of the proxy itself. So if its client isn't allowed to do
that, the proxy realizes something is wrong (either it, or its target
server, has been misconfigured or is under attack), and it can close the
connection *before* the server runs login triggers.

> In this kind of scenario, the client has no business
> demanding that the server authenticate using the password rather than
> anything else. The server, not the client, is in charge of deciding
> which connections to accept; the client's job is only to decide which
> connections to proxy.

This sounds like a reasonable separation of responsibilities on the
surface, but I think it's subtly off. The entire confused-deputy problem
space revolves around the proxy being unable to correctly decide which
connections to allow unless it also knows why the connections are being
authorized.

You've constructed an example where that's not a concern: everything's
symmetrical, all proxies operate with the same authority, and internal
users are identical to external users. But the CVE that led to the
password requirement, as far as I can tell, dealt with asymmetry. The
proxy had the authority to connect locally to a user, and the clients
had the authority to connect to other machines' users, but those users
weren't the same and were not mutually trusting.

> And the human being is responsible for making
> sure that the combination of those two things implements the intended
> security policy.

Sure, but upthread it was illustrated how difficult it is for even the
people implementing the protocol to reason through what's safe and
what's not.

The primitives we're providing in the protocol are, IMO, difficult to
wield safely for more complex use cases. We can provide mitigations, and
demand that the DBA reason through every combination, and tell them
"don't do that" when they screw up or come across a situation that our
mitigations can't paper over. But I think we can solve the root problem
instead.

> We
> could do something that looks more like a series of if-then rules,
> e.g.
> 
> target-host 127.0.0.0/8 => reject
> authentication-method scram => accept
> reject

Yeah, I think something based on allow/deny is going to be the most
intuitive.

> But it's only a hop, skip and a jump from there to something that
> looks an awful lot like a full-blown programing language, and maybe
> that's even the right idea, but, oh, the bike-shedding!

Eh. Someone will demand Turing-completeness eventually, but you don't
have to listen. :D

--Jacob




Re: ICU locale validation / canonicalization

2023-02-09 Thread Jeff Davis
On Thu, 2023-02-09 at 10:53 -0500, Robert Haas wrote:
> Unfortunately, I have no idea whether your specific ideas about how
> to
> make that happen are any good or not. But I hope they are, because
> the
> current situation is pessimal.

It feels like BCP 47 is the right catalog representation. We are
already using it for the import of initial collations, and it's a
standard, and there seems to be good support in ICU.

There are a couple cases where canonicalization will succeed but
conversion to a BCP 47 language tag will fail. One is for unsupported
attributes, like "en_US@foo=bar". Another is a bug I found and reported
here:

https://unicode-org.atlassian.net/browse/ICU-22268

In both cases, we know that conversion has failed, and we have a choice
about how to proceed. We can fail, warn and continue with the user-
entered representation, or turn off the strictness checking and come up
with some BCP 47 tag and see if it resolves to the same collator.

I do like the ICU format locale IDs from a readability standpoint.
"en_US@colstrength=primary" is more meaningful to me than "en-US-u-ks-
level1" (the equivalent language tag). And the format is specified[1],
even though it's not an independent standard. But I think the benefits
of better validation, an independent standard, and the fact that we're
already favoring BCP47 outweigh my subjective opinion.

I also attached a simple test program that I've been using to
experiment (not intended for code review).

It's hard for me to say that I'm sure I'm right. I really just got
involved in this a few months back, and had a few off-list
conversations with Peter Eisentraut to try to learn more (I believe he
is aligned with my proposal but I will let him speak for himself).

I should also say that I'm not exactly an expert in languages or
scripts. I assume that ICU and IETF are doing sensible things to
accommodate the diversity of human language as well as they can (or at
least much better than the Postgres project could do on its own).

I'm happy to hear more input or other proposals.

[1]
https://unicode-org.github.io/icu/userguide/locale/#canonicalization

-- 
Jeff Davis
PostgreSQL Contributor Team - AWS



#include 
#include 
#include 

#define CAPACITY 1024

int main(int argc, char *argv[])
{
  UErrorCode	 status;
  UCollator	*collator;
  const char	*requested_locale = NULL;
  const char	*valid_locale;
  const char	*actual_locale;
  char		 canonical[CAPACITY] = {0};
  char		 variant[CAPACITY] = {0};
  char		 basename[CAPACITY] = {0};
  char		 getname[CAPACITY] = {0};
  char		 langtag[CAPACITY] = {0};
  char		 langtag_s[CAPACITY] = {0};

  if (argc > 1)
{
  requested_locale = argv[1];
  status = U_ZERO_ERROR;
  uloc_canonicalize(requested_locale, canonical, CAPACITY, &status);
  if (U_FAILURE(status))
	printf("canonicalize error: %s\n", u_errorName(status));
  status = U_ZERO_ERROR;
  uloc_getBaseName(requested_locale, basename, CAPACITY, &status);
  if (U_FAILURE(status))
	printf("basename error: %s\n", u_errorName(status));
  status = U_ZERO_ERROR;
  uloc_getName(requested_locale, getname, CAPACITY, &status);
  if (U_FAILURE(status))
	printf("getname error: %s\n", u_errorName(status));
  status = U_ZERO_ERROR;
  uloc_getVariant(requested_locale, variant, CAPACITY, &status);
  if (U_FAILURE(status))
	printf("variant error: %s\n", u_errorName(status));
  status = U_ZERO_ERROR;
  uloc_toLanguageTag(requested_locale, langtag, CAPACITY, false, &status);
  if (U_FAILURE(status))
	printf("langtag error: %s\n", u_errorName(status));
  uloc_toLanguageTag(requested_locale, langtag_s, CAPACITY, true, &status);
  if (U_FAILURE(status))
	printf("langtag strict error: %s\n", u_errorName(status));
}
  else if (argc > 2)
fprintf(stderr, "too many arguments");
  
  status = U_ZERO_ERROR;
  collator = ucol_open(requested_locale, &status);
  valid_locale = ucol_getLocaleByType(collator, ULOC_VALID_LOCALE, &status);
  actual_locale = ucol_getLocaleByType(collator, ULOC_ACTUAL_LOCALE, &status);
  printf("canonicalize: %s\n", canonical);
  printf("langtag   : %s\n", langtag);
  printf("langtag strict: %s\n", langtag_s);
  printf("variant: %s\n", variant);
  printf("getBaseName: %s\n", basename);
  printf("getName: %s\n", getname);
  printf("valid locale: %s\n", valid_locale);
  printf("actual locale: %s\n", actual_locale);
  ucol_close(collator);
}


Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Tom Lane
Andres Freund  writes:
> Did that in the attached.

Thanks.

> I didn't convert the test though, due to the duplicating it'd create. Perhaps
> we should just move it to a shell script? Or maybe it just doesn't matter
> enough to bother with?

We could move it to a shell script perhaps, but that seems pretty
low-priority.

> It doesn't build as-is with msvc, but does build with mingw. Failure:
> https://cirrus-ci.com/task/6290206869946368?logs=build#L1573

Thanks, I'll take a look at these things.

> To we really want to require users to install pg_bsd_indent into PATH? Seems
> like we ought to have a build target to invoke pgindent with a path to
> pg_bsd_indent or such? But I guess we can address that later.

For the moment I was just interested in maintaining the current workflow.
I know people muttered about having some sort of build target that'd
indent the whole tree from scratch after building pg_bsd_indent, but it's
not very clear to me how that'd work with e.g. VPATH configurations.

(I think you can already tell pgindent to use a specific pg_bsd_indent,
if your gripe is just about wanting to use a prebuilt copy that you
don't want to keep in PATH for some reason.)

> Independent of this specific patch: You seem to be generating your patch
> series by invoking git show and redirecting that to a file?

Yeah, it's pretty low-tech.  I'm not in the habit of posting multi-patch
series very often, so I haven't really bothered to use format-patch.
(I gave up on "git am" long ago as being too fragile, and always
use good ol' "patch" to apply patches, so I don't think about things
like whether it'd automatically absorb commit messages.  I pretty much
never use anyone else's commit message verbatim anyway ...)

regards, tom lane




Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 13:55:32 -0800, Andres Freund wrote:
> ../src/tools/pg_bsd_indent/args.c(179): error C2065: 'PATH_MAX': undeclared 
> identifier
> ../src/tools/pg_bsd_indent/args.c(179): error C2057: expected constant 
> expression
> ../src/tools/pg_bsd_indent/args.c(179): error C2466: cannot allocate an array 
> of constant size 0
> ../src/tools/pg_bsd_indent/args.c(179): error C2133: 'fname': unknown size
> ../src/tools/pg_bsd_indent/args.c(183): warning C4034: sizeof returns 0
> ../src/tools/pg_bsd_indent/args.c(185): warning C4034: sizeof returns 0
> [1557/2161] Compiling C object 
> src/tools/pg_bsd_indent/pg_bsd_indent.exe.p/err.c.obj
> [1558/2161] Precompiling header 
> src/interfaces/ecpg/ecpglib/libecpg.dll.p/meson_pch-c.c
> [1559/2161] Compiling C object 
> src/tools/pg_bsd_indent/pg_bsd_indent.exe.p/indent.c.obj
> FAILED: src/tools/pg_bsd_indent/pg_bsd_indent.exe.p/indent.c.obj
> "cl" "-Isrc\tools/pg_bsd_indent\pg_bsd_indent.exe.p" 
> "-Isrc\tools/pg_bsd_indent" "-I..\src\tools\pg_bsd_indent" "-Isrc\include" 
> "-I..\src\include" "-Ic:\openssl\1.1\include" "-I..\src\include\port\win32" 
> "-I..\src\include\port\win32_msvc" "/MDd" "/nologo" "/showIncludes" "/utf-8" 
> "/W2" "/Od" "/Zi" "/DWIN32" "/DWINDOWS" "/D__WINDOWS__" "/D__WIN32__" 
> "/D_CRT_SECURE_NO_DEPRECATE" "/D_CRT_NONSTDC_NO_DEPRECATE" "/wd4018" 
> "/wd4244" "/wd4273" "/wd4101" "/wd4102" "/wd4090" "/wd4267" 
> "/Fdsrc\tools/pg_bsd_indent\pg_bsd_indent.exe.p\indent.c.pdb" 
> /Fosrc/tools/pg_bsd_indent/pg_bsd_indent.exe.p/indent.c.obj "/c" 
> ../src/tools/pg_bsd_indent/indent.c
> ../src/tools/pg_bsd_indent/indent.c(63): error C2065: 'MAXPATHLEN': 
> undeclared identifier
> ../src/tools/pg_bsd_indent/indent.c(63): error C2057: expected constant 
> expression
> ../src/tools/pg_bsd_indent/indent.c(63): error C2466: cannot allocate an 
> array of constant size 0
> 
> This specific issue at least should be easily fixable.

The trivial fix of using MAXPGPATH made it build, without warnings. That
doesn't say anything about actually working. So I guess porting the test would
make sense.

Opinions on whether it would make sense as a shell script?

Greetings,

Andres Freund




Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Tom Lane
Andres Freund  writes:
> The trivial fix of using MAXPGPATH made it build, without warnings. That
> doesn't say anything about actually working. So I guess porting the test would
> make sense.

> Opinions on whether it would make sense as a shell script?

Hmmm .. a shell script would be fine by me, but it won't help in
testing a Windows build.  Maybe we need to make it a Perl script?

BTW, the attachments to your previous message are identical to what
I previously posted --- did you attach the wrong set of diffs?

regards, tom lane




Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-09 Thread Peter Smith
> The comment adjustment suggested by Peter-san above
> was also included in this v33.
> Please have a look at the attached patch.

Patch v33 LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: recovery modules

2023-02-09 Thread Nathan Bossart
On Thu, Feb 09, 2023 at 12:18:55PM -0800, Andres Freund wrote:
> On 2023-02-09 11:39:17 -0800, Nathan Bossart wrote:
> Personally I'd probably squash these into a single commit.

done

> I'd probably mention that this should typically be of server lifetime / a
> 'static const' struct. Tableam documents this as follows:

done

>> +  
>> +   
>> +archive_library is only loaded in the archiver 
>> process.
>> +   
>> +  
>>   
> 
> That's not really related to any of the changes here, right?
> 
> I'm not sure it's a good idea to document that. We e.g. probably should allow
> the library to check that the configuration is correct, at postmaster start,
> rather than later, at runtime.

removed

>> +  
>> +   Startup Callback
>> +   
>> +The startup_cb callback is called shortly after the
>> +module is loaded.  This callback can be used to perform any additional
>> +initialization required.  If the archive module needs to have a state, 
>> it
>> +can use state->private_data to store it.
> 
> s/needs to have a state/has state/?

done

>>  
>> -typedef bool (*ArchiveCheckConfiguredCB) (void);
>> +typedef bool (*ArchiveCheckConfiguredCB) (ArchiveModuleState *state);
>>  
>>  
>>  If true is returned, the server will proceed with
> 
> Hm. I wonder if ArchiveCheckConfiguredCB() should actually work without the
> state. We're not really doing anything yet, at that point, so it shouldn't
> really need state?
> 
> The reason I'm wondering is that I think we should consider calling this from
> the GUC assignment hook, at least in postmaster. Which would make it more
> convenient to not have state, I think?

I agree that this callback should typically not need the state, but I'm not
sure whether it fits into the assignment hook for archive_library.  This
callback is primarily meant for situations when you have archiving enabled,
but your module isn't configured yet (e.g., archive_command is empty).  In
this case, we keep the WAL around, but we don't try to archive it until
this hook returns true.  It's up to each module to define that criteria.  I
can imagine someone introducing a GUC in their archive module that
temporarily halts archiving via this callback.  In that case, calling it
via a GUC assignment hook probably won't work.  In fact, checking whether
archive_command is empty in that hook might not work either.

>>  
>> -typedef void (*ArchiveShutdownCB) (void);
>> +typedef void (*ArchiveShutdownCB) (ArchiveModuleState *state);
>>  
>> 
>>
> 
> Perhaps mention that this needs to free state it allocated in the
> ArchiveModuleState, or it'll be leaked?

done

I left this out originally because the archiver exits shortly after calling
this.  However, if you have DSM segments or something, it's probably wise
to make sure those are cleaned up.  And I suppose we might not always exit
immediately after this callback, so establishing the habit of freeing the
state could be a good idea.  In addition to updating this part of the docs,
I wrote a shutdown callback for basic_archive that frees its state.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From fab6305c344deb83c5b5c30ca2c09dbe1925a36c Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 9 Feb 2023 13:49:46 -0800
Subject: [PATCH v10 1/1] restructure archive modules API

---
 contrib/basic_archive/basic_archive.c   | 91 +
 doc/src/sgml/archive-modules.sgml   | 35 +++---
 src/backend/postmaster/pgarch.c | 27 +---
 src/backend/postmaster/shell_archive.c  | 34 +
 src/backend/utils/misc/guc_tables.c |  1 +
 src/include/postmaster/archive_module.h | 58 
 src/include/postmaster/pgarch.h | 39 ---
 src/include/postmaster/shell_archive.h  | 24 +++
 8 files changed, 224 insertions(+), 85 deletions(-)
 create mode 100644 src/include/postmaster/archive_module.h
 create mode 100644 src/include/postmaster/shell_archive.h

diff --git a/contrib/basic_archive/basic_archive.c b/contrib/basic_archive/basic_archive.c
index 36b7a4814a..e4742c9c94 100644
--- a/contrib/basic_archive/basic_archive.c
+++ b/contrib/basic_archive/basic_archive.c
@@ -32,7 +32,7 @@
 
 #include "common/int.h"
 #include "miscadmin.h"
-#include "postmaster/pgarch.h"
+#include "postmaster/archive_module.h"
 #include "storage/copydir.h"
 #include "storage/fd.h"
 #include "utils/guc.h"
@@ -40,14 +40,27 @@
 
 PG_MODULE_MAGIC;
 
+typedef struct BasicArchiveData
+{
+	MemoryContext context;
+} BasicArchiveData;
+
 static char *archive_directory = NULL;
-static MemoryContext basic_archive_context;
 
-static bool basic_archive_configured(void);
-static bool basic_archive_file(const char *file, const char *path);
+static void basic_archive_startup(ArchiveModuleState *state);
+static bool basic_archive_configured(ArchiveModuleState *state);
+static bool basic_archive_file(ArchiveModuleState *state, const char *file, const char *path);
 static void basic_archive_file_internal(const cha

Re: Generating code for query jumbling through gen_node_support.pl

2023-02-09 Thread Michael Paquier
On Wed, Feb 08, 2023 at 03:47:51PM +0900, Michael Paquier wrote:
> This one was intentional to let extensions play with jumbling of such
> nodes, but perhaps you are right that it makes little sense at this
> stage.  If there is an ask for it later, though..  Using
> shared_preload_libraries = pg_stat_statements and compute_query_id =
> regress shows that numbers go up to 60% for funcs.c and 30% for
> switch.c.  Removing nodes like as of the attached brings these numbers
> respectively up to 94.5% and 93.5% for a check.  With a check-world, I
> measure respectively 96.7% and 96.1% because there is more coverage
> for extensions, ALTER SYSTEM and database commands, roughly.

Tom, did you get a chance to look at what is proposed here and expand
the use of query_jumble_ignore in the definitions of the nodes rather
than have an enforced per-file policy in gen_node_support.pl?  The
attached improves the situation by as much as we can, still the
numbers reported by coverage.postgresql.org won't go that up until we
enforce query jumbling testing on the regression database (something
we should have done since this code moved to core, I guess).
--
Michael


signature.asc
Description: PGP signature


Re: pg_usleep for multisecond delays

2023-02-09 Thread Nathan Bossart
On Thu, Feb 09, 2023 at 01:30:27PM -0800, Andres Freund wrote:
> On 2023-02-09 12:59:29 -0800, Nathan Bossart wrote:
>> I just found myself carefully counting the zeros in a call to pg_usleep().
>> Besides getting my eyes checked, perhaps there should be a wrapper called
>> pg_ssleep() than can be used for multisecond sleeps.  Or maybe the
>> USECS_PER_SEC macro should be used more widely.  I attached a patch for the
>> former approach.  I don't have a strong opinion, but I do think it's worth
>> improving readability a bit here.
> 
> pg_usleep() should pretty much never used for sleeps that long, at least in
> the backend - depending on the platform they're not interruptible. Most of the
> things changed here are debugging tools, but even so, it's e.g. pretty
> annoying. E.g. you can't normally shut down while a backend is in
> pre_auth_delay.
> 
> So I'm not sure it's the right direction to make pg_usleep() easier to use...

Hm...  We might be able to use WaitLatch() for some of these.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Re: Worth using personality(ADDR_NO_RANDOMIZE) for EXEC_BACKEND on linux?

2023-02-09 Thread Michael Paquier
On Fri, Feb 10, 2023 at 08:15:21AM +1300, Thomas Munro wrote:
> For the record, according to [1] it's not necessary to use
> --reset-author when back-patching.  (Maybe a little confusingly,
> because it's not quite clear whether our policies consider the author
> field to be meaningful or not.)
> 
> [1] 
> https://www.postgresql.org/message-id/flat/1347459696.16215.11.camel%40vanquo.pezone.net

Oops, sorry about that.  Using --reset-author is a habit when it comes
to backpatch.  It looks like my mistake when back-patching something
only to stable branches.
--
Michael


signature.asc
Description: PGP signature


Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 17:19:22 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > The trivial fix of using MAXPGPATH made it build, without warnings. That
> > doesn't say anything about actually working. So I guess porting the test 
> > would
> > make sense.
> 
> > Opinions on whether it would make sense as a shell script?
> 
> Hmmm .. a shell script would be fine by me, but it won't help in
> testing a Windows build.  Maybe we need to make it a Perl script?

At least for casual testing a shell script actually mostly works, due to git
it's easy enough to have a sh.exe around... Not something I'd necessarily want
to make a hard dependency, but for something like this it might suffice.  Of
course perl would be more dependable...


> BTW, the attachments to your previous message are identical to what
> I previously posted --- did you attach the wrong set of diffs?

I attached an extra patch, in addition to yours. I also attached yours so that
cfbot could continue to work, if you registered this.

Greetings,

Andres Freund




Re: Generating code for query jumbling through gen_node_support.pl

2023-02-09 Thread Tom Lane
Michael Paquier  writes:
> Tom, did you get a chance to look at what is proposed here and expand
> the use of query_jumble_ignore in the definitions of the nodes rather
> than have an enforced per-file policy in gen_node_support.pl?

Sorry, didn't look at it before.

I'm okay with the pathnodes.h changes --- although surely you don't need
changes like this:

-   pg_node_attr(abstract)
+   pg_node_attr(abstract, no_query_jumble)

"abstract" should already imply "no_query_jumble".

I wonder too if you could shorten the changes by making no_query_jumble
an inheritable attribute, and then just applying it to Path and Plan.

The changes in parsenodes.h seem wrong, except for RawStmt.  Those node
types are used in parsed queries, aren't they?

regards, tom lane




Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Tom Lane
Andres Freund  writes:
> On 2023-02-09 17:19:22 -0500, Tom Lane wrote:
>> Hmmm .. a shell script would be fine by me, but it won't help in
>> testing a Windows build.  Maybe we need to make it a Perl script?

> At least for casual testing a shell script actually mostly works, due to git
> it's easy enough to have a sh.exe around... Not something I'd necessarily want
> to make a hard dependency, but for something like this it might suffice.  Of
> course perl would be more dependable...

Yeah, also less question about whether it works on Windows.
I'll see about moving that into Perl.  It's short enough.

>> BTW, the attachments to your previous message are identical to what
>> I previously posted --- did you attach the wrong set of diffs?

> I attached an extra patch, in addition to yours.

D'oh, I didn't notice that :-(

> I also attached yours so that
> cfbot could continue to work, if you registered this.

I thought about registering it, but that won't teach us anything unless
we make it built-by-default, which was not my intention.  I guess we
could temporarily include it in the build.

regards, tom lane




Re: Minor meson gripe

2023-02-09 Thread Peter Geoghegan
On Thu, Feb 9, 2023 at 12:56 PM Andres Freund  wrote:
> > * Isn't it confusing that we have a suite by that name, given that we
> > also need to use the unrelated --setup flag for some nearby testing
> > recipes?
>
> Hm. I don't find it particularly confusing, but I don't think I'm a good judge
> of that, too close.

> It'll work, but only if you have run setup before. And it'll not use changed C
> code.

I see. It's not that confusing on its own, but it does cause confusion
once you consider how things fit together. Suppose I want to do the
equivalent of running the amcheck tests -- the tests that run when
"make check" runs from contrib/amcheck with an autoconf build. That
looks like this with our current meson workflow:

meson test -v --suite setup --suite amcheck

Now consider what I have to run to get the equivalent of a "make
installcheck" run from the contrib/amcheck directory:

meson test -v --setup running --suite amcheck-running

Notice that I have to specify "--suite setup" in the first example,
whereas I have to specify "--setup running" in the second example
instead -- at the same point in. Also notice the rest of the details
almost match. This makes it quite natural to wonder if "--suite setup"
is related to "--setup running" in some way, which is not the case at
all. They're two wholly unrelated concepts.

Why not change the suite name to tmp_install? That immediately reminds
me of what's really going on here, since I'm used to seeing that
directory name. And it clashes with "--suite setup" in a way that
seems useful.

--
Peter Geoghegan




Re: ICU locale validation / canonicalization

2023-02-09 Thread Andreas Karlsson

On 2/9/23 23:09, Jeff Davis wrote:

I do like the ICU format locale IDs from a readability standpoint.
"en_US@colstrength=primary" is more meaningful to me than "en-US-u-ks-
level1" (the equivalent language tag). And the format is specified[1],
even though it's not an independent standard. But I think the benefits
of better validation, an independent standard, and the fact that we're
already favoring BCP47 outweigh my subjective opinion.


I have the same feeling one is readable and the other unreadable but the 
unreadable one is standardized. Hard call.


And in general I agree, if we are going to make ICU default it needs to 
be more user friendly than it is now. Currently there is no nice way to 
understand if you entered the right locale or made a typo in the BCP47 
syntax.


Andreas





Re: make_ctags: use -I option to ignore pg_node_attr macro

2023-02-09 Thread Tatsuo Ishii
>> Oops. Thank you for pointing it out. BTW, just out of curiosity, do
>> you have etags on you Mac?
> 
> Yes.
> 
> $ etags --version
> etags (GNU Emacs 28.2)
> Copyright (C) 2022 Free Software Foundation, Inc.
> This program is distributed under the terms in ETAGS.README

Ok. Probably that was installed with emacs. For some reason I don't
know, I don't have etags even I already installed emacs. So I decided
to test using my old Ubuntu 18 vm, which has old ctags and etags.

> How about just applying the following into make_etags?
> 
> +if [ $# -gt 1 ] || ( [ $# -eq 1 ] && [ $1 != "-n" ] )
> +then echo "Usage: $0 [-n]"
> + exit 1
> +fi

Ok from me. Looks simple enough.

> With the patch, make_etags caused the following error messages
> on my MacOS.
> 
> $ ./src/tools/make_etags
> No such file or directory
> No such file or directory
> 
> To fix this error, probaby we should get rid of double-quotes
> from "$FLAGS" "$IGNORE_IDENTIFIES" in the following command.
> 
> - xargs ctags $MODE -a -f $TAGS_FILE "$FLAGS" "$IGNORE_IDENTIFIES"
> + xargs $PROG $MODE $TAGS_OPT $TAGS_FILE "$FLAGS" "$IGNORE_IDENTIFIES"

Oh, I see.

> + elsectags --help 2>&1 | grep -- -e >/dev/null
> + # Note that "ctags --help" does not always work. Even certain ctags
> does not have the option.
> 
> This code seems to assume that there is non-Exuberant ctags
> supporting -e option. But does such ctags really exist?

Good question. I vaguely recalled so、but I haven't been able to find
any evidence for it.
> 
> I fixed the above issues and refactored the code.
> Attached is the updated version of the patch. Thought?

Thank you! Looks good to me.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 18:19:06 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2023-02-09 17:19:22 -0500, Tom Lane wrote:
> >> Hmmm .. a shell script would be fine by me, but it won't help in
> >> testing a Windows build.  Maybe we need to make it a Perl script?
> 
> > At least for casual testing a shell script actually mostly works, due to git
> > it's easy enough to have a sh.exe around... Not something I'd necessarily 
> > want
> > to make a hard dependency, but for something like this it might suffice.  Of
> > course perl would be more dependable...
> 
> Yeah, also less question about whether it works on Windows.
> I'll see about moving that into Perl.  It's short enough.

Cool.


> > I also attached yours so that
> > cfbot could continue to work, if you registered this.
> 
> I thought about registering it, but that won't teach us anything unless
> we make it built-by-default, which was not my intention.  I guess we
> could temporarily include it in the build.

The meson patch I sent did build it by default, that's why I saw the windows
failure and the freebsd warnings. If we don't want that, we'd need to add
  build_by_default: false

I'm fine either way. It's barely noticeable compared to the rest of postgres.

Greetings,

Andres Freund




Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Tom Lane
Andres Freund  writes:
> On 2023-02-09 18:19:06 -0500, Tom Lane wrote:
>> I thought about registering it, but that won't teach us anything unless
>> we make it built-by-default, which was not my intention.  I guess we
>> could temporarily include it in the build.

> The meson patch I sent did build it by default, that's why I saw the windows
> failure and the freebsd warnings. If we don't want that, we'd need to add
>   build_by_default: false

> I'm fine either way. It's barely noticeable compared to the rest of postgres.

Yeah, build-by-default isn't really a big deal.  Install-by-default
is more of a problem...

regards, tom lane




Re: Minor meson gripe

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 15:34:34 -0800, Peter Geoghegan wrote:
> Why not change the suite name to tmp_install? That immediately reminds
> me of what's really going on here, since I'm used to seeing that
> directory name. And it clashes with "--suite setup" in a way that
> seems useful.

The individual test is actually named tmp_install. I thought it might be
useful to add further things to the setup "stage", hence the more general
name.

I e.g. have a not-quite-done patch that creates a "template initdb", which
pg_regress and tap tests automatically use (except if non-default options are
required), which quite noticably reduces test times (*).  But something needs to
create the template initdb, and initdb doesn't run without an installation, so
we need to run it after the temp_install.

Of course we could wrap that into one "test", but it seemed nicer to see if
you fail during installation, or during initdb. So for that I added a separate
test, that is also part of the setup suite.

Of course we could still name the suite tmp_install (or to be consistent with
the confusing make naming, have a temp-install target, which creates the
tmp_install directory :)). I guess that'd still be less confusing?


I'm not at all wedded to the "setup" name.

Greetings,

Andres Freund


* approximate test time improvements:

  local:
  688.67user 154.44system 1:08.29elapsed 1234%CPU (0avgtext+0avgdata 
138984maxresident)k
  ->
  172.37user 109.43system 1:00.12elapsed 468%CPU (0avgtext+0avgdata 
139168maxresident)k

  The 4x reduction in CPU cycles is pretty bonkers.  To bad wall clock time
  doesn't improve that much - we end up waiting for isolationtester,
  pg_upgrade tests to finish.

  CI freebsd: 06:30 -> 05:00
  CI linux, sanitize-address: 5:40->4:30
  CI linux, sanitize-undefined,alignment: 3:00->2:15
  CI windows 12:20 -> 10:00
  CI macos: 10:20 -> 08:00

  I expect it to very substantially speed up the valgrind builfarm animal. By
  far the most cycles are spent below initdb.

  https://github.com/anarazel/postgres/tree/initdb-caching




Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Andres Freund
On 2023-02-09 19:20:37 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > I'm fine either way. It's barely noticeable compared to the rest of 
> > postgres.
> 
> Yeah, build-by-default isn't really a big deal.  Install-by-default
> is more of a problem...

Perhaps we should install it, just not in bin/, but alongside pgxs/, similar
to pg_regress et al?




Re: Importing pg_bsd_indent into our source tree

2023-02-09 Thread Tom Lane
Andres Freund  writes:
> Perhaps we should install it, just not in bin/, but alongside pgxs/, similar
> to pg_regress et al?

For my own purposes, I really don't want it anywhere in the --prefix
tree.  That's not necessarily present when I'm using the program.

(Hmm, clarify: it wouldn't matter if install sticks it under pgxs,
but I don't want to be forced to use it from there.)

regards, tom lane




Re: Introduce a new view for checkpointer related stats

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 19:00:00 +0530, Bharath Rupireddy wrote:
> On Thu, Feb 9, 2023 at 12:33 PM Andres Freund  wrote:
> > On 2023-02-09 12:21:51 +0530, Bharath Rupireddy wrote:
> > > @@ -1105,18 +1105,22 @@ CREATE VIEW pg_stat_archiver AS
> > >
> > >  CREATE VIEW pg_stat_bgwriter AS
> > >  SELECT
> > > -pg_stat_get_bgwriter_timed_checkpoints() AS checkpoints_timed,
> > > -pg_stat_get_bgwriter_requested_checkpoints() AS checkpoints_req,
> > > -pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
> > > -pg_stat_get_checkpoint_sync_time() AS checkpoint_sync_time,
> > > -pg_stat_get_bgwriter_buf_written_checkpoints() AS 
> > > buffers_checkpoint,
> > >  pg_stat_get_bgwriter_buf_written_clean() AS buffers_clean,
> > >  pg_stat_get_bgwriter_maxwritten_clean() AS maxwritten_clean,
> > > -pg_stat_get_buf_written_backend() AS buffers_backend,
> > > -pg_stat_get_buf_fsync_backend() AS buffers_backend_fsync,
> > >  pg_stat_get_buf_alloc() AS buffers_alloc,
> > >  pg_stat_get_bgwriter_stat_reset_time() AS stats_reset;
> > >
> > > +CREATE VIEW pg_stat_checkpointer AS
> > > +SELECT
> > > +pg_stat_get_timed_checkpoints() AS timed_checkpoints,
> > > +pg_stat_get_requested_checkpoints() AS requested_checkpoints,
> > > +pg_stat_get_checkpoint_write_time() AS checkpoint_write_time,
> > > +pg_stat_get_checkpoint_sync_time() AS checkpoint_sync_time,
> > > +pg_stat_get_buf_written_checkpoints() AS 
> > > buffers_written_checkpoints,
> > > +pg_stat_get_buf_written_backend() AS buffers_written_backend,
> > > +pg_stat_get_buf_fsync_backend() AS buffers_fsync_backend,
> > > +pg_stat_get_checkpointer_stat_reset_time() AS stats_reset;
> > > +
> >
> > I don't think the backend written stats belong more accurately in
> > pg_stat_checkpointer than pg_stat_bgwriter.
> 
> We accumulate buffers_written_backend and buffers_fsync_backend of all
> backends under checkpointer stats to show the aggregated results to
> the users. I think this is correct because the checkpointer is the one
> that processes fsync requests (of course, backends themselves can
> fsync when needed, that's what the buffers_fsync_backend shows),
> whereas bgwriter doesn't perform IIUC.

That's true for buffers_fsync_backend, but not true for
buffers_backend/buffers_written_backend.

That isn't tied to checkpointer.


I think if we end up breaking compat, we should just drop that column. The
pg_stat_io patch from Melanie, which I hope to finish committing by tomorrow,
provides that in a more useful way, in a less confusing place.

I'm not sure it's worth having buffers_fsync_backend in pg_stat_checkpointer
in that case. You can get nearly the same information from pg_stat_io as well
(except fsyncs for SLRUs that couldn't be put into the queue, which you'd not
see right now - hard to believe that ever happens at a relelvant frequency).


> > I continue to be worried about breaking just about any postgres monitoring
> > setup.
> 
> Hm. Yes, it requires minimal and straightforward changes in monitoring
> scripts. Please note that we separated out bgwriter and checkpointer
> in v9.2 12 years ago but we haven't had a chance to separate the stats
> so far. We might do it at some point of time, IMHO this is that time.

> We did away with promote_trigger_file (cd4329d) very recently. The
> agreement was that the changes required to move on to other mechanisms
> of promotion are minimal, hence we didn't want it to be first
> deprecated and then removed.

That's not really comparable, because we have had pg_ctl promote for a long
time. You can use it across all supported versions. pg_promote() is nearly
there as well.  Whereas there's no way to use same query across all versions.

IME there also exist a lot more hand-rolled monitoring setups
than hand-rolled automatic promotion setups.


> From the discussion upthread, it looks like Robert, Amit, Bertrand,
> Greg and myself are in favour of not having a deprecated version but
> moving them to the new pg_stat_checkpointer view.

Yep, and I think you are all wrong, and that this is just going to cause
unnecessary pain :). I'm not going to try to prevent the patch from going in
because of this, just to be clear.

Greetings,

Andres Freund




Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-09 Thread Kyotaro Horiguchi
At Thu, 9 Feb 2023 13:26:19 +0530, Amit Kapila  wrote 
in 
amit.kapila16> On Thu, Feb 9, 2023 at 12:17 AM Peter Smith 
 wrote:
> > I understand in theory, your code is more efficient, but in practice,
> > I think the overhead of a single variable assignment every loop
> > iteration (which is doing WaitLatch anyway) is of insignificant
> > concern, whereas having one assignment is simpler than having two IMO.
> >
> 
> Yeah, that sounds better to me as well.

FWIW, I'm on board with this.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-09 Thread Kyotaro Horiguchi
At Thu, 9 Feb 2023 13:48:52 +0530, Amit Kapila  wrote 
in 
> On Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Wed, 8 Feb 2023 09:03:03 +, "Hayato Kuroda (Fujitsu)" 
> >  wrote in
> > > Thank you for reviewing! PSA new version.
> >
> > +   if (statusinterval_ms > 0 && diffms > statusinterval_ms)
> >
> > The next expected feedback time is measured from the last status
> > report.  Thus, it seems to me this may suppress feedbacks from being
> > sent for an unexpectedly long time especially when min_apply_delay is
> > shorter than wal_r_s_interval.
> >
> 
> I think the minimum time before we send any feedback during the wait
> is wal_r_s_interval. Now, I think if there is no transaction for a
> long time before we get a new transaction, there should be keep-alive
> messages in between which would allow us to send feedback at regular
> intervals (wal_receiver_status_interval). So, I think we should be

Right.

> able to send feedback in less than 2 * wal_receiver_status_interval
> unless wal_sender/receiver timeout is very large and there is a very
> low volume of transactions. Now, we can try to send the feedback

We have suffered this kind of feedback silence many times. Thus I
don't want to rely on luck here. I had in mind of exposing last_send
itself or providing interval-calclation function to the logic.

> before we start waiting or maybe after every
> wal_receiver_status_interval / 2 but I think that will lead to more
> spurious feedback messages than we get the benefit from them.



regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Minor meson gripe

2023-02-09 Thread Peter Geoghegan
On Thu, Feb 9, 2023 at 4:33 PM Andres Freund  wrote:
> The individual test is actually named tmp_install. I thought it might be
> useful to add further things to the setup "stage", hence the more general
> name.

I did notice that, but only after sitting with my initial confusion for a while.

> I e.g. have a not-quite-done patch that creates a "template initdb", which
> pg_regress and tap tests automatically use (except if non-default options are
> required), which quite noticably reduces test times (*).  But something needs 
> to
> create the template initdb, and initdb doesn't run without an installation, so
> we need to run it after the temp_install.
>
> Of course we could wrap that into one "test", but it seemed nicer to see if
> you fail during installation, or during initdb. So for that I added a separate
> test, that is also part of the setup suite.

But what are the chances that the setup / tmp_install "test" will
actually fail? It's almost a test in name only.

> Of course we could still name the suite tmp_install (or to be consistent with
> the confusing make naming, have a temp-install target, which creates the
> tmp_install directory :)). I guess that'd still be less confusing?

Yes, that definitely seems like an improvement. I don't care about the
tiny inconsistency that this creates.

I wonder if this could be addressed by adding another custom test
setup, like --setup running, used whenever you want to just run one or
two tests against an ad-hoc temporary installation? Offhand it seems
as if add_test_setup() could support that requirement?

-- 
Peter Geoghegan




Re: Time delayed LR (WAS Re: logical replication restrictions)

2023-02-09 Thread Kyotaro Horiguchi
Mmm. A part of the previous mail have gone anywhere for a uncertain
reason and placed by a mysterious blank lines...

At Fri, 10 Feb 2023 09:57:22 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> At Thu, 9 Feb 2023 13:48:52 +0530, Amit Kapila  
> wrote in 
> > On Thu, Feb 9, 2023 at 10:45 AM Kyotaro Horiguchi
> >  wrote:
> > >
> > > At Wed, 8 Feb 2023 09:03:03 +, "Hayato Kuroda (Fujitsu)" 
> > >  wrote in
> > > > Thank you for reviewing! PSA new version.
> > >
> > > +   if (statusinterval_ms > 0 && diffms > statusinterval_ms)
> > >
> > > The next expected feedback time is measured from the last status
> > > report.  Thus, it seems to me this may suppress feedbacks from being
> > > sent for an unexpectedly long time especially when min_apply_delay is
> > > shorter than wal_r_s_interval.
> > >
> > 
> > I think the minimum time before we send any feedback during the wait
> > is wal_r_s_interval. Now, I think if there is no transaction for a
> > long time before we get a new transaction, there should be keep-alive
> > messages in between which would allow us to send feedback at regular
> > intervals (wal_receiver_status_interval). So, I think we should be
> 
> Right.
> 
> > able to send feedback in less than 2 * wal_receiver_status_interval
> > unless wal_sender/receiver timeout is very large and there is a very
> > low volume of transactions. Now, we can try to send the feedback
> 
> We have suffered this kind of feedback silence many times. Thus I
> don't want to rely on luck here. I had in mind of exposing last_send
> itself or providing interval-calclation function to the logic.
> 
> > before we start waiting or maybe after every
> > wal_receiver_status_interval / 2 but I think that will lead to more
> > spurious feedback messages than we get the benefit from them.

Agreed. I think we dont want to do that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [PATCH] Add pretty-printed XML output option

2023-02-09 Thread Peter Smith
On Thu, Feb 9, 2023 at 7:17 PM Jim Jones  wrote:
>
> On 09.02.23 08:23, Tom Lane wrote:
> > Um ... why are you using PG_TRY here at all?  It seems like
> > you have full control of the actually likely error cases.
> > The only plausible error out of the StringInfo calls is OOM,
> > and you probably don't want to trap that at all.
>
> My intention was to catch any unexpected error from
> xmlDocDumpFormatMemory and handle it properly. But I guess you're right,
> I can control the likely error cases by checking doc and nbytes.
>
> You suggest something along these lines?
>
>  xmlDocPtr  doc;
>  xmlChar*xmlbuf = NULL;
>  text   *arg = PG_GETARG_TEXT_PP(0);
>  StringInfoData buf;
>  int nbytes;
>
>  doc = xml_parse(arg, XMLOPTION_DOCUMENT, false,
> GetDatabaseEncoding(), NULL);
>
>  if(!doc)
>  elog(ERROR, "could not parse the given XML document");
>
>  xmlDocDumpFormatMemory(doc, &xmlbuf, &nbytes, 1);
>
>  xmlFreeDoc(doc);
>
>  if(!nbytes)
>  elog(ERROR, "could not indent the given XML document");
>
>  initStringInfo(&buf);
>  appendStringInfoString(&buf, (const char *)xmlbuf);
>
>  xmlFree(xmlbuf);
>
>  PG_RETURN_XML_P(stringinfo_to_xmltype(&buf));
>
>
> Thanks!
>

Something like that LGTM, but here are some other minor comments.

==

1.
FYI, there are some whitespace warnings applying the v5 patch

[postgres@CentOS7-x64 oss_postgres_misc]$ git apply
../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch
../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:26:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:29:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:33:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:37:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:41:
trailing whitespace.

warning: squelched 48 whitespace errors
warning: 53 lines add whitespace errors.

==
src/test/regress/sql/xml.sql

2.
The v5 patch was already testing NULL, but it might be good to add
more tests to verify the function is behaving how you want for edge
cases. For example,

+-- XML pretty print: NULL, empty string, spaces only...
SELECT xmlpretty(NULL);
SELECT xmlpretty('');
SELECT xmlpretty(' ');

~~

3.
The function is returning XML anyway, so is the '::xml' casting in
these tests necessary?

e.g.
SELECT xmlpretty(NULL)::xml; --> SELECT xmlpretty(NULL);

==
src/include/catalog/pg_proc.dat

4.

+  { oid => '4642', descr => 'Indented text from xml',
+  proname => 'xmlpretty', prorettype => 'xml',
+  proargtypes => 'xml', prosrc => 'xmlpretty' },

Spurious leading space for this new entry.

==
doc/src/sgml/func.sgml

5.
+ 
+   
+ 42
+   
+ 
+
+(1 row)
+
+]]>

A spurious blank line in the example after the "(1 row)"

~~~

6.
Does this function docs belong in section 9.15.1 "Producing XML
Content"? Or (since it is not really producing content) should it be
moved to the 9.15.3 "Processing XML" section?

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Minor meson gripe

2023-02-09 Thread Andres Freund
Hi,

On 2023-02-09 17:00:48 -0800, Peter Geoghegan wrote:
> On Thu, Feb 9, 2023 at 4:33 PM Andres Freund  wrote:
> > I e.g. have a not-quite-done patch that creates a "template initdb", which
> > pg_regress and tap tests automatically use (except if non-default options 
> > are
> > required), which quite noticably reduces test times (*).  But something 
> > needs to
> > create the template initdb, and initdb doesn't run without an installation, 
> > so
> > we need to run it after the temp_install.
> >
> > Of course we could wrap that into one "test", but it seemed nicer to see if
> > you fail during installation, or during initdb. So for that I added a 
> > separate
> > test, that is also part of the setup suite.
> 
> But what are the chances that the setup / tmp_install "test" will
> actually fail? It's almost a test in name only.

I've seen more failures than I'd like. Permission errors, conflicting names,
and similar things. But mainly that was a reference to running initdb, which
I've broken temporarily many a time.


> > Of course we could still name the suite tmp_install (or to be consistent 
> > with
> > the confusing make naming, have a temp-install target, which creates the
> > tmp_install directory :)). I guess that'd still be less confusing?
> 
> Yes, that definitely seems like an improvement. I don't care about the
> tiny inconsistency that this creates.

Then lets do that - feel free to push something, or send something for
review. Otherwise I'll try to get to it, but I owe a few people work before
this...


> I wonder if this could be addressed by adding another custom test
> setup, like --setup running, used whenever you want to just run one or
> two tests against an ad-hoc temporary installation? Offhand it seems
> as if add_test_setup() could support that requirement?

What precisely do you mean with "ad-hoc temporary installation"?

I was wondering about adding a different setup that'd use the "real"
installation to run tests. But perhaps that's something different than what
you have in mind?

The only restriction I see wrt add_test_setup() is that it's not entirely
trivial to use a "runtime-variable" path to an installation.

Greetings,

Andres Freund




Re: ICU locale validation / canonicalization

2023-02-09 Thread Jeff Davis
On Fri, 2023-02-10 at 01:04 +0100, Andreas Karlsson wrote:
> I have the same feeling one is readable and the other unreadable but
> the 
> unreadable one is standardized. Hard call.
> 
> And in general I agree, if we are going to make ICU default it needs
> to 
> be more user friendly than it is now. Currently there is no nice way
> to 
> understand if you entered the right locale or made a typo in the
> BCP47 
> syntax.

We will still allow the ICU format locale IDs for input; we would just
convert them to BCP47 before storing them in the catalog. And there's
an inverse function, so it's easy enough to offer a view that shows the
ICU format locale IDs in addition to the BCP 47 tags.

I don't think it's hugely important that we use BCP47; ICU format
locale IDs would also make sense. But I do think we should be
consistent to simplify things where we can -- collator versioning is
hard enough without wondering how a user-entered string will be
interpreted. And if we're going to be consistent, BCP 47 seems like the
most obvious choice.

-- 
Jeff Davis
PostgreSQL Contributor Team - AWS





Re: tests against running server occasionally fail, postgres_fdw & tenk1

2023-02-09 Thread Peter Geoghegan
On Wed, Feb 8, 2023 at 7:18 PM Andres Freund  wrote:
> > This is a good thing for performance, of course, but it also makes VACUUM
> > VERBOSE show information that makes sense to users, since things actually
> > happen in a way that makes a lot more sense. I'm quite happy about the fact
> > that the new VACUUM VERBOSE allows users to mostly ignore obscure details
> > like whether an index was scanned by amvacuumcleanup() or by ambulkdelete()
> > -- stuff that basically nobody understands. That seems worth preserving.
>
> I don't mind making the messages as similar as possible, but I do mind if I as
> a postgres hacker, or an expert consultant, can't parse that detail out. We
> need to be able to debug things like amvacuumcleanup() doing too much work too
> often.

FWIW you can tell even today. You can observe that the number of index
scans is 0, and that one or more indexes have their size reported --
that indicates that an amvacuumcleanup()-only scan took place, say
because we needed to put some preexisting deleted pages in the FSM.

There is also another detail that strongly hints that VACUUM VERBOSE
had to scan an index during its call to amvacuumcleanup(), which is
atypical: it only shows details for that particular index, which is
really noticeable. It won't report anything about those indexes that
had no-op calls to amvacuumcleanup().

It kind of makes sense that we report 0 index scans when there were 0
calls to ambulkdelete(), even though there may still have been some
index scans during a call to some amvacuumcleanup() routine. The
common case is that they're no-op calls for every index, but even when
they're not there is still probably only one or two indexes that have
to do a noticeable amount of I/O. It makes sense to "round down to 0".

Granted, there are some notable exceptions. For example,
gistvacuumcleanup() doesn't even try to avoid scanning the index. But
that's really a problem in gistvacuumcleanup() -- since it really
doesn't make very much sense, even from a GiST point of view. It can
follow exactly the same approach as B-Tree here, since its approach to
page deletion is already directly based on nbtree.

-- 
Peter Geoghegan




  1   2   >