Re: pgbench - rework variable management

2020-09-19 Thread Fabien COELHO


Bonjour Michaël,


https://www.postgresql.org/message-id/CAMN686ExUKturcWp4POaaVz3gR3hauSGBjOCd0E-Jh1zEXqf_Q%40mail.gmail.com


Since then, the patch is failing to apply.  As this got zero activity
for the last six months, I am marking the entry as returned with
feedback in the CF app.


Hmmm… I did not notice it did not apply anymore. I do not have much time 
to contribute much this round and probably the next as well, so fine with 
me.


--
Fabien.

Re: [PATCH] Add features to pg_stat_statements

2020-09-19 Thread legrand legrand
+1 !

An other way is to log evictions, it provides informations about time and
amount :

for (i = 0; i < nvictims; i++)
{
hash_search(pgssp_hash, &entries[i]->key, HASH_REMOVE, NULL);
}

pfree(entries);

/* trace when evicting entries, if appening too often this can slow 
queries
...
 * increasing pg_stat_sql_plans.max value could help */
 ereport(LOG,
(errmsg("pg_stat_sql_plans evicting %d entries", nvictims),
errhidecontext(true), errhidestmt(true)));



--
Sent from: https://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html




Re: Resetting spilled txn statistics in pg_stat_replication

2020-09-19 Thread Amit Kapila
On Tue, Sep 8, 2020 at 7:02 PM Amit Kapila  wrote:
>
> On Tue, Sep 8, 2020 at 7:53 AM Masahiko Sawada
>  wrote:

I have fixed these review comments in the attached patch.

>
> Comments on the latest patch:
> =
> 1.
> +CREATE VIEW pg_stat_replication_slots AS
> +SELECT
> +s.name,
> +s.spill_txns,
> +s.spill_count,
> +s.spill_bytes,
> + s.stats_reset
> +FROM pg_stat_get_replication_slots() AS s;
>
> You forgot to update the docs for the new parameter.
>

Updated the docs for this.

> 2.
> @@ -5187,6 +5305,12 @@ pgstat_read_statsfiles(Oid onlydb, bool
> permanent, bool deep)
>   for (i = 0; i < SLRU_NUM_ELEMENTS; i++)
>   slruStats[i].stat_reset_timestamp = globalStats.stat_reset_timestamp;
>
> + /*
> + * Set the same reset timestamp for all replication slots too.
> + */
> + for (i = 0; i < max_replication_slots; i++)
> + replSlotStats[i].stat_reset_timestamp = globalStats.stat_reset_timestamp;
> +
>
> I don't understand why you have removed the above code from the new
> version of the patch?
>

Added back.

> 3.
> pgstat_recv_resetreplslotcounter()
> {
> ..
> + ts = GetCurrentTimestamp();
> + for (i = 0; i < nReplSlotStats; i++)
> + {
> + /* reset entry with the given index, or all entries */
> + if (msg->clearall || idx == i)
> + {
> + /* reset only counters. Don't clear slot name */
> + replSlotStats[i].spill_txns = 0;
> + replSlotStats[i].spill_count = 0;
> + replSlotStats[i].spill_bytes = 0;
> + replSlotStats[i].stat_reset_timestamp = ts;
> + }
> + }
> ..
>
> I don't like this coding pattern as in the worst case we need to
> traverse all the slots to reset a particular slot. This could be okay
> for a fixed number of elements as we have in SLRU but here it appears
> quite inefficient. We can move the reset of stats part to a separate
> function and then invoke it from the place where we need to reset a
> particular slot and the above place.
>

Changed the code as per the above idea.

> 4.
> +pgstat_replslot_index(const char *name, bool create_it)
> {
> ..
> + replSlotStats[nReplSlotStats].stat_reset_timestamp = GetCurrentTimestamp();
> ..
> }
>
> Why do we need to set the reset timestamp on the creation of slot entry?
>

I don't think we need to show any time if the slot is never reset. Let
me know if there is any reason to show it.


> 5.
> @@ -3170,6 +3175,13 @@ ReorderBufferSerializeTXN(ReorderBuffer *rb,
> ReorderBufferTXN *txn)
>   spilled++;
>   }
>
> + /* update the statistics */
> + rb->spillCount += 1;
> + rb->spillBytes += size;
> +
> + /* Don't consider already serialized transactions. */
> + rb->spillTxns += rbtxn_is_serialized(txn) ? 0 : 1;
>
> We can't increment the spillTxns in the above way because now
> sometimes we do serialize before streaming and in that case we clear
> the serialized flag after streaming, see ReorderBufferTruncateTXN. So,
> the count can go wrong.
>

To fix, this I have added another flag which indicates if we have ever
serialized the txn. I couldn't find a better way, do let me know if
you can think of a better way to address this comment.

> Another problem is currently the patch call
> UpdateSpillStats only from begin_cb_wrapper which means it won't
> consider streaming transactions (streaming transactions that might
> have spilled).
>

The other problem I see with updating in begin_cb_wrapper is that it
will ignore the spilling done for transactions that get aborted. To
fix both the issues, I have updated the stats in DecodeCommit and
DecodeAbort.


> 6.
> @@ -322,6 +321,9 @@ ReplicationSlotCreate(const char *name, bool db_specific,
>
>   /* Let everybody know we've modified this slot */
>   ConditionVariableBroadcast(&slot->active_cv);
> +
> + /* Create statistics entry for the new slot */
> + pgstat_report_replslot(NameStr(slot->data.name), 0, 0, 0);
>  }
> ..
> ..
> @@ -683,6 +685,18 @@ ReplicationSlotDropPtr(ReplicationSlot *slot)
>   ereport(WARNING,
>   (errmsg("could not remove directory \"%s\"", tmppath)));
>
> + /*
> + * Report to drop the replication slot to stats collector.  Since there
> + * is no guarantee the order of message arrival on an UDP connection,
> + * it's possible that a message for creating a new slot arrives before a
> + * message for removing the old slot.  We send the drop message while
> + * holding ReplicationSlotAllocationLock to reduce that possibility.
> + * If the messages arrived in reverse, we would lose one statistics update
> + * message.  But the next update message will create the statistics for
> + * the replication slot.
> + */
> + pgstat_report_replslot_drop(NameStr(slot->data.name));
> +
>
> Similar to drop message, why don't we send the create message while
> holding the ReplicationSlotAllocationLock?
>

Updated code to address this comment, basically moved the create
message under lock.

Apart from the above,
(a) fixed one bug in ReorderBufferSerializeTXN() where we were
updating the stats even when we have not spilled anything.
(

VACUUM PARALLEL option vs. max_parallel_maintenance_workers

2020-09-19 Thread Peter Eisentraut
If I read the code correctly, the VACUUM PARALLEL option is capped by 
the active max_parallel_maintenance_workers setting.  So if I write 
VACUUM (PARALLEL 5), it will still only do 2 by default.  Is that 
correct?  The documentation (VACUUM man page) seems to indicate a 
different behavior.


I haven't been able to set up something to test or verify this either way.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers

2020-09-19 Thread Amit Kapila
On Sat, Sep 19, 2020 at 1:58 PM Peter Eisentraut
 wrote:
>
> If I read the code correctly, the VACUUM PARALLEL option is capped by
> the active max_parallel_maintenance_workers setting.  So if I write
> VACUUM (PARALLEL 5), it will still only do 2 by default.  Is that
> correct?

Yeah, but there is another factor also which is the number of indexes
that support parallel vacuum operation.

>  The documentation (VACUUM man page) seems to indicate a
> different behavior.
>

I think we can change the documentation for parallel option to explain
it better. How about: "Perform index vacuum and index cleanup phases
of VACUUM in parallel using integer background workers (for the
details of each vacuum phase, please refer to Table 27.37). The number
of workers is determined based on the number of indexes on the
relation that support parallel vacuum operation which is limited by
number of workers specified with PARALLEL option if any which is
further limited by max_parallel_maintenance_workers." instead of what
is currently there?

-- 
With Regards,
Amit Kapila.




Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers

2020-09-19 Thread Peter Eisentraut

On 2020-09-19 11:37, Amit Kapila wrote:

I think we can change the documentation for parallel option to explain
it better. How about: "Perform index vacuum and index cleanup phases
of VACUUM in parallel using integer background workers (for the
details of each vacuum phase, please refer to Table 27.37). The number
of workers is determined based on the number of indexes on the
relation that support parallel vacuum operation which is limited by
number of workers specified with PARALLEL option if any which is
further limited by max_parallel_maintenance_workers." instead of what
is currently there?


I think the implemented behavior is wrong.  The VACUUM PARALLEL option 
should override the max_parallel_maintenance_worker setting.


Otherwise, what's the point of the command option?

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: VACUUM PARALLEL option vs. max_parallel_maintenance_workers

2020-09-19 Thread Amit Kapila
On Sat, Sep 19, 2020 at 4:28 PM Peter Eisentraut
 wrote:
>
> On 2020-09-19 11:37, Amit Kapila wrote:
> > I think we can change the documentation for parallel option to explain
> > it better. How about: "Perform index vacuum and index cleanup phases
> > of VACUUM in parallel using integer background workers (for the
> > details of each vacuum phase, please refer to Table 27.37). The number
> > of workers is determined based on the number of indexes on the
> > relation that support parallel vacuum operation which is limited by
> > number of workers specified with PARALLEL option if any which is
> > further limited by max_parallel_maintenance_workers." instead of what
> > is currently there?
>
> I think the implemented behavior is wrong.
>

It is the same as what we do for other parallel operations, for
example, we limit the number of parallel workers for parallel create
index by 'max_parallel_maintenance_workers' and parallel scan
operations are limited by 'max_parallel_workers_per_gather'.

>  The VACUUM PARALLEL option
> should override the max_parallel_maintenance_worker setting.
>
> Otherwise, what's the point of the command option?
>

It is for the cases where the user has a better idea of workload. We
can launch only a limited number of parallel workers
'max_parallel_workers' in the system, so sometimes users would like to
use it as per their requirement. If the user omits this option, then
we internally compute the required number of workers but again those
are limited by max_* guc's.

-- 
With Regards,
Amit Kapila.




Re: XversionUpgrade tests broken by postfix operator removal

2020-09-19 Thread Andrew Dunstan


On 9/18/20 6:05 PM, Tom Lane wrote:
> I wrote:
>> Andrew Dunstan  writes:
>>> Yeah, probably worth doing. It's a small enough change and it's only in
>>> the test suite.
>> OK, I'll go take care of that in a bit.
> Done, you should be able to remove @#@ (NONE, bigint) from the
> kill list.
>
>   



Done.


crake tests pg_upgrade back to 9.2, so I had to mangle those static
repos for non-live branches like this:


-- rel 9.2 on

drop operator #%# (bigint, NONE);
CREATE OPERATOR #%# (
    PROCEDURE = factorial,
    LEFTARG = bigint
);


drop operator #@# (bigint, NONE);
CREATE OPERATOR #@# (
    PROCEDURE = factorial,
    LEFTARG = bigint
);

drop operator @#@ (NONE, bigint);
CREATE OPERATOR @#@ (
    PROCEDURE = factorial,
    RIGHTARG = bigint
);

-- rel 9.4
drop operator #@%# (bigint, NONE);
CREATE OPERATOR public.#@%# (
    PROCEDURE = factorial,
    LEFTARG = bigint
);


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: recovering from "found xmin ... from before relfrozenxid ..."

2020-09-19 Thread Amit Kapila
On Sat, Sep 19, 2020 at 4:03 AM Tom Lane  wrote:
>
> Robert Haas  writes:
> > Cool, thanks to both of you for looking. Committed.
>
> Hmph, according to whelk this is worse not better:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=whelk&dt=2020-09-18%2017%3A42%3A11
>
> I'm at a loss to understand what's going on there.
>

I think our assumption that changing the tests to have temp tables
will make them safe w.r.t concurrent activity doesn't seem to be
correct. We do set OldestXmin for temp tables aggressive enough that
it allows us to remove all dead tuples but the test case behavior lies
on whether we are able to prune the chain. AFAICS, we are using
different cut-offs in heap_page_prune when it is called via
lazy_scan_heap. So that seems to be causing both the failures.

-- 
With Regards,
Amit Kapila.




Re: should INSERT SELECT use a BulkInsertState?

2020-09-19 Thread Justin Pryzby
On Sun, Jul 12, 2020 at 08:57:00PM -0500, Justin Pryzby wrote:
> On Thu, Jun 04, 2020 at 10:30:47AM -0700, Andres Freund wrote:
> > On 2020-05-08 02:25:45 -0500, Justin Pryzby wrote:
> > > Seems to me it should, at least conditionally.  At least if there's a 
> > > function
> > > scan or a relation or ..
> > 
> > Well, the problem is that this can cause very very significant
> > regressions. As in 10x slower or more. The ringbuffer can cause constant
> > XLogFlush() calls (due to the lsn interlock), and the eviction from
> > shared_buffers (regardless of actual available) will mean future vacuums
> > etc will be much slower.  I think this is likely to cause pretty
> > widespread regressions on upgrades.
> > 
> > Now, it sucks that we have this problem in the general facility that's
> > supposed to be used for this kind of bulk operation. But I don't really
> > see it realistic as expanding use of bulk insert strategies unless we
> > have some more fundamental fixes.
> 
> I made this conditional on BEGIN BULK/SET bulk, so I'll solicit comments on 
> that.

@cfbot: rebased
>From acfc6ef7b84a6753a49b7f4c9d5b77a0abbfd37c Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 8 May 2020 02:17:32 -0500
Subject: [PATCH v3] Allow INSERT SELECT to use a BulkInsertState

---
 src/backend/executor/nodeModifyTable.c | 23 +--
 src/backend/parser/gram.y  |  7 ++-
 src/backend/tcop/utility.c |  4 
 src/backend/utils/misc/guc.c   | 12 +++-
 src/include/executor/nodeModifyTable.h |  2 ++
 src/include/nodes/execnodes.h  |  2 ++
 src/include/parser/kwlist.h|  1 +
 7 files changed, 47 insertions(+), 4 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 9812089161..464ad5e346 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -75,6 +75,8 @@ static ResultRelInfo *getTargetResultRelInfo(ModifyTableState *node);
 static void ExecSetupChildParentMapForSubplan(ModifyTableState *mtstate);
 static TupleConversionMap *tupconv_map_for_subplan(ModifyTableState *node,
    int whichplan);
+/* guc */
+bool insert_in_bulk = false;
 
 /*
  * Verify that the tuples to be produced by INSERT or UPDATE match the
@@ -578,7 +580,7 @@ ExecInsert(ModifyTableState *mtstate,
 			table_tuple_insert_speculative(resultRelationDesc, slot,
 		   estate->es_output_cid,
 		   0,
-		   NULL,
+		   NULL, /* Bulk insert not supported */
 		   specToken);
 
 			/* insert index entries for tuple */
@@ -617,7 +619,7 @@ ExecInsert(ModifyTableState *mtstate,
 			/* insert the tuple normally */
 			table_tuple_insert(resultRelationDesc, slot,
 			   estate->es_output_cid,
-			   0, NULL);
+			   0, mtstate->bistate);
 
 			/* insert index entries for tuple */
 			if (resultRelInfo->ri_NumIndices > 0)
@@ -2332,6 +2334,17 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
 
 	mtstate->mt_arowmarks = (List **) palloc0(sizeof(List *) * nplans);
 	mtstate->mt_nplans = nplans;
+	mtstate->bistate = NULL;
+	if (operation == CMD_INSERT &&
+			node->onConflictAction != ONCONFLICT_UPDATE &&
+			node->rootResultRelIndex < 0)
+	{
+		// Plan *p = linitial(node->plans);
+		Assert(nplans == 1);
+
+		if (insert_in_bulk)
+			mtstate->bistate = GetBulkInsertState();
+	}
 
 	/* set up epqstate with dummy subplan data for the moment */
 	EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL, node->epqParam);
@@ -2776,6 +2789,12 @@ ExecEndModifyTable(ModifyTableState *node)
 		   resultRelInfo);
 	}
 
+	if (node->bistate)
+	{
+		FreeBulkInsertState(node->bistate);
+		table_finish_bulk_insert((getTargetResultRelInfo(node))->ri_RelationDesc, 0);
+	}
+
 	/*
 	 * Close all the partitioned tables, leaf partitions, and their indices
 	 * and release the slot used for tuple routing, if set.
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 017940bdcd..0bc2108a2b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -631,7 +631,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 	ASSERTION ASSIGNMENT ASYMMETRIC AT ATTACH ATTRIBUTE AUTHORIZATION
 
 	BACKWARD BEFORE BEGIN_P BETWEEN BIGINT BINARY BIT
-	BOOLEAN_P BOTH BY
+	BOOLEAN_P BOTH BY BULK
 
 	CACHE CALL CALLED CASCADE CASCADED CASE CAST CATALOG_P CHAIN CHAR_P
 	CHARACTER CHARACTERISTICS CHECK CHECKPOINT CLASS CLOSE
@@ -9846,6 +9846,9 @@ transaction_mode_item:
 			| NOT DEFERRABLE
 	{ $$ = makeDefElem("transaction_deferrable",
 	   makeIntConst(false, @1), @1); }
+			| BULK
+	{ $$ = makeDefElem("bulk",
+	   makeIntConst(true, @1), @1); }
 		;
 
 /* Syntax with commas is SQL-spec, without commas is Postgres historical */
@@ -15052,6 +15055,7 @@ unreserved_keyword:
 			| BACKWARD
 			| BEFORE
 			| BEGIN_P
+			| BULK
 			| BY
 			| CACHE
 			| CALL
@@ -

Re: XversionUpgrade tests broken by postfix operator removal

2020-09-19 Thread Tom Lane
Andrew Dunstan  writes:
> On 9/18/20 6:05 PM, Tom Lane wrote:
>> Done, you should be able to remove @#@ (NONE, bigint) from the
>> kill list.

> crake tests pg_upgrade back to 9.2, so I had to mangle those static
> repos for non-live branches like this:

Oh, hm.  Now that you mention that, I see snapper is testing 9.3
and 9.4 upgrades too.

It seems like we have these non-kluge fixes available:

1. Backpatch 9ab5ed419 as far as 9.2.  It'd be unusual for us to patch
out-of-support branches, but it's certainly not without precedent.

2. Give up testing these upgrade scenarios.  Don't like this much;
even though these branches are out-of-support, they still seem like
credible upgrade scenarios in the field.

3. Put back the extra DROP in TestUpgradeXversion.pm.  The main
complaint about this is we'd then have no test of pg_upgrade'ing
prefix operators, at least not in these older branches (there is
another such operator in v10 and later).  That doesn't seem like
a huge deal really, since the same-version pg_upgrade test should
cover it pretty well.  Still, it's annoying.

4. Undo the elimination of numeric_fac() in HEAD.  I find this only
sort of not-a-kluge, but it's a reasonable alternative.

In the last two cases we might as well revert 9ab5ed419.

Personally I think #1 or #3 are the best answers, but I'm having
a hard time choosing between them.

regards, tom lane




Re: recovering from "found xmin ... from before relfrozenxid ..."

2020-09-19 Thread Tom Lane
Amit Kapila  writes:
> I think our assumption that changing the tests to have temp tables
> will make them safe w.r.t concurrent activity doesn't seem to be
> correct. We do set OldestXmin for temp tables aggressive enough that
> it allows us to remove all dead tuples but the test case behavior lies
> on whether we are able to prune the chain. AFAICS, we are using
> different cut-offs in heap_page_prune when it is called via
> lazy_scan_heap. So that seems to be causing both the failures.

Hm, reasonable theory.

I was able to partially reproduce whelk's failure here.  I got a
couple of cases of "cannot freeze committed xmax", which then leads
to the second NOTICE diff; but I couldn't reproduce the first
NOTICE diff.  That was out of about a thousand tries :-( so it's not
looking like a promising thing to reproduce without modifying the test.

I wonder whether "cannot freeze committed xmax" doesn't represent an
actual bug, ie is a7212be8b setting the cutoff *too* aggressively?
But if so, why's it so hard to reproduce?

regards, tom lane




Re: please update ps display for recovery checkpoint

2020-09-19 Thread Justin Pryzby
On Thu, Sep 10, 2020 at 01:37:10PM +0900, Michael Paquier wrote:
> On Wed, Sep 09, 2020 at 09:00:50PM -0500, Justin Pryzby wrote:
> > What would you want the checkpointer's ps to say ?
> > 
> > Normally it just says:
> > postgres  3468  3151  0 Aug27 ?00:20:57 postgres: checkpointer  
> >   
> 
> Note that CreateCheckPoint() can also be called from the startup
> process if the bgwriter has not been launched once recovery finishes.
> 
> > Or do you mean do the same thing as now, but one layer lower, like:
> >
> > @@ -8728,6 +8725,9 @@ CreateCheckPoint(int flags)
> > +   if (flags & CHECKPOINT_END_OF_RECOVERY)
> > +   set_ps_display("recovery checkpoint");
> 
> For the use-case discussed here, that would be fine.  Now the
> difficult point is how much information we can actually display
> without bloating ps while still have something meaningful.  Showing
> all the information from LogCheckpointStart() would bloat the output a
> lot for example.  So, thinking about that, my take would be to have ps
> display the following at the beginning of CreateCheckpoint() and
> CreateRestartPoint():
> - restartpoint or checkpoint
> - shutdown
> - end-of-recovery
> 
> The output also needs to be cleared once the routines finish or if
> there is a skip, of course.

In my inital request, I *only* care about the startup process' recovery
checkpoint.  AFAIK, this exits when it's done, so there may be no need to
"revert" to the previous "ps".  However, one could argue that it's currently a
bug that the "recovering NNN" portion isn't updated after finishing the WAL
files.

StartupXLOG -> xlogreader -> XLogPageRead -> WaitForWALToBecomeAvailable -> 
XLogFileReadAnyTLI -> XLogFileRead
  -> CreateCheckPoint

Maybe it's a bad idea if the checkpointer is continuously changing its display.
I don't see the utility in it, since log_checkpoints does more than ps could
ever do.  I'm concerned that would break things for someone using something
like pgrep.
|$ ps -wwf `pgrep -f 'checkpointer *$'`
|UIDPID  PPID  C STIME TTY  STAT   TIME CMD
|postgres  9434  9418  0 Aug20 ?Ss   214:25 postgres: checkpointer   

|pryzbyj  23010 23007  0 10:33 ?00:00:00 postgres: checkpointer 
checkpoint

I think this one is by far the most common, but somewhat confusing, since it's
only one word.  This led me to put parenthesis around it:

|pryzbyj  26810 26809 82 10:53 ?00:00:12 postgres: startup 
(end-of-recovery checkpoint)

Related: I have always thought that this message meant "recovery will complete
Real Soon", but I now understand it to mean "beginning the recovery checkpoint,
which is flagged CHECKPOINT_IMMEDIATE" (and may take a long time).

|2020-09-19 10:53:26.345 CDT [26810] LOG:  checkpoint starting: end-of-recovery 
immediate

-- 
Justin
>From d1d473beb4b59a93ceb7859f4776da47d9381aee Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 19 Sep 2020 10:12:53 -0500
Subject: [PATCH v2 1/4] Clear PS display after recovery

...so it doesn't continue to say 'recovering ...' during checkpoint
---
 src/backend/access/transam/xlog.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 61754312e2..0183cae9a9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7396,6 +7396,7 @@ StartupXLOG(void)
 			/*
 			 * end of main redo apply loop
 			 */
+			set_ps_display("");
 
 			if (reachedRecoveryTarget)
 			{
-- 
2.17.0

>From 0e0bbeb6501f5b1a7a5f0e4109dbf8a8152de249 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 8 Feb 2020 09:16:14 -0600
Subject: [PATCH v2 2/4] Update PS display following replay of last xlog..

..otherwise it shows "recovering " for the duration of the recovery
checkpoint.
---
 src/backend/access/transam/xlog.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 0183cae9a9..3d8220ba78 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8892,6 +8892,13 @@ CreateCheckPoint(int flags)
 	if (log_checkpoints)
 		LogCheckpointStart(flags, false);
 
+	if (flags & CHECKPOINT_END_OF_RECOVERY)
+		set_ps_display("(end-of-recovery checkpoint)");
+	else if (flags & CHECKPOINT_IS_SHUTDOWN)
+		set_ps_display("(shutdown checkpoint)");
+	else
+		set_ps_display("(checkpoint)");
+
 	TRACE_POSTGRESQL_CHECKPOINT_START(flags);
 
 	/*
@@ -9106,6 +9113,7 @@ CreateCheckPoint(int flags)
 
 	/* Real work is done, but log and update stats before releasing lock. */
 	LogCheckpointEnd(false);
+	set_ps_display("");
 
 	TRACE_POSTGRESQL_CHECKPOINT_DONE(CheckpointStats.ckpt_bufs_written,
 	 NBuffers,
@@ -9349,6 +9357,11 @@ CreateRestartPoint(int flags)
 	if (log_checkpoints)
 		LogCheckpointStart(flags, true);
 
+	if (flags & CHECKPOINT_IS_SHUTDOWN)
+		set_ps_display("(shutdown rest

Re: XversionUpgrade tests broken by postfix operator removal

2020-09-19 Thread Andrew Dunstan


On 9/19/20 10:43 AM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 9/18/20 6:05 PM, Tom Lane wrote:
>>> Done, you should be able to remove @#@ (NONE, bigint) from the
>>> kill list.
>> crake tests pg_upgrade back to 9.2, so I had to mangle those static
>> repos for non-live branches like this:
> Oh, hm.  Now that you mention that, I see snapper is testing 9.3
> and 9.4 upgrades too.
>
> It seems like we have these non-kluge fixes available:
>
> 1. Backpatch 9ab5ed419 as far as 9.2.  It'd be unusual for us to patch
> out-of-support branches, but it's certainly not without precedent.
>
> 2. Give up testing these upgrade scenarios.  Don't like this much;
> even though these branches are out-of-support, they still seem like
> credible upgrade scenarios in the field.
>
> 3. Put back the extra DROP in TestUpgradeXversion.pm.  The main
> complaint about this is we'd then have no test of pg_upgrade'ing
> prefix operators, at least not in these older branches (there is
> another such operator in v10 and later).  That doesn't seem like
> a huge deal really, since the same-version pg_upgrade test should
> cover it pretty well.  Still, it's annoying.
>
> 4. Undo the elimination of numeric_fac() in HEAD.  I find this only
> sort of not-a-kluge, but it's a reasonable alternative.
>
> In the last two cases we might as well revert 9ab5ed419.
>
> Personally I think #1 or #3 are the best answers, but I'm having
> a hard time choosing between them.


Here's how cross version upgrade testing works. It uses a cached version of the 
binaries and data directory. The cache is only refreshed if there's a buildfarm 
run on that branch. If not, the cached version will just sit there till kingdom 
come. So all this should normally need for the non-live branches is a one-off 
adjustment in the cached version of the regression database along the lines I 
have indicated. My cached versions of 9.2 and 9.3 are two years old.

There are two things you need to do to enable fixing this as I suggested:


  * create the socket directory set in the postgresql.conf
  * set LD_LIBRARY_PATH to point to the lib directory

Then after starting up musing pg_ctl you can do something like:

bin/psql -h /tmp/buildfarm-xx -U buildfarm regression

and run the updates.

As I say this should be a one-off operation.

Of course, if you rebuild the cached version then you would need to
repeat the operation. If we think that's too onerous then we could
backpatch these changes, presumably all the way back to 8.4 in case
anyone wants to test back that far.

But another alternative would be to have the buildfarm module run (on
versions older than 9.5):

drop operator @#@ (NONE, bigint);
CREATE OPERATOR @#@ (
    PROCEDURE = factorial,
    RIGHTARG = bigint
);

On reflection I think that's probably the simplest solution. It will avoid any 
surprises if the cached version is rebuilt, and at the same time preserve 
testing the prefix operator.

cheers

andrew


-- 

Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: XversionUpgrade tests broken by postfix operator removal

2020-09-19 Thread Tom Lane
Andrew Dunstan  writes:
> Here's how cross version upgrade testing works. It uses a cached version of 
> the binaries and data directory. The cache is only refreshed if there's a 
> buildfarm run on that branch. If not, the cached version will just sit there 
> till kingdom come. So all this should normally need for the non-live branches 
> is a one-off adjustment in the cached version of the regression database 
> along the lines I have indicated. My cached versions of 9.2 and 9.3 are two 
> years old.

Hmm, okay, so patching this on gitmaster wouldn't help anyway.

> But another alternative would be to have the buildfarm module run (on
> versions older than 9.5):

> drop operator @#@ (NONE, bigint);
> CREATE OPERATOR @#@ (
>     PROCEDURE = factorial,
>     RIGHTARG = bigint
> );

> On reflection I think that's probably the simplest solution. It will avoid 
> any surprises if the cached version is rebuilt, and at the same time preserve 
> testing the prefix operator.

Works for me.

regards, tom lane




Re: speed up unicode normalization quick check

2020-09-19 Thread Mark Dilger



> On Sep 18, 2020, at 9:41 AM, John Naylor  wrote:
> 
> Attached is version 4, which excludes the output file from pgindent,
> to match recent commit 74d4608f5. Since it won't be indented again, I
> also tweaked the generator script to match pgindent for the typedef,
> since we don't want to lose what pgindent has fixed already. This last
> part isn't new to v4, but I thought I'd highlight it anyway.
> 
> --
> John Naylorhttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
> 

0002 and 0003 look good to me.  I like the way you cleaned up a bit with the 
unicode_norm_props struct, which makes the code a bit more tidy, and on my 
compiler under -O2 it does not generate any extra runtime dereferences, as the 
compiler can see through the struct just fine.  My only concern would be if 
some other compilers might not see through the struct, resulting in a runtime 
performance cost?  I wouldn't even ask, except that qc_hash_lookup is called in 
a fairly tight loop.

To clarify, the following changes to the generated code which remove the struct 
and corresponding dereferences (not intended as a patch submission) cause zero 
bytes of change in the compiled output for me on mac/clang, which is good, and 
generate inconsequential changes on linux/gcc, which is also good, but I wonder 
if that is true for all compilers.  In your commit message for 0001 you 
mentioned testing on a multiplicity of compilers.  Did you do that for 0002 and 
0003 as well?

diff --git a/src/common/unicode_norm.c b/src/common/unicode_norm.c
index 1714837e64..976b96e332 100644
--- a/src/common/unicode_norm.c
+++ b/src/common/unicode_norm.c
@@ -476,8 +476,11 @@ qc_compare(const void *p1, const void *p2)
return (v1 - v2);
 }
 
-static const pg_unicode_normprops *
-qc_hash_lookup(pg_wchar ch, const unicode_norm_info * norminfo)
+static inline const pg_unicode_normprops *
+qc_hash_lookup(pg_wchar ch,
+  const pg_unicode_normprops *normprops,
+  qc_hash_func hash,
+  int num_normprops)
 {
int h;
uint32  hashkey;
@@ -487,21 +490,21 @@ qc_hash_lookup(pg_wchar ch, const unicode_norm_info * 
norminfo)
 * in network order.
 */
hashkey = htonl(ch);
-   h = norminfo->hash(&hashkey);
+   h = hash(&hashkey);
 
/* An out-of-range result implies no match */
-   if (h < 0 || h >= norminfo->num_normprops)
+   if (h < 0 || h >= num_normprops)
return NULL;
 
/*
 * Since it's a perfect hash, we need only match to the specific 
codepoint
 * it identifies.
 */
-   if (ch != norminfo->normprops[h].codepoint)
+   if (ch != normprops[h].codepoint)
return NULL;
 
/* Success! */
-   return &norminfo->normprops[h];
+   return &normprops[h];
 }
 
 /*
@@ -518,7 +521,10 @@ qc_is_allowed(UnicodeNormalizationForm form, pg_wchar ch)
switch (form)
{
case UNICODE_NFC:
-   found = qc_hash_lookup(ch, &UnicodeNormInfo_NFC_QC);
+   found = qc_hash_lookup(ch,
+  
UnicodeNormProps_NFC_QC,
+  
NFC_QC_hash_func,
+  
NFC_QC_num_normprops);
break;
case UNICODE_NFKC:
found = bsearch(&key,
diff --git a/src/include/common/unicode_normprops_table.h 
b/src/include/common/unicode_normprops_table.h
index 5e1e382af5..38300cfa12 100644
--- a/src/include/common/unicode_normprops_table.h
+++ b/src/include/common/unicode_normprops_table.h
@@ -13,13 +13,6 @@ typedef struct
signed int  quickcheck:4;   /* really UnicodeNormalizationQC */
 } pg_unicode_normprops;
 
-typedef struct
-{
-   const pg_unicode_normprops *normprops;
-   qc_hash_func hash;
-   int num_normprops;
-} unicode_norm_info;
-
 static const pg_unicode_normprops UnicodeNormProps_NFC_QC[] = {
{0x0300, UNICODE_NORM_QC_MAYBE},
{0x0301, UNICODE_NORM_QC_MAYBE},
@@ -1583,12 +1576,6 @@ NFC_QC_hash_func(const void *key)
return h[a % 2463] + h[b % 2463];
 }
 
-static const unicode_norm_info UnicodeNormInfo_NFC_QC = {
-   UnicodeNormProps_NFC_QC,
-   NFC_QC_hash_func,
-   1231
-};
-
 static const pg_unicode_normprops UnicodeNormProps_NFKC_QC[] = {
{0x00A0, UNICODE_NORM_QC_NO},
{0x00A8, UNICODE_NORM_QC_NO},


--
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Probable documentation errors or improvements

2020-09-19 Thread Justin Pryzby
On Thu, Sep 10, 2020 at 12:19:55PM -0700, Yaroslav wrote:
> Disclaimer: I'm not a native speaker, so not sure those are actually
> incorrect, and can't offer non-trivial suggestions.

https://www.postgresql.org/message-id/flat/CAKFQuwZh3k_CX2-%2BNcZ%3DFZss4bX6ASxDFEXJTY6u4wTH%2BG8%2BKA%40mail.gmail.com#9f9eba0cbbf9b57455503537575f5339

Most of these appear to be reasonable corrections or improvements.

Would you want to submit a patch against the master branch ?
It'll be easier for people to read when it's in a consistent format.
And less work to read it than to write it.

I suggest to first handle the 10+ changes which are most important and in need
of fixing.  After a couple rounds, then see if what's left is worth patching.

-- 
Justin




Re: doc review for v13

2020-09-19 Thread Justin Pryzby
On Thu, Sep 10, 2020 at 03:58:31PM +0900, Michael Paquier wrote:
> On Wed, Sep 09, 2020 at 09:37:42AM -0500, Justin Pryzby wrote:
> > I've added a few more.
> 
> I have done an extra round of review on this patch series, and applied
> what looked obvious to me (basically the points already discussed
> upthread).  Some parts applied down to 9.6 for the docs.

Thanks.  Here's the remainder, with some new ones.

-- 
Justin
>From ff7662fb2e68257bcffd9f0281220b5d3ab9dfbc Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 30 Mar 2020 19:43:22 -0500
Subject: [PATCH v9 01/15] doc: btree deduplication

commit 0d861bbb702f8aa05c2a4e3f1650e7e8df8c8c27
Author: Peter Geoghegan 
---
 doc/src/sgml/btree.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/btree.sgml b/doc/src/sgml/btree.sgml
index 20cabe921f..bb395e6a85 100644
--- a/doc/src/sgml/btree.sgml
+++ b/doc/src/sgml/btree.sgml
@@ -642,7 +642,7 @@ options(relopts local_relopts *) returns
   
   
Deduplication works by periodically merging groups of duplicate
-   tuples together, forming a single posting list tuple for each
+   tuples together, forming a single posting list tuple for each
group.  The column key value(s) only appear once in this
representation.  This is followed by a sorted array of
TIDs that point to rows in the table.  This
-- 
2.17.0

>From 9be911f869805b2862154d0170fa89fbd6be9e10 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sun, 29 Mar 2020 21:22:43 -0500
Subject: [PATCH v9 02/15] doc: pg_stat_progress_basebackup

commit e65497df8f85ab9b9084c928ff69f384ea729b24
Author: Fujii Masao 
---
 doc/src/sgml/monitoring.sgml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 673a0e73e4..4e0193a967 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6089,8 +6089,8 @@ SELECT pg_stat_get_backend_pid(s.backendid) AS pid,
   waiting for checkpoint to finish
   
The WAL sender process is currently performing
-   pg_start_backup to set up for
-   taking a base backup, and waiting for backup start
+   pg_start_backup to prepare to
+   take a base backup, and waiting for the start-of-backup
checkpoint to finish.
   
  
-- 
2.17.0

>From 6500196ffbcc735eccb7623e728788aa3f5494bc Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Tue, 7 Apr 2020 19:56:56 -0500
Subject: [PATCH v9 03/15] doc: Allow users to limit storage reserved by
 replication slots

commit c6550776394e25c1620bc8258427c8f1d448080d
Author: Alvaro Herrera 
---
 doc/src/sgml/config.sgml | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2c75876e32..6c1c9157d8 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3902,9 +3902,9 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
 slots are allowed to retain in the pg_wal
 directory at checkpoint time.
 If max_slot_wal_keep_size is -1 (the default),
-replication slots retain unlimited amount of WAL files.  If
-restart_lsn of a replication slot gets behind more than that megabytes
-from the current LSN, the standby using the slot may no longer be able
+replication slots may retain an unlimited amount of WAL files.  Otherwise, if
+restart_lsn of a replication slot falls behind the current LSN by more
+than the given size, the standby using the slot may no longer be able
 to continue replication due to removal of required WAL files. You
 can see the WAL availability of replication slots
 in pg_replication_slots.
-- 
2.17.0

>From bf31db4f44e2d379660492a68343a1a65e10cb60 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 6 Apr 2020 17:16:07 -0500
Subject: [PATCH v9 04/15] doc: s/evade/avoid/

---
 src/backend/access/gin/README | 2 +-
 src/backend/utils/adt/jsonpath_exec.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/gin/README b/src/backend/access/gin/README
index 125a82219b..41d4e1e8a0 100644
--- a/src/backend/access/gin/README
+++ b/src/backend/access/gin/README
@@ -413,7 +413,7 @@ leftmost leaf of the tree.
 Deletion algorithm keeps exclusive locks on left siblings of pages comprising
 currently investigated path.  Thus, if current page is to be removed, all
 required pages to remove both downlink and rightlink are already locked.  That
-evades potential right to left page locking order, which could deadlock with
+avoids potential right to left page locking order, which could deadlock with
 concurrent stepping right.
 
 A search concurrent to page deletion might already have read a pointer to the
diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index 2c0b362502..0591c9effc 100644
--- a/src/backend/utils/adt/jsonpath_exec

Re: Feature proposal for psql

2020-09-19 Thread Denis Gantsev
On Sat, 19 Sep 2020 at 19:20, Tom Lane  wrote:

> Denis Gantsev  writes:
> > I have a working proposal for a small feature, which I would describe in
> > one sentence as
> > "named parametrized queries".
>
> I can see the use of being able to insert parameters into a "macro",
> and you're right that the existing variable-interpolation feature
> can't handle that.
>
> > Basically it allows to save something like this in a file:
>
> > --psql:MyQuery1
> > SELECT 42 FROM @0
> > WHERE true
> > --psql:end
>
... however, that syntax seems pretty horrid.  It's unlike
> anything else in PG and it risks breaking scripts that work today.


I actually thought that would be a completely different file from .psqlrc:
hence, no risk of breaking existing scripts.
That particular file would for exemple be pointed by "PGNQFILE" (or
whatever) environment variable.

We don't do "comments that aren't really comments".  "@0" as a
> parameter notation is a non-starter as well, because "@" is a
> perfectly legal prefix operator.  Besides that, most stuff in
> Postgres is numbered from 1 not 0.
>

indeed, I missed the fact that "@" is an already used operator. I started
with "%s" (like psycopg2), but that would obviously collide too

If I were trying to build this, I'd probably look for ways to
> extend psql's existing variable-interpolation feature rather than
> build something entirely separate.  It's not too hard to imagine
> writing a saved query like
>
> \set MyQuery1 'SELECT * FROM :param1 WHERE id = :param2'
>
> and then we need some notation for expanding a variable with
> parameters.  With one eye on the existing notations :"foo" and
> :'foo', I'm wondering about something like
>
> :(MyQuery1,table_name,id_value)
>
> which is not very pretty, but it's not commandeering any syntax
> that's likely to be in use in current applications.
>
> BTW, the reason I'm suggesting variable notation for the parameter
> references is that the way you'd really want to write the saved
> query is probably more like
>
> \set MyQuery1 'SELECT * FROM :"param1" WHERE id = :''param2'''
>
> so as to have robust quoting behavior.
>
> One limitation of this approach is that \set can't span lines, so
> writing complex queries would be kinda painful.  But that would
> be a good limitation to address separately; \set isn't the only
> metacommand where can't-span-lines is a problem sometimes.
> If you seriously want to pursue adding a feature like this,
> probably the -hackers list is a more appropriate discussion
> forum than -novice.
>
> regards, tom lane
>

 The ability to save and retrieve multi-line queries would be quite nice
though, often I would like to save a query too large to type.

I think I don't know psql well enough to propose a viable syntax, so I
guess that would be up to experts here...
But I would be pretty happy to implement it.

Regards
Denis


Re: factorial function/phase out postfix operators?

2020-09-19 Thread Tom Lane
John Naylor  writes:
> I believe it's actually "lower than Op", and since POSTFIXOP is gone
> it doesn't seem to matter how low it is. In fact, I found that the
> lines with INDENT and UNBOUNDED now work as the lowest precedence
> declarations. Maybe that's worth something?

> Following on Peter E.'s example upthread, GENERATED can be removed
> from precedence, and I also found the same is true for PRESERVE and
> STRIP_P.

Now that the main patch is pushed, I went back to revisit this precedence
issue.  I'm afraid to move the precedence of IDENT as much as you suggest
here.  The comment for opt_existing_window_name says that it's expecting
the precedence of IDENT to be just below that of Op.  If there's daylight
in between, that could result in funny behavior for use of some of the
unreserved words with other precedence levels in this context.

However, I concur that we ought to be able to remove the explicit
precedences for GENERATED, NULL_P, PRESERVE, and STRIP_P, so I did that.

An interesting point is that it's actually possible to remove the
precedence declaration for IDENT itself (at least, that does not
create any bison errors; I did not do any behavioral testing).
I believe what we had that for originally was to control the precedence
behavior of the "target_el: a_expr IDENT" rule, and now that that
rule doesn't end with IDENT, its behavior isn't governed by that.
But I think we're best off to keep the precedence assignment, as
a place to hang the precedences of PARTITION etc.

regards, tom lane




Re: recovering from "found xmin ... from before relfrozenxid ..."

2020-09-19 Thread Tom Lane
I wrote:
> I was able to partially reproduce whelk's failure here.  I got a
> couple of cases of "cannot freeze committed xmax", which then leads
> to the second NOTICE diff; but I couldn't reproduce the first
> NOTICE diff.  That was out of about a thousand tries :-( so it's not
> looking like a promising thing to reproduce without modifying the test.

... however, it's trivial to reproduce via manual interference,
using the same strategy discussed recently for another case:
add a pg_sleep at the start of the heap_surgery.sql script,
run "make installcheck", and while that's running start another
session in which you begin a serializable transaction, execute
any old SELECT, and wait.  AFAICT this reproduces all of whelk's
symptoms with 100% reliability.

With a little more effort, this could be automated by putting
some long-running transaction (likely, it needn't be any more
complicated than "select pg_sleep(10)") in a second test script
launched in parallel with heap_surgery.sql.

So this confirms the suspicion that the cause of the buildfarm
failures is a concurrently-open transaction, presumably from
autovacuum.  I don't have time to poke further right now.

regards, tom lane




Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)

2020-09-19 Thread Thomas Munro
On Sat, Sep 19, 2020 at 6:07 AM Fujii Masao  wrote:
> -   pgstat_report_wait_start(WAIT_EVENT_RECOVERY_PAUSE);
> -   pg_usleep(100L);/* 1000 ms */
> -   pgstat_report_wait_end();
> +   WaitLatch(NULL, WL_EXIT_ON_PM_DEATH | WL_TIMEOUT, 1000,
> + WAIT_EVENT_RECOVERY_PAUSE);
>
> This change may cause at most one second delay against the standby
> promotion request during WAL replay pause? It's only one second,
> but I'd like to avoid this (unnecessary) wait to shorten the failover time
> as much as possible basically. So what about using WL_SET_LATCH here?

Right, there is a comment saying that we could do that:

 * XXX Could also be done with shared latch, avoiding the pg_usleep loop.
 * Probably not worth the trouble though.  This state shouldn't be one that
 * anyone cares about server power consumption in.

> But when using WL_SET_LATCH, one concern is that walreceiver can
> wake up the startup process too frequently even during WAL replay pause.
> This is also problematic? I'm ok with this, but if not, using pg_usleep()
> might be better as the original code does.

You're right, at least if we used recoveryWakeupLatch.  Although we'd
react to pg_wal_replay_resume() faster, which would be nice, we
wouldn't be saving energy, we'd be using more energy due to all the
other latch wakeups that we'd be ignoring.  I believe the correct
solution to this problem is to add a ConditionVariable
"recoveryPauseChanged" into XLogCtlData, and then broadcast on it in
SetRecoveryPause().  This would be a trivial change, except for one
small problem: ConditionVariableTimedSleep() contains
CHECK_FOR_INTERRUPTS(), but startup.c has its own special interrupt
handling rather than using ProcessInterrupts() from postgres.c.  Maybe
that's OK, I'm not sure, but it requires more thought, and I propose
to keep the existing sloppy polling for now and leave precise wakeup
improvements for a separate patch.  The primary goal of this patch is
to switch to the standard treatment of postmaster death in wait loops,
so that we're free to reduce the sampling frequency in
HandleStartupProcInterrupts(), to fix a horrible performance problem.
I have at least tweaked that comment about pg_usleep(), though,
because that was out of date; I also used (void) WaitLatch(...) to
make it look like other places where we ignore the return value
(perhaps some static analyser out there somewhere cares?)

By the way, a CV could probably be used for walreceiver state changes
too, to improve ShutdownWalRcv().

Although I know from CI that this builds and passes "make check" on
Windows, I'm hoping to attract some review of the 0001 patch from a
Windows person, and confirmation that it passes "check-world" (or at
least src/test/recovery check) there, because I don't have CI scripts
for that yet.
From a415ad0f7733ac85f315943dd0a1de9b24d909c5 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Fri, 18 Sep 2020 00:04:56 +1200
Subject: [PATCH v3 1/3] Allow WaitLatch() to be used without a latch.

Due to flaws in commit 3347c982bab, using WaitLatch() without
WL_LATCH_SET could cause an assertion failure or crash.  Repair.

While here, also add a check that the latch we're switching to belongs
to this backend, when changing from one latch to another.

Discussion: https://postgr.es/m/CA%2BhUKGK1607VmtrDUHQXrsooU%3Dap4g4R2yaoByWOOA3m8xevUQ%40mail.gmail.com
---
 src/backend/storage/ipc/latch.c | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 4153cc8557..63c6c97536 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -924,7 +924,22 @@ ModifyWaitEvent(WaitEventSet *set, int pos, uint32 events, Latch *latch)
 
 	if (events == WL_LATCH_SET)
 	{
+		if (latch && latch->owner_pid != MyProcPid)
+			elog(ERROR, "cannot wait on a latch owned by another process");
 		set->latch = latch;
+		/*
+		 * On Unix, we don't need to modify the kernel object because the
+		 * underlying pipe is the same for all latches so we can return
+		 * immediately.  On Windows, we need to update our array of handles,
+		 * but we leave the old one in place and tolerate spurious wakeups if
+		 * the latch is disabled.
+		 */
+#if defined(WAIT_USE_WIN32)
+		if (!latch)
+			return;
+#else
+		return;
+#endif
 	}
 
 #if defined(WAIT_USE_EPOLL)
@@ -1386,7 +1401,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 			/* There's data in the self-pipe, clear it. */
 			drainSelfPipe();
 
-			if (set->latch->is_set)
+			if (set->latch && set->latch->is_set)
 			{
 occurred_events->fd = PGINVALID_SOCKET;
 occurred_events->events = WL_LATCH_SET;
@@ -1536,7 +1551,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 			/* There's data in the self-pipe, clear it. */
 			drainSelfPipe();
 
-			if (set->latch->is_set)
+			if (set->latch && set->latch->is_set)

Re: Collation versioning

2020-09-19 Thread Thomas Munro
On Thu, Sep 17, 2020 at 5:41 AM Julien Rouhaud  wrote:
> On Tue, Sep 15, 2020 at 2:26 PM Peter Eisentraut
>  wrote:
> > I'm confused now.  I think we had mostly agreed on the v28 patch set,
> > without this additional AM flag.  There was still some discussion on
> > what the AM flag's precise semantics should be.  Do we want to work that
> > out first?
>
> That was my understanding too, but since Michael raised a concern I
> wrote some initial implementation for that part.  I'm assuming that
> this new flag will raise some new discussion, and I hope this can be
> discussed later, or at least in parallel, without interfering with the
> rest of the patchset.

If we always record dependencies we'll have the option to invent
clever ways to ignore them during version checking in later releases.

> > Btw., I'm uneasy about the term "stable collation order".  "Stable" has
> > an established meaning for sorting.  It's really about whether the AM
> > uses collations at all, right?
>
> Well, at the AM level I guess it's only about whether it's using some
> kind of sorting or not, as the collation information is really at the
> opclass level.  It makes me realize that this approach won't be able
> to cope with an index built using (varchar|text)_pattern_ops, and
> that's probably something we should handle correctly.

Hmm.




Re: speed up unicode normalization quick check

2020-09-19 Thread John Naylor
On Sat, Sep 19, 2020 at 1:46 PM Mark Dilger
 wrote:

> 0002 and 0003 look good to me.  I like the way you cleaned up a bit with the 
> unicode_norm_props struct, which makes the code a bit more tidy, and on my 
> compiler under -O2 it does not generate any extra runtime dereferences, as 
> the compiler can see through the struct just fine.  My only concern would be 
> if some other compilers might not see through the struct, resulting in a 
> runtime performance cost?  I wouldn't even ask, except that qc_hash_lookup is 
> called in a fairly tight loop.

(I assume you mean unicode_norm_info) Yeah, that usage was copied from
the keyword list code. I believe it was done for the convenience of
the callers. That is worth something, and so is consistency. That
said, I'd be curious if there is a measurable impact for some
platforms.

>  In your commit message for 0001 you mentioned testing on a multiplicity of 
> compilers.  Did you do that for 0002 and 0003 as well?

For that, I was simply using godbolt.org to test compiling the
multiplications down to shift-and-adds. Very widespread, I only
remember MSVC as not doing it. I'm not sure a few extra cycles would
have been noticeable here, but it can't hurt to have that guarantee.

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: speed up unicode normalization quick check

2020-09-19 Thread Mark Dilger



> On Sep 19, 2020, at 3:58 PM, John Naylor  wrote:
> 
> On Sat, Sep 19, 2020 at 1:46 PM Mark Dilger
>  wrote:
> 
>> 0002 and 0003 look good to me.  I like the way you cleaned up a bit with the 
>> unicode_norm_props struct, which makes the code a bit more tidy, and on my 
>> compiler under -O2 it does not generate any extra runtime dereferences, as 
>> the compiler can see through the struct just fine.  My only concern would be 
>> if some other compilers might not see through the struct, resulting in a 
>> runtime performance cost?  I wouldn't even ask, except that qc_hash_lookup 
>> is called in a fairly tight loop.
> 
> (I assume you mean unicode_norm_info) Yeah, that usage was copied from
> the keyword list code. I believe it was done for the convenience of
> the callers. That is worth something, and so is consistency. That
> said, I'd be curious if there is a measurable impact for some
> platforms.

Right, unicode_norm_info.  I'm not sure the convenience of the callers matters 
here, since the usage is restricted to just one file, but I also don't have a 
problem with the code as you have it.

>> In your commit message for 0001 you mentioned testing on a multiplicity of 
>> compilers.  Did you do that for 0002 and 0003 as well?
> 
> For that, I was simply using godbolt.org to test compiling the
> multiplications down to shift-and-adds. Very widespread, I only
> remember MSVC as not doing it. I'm not sure a few extra cycles would
> have been noticeable here, but it can't hurt to have that guarantee.

I am marking this ready for committer.  I didn't object to the whitespace 
weirdness in your patch (about which `git apply` grumbles) since you seem to 
have done that intentionally.  I have no further comments on the performance 
issue, since I don't have any other platforms at hand to test it on.  Whichever 
committer picks this up can decide if the issue matters to them enough to punt 
it back for further performance testing.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Fix inconsistency in jsonpath .datetime()

2020-09-19 Thread Nikita Glukhov

Hi!

The beta-tester of PG13 reported a inconsistency in our current jsonpath
datetime() method implementation.  By the standard format strings in datetime()
allows only characters "-./,':; " to be used as separators in format strings.
But our to_json[b]() serializes timestamps into XSD format with "T" separator
between date and time, so the serialized data cannot be parsed back by jsonpath
and it looks inconsistent:

=# SELECT to_jsonb('2020-09-19 23:45:06'::timestamp);
   to_jsonb
---
 "2020-09-19T23:45:06"
(1 row)

=# SELECT jsonb_path_query(to_jsonb('2020-09-19 23:45:06'::timestamp),
   '$.datetime()');
ERROR:  datetime format is not recognized: "2020-09-19T23:45:06"
HINT:  Use a datetime template argument to specify the input data format.

=# SELECT jsonb_path_query(to_jsonb('2020-09-19 23:45:06'::timestamp),
   '$.datetime("-mm-dd HH:MI:SS")');
ERROR:  unmatched format separator " "

=# SELECT jsonb_path_query(to_jsonb('2020-09-19 23:45:06'::timestamp),
   '$.datetime("-mm-dd\"T\"HH:MI:SS")');
ERROR:  invalid datetime format separator: """



Excerpt from SQL-2916 standard (5.3 , page 197):

 ::=


 ::=
   [  ]

 ::=
 



Attached patch #2 tries to fix this problem by enabling escaped characters in
standard mode.  I'm not sure is it better to enable the whole set of text
separators or only the problematic "T" character, allow only quoted text
separators or not.

Patch #1 is a more simple fix (so it comes first) removing excess space between
time and timezone fields in built-in format strings used for datetime type
recognition.  (It seemed to work as expected with extra space in earlier
version of the patch in which standard mode has not yet been introduced).

--
Nikita Glukhov
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

>From e867111f4f3525b4d9e3710b7d0db530602793ef Mon Sep 17 00:00:00 2001
From: Nikita Glukhov 
Date: Sat, 19 Sep 2020 22:01:51 +0300
Subject: [PATCH 1/2] Fix excess space in built-in format strings of jsonpath
 .datetime()

---
 src/backend/utils/adt/jsonpath_exec.c|  8 +--
 src/test/regress/expected/jsonb_jsonpath.out | 76 ++--
 src/test/regress/sql/jsonb_jsonpath.sql  | 76 ++--
 3 files changed, 80 insertions(+), 80 deletions(-)

diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index 2c0b362..d3c7fe3 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -1837,11 +1837,11 @@ executeDateTimeMethod(JsonPathExecContext *cxt, JsonPathItem *jsp,
 		static const char *fmt_str[] =
 		{
 			"-mm-dd",
-			"HH24:MI:SS TZH:TZM",
-			"HH24:MI:SS TZH",
+			"HH24:MI:SSTZH:TZM",
+			"HH24:MI:SSTZH",
 			"HH24:MI:SS",
-			"-mm-dd HH24:MI:SS TZH:TZM",
-			"-mm-dd HH24:MI:SS TZH",
+			"-mm-dd HH24:MI:SSTZH:TZM",
+			"-mm-dd HH24:MI:SSTZH",
 			"-mm-dd HH24:MI:SS"
 		};
 
diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out
index c870b7f..31d6f05 100644
--- a/src/test/regress/expected/jsonb_jsonpath.out
+++ b/src/test/regress/expected/jsonb_jsonpath.out
@@ -1877,25 +1877,25 @@ select jsonb_path_query('"2017-03-10 12:34:56"', '$.datetime()');
  "2017-03-10T12:34:56"
 (1 row)
 
-select jsonb_path_query('"2017-03-10 12:34:56 +3"', '$.datetime().type()');
+select jsonb_path_query('"2017-03-10 12:34:56+3"', '$.datetime().type()');
   jsonb_path_query  
 
  "timestamp with time zone"
 (1 row)
 
-select jsonb_path_query('"2017-03-10 12:34:56 +3"', '$.datetime()');
+select jsonb_path_query('"2017-03-10 12:34:56+3"', '$.datetime()');
   jsonb_path_query   
 -
  "2017-03-10T12:34:56+03:00"
 (1 row)
 
-select jsonb_path_query('"2017-03-10 12:34:56 +3:10"', '$.datetime().type()');
+select jsonb_path_query('"2017-03-10 12:34:56+3:10"', '$.datetime().type()');
   jsonb_path_query  
 
  "timestamp with time zone"
 (1 row)
 
-select jsonb_path_query('"2017-03-10 12:34:56 +3:10"', '$.datetime()');
+select jsonb_path_query('"2017-03-10 12:34:56+3:10"', '$.datetime()');
   jsonb_path_query   
 -
  "2017-03-10T12:34:56+03:10"
@@ -1913,25 +1913,25 @@ select jsonb_path_query('"12:34:56"', '$.datetime()');
  "12:34:56"
 (1 row)
 
-select jsonb_path_query('"12:34:56 +3"', '$.datetime().type()');
+select jsonb_path_query('"12:34:56+3"', '$.datetime().type()');
jsonb_path_query
 ---
  "time with time zone"
 (1 row)
 
-select jsonb_path_query('"12:34:56 +3"', '$.datetime()');
+select jsonb_path_query('"12:34:56+3"', '$.datetime()');
  jsonb_path_query 
 --
  "12:34:56+03:00"
 (1 row)
 
-select jsonb_path_query('"12:34:56 +3:10"', '$.datetime().type()');
+select jsonb_path_query('"12

Re: Feature proposal for psql

2020-09-19 Thread Corey Huinker
>> One limitation of this approach is that \set can't span lines, so
>> writing complex queries would be kinda painful.  But that would
>> be a good limitation to address separately; \set isn't the only
>> metacommand where can't-span-lines is a problem sometimes.
>> If you seriously want to pursue adding a feature like this,
>> probably the -hackers list is a more appropriate discussion
>> forum than -novice.
>>
>> regards, tom lane
>>
>
>  The ability to save and retrieve multi-line queries would be quite nice
> though, often I would like to save a query too large to type.
>
> I think I don't know psql well enough to propose a viable syntax, so I
> guess that would be up to experts here...
> But I would be pretty happy to implement it.
>
> Regards
> Denis
>
>
Well, if you want to do it right now, you can do this:

db=> select * from foo;
 x  | y
+
  1 |  1
  2 |  2
  3 |  3
  4 |  4
  5 |  5
  6 |  6
  7 |  7
  8 |  8
  9 |  9
 10 | 10
(10 rows)
db=> select * from foo where x = :xval \w query1.sql
db=> \set xval 4
db=> \i query1.sql
 x | y
---+---
 4 | 4
(1 row)


Granted, that involves adding files to the filesystem, setting variables
rather than passing parameters, remembering what those variables were, and
having the discipline to not have overlapping uses for variable names
across multiple files.

So the key shortcomings right now seem to be:
* no way to pass in values to an \i or \ir and no way to locally scope them
* one file per query

Setting variables locally in a \ir would need to somehow push and pop
existing variable values because those vars are scoped at the session
level, and that might confuse the user when they set the var inside the
included file expecting the calling session to keep the value.

Perhaps we could add a notion of a "bag of tricks" dir in each user's home
directory, and a slash command \wbag (better name suggestions welcome) that
behaves like \w but assumes the file will go in ~/.psql-bag-of-tricks/  and
\ibag which includes a file from the same dir.


Re: XversionUpgrade tests broken by postfix operator removal

2020-09-19 Thread Andrew Dunstan

On 9/19/20 12:21 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> Here's how cross version upgrade testing works. It uses a cached version of 
>> the binaries and data directory. The cache is only refreshed if there's a 
>> buildfarm run on that branch. If not, the cached version will just sit there 
>> till kingdom come. So all this should normally need for the non-live 
>> branches is a one-off adjustment in the cached version of the regression 
>> database along the lines I have indicated. My cached versions of 9.2 and 9.3 
>> are two years old.
> Hmm, okay, so patching this on gitmaster wouldn't help anyway.
>
>> But another alternative would be to have the buildfarm module run (on
>> versions older than 9.5):
>> drop operator @#@ (NONE, bigint);
>> CREATE OPERATOR @#@ (
>>     PROCEDURE = factorial,
>>     RIGHTARG = bigint
>> );
>> On reflection I think that's probably the simplest solution. It will avoid 
>> any surprises if the cached version is rebuilt, and at the same time 
>> preserve testing the prefix operator.
> Works for me.
>
>   



OK, I rolled back the changes I made in the legacy branch databases, and
this fix worked.


For reference, here is the complete hotfix.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/PGBuild/Modules/TestUpgradeXversion.pm b/PGBuild/Modules/TestUpgradeXversion.pm
index 8bc226f..bb9e560 100644
--- a/PGBuild/Modules/TestUpgradeXversion.pm
+++ b/PGBuild/Modules/TestUpgradeXversion.pm
@@ -434,6 +434,38 @@ sub test_upgrade## no critic (Subroutines::ProhibitManyArgs)
 		}
 	}
 
+	# operators not supported from release 14
+	if (   ($this_branch gt 'REL_13_STABLE' || $this_branch eq 'HEAD')
+		&& ($oversion le 'REL_13_STABLE' && $oversion ne 'HEAD'))
+	{
+		my $prstmt = join(';',
+		  'drop operator if exists #@# (bigint,NONE)',
+		  'drop operator if exists #%# (bigint,NONE)',
+		  'drop operator if exists !=- (bigint,NONE)',
+		  'drop operator if exists #@%# (bigint,NONE)');
+
+		system( "$other_branch/inst/bin/psql -X -e "
+  . " -c '$prstmt' "
+  . "regression "
+  . ">> '$upgrade_loc/$oversion-copy.log' 2>&1");
+		return if $?;
+
+		if ($oversion le 'REL9_4_STABLE')
+		{
+			# this is fixed in 9.5 and later
+			$prstmt = join(';',
+		   'drop operator @#@ (NONE, bigint)',
+		   'CREATE OPERATOR @#@ (' .
+			 'PROCEDURE = factorial, ' .
+			 'RIGHTARG = bigint )');
+			system( "$other_branch/inst/bin/psql -X -e "
+	  . " -c '$prstmt' "
+	  . "regression "
+	  . ">> '$upgrade_loc/$oversion-copy.log' 2>&1");
+			return if $?;
+		}
+	}
+
 	my $extra_digits = "";
 
 	if (   $oversion ne 'HEAD'


Re: Handing off SLRU fsyncs to the checkpointer

2020-09-19 Thread Thomas Munro
On Sat, Sep 19, 2020 at 5:06 PM Thomas Munro  wrote:
> In the meantime, from the low-hanging-fruit department, here's a new
> version of the SLRU-fsync-offload patch.  The only changes are a
> tweaked commit message, and adoption of C99 designated initialisers
> for the function table, so { [SYNC_HANDLER_CLOG] = ... } instead of
> relying on humans to make the array order match the enum values.  If
> there are no comments or objections, I'm planning to commit this quite
> soon.

... and CI told me that Windows didn't like my array syntax with the
extra trailing comma.  Here's one without.
From 89bc77827a73c93914f018fc862439f53ff54a39 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sat, 19 Sep 2020 16:22:06 +1200
Subject: [PATCH v4 1/2] Defer flushing of SLRU files.

Previously, we called fsync() after writing out pg_xact, multixact and
commit_ts pages due to cache pressure, leading to an I/O stall in user
backends and recovery.  Collapse requests for the same file into a
single system call as part of the next checkpoint, as we already do for
relation files.  This causes a significant improvement to recovery
times.

Tested-by: Jakub Wartak 
Discussion: https://postgr.es/m/CA+hUKGLJ=84yt+nvhkeedauutvhmfq9i-n7k_o50jmq6rpj...@mail.gmail.com
---
 src/backend/access/transam/clog.c  |  13 +++-
 src/backend/access/transam/commit_ts.c |  12 ++-
 src/backend/access/transam/multixact.c |  24 +-
 src/backend/access/transam/slru.c  | 101 +++--
 src/backend/access/transam/subtrans.c  |   4 +-
 src/backend/commands/async.c   |   5 +-
 src/backend/storage/lmgr/predicate.c   |   4 +-
 src/backend/storage/sync/sync.c|  28 ++-
 src/include/access/clog.h  |   3 +
 src/include/access/commit_ts.h |   3 +
 src/include/access/multixact.h |   4 +
 src/include/access/slru.h  |  12 ++-
 src/include/storage/sync.h |   7 +-
 13 files changed, 174 insertions(+), 46 deletions(-)

diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 65aa8841f7..304612c159 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -42,6 +42,7 @@
 #include "pg_trace.h"
 #include "pgstat.h"
 #include "storage/proc.h"
+#include "storage/sync.h"
 
 /*
  * Defines for CLOG page sizes.  A page is the same BLCKSZ as is used
@@ -691,7 +692,8 @@ CLOGShmemInit(void)
 {
 	XactCtl->PagePrecedes = CLOGPagePrecedes;
 	SimpleLruInit(XactCtl, "Xact", CLOGShmemBuffers(), CLOG_LSNS_PER_PAGE,
-  XactSLRULock, "pg_xact", LWTRANCHE_XACT_BUFFER);
+  XactSLRULock, "pg_xact", LWTRANCHE_XACT_BUFFER,
+  SYNC_HANDLER_CLOG);
 }
 
 /*
@@ -1033,3 +1035,12 @@ clog_redo(XLogReaderState *record)
 	else
 		elog(PANIC, "clog_redo: unknown op code %u", info);
 }
+
+/*
+ * Entrypoint for sync.c to sync clog files.
+ */
+int
+clogsyncfiletag(const FileTag *ftag, char *path)
+{
+	return slrusyncfiletag(XactCtl, ftag, path);
+}
diff --git a/src/backend/access/transam/commit_ts.c b/src/backend/access/transam/commit_ts.c
index 5244b06a2b..913ec9e48d 100644
--- a/src/backend/access/transam/commit_ts.c
+++ b/src/backend/access/transam/commit_ts.c
@@ -555,7 +555,8 @@ CommitTsShmemInit(void)
 	CommitTsCtl->PagePrecedes = CommitTsPagePrecedes;
 	SimpleLruInit(CommitTsCtl, "CommitTs", CommitTsShmemBuffers(), 0,
   CommitTsSLRULock, "pg_commit_ts",
-  LWTRANCHE_COMMITTS_BUFFER);
+  LWTRANCHE_COMMITTS_BUFFER,
+  SYNC_HANDLER_COMMIT_TS);
 
 	commitTsShared = ShmemInitStruct("CommitTs shared",
 	 sizeof(CommitTimestampShared),
@@ -1083,3 +1084,12 @@ commit_ts_redo(XLogReaderState *record)
 	else
 		elog(PANIC, "commit_ts_redo: unknown op code %u", info);
 }
+
+/*
+ * Entrypoint for sync.c to sync commit_ts files.
+ */
+int
+committssyncfiletag(const FileTag *ftag, char *path)
+{
+	return slrusyncfiletag(CommitTsCtl, ftag, path);
+}
diff --git a/src/backend/access/transam/multixact.c b/src/backend/access/transam/multixact.c
index b8bedca04a..344006b0f5 100644
--- a/src/backend/access/transam/multixact.c
+++ b/src/backend/access/transam/multixact.c
@@ -1831,11 +1831,13 @@ MultiXactShmemInit(void)
 	SimpleLruInit(MultiXactOffsetCtl,
   "MultiXactOffset", NUM_MULTIXACTOFFSET_BUFFERS, 0,
   MultiXactOffsetSLRULock, "pg_multixact/offsets",
-  LWTRANCHE_MULTIXACTOFFSET_BUFFER);
+  LWTRANCHE_MULTIXACTOFFSET_BUFFER,
+  SYNC_HANDLER_MULTIXACT_OFFSET);
 	SimpleLruInit(MultiXactMemberCtl,
   "MultiXactMember", NUM_MULTIXACTMEMBER_BUFFERS, 0,
   MultiXactMemberSLRULock, "pg_multixact/members",
-  LWTRANCHE_MULTIXACTMEMBER_BUFFER);
+  LWTRANCHE_MULTIXACTMEMBER_BUFFER,
+  SYNC_HANDLER_MULTIXACT_MEMBER);
 
 	/* Initialize our shared state struct */
 	MultiXactState = ShmemInitStruct("Shared MultiXact State",
@@ -3386,3 +3388,21 @@ pg_get_multixact_members(PG_FUNCTION_ARGS)
 
 	SRF_RETURN_DONE(funccxt);
 }
+
+/*
+ * Entrypoint for sync.c to sync offsets files.
+ */
+int
+multix

Re: [PATCH] Remove useless distinct clauses

2020-09-19 Thread Bruce Momjian
On Tue, Sep 15, 2020 at 10:57:04PM +1200, David Rowley wrote:
> On Fri, 31 Jul 2020 at 20:41, Pierre Ducroquet  wrote:
> >
> > In a recent audit, I noticed that application developers have a tendency to
> > abuse the distinct clause. For instance they use an ORM and add a distinct 
> > at
> > the top level just because they don't know the cost it has, or they don't 
> > know
> > that using EXISTS is a better way to express their queries than doing JOINs
> > (or worse, they can't do better).
> >
> > They thus have this kind of queries (considering tbl1 has a PK of course):
> > SELECT DISTINCT * FROM tbl1;
> > SELECT DISTINCT * FROM tbl1 ORDER BY a;
> > SELECT DISTINCT tbl1.* FROM tbl1
> > JOIN tbl2 ON tbl2.a = tbl1.id;
> 
> This is a common anti-pattern that I used to see a couple of jobs ago.
> What seemed to happen was that someone would modify some query or a
> view to join in an additional table to fetch some information that was
> now required.  At some later time, there'd be a bug report to say that
> the query is returning certain records more than once.  The
> developer's solution was to add DISTINCT, instead of figuring out that
> the join that was previously added missed some column from the join
> clause.

I can 100% imagine that happening.  :-(

-- 
  Bruce Momjian  https://momjian.us
  EnterpriseDB https://enterprisedb.com

  The usefulness of a cup is in its emptiness, Bruce Lee





Re: Collation versioning

2020-09-19 Thread Julien Rouhaud
On Sun, Sep 20, 2020 at 6:36 AM Thomas Munro  wrote:
>
> On Thu, Sep 17, 2020 at 5:41 AM Julien Rouhaud  wrote:
> > On Tue, Sep 15, 2020 at 2:26 PM Peter Eisentraut
> >  wrote:
> > > I'm confused now.  I think we had mostly agreed on the v28 patch set,
> > > without this additional AM flag.  There was still some discussion on
> > > what the AM flag's precise semantics should be.  Do we want to work that
> > > out first?
> >
> > That was my understanding too, but since Michael raised a concern I
> > wrote some initial implementation for that part.  I'm assuming that
> > this new flag will raise some new discussion, and I hope this can be
> > discussed later, or at least in parallel, without interfering with the
> > rest of the patchset.
>
> If we always record dependencies we'll have the option to invent
> clever ways to ignore them during version checking in later releases.

But in any case we need to record the dependencies for all collations
right?  The only difference is that we shouldn't record the collation
version if there's no risk of corruption if the underlying sort order
changes.
So while I want to address this part in pg14, if that wasn't the case
the problem would anyway be automatically fixed in the later version
by doing a reindex I think, as the version would be cleared.

There could still be a possible false positive warning in that case if
the lib is updated, but users could clear it with the infrastructure
proposed.  Or alternatively if we add a new backend filter, say
REINDEX (COLLATION NOT CURRENT), we could add there additional
knowledge to ignore such cases.

> > > Btw., I'm uneasy about the term "stable collation order".  "Stable" has
> > > an established meaning for sorting.  It's really about whether the AM
> > > uses collations at all, right?
> >
> > Well, at the AM level I guess it's only about whether it's using some
> > kind of sorting or not, as the collation information is really at the
> > opclass level.  It makes me realize that this approach won't be able
> > to cope with an index built using (varchar|text)_pattern_ops, and
> > that's probably something we should handle correctly.
>
> Hmm.

On the other hand the *_pattern_ops are entirely hardcoded, and I
don't think that we'll ever have an extensible way to have this kind
of magic exception.  So maybe having a flag at the am level is
acceptable?




Re: Range checks of pg_test_fsync --secs-per-test and pg_test_timing --duration

2020-09-19 Thread Michael Paquier
On Fri, Sep 18, 2020 at 05:22:15PM +0900, Michael Paquier wrote:
> Okay, after looking at that, here is v3.  This includes range checks
> as well as errno checks based on strtol().  What do you think?

This fails in the CF bot on Linux because of pg_logging_init()
returning with errno=ENOTTY in the TAP tests, for which I began a new
thread:
https://www.postgresql.org/message-id/20200918095713.ga20...@paquier.xyz

Not sure if this will lead anywhere, but we can also address the
failure by enforcing errno=0 for the new calls of strtol() introduced
in this patch.  So here is an updated patch doing so.
--
Michael
diff --git a/src/bin/pg_test_fsync/.gitignore b/src/bin/pg_test_fsync/.gitignore
index f3b5932498..5eb5085f45 100644
--- a/src/bin/pg_test_fsync/.gitignore
+++ b/src/bin/pg_test_fsync/.gitignore
@@ -1 +1,3 @@
 /pg_test_fsync
+
+/tmp_check/
diff --git a/src/bin/pg_test_fsync/Makefile b/src/bin/pg_test_fsync/Makefile
index 7632c94eb7..c4f9ae0664 100644
--- a/src/bin/pg_test_fsync/Makefile
+++ b/src/bin/pg_test_fsync/Makefile
@@ -22,6 +22,12 @@ install: all installdirs
 installdirs:
 	$(MKDIR_P) '$(DESTDIR)$(bindir)'
 
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
 uninstall:
 	rm -f '$(DESTDIR)$(bindir)/pg_test_fsync$(X)'
 
diff --git a/src/bin/pg_test_fsync/pg_test_fsync.c b/src/bin/pg_test_fsync/pg_test_fsync.c
index 6e47293123..1a09c36960 100644
--- a/src/bin/pg_test_fsync/pg_test_fsync.c
+++ b/src/bin/pg_test_fsync/pg_test_fsync.c
@@ -5,6 +5,7 @@
 
 #include "postgres_fe.h"
 
+#include 
 #include 
 #include 
 #include 
@@ -62,7 +63,7 @@ do { \
 
 static const char *progname;
 
-static int	secs_per_test = 5;
+static int32 secs_per_test = 5;
 static int	needs_unlink = 0;
 static char full_buf[DEFAULT_XLOG_SEG_SIZE],
 		   *buf,
@@ -148,6 +149,8 @@ handle_args(int argc, char *argv[])
 
 	int			option;			/* Command line option */
 	int			optindex = 0;	/* used by getopt_long */
+	long		optval;			/* used for option parsing */
+	char	   *endptr;
 
 	if (argc > 1)
 	{
@@ -173,7 +176,24 @@ handle_args(int argc, char *argv[])
 break;
 
 			case 's':
-secs_per_test = atoi(optarg);
+errno = 0;
+optval = strtol(optarg, &endptr, 10);
+
+if (endptr == optarg || *endptr != '\0' ||
+	errno != 0 || optval != (int32) optval)
+{
+	pg_log_error("invalid argument for option %s", "--secs-per-test");
+	fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
+	exit(1);
+}
+
+secs_per_test = (int32) optval;
+if (secs_per_test <= 0)
+{
+	pg_log_error("%s must be in range %d..%d",
+ "--secs-per-test", 1, INT_MAX);
+	exit(1);
+}
 break;
 
 			default:
diff --git a/src/bin/pg_test_fsync/t/001_basic.pl b/src/bin/pg_test_fsync/t/001_basic.pl
new file mode 100644
index 00..4b615c263d
--- /dev/null
+++ b/src/bin/pg_test_fsync/t/001_basic.pl
@@ -0,0 +1,25 @@
+use strict;
+use warnings;
+
+use Config;
+use TestLib;
+use Test::More tests => 12;
+
+#
+# Basic checks
+
+program_help_ok('pg_test_fsync');
+program_version_ok('pg_test_fsync');
+program_options_handling_ok('pg_test_fsync');
+
+#
+# Test invalid option combinations
+
+command_fails_like(
+	[ 'pg_test_fsync', '--secs-per-test', 'a' ],
+	qr/\Qpg_test_fsync: error: invalid argument for option --secs-per-test\E/,
+	'pg_test_fsync: invalid argument for option --secs-per-test');
+command_fails_like(
+	[ 'pg_test_fsync', '--secs-per-test', '0' ],
+	qr/\Qpg_test_fsync: error: --secs-per-test must be in range 1..2147483647\E/,
+	'pg_test_fsync: --secs-per-test must be in range');
diff --git a/src/bin/pg_test_timing/.gitignore b/src/bin/pg_test_timing/.gitignore
index f6c664c765..e5aac2ab12 100644
--- a/src/bin/pg_test_timing/.gitignore
+++ b/src/bin/pg_test_timing/.gitignore
@@ -1 +1,3 @@
 /pg_test_timing
+
+/tmp_check/
diff --git a/src/bin/pg_test_timing/Makefile b/src/bin/pg_test_timing/Makefile
index 334d6ff5c0..52994b4103 100644
--- a/src/bin/pg_test_timing/Makefile
+++ b/src/bin/pg_test_timing/Makefile
@@ -22,6 +22,12 @@ install: all installdirs
 installdirs:
 	$(MKDIR_P) '$(DESTDIR)$(bindir)'
 
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
 uninstall:
 	rm -f '$(DESTDIR)$(bindir)/pg_test_timing$(X)'
 
diff --git a/src/bin/pg_test_timing/pg_test_timing.c b/src/bin/pg_test_timing/pg_test_timing.c
index e14802372b..cd16fc2f83 100644
--- a/src/bin/pg_test_timing/pg_test_timing.c
+++ b/src/bin/pg_test_timing/pg_test_timing.c
@@ -6,6 +6,8 @@
 
 #include "postgres_fe.h"
 
+#include 
+
 #include "getopt_long.h"
 #include "portability/instr_time.h"
 
@@ -14,7 +16,7 @@ static const char *progname;
 static int32 test_duration = 3;
 
 static void handle_args(int argc, char *argv[]);
-static uint64 test_timing(int32);
+static uint64 test_timing(int32 duration);
 static void output(uint64 loop_count);
 
 /* record duration in powers of 2 microseconds */
@@ -47,6 +49

Re: [HACKERS] logical decoding of two-phase transactions

2020-09-19 Thread Dilip Kumar
On Fri, Sep 18, 2020 at 6:02 PM Ajin Cherian  wrote:
>

I have reviewed v4-0001 patch and I have a few comments.  I haven't
yet completely reviewed the patch.

1.
+ /*
+ * Process invalidation messages, even if we're not interested in the
+ * transaction's contents, since the various caches need to always be
+ * consistent.
+ */
+ if (parsed->nmsgs > 0)
+ {
+ if (!ctx->fast_forward)
+ ReorderBufferAddInvalidations(ctx->reorder, xid, buf->origptr,
+   parsed->nmsgs, parsed->msgs);
+ ReorderBufferXidSetCatalogChanges(ctx->reorder, xid, buf->origptr);
+ }
+

I think we don't need to add prepare time invalidation messages as we now we
are already logging the invalidations at the command level and adding them to
reorder buffer.

2.

+ /*
+ * Tell the reorderbuffer about the surviving subtransactions. We need to
+ * do this because the main transaction itself has not committed since we
+ * are in the prepare phase right now. So we need to be sure the snapshot
+ * is setup correctly for the main transaction in case all changes
+ * happened in subtransanctions
+ */
+ for (i = 0; i < parsed->nsubxacts; i++)
+ {
+ ReorderBufferCommitChild(ctx->reorder, xid, parsed->subxacts[i],
+ buf->origptr, buf->endptr);
+ }
+
+ if (SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) ||
+ (parsed->dbId != InvalidOid && parsed->dbId != ctx->slot->data.database) ||
+ ctx->fast_forward || FilterByOrigin(ctx, origin_id))
+ return;

Do we need to call ReorderBufferCommitChild if we are skiping this transaction?
I think the below check should be before calling ReorderBufferCommitChild.

3.

+ /*
+ * If it's ROLLBACK PREPARED then handle it via callbacks.
+ */
+ if (TransactionIdIsValid(xid) &&
+ !SnapBuildXactNeedsSkip(ctx->snapshot_builder, buf->origptr) &&
+ parsed->dbId == ctx->slot->data.database &&
+ !FilterByOrigin(ctx, origin_id) &&
+ ReorderBufferTxnIsPrepared(ctx->reorder, xid, parsed->twophase_gid))
+ {
+ ReorderBufferFinishPrepared(ctx->reorder, xid, buf->origptr, buf->endptr,
+ commit_time, origin_id, origin_lsn,
+ parsed->twophase_gid, false);
+ return;
+ }


I think we have already checked !SnapBuildXactNeedsSkip, parsed->dbId
== ctx->slot->data.database and !FilterByOrigin in DecodePrepare
so if those are not true then we wouldn't have prepared this
transaction i.e. ReorderBufferTxnIsPrepared will be false so why do we
need
to recheck this conditions.


4.

+ /* If streaming, reset the TXN so that it is allowed to stream
remaining data. */
+ if (streaming && stream_started)
+ {
+ ReorderBufferResetTXN(rb, txn, snapshot_now,
+   command_id, prev_lsn,
+   specinsert);
+ }
+ else
+ {
+ elog(LOG, "stopping decoding of %s (%u)",
+ txn->gid[0] != '\0'? txn->gid:"", txn->xid);
+ ReorderBufferTruncateTXN(rb, txn, true);
+ }

Why only if (streaming) is not enough?  I agree if we are coming here
and it is streaming mode then streaming started must be true
but we already have an assert for that.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com