Re: Transactions involving multiple postgres foreign servers, take 2

2020-09-13 Thread Masahiko Sawada
On Fri, 11 Sep 2020 at 18:24, tsunakawa.ta...@fujitsu.com
 wrote:
>
> From: Masahiko Sawada 
> > On Tue, 8 Sep 2020 at 13:00, tsunakawa.ta...@fujitsu.com
> >  wrote:
> > > 2. 2PC processing is queued and serialized in one background worker.  That
> > severely subdues transaction throughput.  Each backend should perform
> > 2PC.
> >
> > Not sure it's safe that each backend perform PREPARE and COMMIT
> > PREPARED since the current design is for not leading an inconsistency
> > between the actual transaction result and the result the user sees.
>
> As Fujii-san is asking, I also would like to know what situation you think is 
> not safe.  Are you worried that the FDW's commit function might call 
> ereport(ERROR | FATAL | PANIC)?

Yes.

> If so, can't we stipulate that the FDW implementor should ensure that the 
> commit function always returns control to the caller?

How can the FDW implementor ensure that? Since even palloc could call
ereport(ERROR) I guess it's hard to require that to all FDW
implementors.

>
>
> > But in the future, I think we can have multiple background workers per
> > database for better performance.
>
> Does the database in "per database" mean the local database (that 
> applications connect to), or the remote database accessed via FDW?

I meant the local database. In the current patch, we launch the
resolver process per local database. My idea is to allow launching
multiple resolver processes for one local database as long as the
number of workers doesn't exceed the limit.

>
> I'm wondering how the FDW and background worker(s) can realize parallel 
> prepare and parallel commit.  That is, the coordinator transaction performs:
>
> 1. Issue prepare to all participant nodes, but doesn't wait for the reply for 
> each issue.
> 2. Waits for replies from all participants.
> 3. Issue commit to all participant nodes, but doesn't wait for the reply for 
> each issue.
> 4. Waits for replies from all participants.
>
> If we just consider PostgreSQL and don't think about FDW, we can use libpq 
> async functions -- PQsendQuery, PQconsumeInput, and PQgetResult.  pgbench 
> uses them so that one thread can issue SQL statements on multiple connections 
> in parallel.
>
> But when we consider the FDW interface, plus other DBMSs, how can we achieve 
> the parallelism?

It's still a rough idea but I think we can use TMASYNC flag and
xa_complete explained in the XA specification. The core transaction
manager call prepare, commit, rollback APIs with the flag, requiring
to execute the operation asynchronously and to return a handler (e.g.,
a socket taken by PQsocket in postgres_fdw case) to the transaction
manager. Then the transaction manager continues polling the handler
until it becomes readable and testing the completion using by
xa_complete() with no wait, until all foreign servers return OK on
xa_complete check.

>
>
> > > 3. postgres_fdw cannot detect remote updates when the UDF executed on a
> > remote node updates data.
> >
> > I assume that you mean the pushing the UDF down to a foreign server.
> > If so, I think we can do this by improving postgres_fdw. In the current 
> > patch,
> > registering and unregistering a foreign server to a group of 2PC and 
> > marking a
> > foreign server as updated is FDW responsible. So perhaps if we had a way to
> > tell postgres_fdw that the UDF might update the data on the foreign server,
> > postgres_fdw could mark the foreign server as updated if the UDF is 
> > shippable.
>
> Maybe we can consider VOLATILE functions update data.  That may be 
> overreaction, though.

Sorry I don't understand that. The volatile functions are not pushed
down to the foreign servers in the first place, no?

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [PATCH] Finally split StdRdOptions into HeapOptions and ToastOptions

2020-09-13 Thread Nikolay Shaplov
В письме от понедельник, 20 июля 2020 г. 18:36:44 MSK пользователь Georgios 
Kokolatos написал:

Hi! Sorry for really long delay, I was at my summer vacations, and then has 
urgent things to finish first. :-( Now I hope we can continue... 


> thank you for the patch. It applies cleanly, compiles and passes check,
> check-world.

Thank you for reviewing efforts. 
 
> I feel as per the discussion, this is a step to the right direction yet it
> does not get far enough. From experience, I can confirm that dealing with
> reloptions in a new table AM is somewhat of a pain. Ultimately, reloptions
> should be handled by the table AM specific code. The current patch does not
> address the issue. Yet it does make the issue easier to address by clearing
> up the current state.

Moving reloptions to AM code is the goal I am slowly moving to. I've started 
some time ago with big patch https://commitfest.postgresql.org/14/992/ and 
have been told to split it into smaller parts. So I did, and this patch is 
last step that cleans options related things up, and then actual moving can be 
done.
 
> If you allow me, I have a couple of comments.
> 
> - saveFreeSpace = RelationGetTargetPageFreeSpace(relation,
> - 
>HEAP_DEFAULT_FILLFACTOR);
> + if (IsToastRelation(relation))
> + saveFreeSpace = ToastGetTargetPageFreeSpace();
> + else
> + saveFreeSpace = HeapGetTargetPageFreeSpace(relation);
> 
> For balance, it does make some sense for ToastGetTargetPageFreeSpace() to
> get relation as an argument, similarly to HeapGetTargetPageFreeSpace().

ToastGetTargetPageFreeSpace return a const value. I've change the code, so it 
gets relation argument, that is not used, the way you suggested. But I am not 
sure if it is good or bad idea. May be we will get some "Unused variable" 
warning on some compilers. I like consistency... But not sure we need it here. 

> -   /* Retrieve the parallel_workers reloption, or -1 if not set. */
> -   rel->rel_parallel_workers = RelationGetParallelWorkers(relation,
> -1);
 +   /*
> +* Retrieve the parallel_workers for heap and mat.view relations.
> +* Use -1 if not set, or if we are dealing with other relation
> kinds
 +*/
> +   if (relation->rd_rel->relkind == RELKIND_RELATION ||
> +   relation->rd_rel->relkind == RELKIND_MATVIEW)
> +   rel->rel_parallel_workers =
> RelationGetParallelWorkers(relation, -1);
 +   else
> +   rel->rel_parallel_workers = -1;
> Also, this pattern is repeated in four places, maybe the branch can be
> moved inside a macro or static inline instead? 

> If the comment above is agreed upon, then it makes a bit of sense to apply
> the same here. The expression in the branch is already asserted for in
> macro, why not switch there and remove the responsibility from the caller?

I guess here you are right, because here the logic is following: for heap 
relation take option from options, for _all_ others use -1. This can be moved 
to macro.

So I changed it to 

/*  
 * HeapGetParallelWorkers   
 *  Returns the heap's parallel_workers reloption setting.  
 *  Note multiple eval of argument! 
 */ 
#define HeapGetParallelWorkers(relation, defaultpw) \   
(AssertMacro(relation->rd_rel->relkind == RELKIND_RELATION ||   \   
 relation->rd_rel->relkind == RELKIND_MATVIEW), \   
 (relation)->rd_options ?   \   
  ((HeapOptions *) (relation)->rd_options)->parallel_workers :  \   
(defaultpw))

/*  
 * RelationGetParallelWorkers   
 *  Returns the relation's parallel_workers reloption setting.  
 *  Note multiple eval of argument! 
 */ 

#define RelationGetParallelWorkers(relation, defaultpw) \   
(((relation)->rd_rel->relkind == RELKIND_RELATION ||\   
  (relation)->rd_rel->relkind == RELKIND_MATVIEW) ? \   
  HeapGetParallelWorkers(relation, defaultpw) : defaultpw)


But I would not like to move

if (IsToastRelation(relation))

Re: Sometimes the output to the stdout in Windows disappears

2020-09-13 Thread Tom Lane
I happened to try googling for other similar reports, and I found
a very interesting recent thread here:

https://github.com/nodejs/node/issues/33166

It might not have the same underlying cause, of course, but it sure
sounds familiar.  If Node.js are really seeing the same effect,
that would point to an underlying Windows bug rather than anything
Postgres is doing wrong.

It doesn't look like the Node.js crew got any closer to
understanding the issue than we have, unfortunately.  They made
their problem mostly go away by reverting a seemingly-unrelated
patch.  But I can't help thinking that it's a timing-related bug,
and that patch was just unlucky enough to change the timing of
their tests so that they saw the failure frequently.

regards, tom lane




Re: Probable documentation errors or improvements

2020-09-13 Thread David G. Johnston
On Thu, Sep 10, 2020 at 12:20 PM Yaroslav  wrote:

> Disclaimer: I'm not a native speaker, so not sure those are actually
> incorrect, and can't offer non-trivial suggestions.


I skimmed about 2/3rds of these and while I agree on the surface that
probably 3/4ths of them are improvements they aren't clear enough wins to
methodically go through and write up one or more patches.

There are a few areas that seem outdated that could use a good once-over in
the sgml sources to clean up.  For those I'd expect that I or someone else
may go and write up actual patches.

This run-on presentation is off-putting.  Even if actual patches are not
forthcoming I would at least suggest one email per topic with the
suggestions broken down into at least bugs (at the top) and
non-bugs/improvements.

David J.


Gripes about walsender command processing

2020-09-13 Thread Tom Lane
While trying to understand a recent buildfarm failure [1], I got annoyed
by the fact that a walsender exposes its last SQL command in
pg_stat_activity even when that was a long time ago and what it's been
doing since then is replication commands.  This leads to *totally*
misleading reports, for instance in [1] we read

2020-09-13 19:10:09.632 CEST [5f5e526d.242415:4] LOG:  server process (PID 
2368853) was terminated by signal 11: Segmentation fault
2020-09-13 19:10:09.632 CEST [5f5e526d.242415:5] DETAIL:  Failed process was 
running: SELECT pg_catalog.set_config('search_path', '', false);

even though the process's own log messages paint a much better picture of
reality:

2020-09-13 19:10:07.302 CEST [5f5e526f.242555:1] LOG:  connection received: 
host=[local]
2020-09-13 19:10:07.303 CEST [5f5e526f.242555:2] LOG:  replication connection 
authorized: user=bf application_name=sub2
2020-09-13 19:10:07.303 CEST [5f5e526f.242555:3] LOG:  statement: SELECT 
pg_catalog.set_config('search_path', '', false);
2020-09-13 19:10:07.334 CEST [5f5e526f.242555:4] LOG:  received replication 
command: IDENTIFY_SYSTEM
2020-09-13 19:10:07.335 CEST [5f5e526f.242555:5] LOG:  received replication 
command: START_REPLICATION SLOT "sub2" LOGICAL 0/0 (proto_version '2', 
publication_names '"pub2"')
2020-09-13 19:10:07.335 CEST [5f5e526f.242555:6] LOG:  starting logical 
decoding for slot "sub2"
2020-09-13 19:10:07.335 CEST [5f5e526f.242555:7] DETAIL:  Streaming 
transactions committing after 0/159EB38, reading WAL from 0/159EB00.
2020-09-13 19:10:07.335 CEST [5f5e526f.242555:8] LOG:  logical decoding found 
consistent point at 0/159EB00
2020-09-13 19:10:07.335 CEST [5f5e526f.242555:9] DETAIL:  There are no running 
transactions.

I think that walsenders ought to report their current replication command
as though it were a SQL command.  They oughta set debug_query_string, too.

While trying to fix this, I also observed that exec_replication_command
fails to clean up the temp context it made for parsing the command string,
if that turns out to be a SQL command.  This very accidentally fails to
lead to a long-term memory leak, because that context will be a child of
MessageContext so it'll be cleaned out in the next iteration of
PostgresMain's main loop.  But it's still unacceptably sloppy coding
IMHO, and it's closely related to a lot of other randomness in the
function; it'd be better to have a separate early-exit path for SQL
commands.

Attached find a proposed fix for these things.

What I'd really like to do is adjust pgstat_report_activity so that
callers are *required* to provide some kind of command string when
transitioning into STATE_RUNNING state, ie something like

Assert(state == STATE_RUNNING ? cmd_str != NULL : cmd_str == NULL);

However, before we could do that, we'd have to clean up the rat's nest
of seemingly-inserted-with-the-aid-of-a-dartboard pgstat_report_activity
calls in replication/logical/worker.c.  I couldn't make heads or tails
of what is going on there, so I did not try.

BTW, while I didn't change it here, isn't the second
SnapBuildClearExportedSnapshot call in exec_replication_command just
useless?  We already did that before parsing the command.

regards, tom lane

[1] 
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03

diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 3f756b470a..9a2b154797 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1545,6 +1545,9 @@ exec_replication_command(const char *cmd_string)
 
 	CHECK_FOR_INTERRUPTS();
 
+	/*
+	 * Parse the command.
+	 */
 	cmd_context = AllocSetContextCreate(CurrentMemoryContext,
 		"Replication command context",
 		ALLOCSET_DEFAULT_SIZES);
@@ -1560,15 +1563,38 @@ exec_replication_command(const char *cmd_string)
 
 	cmd_node = replication_parse_result;
 
+	/*
+	 * If it's a SQL command, just clean up our mess and return false; the
+	 * caller will take care of executing it.
+	 */
+	if (IsA(cmd_node, SQLCmd))
+	{
+		if (MyDatabaseId == InvalidOid)
+			ereport(ERROR,
+	(errmsg("cannot execute SQL commands in WAL sender for physical replication")));
+
+		MemoryContextSwitchTo(old_context);
+		MemoryContextDelete(cmd_context);
+
+		/* Tell the caller that this wasn't a WalSender command. */
+		return false;
+	}
+
+	/*
+	 * Report query to various monitoring facilities.  For this purpose, we
+	 * report replication commands just like SQL commands.
+	 */
+	debug_query_string = cmd_string;
+
+	pgstat_report_activity(STATE_RUNNING, cmd_string);
+
 	/*
 	 * Log replication command if log_replication_commands is enabled. Even
 	 * when it's disabled, log the command with DEBUG1 level for backward
-	 * compatibility. Note that SQL commands are not logged here, and will be
-	 * logged later if log_statement is enabled.
+	 * compatibility.
 	 */
-	if (cmd_node->type != T_SQLCmd)
-		

Minor documentation error regarding streaming replication protocol

2020-09-13 Thread Brar Piening
While implementing streaming replication client functionality for Npgsql
I stumbled upon a minor documentation error at
https://www.postgresql.org/docs/current/protocol-replication.html

The "content" return value for the TIMELINE_HISTORYcommand is advertised
as bytea while it is in fact raw ASCII bytes.

How did I find out?
Since the value I get doesn't start with a "\x" and contains ascii text,
although I've bytea_outputset to hex, I first thought that the streaming
replication protocol simply doesn't honor bytea_output, but then I
realized that I also get unencoded tabs and newlines which wouldn't be
possible if the value woud be passed through byteaout.

This is certainly a minor problem since the timeline history file only
contains generated strings that are ASCII-only, so just using the
unencoded bytes is actually easier than decoding bytea.
OTOH it did cost me a few hours (writing a bytea decoder and figuring
out why it doesn't work by looking at varlena.c and proving the docs
wrong) so I want to point this out here since it is possibly an
unintended behavior or at least a documentation error.
Also I'm wary of taking dependency on an undocumented implementation
detail that could possibly change at any point.

Regards,
Brar





Re: Minor documentation error regarding streaming replication protocol

2020-09-13 Thread Brar Piening
Brar Piening wrote:
> This is certainly a minor problem

After thinking about it a little more I think that at least the fact
that even the row description message advertises the content as bytea
could be considered as a bug - no?






Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-09-13 Thread Tom Lane
Amit Kapila  writes:
> Pushed.

Observe the following reports:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04

These are all on HEAD, and all within the last ten days, and I see
nothing comparable in any branch before that.  So it's hard to avoid
the conclusion that somebody broke something about ten days ago.

None of these animals provided gdb backtraces; but we do have a built-in
trace from several, and they all look like pgoutput.so is trying to
list_free() garbage, somewhere inside a relcache invalidation/rebuild
scenario:

TRAP: FailedAssertion("list->length > 0", File: 
"/home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/backend/nodes/list.c",
 Line: 68)
postgres: publisher: walsender bf [local] 
idle(ExceptionalCondition+0x57)[0x9081f7]
postgres: publisher: walsender bf [local] idle[0x6bcc70]
postgres: publisher: walsender bf [local] idle(list_free+0x11)[0x6bdc01]
/home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/tmp_install/home/bf/build/buildfarm-idiacanthus/HEAD/inst/lib/postgresql/pgoutput.so(+0x35d8)[0x7fa4c5a6f5d8]
postgres: publisher: walsender bf [local] 
idle(LocalExecuteInvalidationMessage+0x15b)[0x8f0cdb]
postgres: publisher: walsender bf [local] 
idle(ReceiveSharedInvalidMessages+0x4b)[0x7bca0b]
postgres: publisher: walsender bf [local] idle(LockRelationOid+0x56)[0x7c19e6]
postgres: publisher: walsender bf [local] idle(relation_open+0x1c)[0x4a2d0c]
postgres: publisher: walsender bf [local] idle(table_open+0x6)[0x524486]
postgres: publisher: walsender bf [local] idle[0x9017f2]
postgres: publisher: walsender bf [local] idle[0x8fabd4]
postgres: publisher: walsender bf [local] idle[0x8fa58a]
postgres: publisher: walsender bf [local] 
idle(RelationCacheInvalidateEntry+0xaf)[0x8fbdbf]
postgres: publisher: walsender bf [local] 
idle(LocalExecuteInvalidationMessage+0xec)[0x8f0c6c]
postgres: publisher: walsender bf [local] 
idle(ReceiveSharedInvalidMessages+0xcb)[0x7bca8b]
postgres: publisher: walsender bf [local] idle(LockRelationOid+0x56)[0x7c19e6]
postgres: publisher: walsender bf [local] idle(relation_open+0x1c)[0x4a2d0c]
postgres: publisher: walsender bf [local] idle(table_open+0x6)[0x524486]
postgres: publisher: walsender bf [local] idle[0x8ee8b0]

010_truncate.pl itself hasn't changed meaningfully in a good long time.
However, I see that 464824323 added a whole boatload of code to
pgoutput.c, and the timing is right for that commit to be the culprit,
so that's what I'm betting on.

Probably this requires a relcache inval at the wrong time;
although we have recent passes from CLOBBER_CACHE_ALWAYS animals,
so that can't be the whole triggering condition.  I wonder whether
it is relevant that all of the complaining animals are JIT-enabled.

regards, tom lane




Re: A micro-optimisation for walkdir()

2020-09-13 Thread Thomas Munro
On Tue, Sep 8, 2020 at 10:53 AM Tom Lane  wrote:
> Thomas Munro  writes:
> > FWIW I just spotted a couple of very suspicious looking failures for
> > build farm animal "walleye", a "MinGW64 8.1.0" system, that say:
>
> walleye's been kind of unstable since the get-go, so I wouldn't put
> too much faith in reports just from it.

CC'ing animal owner.

  I suspect that someone who knows about PostgreSQL
on Windows would recognise the above symptom, but my guess is the
Windows "indexing" service is on, or an antivirus thing, or some other
kind of automatically-open-and-sniff-every-file-on-certain-file-events
thing.  It looks like nothing of ours is even running at that moment
("waiting for server to shut down done"), and it's the RMDIR /s
shell command that is reporting the error.  The other low probability
error seen on this host is this one:

+ERROR:  could not stat file "pg_wal/00010007":
Permission denied




Re: Gripes about walsender command processing

2020-09-13 Thread Tom Lane
I wrote:
> I think that walsenders ought to report their current replication command
> as though it were a SQL command.  They oughta set debug_query_string, too.

I also notice that a walsender does not maintain its ps status:

postgres  921109  100  0.0  32568 11880 ?Rs   18:57   0:51 postgres: 
publisher: walsender postgres [local] idle

I don't mind if we decide we don't need to reflect the walsender's
true activity, but let's not have it showing an outright lie.

regards, tom lane




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-09-13 Thread Tom Lane
I wrote:
> Probably this requires a relcache inval at the wrong time;
> although we have recent passes from CLOBBER_CACHE_ALWAYS animals,
> so that can't be the whole triggering condition.  I wonder whether
> it is relevant that all of the complaining animals are JIT-enabled.

Hmmm ... I take that back.  hyrax has indeed passed since this went
in, but *it doesn't run any TAP tests*.  So the buildfarm offers no
information about whether the replication tests work under
CLOBBER_CACHE_ALWAYS.

Realizing that, I built an installation that way and tried to run
the subscription tests.  Results so far:

* Running 010_truncate.pl by itself passed for me.  So there's still
some unexplained factor needed to trigger the buildfarm failures.
(I'm wondering about concurrent autovacuum activity now...)

* Starting over, it appears that 001_rep_changes.pl almost immediately
gets into an infinite loop.  It does not complete the third test step,
rather infinitely waiting for progress to be made.  The publisher log
shows a repeating loop like

2020-09-13 21:16:05.734 EDT [928529] tap_sub LOG:  could not send data to 
client: Broken pipe
2020-09-13 21:16:05.734 EDT [928529] tap_sub CONTEXT:  slot "tap_sub", output 
plugin "pgoutput", in the commit callback, associated LSN 0/1660628
2020-09-13 21:16:05.843 EDT [928581] 001_rep_changes.pl LOG:  statement: SELECT 
pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM 
pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:16:05.861 EDT [928582] tap_sub LOG:  statement: SELECT 
pg_catalog.set_config('search_path', '', false);
2020-09-13 21:16:05.929 EDT [928582] tap_sub LOG:  received replication 
command: IDENTIFY_SYSTEM
2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG:  received replication 
command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', 
publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG:  starting logical decoding 
for slot "tap_sub"
2020-09-13 21:16:05.930 EDT [928582] tap_sub DETAIL:  Streaming transactions 
committing after 0/1652820, reading WAL from 0/1651B20.
2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG:  logical decoding found 
consistent point at 0/1651B20
2020-09-13 21:16:05.930 EDT [928582] tap_sub DETAIL:  There are no running 
transactions.
2020-09-13 21:16:21.560 EDT [928600] 001_rep_changes.pl LOG:  statement: SELECT 
pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM 
pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:16:37.291 EDT [928610] 001_rep_changes.pl LOG:  statement: SELECT 
pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM 
pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:16:52.959 EDT [928627] 001_rep_changes.pl LOG:  statement: SELECT 
pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM 
pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:17:06.866 EDT [928636] tap_sub LOG:  statement: SELECT 
pg_catalog.set_config('search_path', '', false);
2020-09-13 21:17:06.934 EDT [928636] tap_sub LOG:  received replication 
command: IDENTIFY_SYSTEM
2020-09-13 21:17:06.934 EDT [928636] tap_sub LOG:  received replication 
command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', 
publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:17:06.934 EDT [928636] tap_sub ERROR:  replication slot "tap_sub" 
is active for PID 928582
2020-09-13 21:17:07.811 EDT [928638] tap_sub LOG:  statement: SELECT 
pg_catalog.set_config('search_path', '', false);
2020-09-13 21:17:07.880 EDT [928638] tap_sub LOG:  received replication 
command: IDENTIFY_SYSTEM
2020-09-13 21:17:07.881 EDT [928638] tap_sub LOG:  received replication 
command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', 
publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:17:07.881 EDT [928638] tap_sub ERROR:  replication slot "tap_sub" 
is active for PID 928582
2020-09-13 21:17:08.618 EDT [928641] 001_rep_changes.pl LOG:  statement: SELECT 
pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM 
pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub';
2020-09-13 21:17:08.753 EDT [928642] tap_sub LOG:  statement: SELECT 
pg_catalog.set_config('search_path', '', false);
2020-09-13 21:17:08.821 EDT [928642] tap_sub LOG:  received replication 
command: IDENTIFY_SYSTEM
2020-09-13 21:17:08.821 EDT [928642] tap_sub LOG:  received replication 
command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', 
publication_names '"tap_pub","tap_pub_ins_only"')
2020-09-13 21:17:08.821 EDT [928642] tap_sub ERROR:  replication slot "tap_sub" 
is active for PID 928582
2020-09-13 21:17:09.689 EDT [928645] tap_sub LOG:  statement: SELECT 
pg_catalog.set_config('search_path', '', false);
2020-09-13 21:17:09.756 EDT [928645] tap_sub LOG:  received replication 
command: IDENTIFY_SYSTEM
2020-0

Re: Avoid incorrect allocation in buildIndexArray

2020-09-13 Thread Michael Paquier
On Sat, Sep 12, 2020 at 12:40:49PM +0200, Julien Rouhaud wrote:
> agreed.

Ok, done as ac673a1 then.
--
Michael


signature.asc
Description: PGP signature


typo in snapmgr.c and procarray.c

2020-09-13 Thread btnakamichin

Hi,

Attached fixes $subject.

Thanks,
Naoki Nakamichidiff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 1c0cd6b248..672d2083cc 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -4106,7 +4106,7 @@ GlobalVisCheckRemovableXid(Relation rel, TransactionId xid)
  *
  * Be very careful about when to use this function. It can only safely be used
  * when there is a guarantee that xid is within MaxTransactionId / 2 xids of
- * rel. That e.g. can be guaranteed if the the caller assures a snapshot is
+ * rel. That e.g. can be guaranteed if the caller assures a snapshot is
  * held by the backend and xid is from a table (where vacuum/freezing ensures
  * the xid has to be within that range), or if xid is from the procarray and
  * prevents xid wraparound that way.
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 22cf3ebaf4..ed7f5239a0 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -1855,7 +1855,7 @@ TransactionIdLimitedForOldSnapshots(TransactionId recentXmin,
 		if (ts == threshold_timestamp)
 		{
 			/*
-			 * Current timestamp is in same bucket as the the last limit that
+			 * Current timestamp is in same bucket as the last limit that
 			 * was applied. Reuse.
 			 */
 			xlimit = threshold_xid;


Re: typo in snapmgr.c and procarray.c

2020-09-13 Thread Fujii Masao




On 2020/09/14 11:06, btnakamichin wrote:

Hi,

Attached fixes $subject.


Thanks for the patch! LGTM. I will commit it.


Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-09-13 Thread Amit Kapila
On Mon, Sep 14, 2020 at 3:08 AM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Pushed.
>
> Observe the following reports:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04
>
> These are all on HEAD, and all within the last ten days, and I see
> nothing comparable in any branch before that.  So it's hard to avoid
> the conclusion that somebody broke something about ten days ago.
>

I'll analyze these reports.

-- 
With Regards,
Amit Kapila.




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-09-13 Thread Tom Lane
I wrote:
> * Starting over, it appears that 001_rep_changes.pl almost immediately
> gets into an infinite loop.  It does not complete the third test step,
> rather infinitely waiting for progress to be made.

Ah, looking closer, the problem is that wal_receiver_timeout = 60s
is too short when the sender is using CCA.  It times out before we
can get through the needed data transmission.

regards, tom lane




Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-09-13 Thread Amit Kapila
On Mon, Sep 14, 2020 at 3:08 AM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Pushed.
>
> Observe the following reports:
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04
>
> These are all on HEAD, and all within the last ten days, and I see
> nothing comparable in any branch before that.  So it's hard to avoid
> the conclusion that somebody broke something about ten days ago.
>
> None of these animals provided gdb backtraces; but we do have a built-in
> trace from several, and they all look like pgoutput.so is trying to
> list_free() garbage, somewhere inside a relcache invalidation/rebuild
> scenario:
>

Yeah, this is right, and here is some initial analysis. It seems to be
failing in below code:
rel_sync_cache_relation_cb(){ ...list_free(entry->streamed_txns);..}

This list can have elements only in 'streaming' mode (need to enable
'streaming' with Create Subscription command) whereas none of the
tests in 010_truncate.pl is using 'streaming', so this list should be
empty (NULL). The two different assertion failures shown in BF reports
in list_free code are as below:
Assert(list->length > 0);
Assert(list->length <= list->max_length);

It seems to me that this list is not initialized properly when it is
not used or maybe that is true in some special circumstances because
we initialize it in get_rel_sync_entry(). I am not sure if CCI build
is impacting this in some way.

--
With Regards,
Amit Kapila.




Re: ModifyTable overheads in generic plans

2020-09-13 Thread Amit Langote
Hello,

On Fri, Aug 7, 2020 at 9:26 PM Amit Langote  wrote:
> On Tue, Aug 4, 2020 at 3:15 PM Amit Langote  wrote:
> > On Sat, Aug 1, 2020 at 4:46 AM Robert Haas  wrote:
> > > On Fri, Jun 26, 2020 at 8:36 AM Amit Langote  
> > > wrote:
> > > > 0001 and 0002 are preparatory patches.
> > >
> > > I read through these patches a bit but it's really unclear what the
> > > point of them is. I think they need better commit messages, or better
> > > comments, or both.
> >
> > Thanks for taking a look.  Sorry about the lack of good commentary,
> > which I have tried to address in the attached updated version. I
> > extracted one more part as preparatory from the earlier 0003 patch, so
> > there are 4 patches now.
> >
> > Also as discussed with Daniel, I have changed the patches so that they
> > can be applied on plain HEAD instead of having to first apply the
> > patches at [1].  Without runtime pruning for UPDATE/DELETE proposed in
> > [1], optimizing ResultRelInfo creation by itself does not improve the
> > performance/scalability by that much, but the benefit of lazily
> > creating ResultRelInfos seems clear so I think maybe it's okay to
> > pursue this independently.
>
> Per cfbot's automatic patch tester, there were some issues in the 0004 patch:
>
> nodeModifyTable.c: In function ‘ExecModifyTable’:
> 1529nodeModifyTable.c:2484:24: error: ‘junkfilter’ may be used
> uninitialized in this function [-Werror=maybe-uninitialized]
> 1530  junkfilter->jf_junkAttNo,
> 1531^
> 1532nodeModifyTable.c:2309:14: note: ‘junkfilter’ was declared here
> 1533  JunkFilter *junkfilter;
> 1534  ^
> 1535cc1: all warnings being treated as errors
> 1536: recipe for target 'nodeModifyTable.o' failed
> 1537make[3]: *** [nodeModifyTable.o] Error 1
>
> Fixed in the attached updated version

Needed a rebase due to f481d28232.  Attached updated patches.

-- 
Amit Langote
EnterpriseDB: http://www.enterprisedb.com


v5-0004-Initialize-result-relation-information-lazily.patch
Description: Binary data


v5-0001-Revise-how-FDWs-obtain-result-relation-informatio.patch
Description: Binary data


v5-0002-Don-t-make-root-ResultRelInfo-for-insert-queries.patch
Description: Binary data


v5-0003-Revise-child-to-root-tuple-conversion-map-managem.patch
Description: Binary data


Re: Print logical WAL message content

2020-09-13 Thread Ashutosh Bapat
On Fri, 11 Sep 2020 at 04:08, Alvaro Herrera 
wrote:

>
> Pushed, thanks!
>

Thanks Alvaro.

-- 
Best Wishes,
Ashutosh


Re: Making index_set_state_flags() transactional

2020-09-13 Thread Michael Paquier
On Fri, Sep 11, 2020 at 06:42:09PM +0300, Anastasia Lubennikova wrote:
> I agree that transactional update in index_set_state_flags() is good for
> both cases, that you mentioned and don't see any restrictions. It seems that
> not doing this before was just a loose end, as the comment from 813fb03
> patch clearly stated, that it is safe to make this update transactional.
> 
> The patch itself looks clear and committable.

Thanks for double-checking, Anastasia.  Committed.
--
Michael


signature.asc
Description: PGP signature


Re: typo in snapmgr.c and procarray.c

2020-09-13 Thread Fujii Masao




On 2020/09/14 11:14, Fujii Masao wrote:



On 2020/09/14 11:06, btnakamichin wrote:

Hi,

Attached fixes $subject.


Thanks for the patch! LGTM. I will commit it.


Pushed. Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Optimising compactify_tuples()

2020-09-13 Thread David Rowley
On Fri, 11 Sep 2020 at 17:48, Thomas Munro  wrote:
>
> On Fri, Sep 11, 2020 at 3:53 AM David Rowley  wrote:
> > That gets my benchmark down to 60.8 seconds, so 2.2 seconds better than v4b.
>
> . o O ( I wonder if there are opportunities to squeeze some more out
> of this with __builtin_prefetch... )

I'd be tempted to go down that route if we had macros already defined
for that, but it looks like we don't.

> > I've attached v6b and an updated chart showing the results of the 10
> > runs I did of it.
>
> One failure seen like this while running check word (cfbot):
>
> #2 0x0091f93f in ExceptionalCondition
> (conditionName=conditionName@entry=0x987284 "nitems > 0",
> errorType=errorType@entry=0x97531d "FailedAssertion",
> fileName=fileName@entry=0xa9df0d "bufpage.c",
> lineNumber=lineNumber@entry=442) at assert.c:67

Thanks.  I neglected to check the other call site properly checked for
nitems > 0.  Looks like PageIndexMultiDelete() relied on
compacify_tuples() to set pd_upper to pd_special when nitems == 0.
That's not what PageRepairFragmentation() did, so I've now aligned the
two so they work the same way.

I've attached patches in git format-patch format. I'm proposing to
commit these in about 48 hours time unless there's some sort of
objection before then.

Thanks for reviewing this.

David


v8-0001-Optimize-compactify_tuples-function.patch
Description: Binary data


v8-0002-Report-resource-usage-at-the-end-of-recovery.patch
Description: Binary data


Re: problem with RETURNING and update row movement

2020-09-13 Thread Amit Langote
Hi Alvaro,

On Sat, Sep 12, 2020 at 5:42 AM Alvaro Herrera  wrote:
>
> I noticed that this bugfix has stalled, probably because the other
> bugfix has also stalled.
>
> It seems that cleanly returning system columns from table AM is not
> going to be a simple task -- whatever fix we get for that is likely not
> going to make it all the way to PG 12.  So I suggest that we should fix
> this bug in 11-13 without depending on that.

Although I would be reversing course on what I said upthread, I tend
to agree with that, because the core idea behind the fix for this
issue does not seem likely to be invalidated by any conclusion
regarding the other issue.  That is, as far as the issue here is
concerned, we can't avoid falling back to using the source partition's
RETURNING projection whose scan tuple is provided using the source
partition's tuple slot.

However, given that we have to pass the *new* tuple to the projection,
not the old one, we will need a "clean" way to transfer its (the new
tuple's) system columns into the source partition's tuple slot.  The
sketch I gave upthread of what that could look like was not liked by
Fujita-san much.



--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com