Re: Transactions involving multiple postgres foreign servers, take 2
On Fri, 11 Sep 2020 at 18:24, tsunakawa.ta...@fujitsu.com wrote: > > From: Masahiko Sawada > > On Tue, 8 Sep 2020 at 13:00, tsunakawa.ta...@fujitsu.com > > wrote: > > > 2. 2PC processing is queued and serialized in one background worker. That > > severely subdues transaction throughput. Each backend should perform > > 2PC. > > > > Not sure it's safe that each backend perform PREPARE and COMMIT > > PREPARED since the current design is for not leading an inconsistency > > between the actual transaction result and the result the user sees. > > As Fujii-san is asking, I also would like to know what situation you think is > not safe. Are you worried that the FDW's commit function might call > ereport(ERROR | FATAL | PANIC)? Yes. > If so, can't we stipulate that the FDW implementor should ensure that the > commit function always returns control to the caller? How can the FDW implementor ensure that? Since even palloc could call ereport(ERROR) I guess it's hard to require that to all FDW implementors. > > > > But in the future, I think we can have multiple background workers per > > database for better performance. > > Does the database in "per database" mean the local database (that > applications connect to), or the remote database accessed via FDW? I meant the local database. In the current patch, we launch the resolver process per local database. My idea is to allow launching multiple resolver processes for one local database as long as the number of workers doesn't exceed the limit. > > I'm wondering how the FDW and background worker(s) can realize parallel > prepare and parallel commit. That is, the coordinator transaction performs: > > 1. Issue prepare to all participant nodes, but doesn't wait for the reply for > each issue. > 2. Waits for replies from all participants. > 3. Issue commit to all participant nodes, but doesn't wait for the reply for > each issue. > 4. Waits for replies from all participants. > > If we just consider PostgreSQL and don't think about FDW, we can use libpq > async functions -- PQsendQuery, PQconsumeInput, and PQgetResult. pgbench > uses them so that one thread can issue SQL statements on multiple connections > in parallel. > > But when we consider the FDW interface, plus other DBMSs, how can we achieve > the parallelism? It's still a rough idea but I think we can use TMASYNC flag and xa_complete explained in the XA specification. The core transaction manager call prepare, commit, rollback APIs with the flag, requiring to execute the operation asynchronously and to return a handler (e.g., a socket taken by PQsocket in postgres_fdw case) to the transaction manager. Then the transaction manager continues polling the handler until it becomes readable and testing the completion using by xa_complete() with no wait, until all foreign servers return OK on xa_complete check. > > > > > 3. postgres_fdw cannot detect remote updates when the UDF executed on a > > remote node updates data. > > > > I assume that you mean the pushing the UDF down to a foreign server. > > If so, I think we can do this by improving postgres_fdw. In the current > > patch, > > registering and unregistering a foreign server to a group of 2PC and > > marking a > > foreign server as updated is FDW responsible. So perhaps if we had a way to > > tell postgres_fdw that the UDF might update the data on the foreign server, > > postgres_fdw could mark the foreign server as updated if the UDF is > > shippable. > > Maybe we can consider VOLATILE functions update data. That may be > overreaction, though. Sorry I don't understand that. The volatile functions are not pushed down to the foreign servers in the first place, no? Regards, -- Masahiko Sawadahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: [PATCH] Finally split StdRdOptions into HeapOptions and ToastOptions
В письме от понедельник, 20 июля 2020 г. 18:36:44 MSK пользователь Georgios Kokolatos написал: Hi! Sorry for really long delay, I was at my summer vacations, and then has urgent things to finish first. :-( Now I hope we can continue... > thank you for the patch. It applies cleanly, compiles and passes check, > check-world. Thank you for reviewing efforts. > I feel as per the discussion, this is a step to the right direction yet it > does not get far enough. From experience, I can confirm that dealing with > reloptions in a new table AM is somewhat of a pain. Ultimately, reloptions > should be handled by the table AM specific code. The current patch does not > address the issue. Yet it does make the issue easier to address by clearing > up the current state. Moving reloptions to AM code is the goal I am slowly moving to. I've started some time ago with big patch https://commitfest.postgresql.org/14/992/ and have been told to split it into smaller parts. So I did, and this patch is last step that cleans options related things up, and then actual moving can be done. > If you allow me, I have a couple of comments. > > - saveFreeSpace = RelationGetTargetPageFreeSpace(relation, > - >HEAP_DEFAULT_FILLFACTOR); > + if (IsToastRelation(relation)) > + saveFreeSpace = ToastGetTargetPageFreeSpace(); > + else > + saveFreeSpace = HeapGetTargetPageFreeSpace(relation); > > For balance, it does make some sense for ToastGetTargetPageFreeSpace() to > get relation as an argument, similarly to HeapGetTargetPageFreeSpace(). ToastGetTargetPageFreeSpace return a const value. I've change the code, so it gets relation argument, that is not used, the way you suggested. But I am not sure if it is good or bad idea. May be we will get some "Unused variable" warning on some compilers. I like consistency... But not sure we need it here. > - /* Retrieve the parallel_workers reloption, or -1 if not set. */ > - rel->rel_parallel_workers = RelationGetParallelWorkers(relation, > -1); + /* > +* Retrieve the parallel_workers for heap and mat.view relations. > +* Use -1 if not set, or if we are dealing with other relation > kinds +*/ > + if (relation->rd_rel->relkind == RELKIND_RELATION || > + relation->rd_rel->relkind == RELKIND_MATVIEW) > + rel->rel_parallel_workers = > RelationGetParallelWorkers(relation, -1); + else > + rel->rel_parallel_workers = -1; > Also, this pattern is repeated in four places, maybe the branch can be > moved inside a macro or static inline instead? > If the comment above is agreed upon, then it makes a bit of sense to apply > the same here. The expression in the branch is already asserted for in > macro, why not switch there and remove the responsibility from the caller? I guess here you are right, because here the logic is following: for heap relation take option from options, for _all_ others use -1. This can be moved to macro. So I changed it to /* * HeapGetParallelWorkers * Returns the heap's parallel_workers reloption setting. * Note multiple eval of argument! */ #define HeapGetParallelWorkers(relation, defaultpw) \ (AssertMacro(relation->rd_rel->relkind == RELKIND_RELATION || \ relation->rd_rel->relkind == RELKIND_MATVIEW), \ (relation)->rd_options ? \ ((HeapOptions *) (relation)->rd_options)->parallel_workers : \ (defaultpw)) /* * RelationGetParallelWorkers * Returns the relation's parallel_workers reloption setting. * Note multiple eval of argument! */ #define RelationGetParallelWorkers(relation, defaultpw) \ (((relation)->rd_rel->relkind == RELKIND_RELATION ||\ (relation)->rd_rel->relkind == RELKIND_MATVIEW) ? \ HeapGetParallelWorkers(relation, defaultpw) : defaultpw) But I would not like to move if (IsToastRelation(relation))
Re: Sometimes the output to the stdout in Windows disappears
I happened to try googling for other similar reports, and I found a very interesting recent thread here: https://github.com/nodejs/node/issues/33166 It might not have the same underlying cause, of course, but it sure sounds familiar. If Node.js are really seeing the same effect, that would point to an underlying Windows bug rather than anything Postgres is doing wrong. It doesn't look like the Node.js crew got any closer to understanding the issue than we have, unfortunately. They made their problem mostly go away by reverting a seemingly-unrelated patch. But I can't help thinking that it's a timing-related bug, and that patch was just unlucky enough to change the timing of their tests so that they saw the failure frequently. regards, tom lane
Re: Probable documentation errors or improvements
On Thu, Sep 10, 2020 at 12:20 PM Yaroslav wrote: > Disclaimer: I'm not a native speaker, so not sure those are actually > incorrect, and can't offer non-trivial suggestions. I skimmed about 2/3rds of these and while I agree on the surface that probably 3/4ths of them are improvements they aren't clear enough wins to methodically go through and write up one or more patches. There are a few areas that seem outdated that could use a good once-over in the sgml sources to clean up. For those I'd expect that I or someone else may go and write up actual patches. This run-on presentation is off-putting. Even if actual patches are not forthcoming I would at least suggest one email per topic with the suggestions broken down into at least bugs (at the top) and non-bugs/improvements. David J.
Gripes about walsender command processing
While trying to understand a recent buildfarm failure [1], I got annoyed by the fact that a walsender exposes its last SQL command in pg_stat_activity even when that was a long time ago and what it's been doing since then is replication commands. This leads to *totally* misleading reports, for instance in [1] we read 2020-09-13 19:10:09.632 CEST [5f5e526d.242415:4] LOG: server process (PID 2368853) was terminated by signal 11: Segmentation fault 2020-09-13 19:10:09.632 CEST [5f5e526d.242415:5] DETAIL: Failed process was running: SELECT pg_catalog.set_config('search_path', '', false); even though the process's own log messages paint a much better picture of reality: 2020-09-13 19:10:07.302 CEST [5f5e526f.242555:1] LOG: connection received: host=[local] 2020-09-13 19:10:07.303 CEST [5f5e526f.242555:2] LOG: replication connection authorized: user=bf application_name=sub2 2020-09-13 19:10:07.303 CEST [5f5e526f.242555:3] LOG: statement: SELECT pg_catalog.set_config('search_path', '', false); 2020-09-13 19:10:07.334 CEST [5f5e526f.242555:4] LOG: received replication command: IDENTIFY_SYSTEM 2020-09-13 19:10:07.335 CEST [5f5e526f.242555:5] LOG: received replication command: START_REPLICATION SLOT "sub2" LOGICAL 0/0 (proto_version '2', publication_names '"pub2"') 2020-09-13 19:10:07.335 CEST [5f5e526f.242555:6] LOG: starting logical decoding for slot "sub2" 2020-09-13 19:10:07.335 CEST [5f5e526f.242555:7] DETAIL: Streaming transactions committing after 0/159EB38, reading WAL from 0/159EB00. 2020-09-13 19:10:07.335 CEST [5f5e526f.242555:8] LOG: logical decoding found consistent point at 0/159EB00 2020-09-13 19:10:07.335 CEST [5f5e526f.242555:9] DETAIL: There are no running transactions. I think that walsenders ought to report their current replication command as though it were a SQL command. They oughta set debug_query_string, too. While trying to fix this, I also observed that exec_replication_command fails to clean up the temp context it made for parsing the command string, if that turns out to be a SQL command. This very accidentally fails to lead to a long-term memory leak, because that context will be a child of MessageContext so it'll be cleaned out in the next iteration of PostgresMain's main loop. But it's still unacceptably sloppy coding IMHO, and it's closely related to a lot of other randomness in the function; it'd be better to have a separate early-exit path for SQL commands. Attached find a proposed fix for these things. What I'd really like to do is adjust pgstat_report_activity so that callers are *required* to provide some kind of command string when transitioning into STATE_RUNNING state, ie something like Assert(state == STATE_RUNNING ? cmd_str != NULL : cmd_str == NULL); However, before we could do that, we'd have to clean up the rat's nest of seemingly-inserted-with-the-aid-of-a-dartboard pgstat_report_activity calls in replication/logical/worker.c. I couldn't make heads or tails of what is going on there, so I did not try. BTW, while I didn't change it here, isn't the second SnapBuildClearExportedSnapshot call in exec_replication_command just useless? We already did that before parsing the command. regards, tom lane [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03 diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 3f756b470a..9a2b154797 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -1545,6 +1545,9 @@ exec_replication_command(const char *cmd_string) CHECK_FOR_INTERRUPTS(); + /* + * Parse the command. + */ cmd_context = AllocSetContextCreate(CurrentMemoryContext, "Replication command context", ALLOCSET_DEFAULT_SIZES); @@ -1560,15 +1563,38 @@ exec_replication_command(const char *cmd_string) cmd_node = replication_parse_result; + /* + * If it's a SQL command, just clean up our mess and return false; the + * caller will take care of executing it. + */ + if (IsA(cmd_node, SQLCmd)) + { + if (MyDatabaseId == InvalidOid) + ereport(ERROR, + (errmsg("cannot execute SQL commands in WAL sender for physical replication"))); + + MemoryContextSwitchTo(old_context); + MemoryContextDelete(cmd_context); + + /* Tell the caller that this wasn't a WalSender command. */ + return false; + } + + /* + * Report query to various monitoring facilities. For this purpose, we + * report replication commands just like SQL commands. + */ + debug_query_string = cmd_string; + + pgstat_report_activity(STATE_RUNNING, cmd_string); + /* * Log replication command if log_replication_commands is enabled. Even * when it's disabled, log the command with DEBUG1 level for backward - * compatibility. Note that SQL commands are not logged here, and will be - * logged later if log_statement is enabled. + * compatibility. */ - if (cmd_node->type != T_SQLCmd) -
Minor documentation error regarding streaming replication protocol
While implementing streaming replication client functionality for Npgsql I stumbled upon a minor documentation error at https://www.postgresql.org/docs/current/protocol-replication.html The "content" return value for the TIMELINE_HISTORYcommand is advertised as bytea while it is in fact raw ASCII bytes. How did I find out? Since the value I get doesn't start with a "\x" and contains ascii text, although I've bytea_outputset to hex, I first thought that the streaming replication protocol simply doesn't honor bytea_output, but then I realized that I also get unencoded tabs and newlines which wouldn't be possible if the value woud be passed through byteaout. This is certainly a minor problem since the timeline history file only contains generated strings that are ASCII-only, so just using the unencoded bytes is actually easier than decoding bytea. OTOH it did cost me a few hours (writing a bytea decoder and figuring out why it doesn't work by looking at varlena.c and proving the docs wrong) so I want to point this out here since it is possibly an unintended behavior or at least a documentation error. Also I'm wary of taking dependency on an undocumented implementation detail that could possibly change at any point. Regards, Brar
Re: Minor documentation error regarding streaming replication protocol
Brar Piening wrote: > This is certainly a minor problem After thinking about it a little more I think that at least the fact that even the row description message advertises the content as bytea could be considered as a bug - no?
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Amit Kapila writes: > Pushed. Observe the following reports: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04 These are all on HEAD, and all within the last ten days, and I see nothing comparable in any branch before that. So it's hard to avoid the conclusion that somebody broke something about ten days ago. None of these animals provided gdb backtraces; but we do have a built-in trace from several, and they all look like pgoutput.so is trying to list_free() garbage, somewhere inside a relcache invalidation/rebuild scenario: TRAP: FailedAssertion("list->length > 0", File: "/home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/../pgsql/src/backend/nodes/list.c", Line: 68) postgres: publisher: walsender bf [local] idle(ExceptionalCondition+0x57)[0x9081f7] postgres: publisher: walsender bf [local] idle[0x6bcc70] postgres: publisher: walsender bf [local] idle(list_free+0x11)[0x6bdc01] /home/bf/build/buildfarm-idiacanthus/HEAD/pgsql.build/tmp_install/home/bf/build/buildfarm-idiacanthus/HEAD/inst/lib/postgresql/pgoutput.so(+0x35d8)[0x7fa4c5a6f5d8] postgres: publisher: walsender bf [local] idle(LocalExecuteInvalidationMessage+0x15b)[0x8f0cdb] postgres: publisher: walsender bf [local] idle(ReceiveSharedInvalidMessages+0x4b)[0x7bca0b] postgres: publisher: walsender bf [local] idle(LockRelationOid+0x56)[0x7c19e6] postgres: publisher: walsender bf [local] idle(relation_open+0x1c)[0x4a2d0c] postgres: publisher: walsender bf [local] idle(table_open+0x6)[0x524486] postgres: publisher: walsender bf [local] idle[0x9017f2] postgres: publisher: walsender bf [local] idle[0x8fabd4] postgres: publisher: walsender bf [local] idle[0x8fa58a] postgres: publisher: walsender bf [local] idle(RelationCacheInvalidateEntry+0xaf)[0x8fbdbf] postgres: publisher: walsender bf [local] idle(LocalExecuteInvalidationMessage+0xec)[0x8f0c6c] postgres: publisher: walsender bf [local] idle(ReceiveSharedInvalidMessages+0xcb)[0x7bca8b] postgres: publisher: walsender bf [local] idle(LockRelationOid+0x56)[0x7c19e6] postgres: publisher: walsender bf [local] idle(relation_open+0x1c)[0x4a2d0c] postgres: publisher: walsender bf [local] idle(table_open+0x6)[0x524486] postgres: publisher: walsender bf [local] idle[0x8ee8b0] 010_truncate.pl itself hasn't changed meaningfully in a good long time. However, I see that 464824323 added a whole boatload of code to pgoutput.c, and the timing is right for that commit to be the culprit, so that's what I'm betting on. Probably this requires a relcache inval at the wrong time; although we have recent passes from CLOBBER_CACHE_ALWAYS animals, so that can't be the whole triggering condition. I wonder whether it is relevant that all of the complaining animals are JIT-enabled. regards, tom lane
Re: A micro-optimisation for walkdir()
On Tue, Sep 8, 2020 at 10:53 AM Tom Lane wrote: > Thomas Munro writes: > > FWIW I just spotted a couple of very suspicious looking failures for > > build farm animal "walleye", a "MinGW64 8.1.0" system, that say: > > walleye's been kind of unstable since the get-go, so I wouldn't put > too much faith in reports just from it. CC'ing animal owner. I suspect that someone who knows about PostgreSQL on Windows would recognise the above symptom, but my guess is the Windows "indexing" service is on, or an antivirus thing, or some other kind of automatically-open-and-sniff-every-file-on-certain-file-events thing. It looks like nothing of ours is even running at that moment ("waiting for server to shut down done"), and it's the RMDIR /s shell command that is reporting the error. The other low probability error seen on this host is this one: +ERROR: could not stat file "pg_wal/00010007": Permission denied
Re: Gripes about walsender command processing
I wrote: > I think that walsenders ought to report their current replication command > as though it were a SQL command. They oughta set debug_query_string, too. I also notice that a walsender does not maintain its ps status: postgres 921109 100 0.0 32568 11880 ?Rs 18:57 0:51 postgres: publisher: walsender postgres [local] idle I don't mind if we decide we don't need to reflect the walsender's true activity, but let's not have it showing an outright lie. regards, tom lane
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
I wrote: > Probably this requires a relcache inval at the wrong time; > although we have recent passes from CLOBBER_CACHE_ALWAYS animals, > so that can't be the whole triggering condition. I wonder whether > it is relevant that all of the complaining animals are JIT-enabled. Hmmm ... I take that back. hyrax has indeed passed since this went in, but *it doesn't run any TAP tests*. So the buildfarm offers no information about whether the replication tests work under CLOBBER_CACHE_ALWAYS. Realizing that, I built an installation that way and tried to run the subscription tests. Results so far: * Running 010_truncate.pl by itself passed for me. So there's still some unexplained factor needed to trigger the buildfarm failures. (I'm wondering about concurrent autovacuum activity now...) * Starting over, it appears that 001_rep_changes.pl almost immediately gets into an infinite loop. It does not complete the third test step, rather infinitely waiting for progress to be made. The publisher log shows a repeating loop like 2020-09-13 21:16:05.734 EDT [928529] tap_sub LOG: could not send data to client: Broken pipe 2020-09-13 21:16:05.734 EDT [928529] tap_sub CONTEXT: slot "tap_sub", output plugin "pgoutput", in the commit callback, associated LSN 0/1660628 2020-09-13 21:16:05.843 EDT [928581] 001_rep_changes.pl LOG: statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub'; 2020-09-13 21:16:05.861 EDT [928582] tap_sub LOG: statement: SELECT pg_catalog.set_config('search_path', '', false); 2020-09-13 21:16:05.929 EDT [928582] tap_sub LOG: received replication command: IDENTIFY_SYSTEM 2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG: received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"') 2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG: starting logical decoding for slot "tap_sub" 2020-09-13 21:16:05.930 EDT [928582] tap_sub DETAIL: Streaming transactions committing after 0/1652820, reading WAL from 0/1651B20. 2020-09-13 21:16:05.930 EDT [928582] tap_sub LOG: logical decoding found consistent point at 0/1651B20 2020-09-13 21:16:05.930 EDT [928582] tap_sub DETAIL: There are no running transactions. 2020-09-13 21:16:21.560 EDT [928600] 001_rep_changes.pl LOG: statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub'; 2020-09-13 21:16:37.291 EDT [928610] 001_rep_changes.pl LOG: statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub'; 2020-09-13 21:16:52.959 EDT [928627] 001_rep_changes.pl LOG: statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub'; 2020-09-13 21:17:06.866 EDT [928636] tap_sub LOG: statement: SELECT pg_catalog.set_config('search_path', '', false); 2020-09-13 21:17:06.934 EDT [928636] tap_sub LOG: received replication command: IDENTIFY_SYSTEM 2020-09-13 21:17:06.934 EDT [928636] tap_sub LOG: received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"') 2020-09-13 21:17:06.934 EDT [928636] tap_sub ERROR: replication slot "tap_sub" is active for PID 928582 2020-09-13 21:17:07.811 EDT [928638] tap_sub LOG: statement: SELECT pg_catalog.set_config('search_path', '', false); 2020-09-13 21:17:07.880 EDT [928638] tap_sub LOG: received replication command: IDENTIFY_SYSTEM 2020-09-13 21:17:07.881 EDT [928638] tap_sub LOG: received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"') 2020-09-13 21:17:07.881 EDT [928638] tap_sub ERROR: replication slot "tap_sub" is active for PID 928582 2020-09-13 21:17:08.618 EDT [928641] 001_rep_changes.pl LOG: statement: SELECT pg_current_wal_lsn() <= replay_lsn AND state = 'streaming' FROM pg_catalog.pg_stat_replication WHERE application_name = 'tap_sub'; 2020-09-13 21:17:08.753 EDT [928642] tap_sub LOG: statement: SELECT pg_catalog.set_config('search_path', '', false); 2020-09-13 21:17:08.821 EDT [928642] tap_sub LOG: received replication command: IDENTIFY_SYSTEM 2020-09-13 21:17:08.821 EDT [928642] tap_sub LOG: received replication command: START_REPLICATION SLOT "tap_sub" LOGICAL 0/1652820 (proto_version '2', publication_names '"tap_pub","tap_pub_ins_only"') 2020-09-13 21:17:08.821 EDT [928642] tap_sub ERROR: replication slot "tap_sub" is active for PID 928582 2020-09-13 21:17:09.689 EDT [928645] tap_sub LOG: statement: SELECT pg_catalog.set_config('search_path', '', false); 2020-09-13 21:17:09.756 EDT [928645] tap_sub LOG: received replication command: IDENTIFY_SYSTEM 2020-0
Re: Avoid incorrect allocation in buildIndexArray
On Sat, Sep 12, 2020 at 12:40:49PM +0200, Julien Rouhaud wrote: > agreed. Ok, done as ac673a1 then. -- Michael signature.asc Description: PGP signature
typo in snapmgr.c and procarray.c
Hi, Attached fixes $subject. Thanks, Naoki Nakamichidiff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c index 1c0cd6b248..672d2083cc 100644 --- a/src/backend/storage/ipc/procarray.c +++ b/src/backend/storage/ipc/procarray.c @@ -4106,7 +4106,7 @@ GlobalVisCheckRemovableXid(Relation rel, TransactionId xid) * * Be very careful about when to use this function. It can only safely be used * when there is a guarantee that xid is within MaxTransactionId / 2 xids of - * rel. That e.g. can be guaranteed if the the caller assures a snapshot is + * rel. That e.g. can be guaranteed if the caller assures a snapshot is * held by the backend and xid is from a table (where vacuum/freezing ensures * the xid has to be within that range), or if xid is from the procarray and * prevents xid wraparound that way. diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c index 22cf3ebaf4..ed7f5239a0 100644 --- a/src/backend/utils/time/snapmgr.c +++ b/src/backend/utils/time/snapmgr.c @@ -1855,7 +1855,7 @@ TransactionIdLimitedForOldSnapshots(TransactionId recentXmin, if (ts == threshold_timestamp) { /* - * Current timestamp is in same bucket as the the last limit that + * Current timestamp is in same bucket as the last limit that * was applied. Reuse. */ xlimit = threshold_xid;
Re: typo in snapmgr.c and procarray.c
On 2020/09/14 11:06, btnakamichin wrote: Hi, Attached fixes $subject. Thanks for the patch! LGTM. I will commit it. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Mon, Sep 14, 2020 at 3:08 AM Tom Lane wrote: > > Amit Kapila writes: > > Pushed. > > Observe the following reports: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04 > > These are all on HEAD, and all within the last ten days, and I see > nothing comparable in any branch before that. So it's hard to avoid > the conclusion that somebody broke something about ten days ago. > I'll analyze these reports. -- With Regards, Amit Kapila.
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
I wrote: > * Starting over, it appears that 001_rep_changes.pl almost immediately > gets into an infinite loop. It does not complete the third test step, > rather infinitely waiting for progress to be made. Ah, looking closer, the problem is that wal_receiver_timeout = 60s is too short when the sender is using CCA. It times out before we can get through the needed data transmission. regards, tom lane
Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
On Mon, Sep 14, 2020 at 3:08 AM Tom Lane wrote: > > Amit Kapila writes: > > Pushed. > > Observe the following reports: > > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=idiacanthus&dt=2020-09-13%2016%3A54%3A03 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=desmoxytes&dt=2020-09-10%2009%3A08%3A03 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=komodoensis&dt=2020-09-05%2020%3A22%3A02 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-04%2001%3A52%3A03 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet&dt=2020-09-03%2020%3A54%3A04 > > These are all on HEAD, and all within the last ten days, and I see > nothing comparable in any branch before that. So it's hard to avoid > the conclusion that somebody broke something about ten days ago. > > None of these animals provided gdb backtraces; but we do have a built-in > trace from several, and they all look like pgoutput.so is trying to > list_free() garbage, somewhere inside a relcache invalidation/rebuild > scenario: > Yeah, this is right, and here is some initial analysis. It seems to be failing in below code: rel_sync_cache_relation_cb(){ ...list_free(entry->streamed_txns);..} This list can have elements only in 'streaming' mode (need to enable 'streaming' with Create Subscription command) whereas none of the tests in 010_truncate.pl is using 'streaming', so this list should be empty (NULL). The two different assertion failures shown in BF reports in list_free code are as below: Assert(list->length > 0); Assert(list->length <= list->max_length); It seems to me that this list is not initialized properly when it is not used or maybe that is true in some special circumstances because we initialize it in get_rel_sync_entry(). I am not sure if CCI build is impacting this in some way. -- With Regards, Amit Kapila.
Re: ModifyTable overheads in generic plans
Hello, On Fri, Aug 7, 2020 at 9:26 PM Amit Langote wrote: > On Tue, Aug 4, 2020 at 3:15 PM Amit Langote wrote: > > On Sat, Aug 1, 2020 at 4:46 AM Robert Haas wrote: > > > On Fri, Jun 26, 2020 at 8:36 AM Amit Langote > > > wrote: > > > > 0001 and 0002 are preparatory patches. > > > > > > I read through these patches a bit but it's really unclear what the > > > point of them is. I think they need better commit messages, or better > > > comments, or both. > > > > Thanks for taking a look. Sorry about the lack of good commentary, > > which I have tried to address in the attached updated version. I > > extracted one more part as preparatory from the earlier 0003 patch, so > > there are 4 patches now. > > > > Also as discussed with Daniel, I have changed the patches so that they > > can be applied on plain HEAD instead of having to first apply the > > patches at [1]. Without runtime pruning for UPDATE/DELETE proposed in > > [1], optimizing ResultRelInfo creation by itself does not improve the > > performance/scalability by that much, but the benefit of lazily > > creating ResultRelInfos seems clear so I think maybe it's okay to > > pursue this independently. > > Per cfbot's automatic patch tester, there were some issues in the 0004 patch: > > nodeModifyTable.c: In function ‘ExecModifyTable’: > 1529nodeModifyTable.c:2484:24: error: ‘junkfilter’ may be used > uninitialized in this function [-Werror=maybe-uninitialized] > 1530 junkfilter->jf_junkAttNo, > 1531^ > 1532nodeModifyTable.c:2309:14: note: ‘junkfilter’ was declared here > 1533 JunkFilter *junkfilter; > 1534 ^ > 1535cc1: all warnings being treated as errors > 1536: recipe for target 'nodeModifyTable.o' failed > 1537make[3]: *** [nodeModifyTable.o] Error 1 > > Fixed in the attached updated version Needed a rebase due to f481d28232. Attached updated patches. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com v5-0004-Initialize-result-relation-information-lazily.patch Description: Binary data v5-0001-Revise-how-FDWs-obtain-result-relation-informatio.patch Description: Binary data v5-0002-Don-t-make-root-ResultRelInfo-for-insert-queries.patch Description: Binary data v5-0003-Revise-child-to-root-tuple-conversion-map-managem.patch Description: Binary data
Re: Print logical WAL message content
On Fri, 11 Sep 2020 at 04:08, Alvaro Herrera wrote: > > Pushed, thanks! > Thanks Alvaro. -- Best Wishes, Ashutosh
Re: Making index_set_state_flags() transactional
On Fri, Sep 11, 2020 at 06:42:09PM +0300, Anastasia Lubennikova wrote: > I agree that transactional update in index_set_state_flags() is good for > both cases, that you mentioned and don't see any restrictions. It seems that > not doing this before was just a loose end, as the comment from 813fb03 > patch clearly stated, that it is safe to make this update transactional. > > The patch itself looks clear and committable. Thanks for double-checking, Anastasia. Committed. -- Michael signature.asc Description: PGP signature
Re: typo in snapmgr.c and procarray.c
On 2020/09/14 11:14, Fujii Masao wrote: On 2020/09/14 11:06, btnakamichin wrote: Hi, Attached fixes $subject. Thanks for the patch! LGTM. I will commit it. Pushed. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Re: Optimising compactify_tuples()
On Fri, 11 Sep 2020 at 17:48, Thomas Munro wrote: > > On Fri, Sep 11, 2020 at 3:53 AM David Rowley wrote: > > That gets my benchmark down to 60.8 seconds, so 2.2 seconds better than v4b. > > . o O ( I wonder if there are opportunities to squeeze some more out > of this with __builtin_prefetch... ) I'd be tempted to go down that route if we had macros already defined for that, but it looks like we don't. > > I've attached v6b and an updated chart showing the results of the 10 > > runs I did of it. > > One failure seen like this while running check word (cfbot): > > #2 0x0091f93f in ExceptionalCondition > (conditionName=conditionName@entry=0x987284 "nitems > 0", > errorType=errorType@entry=0x97531d "FailedAssertion", > fileName=fileName@entry=0xa9df0d "bufpage.c", > lineNumber=lineNumber@entry=442) at assert.c:67 Thanks. I neglected to check the other call site properly checked for nitems > 0. Looks like PageIndexMultiDelete() relied on compacify_tuples() to set pd_upper to pd_special when nitems == 0. That's not what PageRepairFragmentation() did, so I've now aligned the two so they work the same way. I've attached patches in git format-patch format. I'm proposing to commit these in about 48 hours time unless there's some sort of objection before then. Thanks for reviewing this. David v8-0001-Optimize-compactify_tuples-function.patch Description: Binary data v8-0002-Report-resource-usage-at-the-end-of-recovery.patch Description: Binary data
Re: problem with RETURNING and update row movement
Hi Alvaro, On Sat, Sep 12, 2020 at 5:42 AM Alvaro Herrera wrote: > > I noticed that this bugfix has stalled, probably because the other > bugfix has also stalled. > > It seems that cleanly returning system columns from table AM is not > going to be a simple task -- whatever fix we get for that is likely not > going to make it all the way to PG 12. So I suggest that we should fix > this bug in 11-13 without depending on that. Although I would be reversing course on what I said upthread, I tend to agree with that, because the core idea behind the fix for this issue does not seem likely to be invalidated by any conclusion regarding the other issue. That is, as far as the issue here is concerned, we can't avoid falling back to using the source partition's RETURNING projection whose scan tuple is provided using the source partition's tuple slot. However, given that we have to pass the *new* tuple to the projection, not the old one, we will need a "clean" way to transfer its (the new tuple's) system columns into the source partition's tuple slot. The sketch I gave upthread of what that could look like was not liked by Fujita-san much. -- Amit Langote EnterpriseDB: http://www.enterprisedb.com