Re: pg_basebackup, walreceiver and wal_sender_timeout
On Mon, Jan 28, 2019, at 10:25, Michael Paquier wrote: > On Mon, Jan 28, 2019 at 09:05:26AM +0100, Magnus Hagander wrote: > > And for plain format, we'd do the same -- sync after each file segment, and > > then a final one of the directory when done, right? > > Well, the code is doing a double amount of work in its current shape > as we call fsync_pgdata() for the plain format, which cascades to > pg_wal and all its files, so it seems to me that there is little point > in issuing a sync when each segment is finished streaming if that's > what you mean. Agreed. While reading the doc page for the pg_basebackup, I've been confused by the fact that it says WAL files will be written to .tarballs (either base.tar or pg_wal.tar) when pg_basebackup is instructed to stream WALs alongside the backup itself. I think it makes sense to elaborate that it only happens when tar format is specified (doc patch is attached). Cheers, Oleksii diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml index 57dc83b620..c4f3950e5b 100644 --- a/doc/src/sgml/ref/pg_basebackup.sgml +++ b/doc/src/sgml/ref/pg_basebackup.sgml @@ -306,8 +306,8 @@ PostgreSQL documentation backup will fail and be unusable. -The write-ahead log files will be written to -the base.tar file. +When tar format mode is used, the write-ahead log files will be +written to the base.tar file. @@ -326,9 +326,10 @@ PostgreSQL documentation requires no extra write-ahead logs to be saved on the master. -The write-ahead log files are written to a separate file -named pg_wal.tar (if the server is a version -earlier than 10, the file will be named pg_xlog.tar). +When tar format mode is used, the write-ahead log files will be +written to a separate file named pg_wal.tar +(if the server is a version earlier than 10, the file will be named +pg_xlog.tar). This value is the default.
'tuple concurrently updated' error w/o visible catalog updates
Hello, Earlier this week we have split our Postgres 9.6.8 shards, each having two databases, into one database per shard setup. This was done by promoting replicas and subsequently removing unused databases. Immediately afterwards we have discovered repeated 'tuple concurrently updated' errors on most of those new shards. The error is always shown at the same statement: ERROR,XX000,"tuple concurrently updated","SQL statement ""UPDATE config_content SET cc_content = l_config_content WHERE cc_config_content_id = l_ccm_content_id"" By searching the archives (i.e. https://www.postgresql.org/messageid/flat/CAB7nPqSZCkVfibTvx9TYmHYhVtV_vOMNwOpLHnRU85qeiimUaQ%40mail.gmail.com#cab7npqszckvfibtvx9tymhyhvtv_vomnwoplhnru85qeiim...@mail.gmail.com) I’ve got an impression that this error manifests itself when system catalog tuples are updated concurrently, however I see none of that in the query that leads to an ERROR. There are no triggers on 'config_content' table, neither there are any views referring to it. The errors stopped when we disabled a call to the 'upsert_foo_content' function (here and below I obfuscated real names). This is a fairly simple pl/pgsql function that does a few selects and an upsert. The block inside that function that contains the statement at fault is: -- SELECT ccm_content_id, ccm_simple_update_received_at INTO l_ccm_content_id, l_ccm_simple_update_received_at FROM config_content_metadata WHERE ccm_config_id = l_c_id AND ccm_sales_channel_id = l_sales_channel_id; IF (l_ccm_content_id IS NULL) THEN -- insert config content -- INSERT INTO config_content_metadata(ccm_config_id, ccm_sales_channel_id, ccm_update_caused_by, ccm_simple_update_eid, ccm_simple_update_received_at) VALUES(l_c_id, l_sales_channel_id, l_rp_id, l_content_update_eid, l_content_update_received_at) RETURNING ccm_content_id INTO l_ccm_content_id; INSERT INTO config_content(cc_config_content_id, cc_content) VALUES (l_ccm_content_id, l_config_content); ELSIF (l_ccm_simple_update_received_at < l_content_update_received_at) THEN UPDATE config_content_metadata SET ccm_update_caused_by = l_rp_id, ccm_simple_update_eid = l_content_update_eid, ccm_simple_update_received_at = l_content_update_received_at, ccm_updated_at = now() WHERE ccm_content_id = l_ccm_content_id; -- XXX problematic statement XXX UPDATE config_content SET cc_content = l_config_content WHERE cc_config_content_id = l_ccm_content_id; END IF; -- Note that config_content references config_metdata with a foreign key, however, the referenced column is not updated. That 'upsert_foo_content' is called by another one, upsert_foo_content_batch, in a loop over the elements of a JSON array, something like: -- CREATE OR REPLACE FUNCTION upsert_foo_content_batch(p_batch jsonb) RETURN void LANGUAGE plpgpsql AS $function$ DECLARE ... BEGIN FOR item IN SELECT * FROM jsonb_array_elements(p_batch) LOOP -- some unpacking of fields from json into the local variables PERFORM upsert_foo_content(..) -- called with the unpacked variables END LOOP; END; $function$ -- 'upsert_foo_content_batch' is called, in order, at the end of a long pl/pgsql function 'upsert_foo_event_batch', which consists of a very long CTE that extracts individual fields from a JSON argument, and then performs a number of inserts into some tables, doing on conflict do nothing, afterwards performing more inserts into the tables that reference the previous ones, doing on conflict do update. However, it modifies neither 'config_content' or 'config_content_metadata' tables. So the chain of calls is 'upsert_foo_event_batch' -> 'upsert_foo_content_batch' -> 'upsert_foo_content'. (the last one contains the statement that leads to the "tuple concurrently updated" error). It is possible that 'upsert_foo_content' function is called with the same data multiple times in different processes, however, I’d expect it to either complete successfully, or throw an error because the PK already exists (this is running in a read committed mode, so ISTM not immune to the case where the row in the metadata table is inserted after another session does the check, but before the insert), but not an error mentioned at the beginning of this message. Are there any signs in this description that the queries might be doing something unexpected to PostgreSQL, or that something went wrong during the split? I am running out of options of what could cause the issue, so any pointers or help in debugging it is appreciated (note that this is a production database, I cannot just stop it at will). Cheers, Oleksii
Re: GSoC 2018
Hello, On Fri, Dec 15, 2017, at 14:30, Stephen Frost wrote: > Aleksander, > > * Aleksander Alekseev (a.aleks...@postgrespro.ru) wrote: > > > Regarding the difficulty of the project - in fact it's not that > > difficult. Particularly this project can rely on external tools, e.g. > > use Consul for service discovery and leader election based on > > leader-lease approach (implementation [1]). Having this the only thing > > is missing is automatic replica promoting and (optionally) > > re-configuring of HAProxy / pgbouncer / whatever. Yes, and lots of > > Jepsen-like test of course. I believe it's not such a complicated > > project. Does it make sense to address the limitations of the logical replication first, i.e. inability to replicate DDL, sequences and so on? > > What you're talking about is rebuilding Patroni, but adding more into it > than even Patroni tries to do, building it on Logical Replication > instead of physical replication, and calling it simple and something > that could get into core PostgreSQL over a 12 week GSoC project. I've > certainly got doubts about that, even if we decide that it'd be an > external-to-PG project (like Patroni). > > What might be interesting is seeing if Logical Replication could be > added to Patroni as an option and then building on that.. Having > someone involved in the Patroni project would be the right way to go > about proposing that though to see what they think of it. That would > also be much more sensible as a GSoC project, since it'd be an addition > to an existing project and not more-or-less starting a whole new > project. Right now logical replication and physical replication-based HA tools don't work together nicely, since logical replication position is not propagated to the promoted replica (I think Craig Ringer has been tackling this issue for a few releases already, the latest set of patches I could find is https://commitfest.postgresql.org/15/788/). Perhaps there is opportunity for a GSoC student to help fixing it. Until then we cannot use logical replication for HA, and even doing something simpler like automating creation of logical replicas in Patroni makes little sense, as they are doomed to be reinitialized on every failover. -- Sincerely, Alex
Re: GSoC 2018
On Fri, Dec 15, 2017, at 14:52, Aleksander Alekseev wrote: > Completely agree, this project can be an improvement for Stolon (or > Patroni, but I personally never tested or used it, also I got a feeling > that Google guys will prefer a project that is written in Go). This > would make much more sense. I don't believe Google will reject a project based on the fact that it is written in Python (in fact, Python Software Foundation has successfully participated in GSoC for many years). Based on the github statistics, Patroni has started earlier and has more contributors than Stolon (including those contributed more than one patch/pull-request.) -- Sincerely, Alex