Re: Taking into account syncrep position in flush_lsn reported by apply worker

2024-08-14 Thread Arseny Sher
On 8/13/24 06:35, Amit Kapila wrote: On Mon, Aug 12, 2024 at 3:43 PM Arseny Sher wrote: Sorry for the poor formatting of the message above, this should be better: Hey. Currently synchronous_commit is disabled for logical apply worker on the ground that reported flush_lsn includes only

Re: Taking into account syncrep position in flush_lsn reported by apply worker

2024-08-12 Thread Arseny Sher
Sorry for the poor formatting of the message above, this should be better: Hey. Currently synchronous_commit is disabled for logical apply worker on the ground that reported flush_lsn includes only locally flushed data so slot (publisher) preserves everything higher than this, and so in case of s

Taking into account syncrep position in flush_lsn reported by apply worker

2024-08-12 Thread Arseny Sher
Hey. Currently synchronous_commit is by default disabled for logical apply worker on the ground that reported flush_lsn includes only locally flushed data so slot (publisher) preserves everything higher than this, and so in case of subscriber restart no data is lost. However, imagine that subsc

Re: Flaky vacuum truncate test in reloptions.sql

2021-04-04 Thread Arseny Sher
On Fri, Apr 2, 2021 at 9:46 AM Michael Paquier wrote: > Okay, applied and back-patched down to 12 then. Thank you both. Unfortunately and surprisingly, the test still fails (perhaps even rarer, once in several hundred runs) under multimaster. After scratching the head for some more time, it see

Re: Flaky vacuum truncate test in reloptions.sql

2021-04-01 Thread Arseny Sher
Michael Paquier writes: > On Thu, Apr 01, 2021 at 12:52:21PM +0900, Masahiko Sawada wrote: >> Just to be clear the context, I’m mentioning the following test case: Sorry, I misremembered the test and assumed the table is non-empty there while it is empty but vacuum_truncate is disabled. Still,

Re: Flaky vacuum truncate test in reloptions.sql

2021-03-31 Thread Arseny Sher
Arseny Sher writes: > as currently the chance of its failure is close to 1. A typo, to 0 too, of course.

Re: Flaky vacuum truncate test in reloptions.sql

2021-03-31 Thread Arseny Sher
Masahiko Sawada writes: >> I don't think this matters much, as it tests the contrary and the >> probability of >> successful test passing (in case of theoretical bug making vacuum to >> truncate >> non-empty relation) becomes stunningly small. But adding it wouldn't hurt >> either. > > I was co

Re: Flaky vacuum truncate test in reloptions.sql

2021-03-31 Thread Arseny Sher
On 3/31/21 4:17 PM, Masahiko Sawada wrote: > Is it better to add FREEZE to the first "VACUUM reloptions_test;" as well? I don't think this matters much, as it tests the contrary and the probability of successful test passing (in case of theoretical bug making vacuum to truncate non-empty

Re: Flaky vacuum truncate test in reloptions.sql

2021-03-30 Thread Arseny Sher
On 3/30/21 10:12 AM, Michael Paquier wrote: > Yep, this is the same problem as the one discussed for c2dc1a7, where > a concurrent checkpoint may cause a page to be skipped, breaking the > test. Indeed, Alexander Lakhin pointed me to that commit after I wrote the message. > Why not just using

Flaky vacuum truncate test in reloptions.sql

2021-03-29 Thread Arseny Sher
#x27;ve ever seen this only when running regression tests under our multimaster. While multimaster contains a fair amount of C code, I don't see how any of it can interfere with the vacuuming business here. I can't say I did my best to create the repoduction though -- the explanation above seems to

Enlarge IOS vm cache

2021-03-09 Thread Arseny Sher
struct IndexOnlyScanState Relation ioss_RelationDesc; struct IndexScanDescData *ioss_ScanDesc; TupleTableSlot *ioss_TableSlot; - Buffer ioss_VMBuffer; + Buffer ioss_VMBuffer[VMBUF_SIZE]; Size ioss_PscanLen; } IndexOnlyScanState; -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Use-after-free in 12- EventTriggerAlterTableEnd

2020-10-27 Thread Arseny Sher
ndList, currentEventTriggerState->currentCommand); + + MemoryContextSwitchTo(oldcxt); } else pfree(currentEventTriggerState->currentCommand); -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Parallel query hangs after a smart shutdown is issued

2020-08-14 Thread Arseny Sher
Tom Lane writes: > Thomas Munro writes: >> On Fri, Aug 14, 2020 at 4:45 AM Tom Lane wrote: >>> After some more rethinking and testing, here's a v5 that feels >>> fairly final to me. I realized that the logic in canAcceptConnections >>> was kind of backwards: it's better to check the main pmS

Re: logical copy_replication_slot issues

2020-03-09 Thread Arseny Sher
Masahiko Sawada writes: > /* > -* Create logical decoding context, to build the initial snapshot. > +* Create logical decoding context to find start point or, if we don't > +* need it, to 1) bump slot's restart_lsn and xmin 2) check plugin sanity. > */ > > Do we need to num

Re: logical copy_replication_slot issues

2020-03-06 Thread Arseny Sher
I wrote: > It looks good to me now. After lying for some time in my head it reminded me that CreateInitDecodingContext not only pegs the LSN, but also xmin, so attached makes a minor comment correction. While taking a look at the nearby code it seemed weird to me that GetOldestSafeDecodingTransa

Re: logical copy_replication_slot issues

2020-03-04 Thread Arseny Sher
Masahiko Sawada writes: > I've attached the updated version patch that incorporated your > comments. I believe we're going in the right direction for fixing this > bug. I'll register this item to the next commit fest so as not to > forget. I've moved confirmed_flush check to the second lookup o

Re: logical copy_replication_slot issues

2020-02-10 Thread Arseny Sher
Masahiko Sawada writes: > I've attached the draft patch fixing this issue but I'll continue > investigating it more deeply. There also should be a check that source slot itself has consistent snapshot (valid confirmed_flush) -- otherwise it might be possible to create not initialized slot whic

logical copy_replication_slot issues

2020-02-09 Thread Arseny Sher
it as well. Was this considered? [1] https://www.postgresql.org/message-id/flat/AB5978B2-1772-4FEE-A245-74C91704ECB0%40amazon.com -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Too rigorous assert in reorderbuffer.c

2019-12-19 Thread Arseny Sher
Arseny Sher writes: > I'm sorry to bother you with this again, but due to new test our > internal buildfarm revealed that ajacent assert on cmin is also lie. You > see, we can't assume cmin is stable because the same key (relnode, tid) > might refer to completely differe

Re: (Re)building index using itself or another index of the same table

2019-09-16 Thread Arseny Sher
gt; > It's possible the columns referenced by the index expression are not > changing, but some additional columns are updated. Yeah. Also table can be CLUSTERed without VACUUM FULL. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company >From

(Re)building index using itself or another index of the same table

2019-09-12 Thread Arseny Sher
ossiblity of triggering "could not read block" error with plain SQL is definitely not nice. >From 5942a3a5b2c90056119b9873c81f30dfa9e003af Mon Sep 17 00:00:00 2001 From: Arseny Sher Date: Thu, 12 Sep 2019 17:35:16 +0300 Subject: [PATCH] Avoid touching user indexes while they are being (r

Re: Parallel query vs smart shutdown and Postmaster death

2019-03-18 Thread Arseny Sher
parameter "target" to > "include_type_mask" to make it super clear what's going on. I thought that this is a bit too complicated for single use-case, but if you like it better, here is an updated version. -- Arseny Sher Postgres Professional: http://www.p

Re: Parallel query vs smart shutdown and Postmaster death

2019-03-16 Thread Arseny Sher
wever, I think there is a problem in your patch: we might be in post PM_RUN states due to FatalError, not because of shutdown. In this case, we shouldn't refuse to run bgws in the future. I would also merge the check into bgworker_should_start_now. -- Arseny Sher Postgres Professional: http://www.po

Re: Too rigorous assert in reorderbuffer.c

2019-02-15 Thread Arseny Sher
n. Maybe I just haven't tried hard enough though. Attached 'aborted_subxact_test.patch' is an illustration of such wrong cmin visibility on pg_attribute. It triggers assertion failure, but otherwise ok (no user-facing issues), as I said earlier, so I am disinclined to include it in the

Re: Too rigorous assert in reorderbuffer.c

2019-02-12 Thread Arseny Sher
, it works. I thought for a moment that some obscure cases where cmax on a single tuple is not strictly monotonic might exist, but looks like they don't. So your change is ok for me, reshaping assert is better than removing. make check is also good on all supported branches. -- Arseny Sher P

Re: Too rigorous assert in reorderbuffer.c

2019-02-07 Thread Arseny Sher
Alvaro Herrera writes: > On 2019-Feb-06, Arseny Sher wrote: > >> >> Alvaro Herrera writes: >> >> > note the additional pg_temp_XYZ row in the middle. This is caused by >> > the rewrite in ALTER TABLE. Peter E fixed that in Pg11 in commit >>

Re: Too rigorous assert in reorderbuffer.c

2019-02-06 Thread Arseny Sher
d it breaking the test. Oh, I see. Let's just remove the first insertion then, as in attached. I've tested it on master and on 9.4. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company >From b34726d9b7565df73319a44664d9cd04de5f514f Mon Sep

Too rigorous assert in reorderbuffer.c

2019-01-30 Thread Arseny Sher
nyway and wasn't covered before. >From c5cd30f9e23c96390cafad82f832c918dfd3397f Mon Sep 17 00:00:00 2001 From: Arseny Sher Date: Wed, 30 Jan 2019 23:31:47 +0300 Subject: [PATCH] Remove assertion in reorderbuffer that cmax is stable. Since it can be rewritten arbitrary number of times if

Re: [HACKERS] logical decoding of two-phase transactions

2019-01-14 Thread Arseny Sher
n has been decoded. The gid field, a commit prepared transaction *record* has been decoded? Fourth patch: Applying: Teach test_decoding plugin to work with 2PC .git/rebase-apply/patch:347: trailing whitespace. -- test savepoints .git/rebase-apply/patch:424: trailing whitespace. # get XID of t

Re: [HACKERS] logical decoding of two-phase transactions

2018-12-18 Thread Arseny Sher
here addressed in the latest version though. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Global snapshots

2018-09-26 Thread Arseny Sher
forward with arbitrary speed (e.g. clocks can be stopped), but it must never go backwards. So if leap second correction is implemented by doubling the duration of certain second (as it usually seems to be), we are fine. > Also, I could not understand some notes from Arseny: > >> 25 июля 201

Re: [HACKERS] logical decoding of two-phase transactions

2018-08-12 Thread Arseny Sher
ted) transaction or decoding of an uncommitted transaction, >> this >> + change callback is ensured sane access to catalog tables regardless of >> + simultaneous rollback by another backend of this very same >> transaction. >> >> I don't think we should explain this, at least in such words. As >> mentioned upthread, we should warn about allowed systable_* accesses >> instead. Same for message_cb. >> > > Looks like you are looking at an earlier patchset. The latest patchset > has removed the above. I see, sorry. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: [HACKERS] logical decoding of two-phase transactions

2018-08-07 Thread Arseny Sher
s (ABORT PREPARED) for any reason; * Decoding processs notices this on catalog scan and calls abort() callback; * Later decoding process reads abort record and calls abort_prepared callback. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: [HACKERS] logical decoding of two-phase transactions

2018-08-06 Thread Arseny Sher
), +errmsg("transaction aborted during system catalog scan"))); Probably centralize checks in one function? As well as 'We don't expect direct calls to heap_fetch...' ones. P.S. Looks like you have torn the thread chain: In-Reply-To header of mail [1] is missing. Please don't do that. [1] https://www.postgresql.org/message-id/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3%2BQ%40mail.gmail.com -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Global snapshots

2018-07-25 Thread Arseny Sher
urrently we export snap only once in xact. * Remove assertion that circular buffer entries are monotonic, as GetOldestXmin *can* go backwards. [1] https://www.cockroachlabs.com/blog/living-without-atomic-clocks/ -- Arseny Sher Postgres Professional: http://

Re: Possible bug in logical replication.

2018-07-09 Thread Arseny Sher
Michael Paquier writes: > Okay, let's do as you suggest then. Do you find the attached adapted? Yes, thanks! -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: pgsql: Fix "base" snapshot handling in logical decoding

2018-07-06 Thread Arseny Sher
heap_tuple does the job. > Thanks for the detective work! I pushed this test change. Thank you, I appreciate this. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Possible bug in logical replication.

2018-07-02 Thread Arseny Sher
that? I'm practically happy with this. > * while confirmed_lsn is used as base point for the decoding context. This line is excessive as now we have comment below saying it doesn't matter. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: pgsql: Fix "base" snapshot handling in logical decoding

2018-07-02 Thread Arseny Sher
Arseny Sher writes: > There is also one thing that puzzles me as I don't know much about > vacuum internals. If I do plain VACUUM of pg_attribute in the test, it > shouts "catalog is missing 1 attribute(s) for relid" error (which is > quite expected), while with &#x

Re: pgsql: Fix "base" snapshot handling in logical decoding

2018-06-30 Thread Arseny Sher
ess -- that is, tuple is successfully decoded with all three columns, as though VACUUM was not actually executed. All this is without the main patch, of course. I think I will look into this soon. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company diff -

Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-26 Thread Arseny Sher
.. Your v3 patch fails for me on freshest master (4d54543efa) in exactly this assert, it looks like there is something wrong in your setup. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-26 Thread Arseny Sher
_push_tail(&txn->changes, &change->node); txn->nentries++; txn->nentries_mem++; Since we do that, probably we should replace all lsn validity checks with XLogRectPtrIsInvalid? -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-21 Thread Arseny Sher
lit the diff into two, although both >> fixes use the by_base_snapshot_lsn field. > > Please don't. Yeah, I don't think we should bother with that. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company >From 61b4b0a89c95f8912729c54c013b6403

Re: Possible bug in logical replication.

2018-06-20 Thread Arseny Sher
Michael Paquier writes: > On Mon, Jun 18, 2018 at 09:42:36PM +0900, Michael Paquier wrote: >> On Fri, Jun 15, 2018 at 06:27:56PM +0300, Arseny Sher wrote: >> It seems to me that we still want to have the slot forwarding finish in >> this case even if this is interrupted.

Re: Possible bug in logical replication.

2018-06-15 Thread Arseny Sher
heck in its main loop. * Copy-paste comment fix. [1] https://www.postgresql.org/message-id/5f85bf41-098e-c4e1-7332-9171fef57a0a%40enterprisedb.com -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company >From d8ed8ae3eec54b716d7dbb35379d0047a96c6c75 Mon S

Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-05-25 Thread Arseny Sher
he section for older bugs: > https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items#Older_Bugs > However this list tends to be... Er... Ignored. > Thank you, Michael. I have created the commitfest entry. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Possible bug in logical replication.

2018-05-24 Thread Arseny Sher
ning of the next record: Same problem should be handled at pg_logical_slot_get_changes_guts and apply worker feedback; and there is a convention that all commits since confirmed_flush must be decoded, so we risk decoding such boundary commit twice. -- Arseny Sher Postgres Professional: http://www.post

Re: Possible bug in logical replication.

2018-05-24 Thread Arseny Sher
h also fixes this. Indeed, but we have these problems only if we are trying to read WAL since confirmed_flush. [1] https://www.postgresql.org/message-id/873720e4hf.fsf%40ars-thinkpad -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Possible bug in logical replication.

2018-05-17 Thread Arseny Sher
nt on the valid record position (by jjust adding page header > size?). Well, restart_lsn is always available on live slot: it is initially set in ReplicationSlotReserveWal during slot creation. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Indexes on partitioned tables and foreign partitions

2018-05-09 Thread Arseny Sher
#x27;t guarantee uniqueness this way; that's fully users responsiblity. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Indexes on partitioned tables and foreign partitions

2018-05-09 Thread Arseny Sher
just not creating them. [1] https://www.postgresql.org/message-id/flat/4F62FD69.2060007%40lab.ntt.co.jp#4f62fd69.2060...@lab.ntt.co.jp -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Indexes on partitioned tables and foreign partitions

2018-05-09 Thread Arseny Sher
ttachrel->rd_rel->relkind == RELKIND_FOREIGN_TABLE) + return; + cxt = AllocSetContextCreate(CurrentMemoryContext, "AttachPartitionEnsureIndexes", ALLOCSET_DEFAULT_SIZES); -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-04-16 Thread Arseny Sher
(delicate ping)

Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-04-07 Thread Arseny Sher
them. Another problem is that new snapshots are never queued to known subxacts. It means decoding results can be wrong if toplevel doesn't write anything while subxact does. Please see detailed description of the issues, tests which reproduce them and fixes in the attached patch. -- Arseny She

Re: Why chain of snapshots is used in ReorderBufferCommit?

2018-03-01 Thread Arseny Sher
ver mind the last paragraph. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Two-phase update of restart_lsn in LogicalConfirmReceivedLocation

2018-02-28 Thread Arseny Sher
es it is possible that we will start decoding from segment which was already recycled, making the slot broken. Shouldn't this be fixed? -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company

Why chain of snapshots is used in ReorderBufferCommit?

2018-02-28 Thread Arseny Sher
'base_snapshot' concept, why first snapshots aren't logged to the change queue as any subsequent ones? -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company >From 2a02c117e538f5085e82f211b41c3b14c37ce3ff Mon Sep 17 00:00:00 2001 From: Ars

Re: Server Crash while executing pg_replication_slot_advance (second time)

2018-02-16 Thread Arseny Sher
other approach is to notice pointer to page start and replace it with ptr to first record, but that doesn't sound elegantly. -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company >From d3db37bca231beb0081567ef818ac1ec852cbb1a Mon Sep 17 00:00:00 2001 From

Re: GSoC 2018

2017-12-15 Thread Arseny Sher
For logical replication, this can be set in the connection information of the subscription, and it defaults to the subscription name. [1] https://www.postgresql.org/docs/current/static/runtime-config-replication.html -- Arseny Sher Postgres Professional: http://www.postgrespro.com The Russian Postgres Company