Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions

Tomas Vondra Mon, 29 Jan 2018 05:35:23 -0800

Hi,

On 01/29/2018 11:23 AM, Simon Riggs wrote:
> On 29 January 2018 at 07:15, Nikhil Sontakke <[email protected]> wrote:
> 
>>> Having this as responsibility of plugin sounds interesting. It
>>> certainly narrows the scope for which we need to solve the abort
>>> issue. For 2PC that may be okay as we need to somehow interact
>>> with transaction manager as Simon noted. I am not sure if this
>>> helps streaming use-case though as there is not going to be any
>>> external transaction management involved there.
> 
> I think we should recognize that the two cases are different.
> 
> 2PC decoding patch is looking at what happens AFTER a PREPARE has
> occurred on a transaction.
> 
> Streaming looks to have streaming occur right in the middle of a
> transaction, so it has other concerns and likely other solutions as a
> result.
>


I don't quite see how these cases are so different, or why would they
require different handling. Can you explain?

We need to deal with decoding a transaction that aborts while we're
decoding it - we must not continue decoding it (in the sense of passing
the changes to the plugin). I don't see any major difference between
ROLLBACK and ROLLBACK PREPARED.

So I think the pre-abort hook solution should work for the streaming
case too. At least - I don't see why it wouldn't.

While discussing this with Peter Eisentraut off-list, I think we came up
with an alternative idea for the streaming case, that does not require
the pre-abort hook. The important detail is that we only really care
about aborts in transactions that modified catalogs in some way (e.g. by
doing DDL). But we can safely decode (and stream) changes up to the
point when the catalogs get modified, so we can do two different things
at that point:

(1) Stop streaming changes from that transaction, and instead start
spilling it to disk (and then continue with the replay only when it
actually commits).

(2) Note the transaction ID somewhere and restart the decoding (that is,
notify the downstream to throw away the data and go back in WAL to read
all the data from scratch) but spilling that one transaction to disk
instead of streaming it.

Neither of these solutions is currently implemented and would require
changes to ReorderBuffer (which currently does not support mixing of
spill-to-disk and streaming) and possibly the logical decoding
infrastructure (e.g. to stash the XIDs somewhere and allow restarting
from previous LSN).

The good thing is it does not need to know about aborted transactions
and so does not require communication with transaction manager using
pre-abort hooks etc. I think this would be a massive advantage.

The main question (at least for me) is if it's actually cheaper,
compared to the pre-abort hook. In my experience, aborts tend to be
fairly rare in practice - maybe 1:1000 to commits. On the other hand,
temporary tables are fairly common thing, and they count as catalog
changes, of course.

Maybe this is not that bad though, as it only really matters for large
transactions, which is when we start to spill to disk or stream. And
that should not be very often - if it happens very often, you probably
need a higher limit (similarly to work_mem).

The cases where this would matter are large ETL jobs, upgrade scripts
and so on - these tend to be large and mix DDL (temporary tables, ALTER
TABLE, ...). That's unfortunate, as it's one of the cases the streaming
was supposed to help.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Logical Decoding and HeapTupleSatisfiesVacuum assumptions

Reply via email to