Hi,

On 2018-12-31 09:56:48 +0530, Amit Kapila wrote:
> To support logical decoding for zheap operations, we need a way to
> ensure zheap tuples can be registered as change streams.   One idea
> could be that we make ReorderBufferChange aware of another kind of
> tuples as well, something like this:
> 
> @@ -100,6 +123,20 @@ typedef struct ReorderBufferChange
>   ReorderBufferTupleBuf *newtuple;
>   } tp;
> + struct
> + {
> + /* relation that has been changed */
> + RelFileNode relnode;
> +
> + /* no previously reassembled toast chunks are necessary anymore */
> + bool clear_toast_afterwards;
> +
> + /* valid for DELETE || UPDATE */
> + ReorderBufferZTupleBuf *oldtuple;
> + /* valid for INSERT || UPDATE */
> + ReorderBufferZTupleBuf *newtuple;
> + } ztp;
> +
> 
> 
> +/* an individual zheap tuple, stored in one chunk of memory */
> +typedef struct ReorderBufferZTupleBuf
> +{
> ..
> + /* tuple header, the interesting bit for users of logical decoding */
> + ZHeapTupleData tuple;
> ..
> +} ReorderBufferZTupleBuf;
> 
> Apart from this, we need to define different decode functions for
> zheap operations as the WAL data is different for heap and zheap, so
> same functions can't be used to decode.

I'm very strongly opposed to that. We shouldn't have expose every
possible storage method to output plugins, that'll make extensibility
a farce.  I think we'll either have to re-form a HeapTuple or decide
to bite the bullet and start exposing tuples via slots.


> This email is primarily to discuss about how the logical decoding for
> basic DML operations (Insert/Update/Delete) will work in zheap.  We
> might need some special mechanism to deal with sub-transactions as
> zheap doesn't generate a transaction id for sub-transactions, but we
> can discuss that separately.

Subtransactions seems to be the hardest part besides the tuple format
issue, so I think we should discuss that very soon.

> +/*
> + * Write zheap's INSERT to the output stream.
> + */
> +void
> +logicalrep_write_zinsert(StringInfo out, Relation rel, ZHeapTuple newtuple)
> +{
> +     pq_sendbyte(out, 'I');          /* action INSERT */
> +
> +     Assert(rel->rd_rel->relreplident == REPLICA_IDENTITY_DEFAULT ||
> +                rel->rd_rel->relreplident == REPLICA_IDENTITY_FULL ||
> +                rel->rd_rel->relreplident == REPLICA_IDENTITY_INDEX);
> +
> +     /* use Oid as relation identifier */
> +     pq_sendint32(out, RelationGetRelid(rel));
> +
> +     pq_sendbyte(out, 'N');          /* new tuple follows */
> +     //logicalrep_write_tuple(out, rel, newtuple);
> +}

Obviously we need to do better - I don't think we should have
tuple-specific replication messages.


>  /*
>   * Write relation description to the output stream.
>   */
> diff --git a/src/backend/replication/logical/reorderbuffer.c 
> b/src/backend/replication/logical/reorderbuffer.c
> index 23466bade2..70fb5e2934 100644
> --- a/src/backend/replication/logical/reorderbuffer.c
> +++ b/src/backend/replication/logical/reorderbuffer.c
> @@ -393,6 +393,19 @@ ReorderBufferReturnChange(ReorderBuffer *rb, 
> ReorderBufferChange *change)
>                               change->data.tp.oldtuple = NULL;
>                       }
>                       break;
> +             case REORDER_BUFFER_CHANGE_ZINSERT:

This really needs to be undistinguishable from normal CHANGE_INSERT...

Greetings,

Andres Freund

Reply via email to