Re: flink cdc postgres connector 1.1 - heartbeat.interval.ms - WAL consumption

Till Rohrmann Thu, 08 Apr 2021 03:22:49 -0700

Hi Hemant,

I am pulling in Jark who is most familiar with Flink's cdc connector. He
might also be able to tell whether the fix can be backported.


Cheers,
Till

On Thu, Apr 8, 2021 at 10:42 AM bat man <tintin0...@gmail.com> wrote:

> Anyone who has faced similar issues with cdc with Postgres.
>
> I see the restart_lsn and confirmed_flush_lsn constant since the snapshot
> replication records were streamed even though I have tried inserting
> a record in the whitelisted table.
>
> select * from pg_replication_slots;
>   slot_name  |  plugin  | slot_type | datoid | database | temporary |
> active | active_pid | xmin | catalog_xmin | restart_lsn |
> confirmed_flush_lsn
>
> -------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
>  stream_cdc3 | pgoutput | logical   |  16411 | test_cdc | f         | t
>    |       1146 |      |         6872 | 62/34000828 | 62/34000860
>
> I have passed the  heartbeat.interval.ms = 1000 and could see
> the heartbeat events streamed to flink however the transaction log disk
> usage and oldest replication slot lag consistently increasing. From [1] I
> have also tried this -
>
> For other decoder plug-ins, it is recommended to create a supplementary
> table that is not monitored by Debezium.
>
> A separate process would then periodically update the table (either
> inserting a new event or updating the same row all over). PostgreSQL then
> will invoke Debezium which will confirm the latest LSN and allow the
> database to reclaim the WAL space.
>
> [image: Screenshot 2021-04-08 at 2.07.18 PM.png]
>
> [image: Screenshot 2021-04-08 at 2.07.52 PM.png]
>
> [1] -
> https://debezium.io/documentation/reference/1.0/connectors/postgresql.html#wal-disk-space
>
> Thanks.
>
> On Wed, Apr 7, 2021 at 12:51 PM bat man <tintin0...@gmail.com> wrote:
>
>> Hi there,
>>
>> I am using flink 1.11 and cdc connector 1.1 to stream changes from a
>> postgres table. I see the WAL consumption is increasing gradually even
>> though the writes to tables are very less.
>>
>> I am using AWS RDS, from [1] I understand that setting the parameter
>> heartbeat.interval.ms solves this WAL consumption issue. However, I
>> tried setting this with no success.
>>
>> I found a bug [2] which seems to be taking care of committing the lsn for
>> the db to release the wal. however this seems to be only fixed in 1.3 which
>> is compatible with flink 1.12.1. Is there any way this can be fixed in
>> 1.11.1. Since I am using EMR and the latest flink version available is 1.11.
>>
>>
>> [1] -
>> https://debezium.io/documentation/reference/connectors/postgresql.html
>> [2] - https://github.com/ververica/flink-cdc-connectors/issues/97
>>
>> Thanks.
>> Hemant
>>
>

Re: flink cdc postgres connector 1.1 - heartbeat.interval.ms - WAL consumption

Reply via email to