Re: flink cdc postgres connector 1.1 - heartbeat.interval.ms - WAL consumption

bat man Thu, 08 Apr 2021 08:24:02 -0700

Thanks Till.

Hi Jark,


Any inputs, going through the code of 1.1 and 1.3 in the meantime.

Thanks,
Hemant

On Thu, Apr 8, 2021 at 3:52 PM Till Rohrmann <trohrm...@apache.org> wrote:

> Hi Hemant,
>
> I am pulling in Jark who is most familiar with Flink's cdc connector. He
> might also be able to tell whether the fix can be backported.
>
> Cheers,
> Till
>
> On Thu, Apr 8, 2021 at 10:42 AM bat man <tintin0...@gmail.com> wrote:
>
>> Anyone who has faced similar issues with cdc with Postgres.
>>
>> I see the restart_lsn and confirmed_flush_lsn constant since the snapshot
>> replication records were streamed even though I have tried inserting
>> a record in the whitelisted table.
>>
>> select * from pg_replication_slots;
>>   slot_name  |  plugin  | slot_type | datoid | database | temporary |
>> active | active_pid | xmin | catalog_xmin | restart_lsn |
>> confirmed_flush_lsn
>>
>> -------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
>>  stream_cdc3 | pgoutput | logical   |  16411 | test_cdc | f         | t
>>    |       1146 |      |         6872 | 62/34000828 | 62/34000860
>>
>> I have passed the  heartbeat.interval.ms = 1000 and could see
>> the heartbeat events streamed to flink however the transaction log disk
>> usage and oldest replication slot lag consistently increasing. From [1] I
>> have also tried this -
>>
>> For other decoder plug-ins, it is recommended to create a supplementary
>> table that is not monitored by Debezium.
>>
>> A separate process would then periodically update the table (either
>> inserting a new event or updating the same row all over). PostgreSQL then
>> will invoke Debezium which will confirm the latest LSN and allow the
>> database to reclaim the WAL space.
>>
>> [image: Screenshot 2021-04-08 at 2.07.18 PM.png]
>>
>> [image: Screenshot 2021-04-08 at 2.07.52 PM.png]
>>
>> [1] -
>> https://debezium.io/documentation/reference/1.0/connectors/postgresql.html#wal-disk-space
>>
>> Thanks.
>>
>> On Wed, Apr 7, 2021 at 12:51 PM bat man <tintin0...@gmail.com> wrote:
>>
>>> Hi there,
>>>
>>> I am using flink 1.11 and cdc connector 1.1 to stream changes from a
>>> postgres table. I see the WAL consumption is increasing gradually even
>>> though the writes to tables are very less.
>>>
>>> I am using AWS RDS, from [1] I understand that setting the parameter
>>> heartbeat.interval.ms solves this WAL consumption issue. However, I
>>> tried setting this with no success.
>>>
>>> I found a bug [2] which seems to be taking care of committing the lsn
>>> for the db to release the wal. however this seems to be only fixed in 1.3
>>> which is compatible with flink 1.12.1. Is there any way this can be fixed
>>> in 1.11.1. Since I am using EMR and the latest flink version available is
>>> 1.11.
>>>
>>>
>>> [1] -
>>> https://debezium.io/documentation/reference/connectors/postgresql.html
>>> [2] - https://github.com/ververica/flink-cdc-connectors/issues/97
>>>
>>> Thanks.
>>> Hemant
>>>
>>

Re: flink cdc postgres connector 1.1 - heartbeat.interval.ms - WAL consumption

Reply via email to