Hi everyone!
I'm using Flink *1.12.0* with SQL API.
My streaming job is basically a join of two dynamic tables (from kafka
topics) and insertion the result into PostgreSQL table.
I have enabled watermarking based on kafka topic timestamp column for each
table in join:
CREATE TABLE table1 (
....,
kafka_time TIMESTAMP(3) METADATA FROM 'timestamp',
WATERMARK FOR kafka_time AS kafka_time - INTERVAL '2' HOUR
)
WITH ('format' = 'json',
...
But checkpoint data size for join task is permanently increasing despite the
watermarks on the tables and "Low watermark" mark in UI.
As far as I understand outdated records from both tables must be dropped
from checkpoint after 2 hours, but looks like it holds all job state since
the beginning.
Should I enable this behavior for outdated records explicitly of modify join
query?
Thank you!
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/