Hi,

my Beam 2.10-SNAPSHOT pipeline has a KafkaIO as input and a BigQueryIO
configured with FILE_LOADS as output. What bothers me is that even if I
configure in my Flink 1.6 configuration

state.backend: rocksdb
state.backend.incremental: true

I see states that are as big as 230 MiB and checkpoint timeouts, or
checkpoints that take longer than 10 minutes to complete (I just saw one
that took longer than 30 minutes).

Am I missing something? Is there some room for improvement? Should I use a
different storage backend for the checkpoints? (Currently they are stored
on GCS).

Best,
Tobi

Reply via email to