Hi,

We have updated our Flink applications to 1.12.2, alone with the
following modifications to improve its performance:

- Use unaligned checkpoint
- Change the following fs config
  - state.backend.fs.memory-threshold: 1048576
  - state.backend.fs.write-buffer-size: 4194304

However, now our Flink applications will occasionally stuck when doing
unaligned checkpoint or savepoint. The following are operators that
stuck in our cases.

- Kafka source connector.
- BroadcastProcessFunction with data input, and broadcasted
  configuration.

Also, when it is stuck, Flink also stops to consume any data.

Since these operators do not have many data to be stored in
checkpoint/savepoint, we wonder, how can we debug this problem?


-- 
ChangZhuo Chen (陳昌倬) czchen@{czchen,debconf,debian}.org
http://czchen.info/
Key fingerprint = BA04 346D C2E1 FE63 C790  8793 CC65 B0CD EC27 5D5B

Attachment: signature.asc
Description: PGP signature

Reply via email to