Thanks Yun. Let me try options provided below. Thanks, Vijay
> On Jun 23, 2021, at 4:51 AM, Yun Tang <myas...@live.com> wrote: > > > Hi Vijay, > > To be honest, an 18MB checkpoint size in total might not be something > serious. If you really want to dig what inside, you could use > Checkpoints#loadCheckpointMetadata [1] to load the _metadata to see anything > unexpected. > > And you could refer to FlinkKafkaConsumerBase#unionOffsetStates [2] and > FlinkKinesisConsumer#sequenceNumsToRestore to compare different operator > state stored in kafka and kinesis connector. > > [1] > https://github.com/apache/flink/blob/10146366bec7feca85acedb23184b99517059bc6/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L99 > [2] > https://github.com/apache/flink/blob/10146366bec7feca85acedb23184b99517059bc6/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java#L201 > [3] > https://github.com/apache/flink/blob/10146366bec7feca85acedb23184b99517059bc6/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L158-L159 > > Best, > Yun Tang > From: Vijayendra Yadav <contact....@gmail.com> > Sent: Wednesday, June 23, 2021 11:02 > To: user <user@flink.apache.org> > Subject: High Flink checkpoint Size > > Hi Team, > > I have two flink Streaming Jobs > 1) Flink streaming from KAFKA and writing to s3 > 2) Fling Streaming from KINESIS (KDS) and writing to s3 > > Both Jobs have similar checkpoint duration. > > Job #1 (KAFKA) checkpoint size is only 85KB > Job #2 (KINESIS) checkpoint size is 18MB > > There are no checkpoint failures. But I want to understand why Kinesis > streaming has such a huge checkpoint size, is there a way to handle it > differently? and reduce the size. > > Thanks, > Vijay