Thanks Yun. Let me try options provided below.

Thanks,
Vijay

> On Jun 23, 2021, at 4:51 AM, Yun Tang <myas...@live.com> wrote:
> 
> 
> Hi Vijay,
> 
> To be honest, an 18MB checkpoint size in total might not be something 
> serious. If you really want to dig what inside, you could use 
> Checkpoints#loadCheckpointMetadata [1] to load the _metadata to see anything 
> unexpected.
> 
> And you could refer to FlinkKafkaConsumerBase#unionOffsetStates [2] and 
> FlinkKinesisConsumer#sequenceNumsToRestore to compare different operator 
> state stored in kafka and kinesis connector.
> 
> [1] 
> https://github.com/apache/flink/blob/10146366bec7feca85acedb23184b99517059bc6/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/Checkpoints.java#L99
> [2] 
> https://github.com/apache/flink/blob/10146366bec7feca85acedb23184b99517059bc6/flink-connectors/flink-connector-kafka/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaConsumerBase.java#L201
> [3] 
> https://github.com/apache/flink/blob/10146366bec7feca85acedb23184b99517059bc6/flink-connectors/flink-connector-kinesis/src/main/java/org/apache/flink/streaming/connectors/kinesis/FlinkKinesisConsumer.java#L158-L159
> 
> Best,
> Yun Tang
> From: Vijayendra Yadav <contact....@gmail.com>
> Sent: Wednesday, June 23, 2021 11:02
> To: user <user@flink.apache.org>
> Subject: High Flink checkpoint Size
>  
> Hi Team,
> 
> I have two flink Streaming Jobs 
> 1) Flink streaming from KAFKA and writing to s3
> 2) Fling Streaming from KINESIS (KDS) and writing to s3
> 
> Both Jobs have similar checkpoint duration. 
> 
> Job #1 (KAFKA) checkpoint size is only 85KB 
> Job #2 (KINESIS) checkpoint size is 18MB
> 
> There are no checkpoint failures. But I want to understand why Kinesis 
> streaming has such a huge checkpoint size, is there a way to handle it 
> differently? and reduce the size. 
> 
> Thanks,
> Vijay

Reply via email to