Apologies for the mistake of calculation 120*6*2KB = 1440KB = 1.4MB
> On 18-Oct-2022, at 1:35 AM, Puneet Duggal <puneetduggal1...@gmail.com> wrote: > > Hi, > > I am working on a use case which uses Flink CEP for pattern detection. > > Flink Version - 1.12.1 > Deployment Mode - Session Mode (Highly Available) > State Backend - RocksDB > Checkpoint Interval - 2 mins > Checkpoint Mode - Exactly Once > > CEP pattern looks something like - A not_followed_by B within (40mins) > After Match Skip Strategy - Skip Past Last Event > > In order to test out incremental checkpointing and its size, I deployed a job > on a cluster (let's say cluster A, hence job name J(aa)) and that same job on > cluster B 1 week later (Job Name J(ab)). Basically at any given point in > time, both jobs( J(aa) and J(ab)) process exactly the same records. After 1 > week of deployment of J(ab), I found out that in spite of working on the same > records and window time of 40mins (after which unmatched patterns should > expire), the incremental checkpoint size of J(aa) is around 40-50MB whereas > that of J(ab) is 25-30MB. My assumption of the incremental checkpoint is that > it only contains delta state change after the last checkpoint which is same > for both jobs. Attached screenshots for J(ab) and J(aa) respectively. > > J(ab) > > <Screenshot 2022-10-18 at 1.25.18 AM.png> > > J(aa) > > <Screenshot 2022-10-18 at 1.26.25 AM.png> > > Checkpoint Configuration > > <Screenshot 2022-10-18 at 1.29.10 AM.png> > > > One more doubt on the same lines is that these jobs consume on an average 6 > events per second with one event of the size around 2KB. Assuming a > checkpoint interval of 2 mins and each event getting stored in CEP state, > total delta size of the state should be 2*60*6*1.32 = 316KB which is nowhere > near to size shown in the incremental checkpoint for both the jobs. Even > including meta info for these records, not sure what am i missing which is > causing incremental checkpoints to be so huge. > > > Regards, > Puneet >