Hi Matt, could you give us a bit more information about the windows you are using? They are tumbling windows. What's the size of the windows? Do you allow lateness of events? What's your checkpoint interval?
Are you using event time? If yes, how is the watermark generated? You said that the number of events per window is more or less constant. Does this is also apply to the size of the individual events? Cheers, Till On Wed, May 27, 2020 at 1:21 AM Guowei Ma <guowei....@gmail.com> wrote: > Hi, Matt > The total size of the state of the window operator is related to the > number of windows. For example if you use keyby+tumblingwindow there > would be keys number of windows. > Hope this helps. > Best, > Guowei > > Wissman, Matt <matt.wiss...@here.com> 于2020年5月27日周三 上午3:35写道: > > > > Hello Flink Community, > > > > > > > > I’m running a Flink pipeline that uses a tumbling window and incremental > checkpoint with RocksDB backed by s3. The number of objects in the window > is stable but overtime the checkpoint size grows seemingly unbounded. > Within the first few hours after bringing the Flink pipeline up, the > checkpoint size is around 100K but after a week of operation it grows to > around 100MB. The pipeline isn’t using any other Flink state besides the > state that the window uses. I think this has something to do with RocksDB’s > compaction but shouldn’t the tumbling window state expire and be purged > from the checkpoint? > > > > > > > > Flink Version 1.7.1 > > > > > > > > Thanks! > > > > > > > > -Matt >