Hi Hangxiang, We are using flink 1.14, the state backend is EmbeddedRocksDBStateBackend , and the Checkpoint Storage is filesystem. This is the checkpoint configuration from our running jobs Checkpointing Mode Exactly Once Checkpoint Storage FileSystemCheckpointStorage State Backend EmbeddedRocksDBStateBackend Interval 10m 0s Timeout 20m 0s Minimum Pause Between Checkpoints 3m 0s Maximum Concurrent Checkpoints 1 Unaligned Checkpoints Enabled Aligned checkpoint timeout 0ms Persist Checkpoints Externally Enabled (retain on cancellation) Tolerable Failed Checkpoints 5 Checkpoints With Finished Tasks Disabled
Thanks, Yifan On 2023/09/07 06:16:41 Hangxiang Yu wrote: > Hi, Yifan. > Which flink version are you using ? > You are using filesystem instead of rocksdb so that your checkpoint size > may not be incremental IIUC. > > On Thu, Sep 7, 2023 at 10:52 AM Yifan He via user <us...@flink.apache.org> > wrote: > > > Hi Shammon, > > > > We are using RocksDB,and the configuration is below: > > execution.checkpointing.externalized-checkpoint-retention: > > RETAIN_ON_CANCELLATION > > execution.checkpointing.max-concurrent-checkpoints: 1 > > execution.checkpointing.min-pause: 0 > > execution.checkpointing.mode: EXACTLY_ONCE > > execution.checkpointing.snapshot-compression: true > > execution.checkpointing.timeout: 60000 > > state.backend: FILESYSTEM > > state.backend.incremental: true > > state.backend.local-recovery: true > > state.backend.rocksdb.memory.high-prio-pool-ratio: 0.1 > > state.backend.rocksdb.memory.managed: true > > state.backend.rocksdb.memory.write-buffer-ratio: 0.5 > > state.backend.rocksdb.predefined-options: DEFAULT > > state.backend.rocksdb.timer-service.factory: ROCKSDB > > state.checkpoints.num-retained: 3 > > > > Thanks, > > Yifan > > > > On 2023/09/06 08:00:31 Shammon FY wrote: > > > Hi Yifan, > > > > > > Besides reading job state, I would like to know what statebackend are you > > > using? Can you give the configurations about state and checkpoint for > > your > > > job? Maybe you can check these configuration items to confirm if they are > > > correct first. > > > > > > Best, > > > Shammon FY > > > > > > On Wed, Sep 6, 2023 at 3:17 PM Hang Ruan <ru...@gmail.com> wrote: > > > > > > > Hi, Yifan. > > > > > > > > I think the document[1] means to let us convert the DataStream to the > > > > Table[2]. Then we could handle the state with the Table API & SQL. > > > > > > > > Best, > > > > Hang > > > > > > > > [1] > > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/libs/state_processor_api/ > > > > [2] > > > > > > https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/dev/table/data_stream_api/#converting-between-datastream-and-table > > > > > > > > Yifan He via user <us...@flink.apache.org> 于2023年9月6日周三 13:10写道: > > > > > > > >> Hi team, > > > >> > > > >> We are investigating why the checkpoint size of our FlinkSQL jobs > > keeps > > > >> growing and we want to look into the checkpoint file to know what is > > > >> causing the problem. I know we can use the state processor api to > > read the > > > >> state of jobs using datastream api, but how can I read the state of > > jobs > > > >> using table api & sql? > > > >> > > > >> Thanks, > > > >> Yifan > > > >> > > > > > > > > > > > > -- > Best, > Hangxiang. >