Re: Task Manager restart and RocksDB incremental checkpoints issue.

Yanfei Lei Thu, 10 Nov 2022 19:53:06 -0800

Hi Vidya Sagar,

Could you please share the reason for TaskManager restart? If the machine
or JVM process of TaskManager crashes, the `RocksDBKeyedStateBackend` can't
be disposed/closed normally,  so the existing rocksdb instance directory
would remain.


BTW, if you use Application Mode on k8s, if a TaskManager(pod) crashes, the
rocksdb directory would be deleted as the pod is released.

Best,
Yanfei

Vidya Sagar Mula <mulasa...@gmail.com> 于2022年11月11日周五 01:39写道：

> Hi,
>
> I am using RocksDB state backend for incremental checkpointing with Flink
> 1.11 version.
>
> Question:
> ----------
> For a given Job ID, Intermediate RocksDB checkpoints are stored under the
> path defined with ""
>
> The files are stored with "_jobID+ radom UUID" prefixed to the location.
>
> Case : 1
> ---------
> - When I cancel the job, then all the rocksDB checkpoints are deleted
> properly from the location corresponding to that JobId.
> (based on "instanceBasePath" variable stored in RocksDBKeyedStateBackend
> object).
> "NO Issue here. Working as expected".
>
> Case : 2
> ---------
> - When my TaskManger is restarted, the existing rocksDb checkpoints are
> not deleted.
> New "instanceBasePath" is constructed with the new Random UUID appended to
> the directory.
> And, old checkpoint directories are still there.
>
> questions:
> - Is this expected behaviour not to delete the existing checkPoint
> dirs under the rocksDB local directory?
> - I see the "StreamTaskStateInitializerImpl.java", where new StateBackend
> objects are created. In this case, new directory is created for this Job ID
> appended with new random UUID.
> What happens to the old Directories. Are they going to be purged later on?
> If not, the disk is going to filled up with the older checkpoints. Please
> clarify this.
>
> Thanks,
> Vidya Sagar.
>

Re: Task Manager restart and RocksDB incremental checkpoints issue.

Reply via email to