Sorry for the late reply. Just in case you restart the job , is it able to safely use the checkpoint and get back to the checkpointed state?
Regards Ram, On Thu, Sep 28, 2023 at 4:46 PM Alexis Sarda-Espinosa < sarda.espin...@gmail.com> wrote: > Hi Surendra, > > there are no exceptions in the logs, nor anything salient with > INFO/WARN/ERROR levels. The checkpoints are definitely completing, we even > set the config > > execution.checkpointing.tolerable-failed-checkpoints: 1 > > Regards, > Alexis. > > Am Do., 28. Sept. 2023 um 09:32 Uhr schrieb Surendra Singh Lilhore < > surendralilh...@gmail.com>: > >> Hi Alexis, >> >> Could you please check the TaskManager log for any exceptions? >> >> Thanks >> Surendra >> >> >> On Thu, Sep 28, 2023 at 7:06 AM Alexis Sarda-Espinosa < >> sarda.espin...@gmail.com> wrote: >> >>> Hello, >>> >>> We are using ABFSS for RocksDB's backend as well as the storage dir >>> required for Kubernetes HA. In the Azure Portal's monitoring insights I see >>> that every single operation contains failing transactions for the >>> GetPathStatus API. Unfortunately I don't see any additional details, but I >>> know the storage account is only used by Flink. Checkpointing isn't >>> failing, but I wonder if this could be an issue in the long term? >>> >>> Regards, >>> Alexis. >>> >>>