Thank you Yang and Alexander.
It happened while testing restore from k8s cluster backup, likely job was 
upgraded after “backup”, but before simulation of disaster.
Thanks,
Alexey
________________________________
From: Yang Wang <danrtsey...@gmail.com>
Sent: Tuesday, November 16, 2021 8:00:13 PM
To: Alexander Preuß <alexanderpre...@ververica.com>
Cc: Alexey Trenikhun <yen...@msn.com>; Flink User Mail List 
<user@flink.apache.org>
Subject: Re: Could not retrieve submitted JobGraph from state handle

Hi Alexey,

If you delete the HA data stored in the S3 manually or maybe you configured an 
automatic clean-up rule, then it could happen that
the ConfigMap has the pointers while the concrete data in the S3 is missing.


> How to clean the state handle store?
Since the handle is stored in the ConfigMap, I think you could use the 
following command to do the cleanup manually.
An easy way is to use the different cluster id.

kubectl delete cm --selector='app=<ClusterID>,configmap-type=high-availability'

Best,
Yang


Alexander Preuß 
<alexanderpre...@ververica.com<mailto:alexanderpre...@ververica.com>> 
于2021年11月16日周二 下午10:30写道:
Hi Alexey,

Are you maybe reusing the cluster-id?

Also, could you provide some more information on your setup and a more complete 
stacktrace?
The ConfigMap contains pointers to the actual files on Azure.

Best,
Alexander

On Tue, Nov 16, 2021 at 6:14 AM Alexey Trenikhun 
<yen...@msn.com<mailto:yen...@msn.com>> wrote:
Hello,
We are using Kubernetes HA and Azure Blob storage and in rare cases I see 
following error:

Could not retrieve submitted JobGraph from state handle under 
jobGraph-00000000000000000000000000000000. This indicates that the retrieved 
state handle is broken. Try cleaning the state handle store.

Question is, how exactly can I clean "state handle store"? Delete 
fsp-dispatcher-leader Config Map? Or some files (which one) in Azure Blob 
storage?

Thanks,
Alexey

Reply via email to