Could you please share the full JobManager logs?

AFAIK, you attached exceptions are normal logs when the JobManager is
trying to acquire the configmap lock.

Best,
Yang

houssem <mejrihousse...@gmail.com> 于2021年8月31日周二 上午4:36写道:

> Hello, thanks for the response
>
> I am using kubernetes standalone application mode not the native one.
>
> and this error happens randomly at some point while running the job.
>
> Also i am using just one replicas of the jobmanager
>
> here is some other logs::
>
>
> {"@timestamp":"2021-08-30T15:43:44.970+02:00","@version":"1","message":"Exception
> occurred while renewing lock: Unable to update
> ConfigMapLock","logger_name":"io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector",
> "thread_name":"pool-685-thread-1","level":"DEBUG","level_value":10000,"stack_trace":"io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException:
> Unable to update ConfigMapLock
>
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108)
>
>  
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156)
>
>  
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120)
>
>  
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:104)
>  java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat
> java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
>  
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>
>  
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>  
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  java.lang.Thread.run(Thread.java:748)
>  Caused by: io.fabric8.kubernetes.client.KubernetesClientException:
> Failure executing: PUT at:
>
> https://172.31.64.1/api/v1/namespaces/flink-pushavoo-flink-rec/configmaps/elifibre-00000000000000000000000000000000-jobmanager-leader
> .
>  Message: Operation cannot be fulfilled on configmaps
> \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object
> has been modified; please apply your changes to the latest version and try
> again.
>  Received status: Status(apiVersion=v1, code=409,
> details=StatusDetails(causes=[], group=null, kind=configmaps,
> name=elifibre-00000000000000000000000000000000-jobmanager-leader,
> retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status,
> message=Operation cannot be fulfilled on configmaps
>  \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the
> object has been modified;
>  please apply your changes to the latest version and try again,
> metadata=ListMeta(_continue=null, remainingItemCount=null,
> resourceVersion=null, selfLink=null, additionalProperties={}),
> reason=Conflict, status=Failure, additionalProperties={}).
>
>  
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)
>
>  
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507)
>
>  
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)
>
>  
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)
>
>  
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289)
>
>  
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269)
>
>  
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820)
>
>  
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86)
>
>  
> io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26)
>
>  
> io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5)
>
>  
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92)
>
>  
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36)
>
>  
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106)
>  ... 10 common frames omitted\n"}
>
>
> **********************************************************************************************************
>
>
>
>
>
> On 2021/08/30 10:53:10, Roman Khachatryan <ro...@apache.org> wrote:
> > Hello,
> >
> > Do I understand correctly that you are using native Kubernetes
> > deployment in application mode;
> > and the issue *only* happens if you set kubernetes-jobmanager-replicas
> > [1] to a value greater than 1?
> >
> > Does it happen during deployment or at some point while running the job?
> >
> > Could you share Flink and Kubernetes versions and HA configuration
> > [2]? (I'm assuming you're using Kubernetes for HA, not ZK).
> >
> > [1]
> >
> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/#kubernetes-jobmanager-replicas
> > [2]
> >
> https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/ha/kubernetes_ha/
> >
> > Regards,
> > Roman
> >
> > On Fri, Aug 27, 2021 at 2:31 PM mejri houssem <mejrihousse...@gmail.com>
> wrote:
> > >
> > > hello i am deploying a flink application cluster with kubernetes HA
> mode, but i am facing this  recurrent problem and i didn't know how to
> solve it.
> > >
> > > Any help would be appreciated.
> > >
> > >
> > >
> > > this of the jobManager:
> > >
> > >
> {"@timestamp":"2021-08-27T14:19:42.447+02:00","@version":"1","message":"Exception
> occurred while renewing lock: Unable to update
> ConfigMapLock","logger_name":"io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector","thread_name":"pool-4092-thread-1","level":"DEBUG","level_value":10000,"stack_trace":"io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.LockException:
> Unable to update ConfigMapLock\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:108)\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.tryAcquireOrRenew(LeaderElector.java:156)\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.renew(LeaderElector.java:120)\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.LeaderElector.lambda$null$1(LeaderElector.java:104)\n\tat
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat
> java.util.concurrent.FutureTask.run(Fut
>  ureT
> > >  ask.java:266)\n\tat
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)\n\tat
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)\n\tat
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat
> java.lang.Thread.run(Thread.java:748)\nCaused by:
> io.fabric8.kubernetes.client.KubernetesClientException: Failure executing:
> PUT at:
> https://172.31.64.1/api/v1/namespaces/flink-pushavoo-flink-rec/configmaps/elifibre-00000000000000000000000000000000-jobmanager-leader.
> Message: Operation cannot be fulfilled on configmaps
> \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object
> has been modified; please apply your changes to the latest version and try
> again. Received status: Status(apiVersion=v1, code=409,
> details=StatusDetails(causes=[],
>   gro
> > >  up=null, kind=configmaps,
> name=elifibre-00000000000000000000000000000000-jobmanager-leader,
> retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status,
> message=Operation cannot be fulfilled on configmaps
> \"elifibre-00000000000000000000000000000000-jobmanager-leader\": the object
> has been modified; please apply your changes to the latest version and try
> again, metadata=ListMeta(_continue=null, remainingItemCount=null,
> resourceVersion=null, selfLink=null, additionalProperties={}),
> reason=Conflict, status=Failure, additionalProperties={}).\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:568)\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:507)\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:471)\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:430)\n\tat
> io.fabric8.kube
>  rnet
> > >
> es.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:289)\n\tat
> io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleReplace(OperationSupport.java:269)\n\tat
> io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleReplace(BaseOperation.java:820)\n\tat
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.lambda$replace$1(HasMetadataOperation.java:86)\n\tat
> io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:26)\n\tat
> io.fabric8.kubernetes.api.model.DoneableConfigMap.done(DoneableConfigMap.java:5)\n\tat
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:92)\n\tat
> io.fabric8.kubernetes.client.dsl.base.HasMetadataOperation.replace(HasMetadataOperation.java:36)\n\tat
> io.fabric8.kubernetes.client.extended.leaderelection.resourcelock.ConfigMapLock.update(ConfigMapLock.java:106)\n\t...
> 10 common frames omitted\n"}
> > >
> >
>

Reply via email to