Job failed to restart on time

2025-01-06 Thread Liting Liu (litiliu) via user
I'm using Flink 1.15.4. One time, I found the job failed to restart on time, here are some of my jobManager's log: ``` 2024-12-30 18:50:32,089 INFO org.apache.flink.runtime.jobmaster.JobMaster [] - Using restart back off time strategy FixedDelayRestartBackoffTimeStrategy(maxNum

回复: Jobmanager restart after it has been requested to stop

2024-02-04 Thread Liting Liu (litiliu) via user
间: 2024年2月2日 17:56 收件人: Liting Liu (litiliu) 抄送: user 主题: Re: Jobmanager restart after it has been requested to stop If you could find the "Deregistering Flink Kubernetes cluster, clusterId" in the JobManager log, then it is not the expected behavior. Having the full logs of JobMan

Jobmanager restart after it has been requested to stop

2024-02-01 Thread Liting Liu (litiliu) via user
Hi, community:   I'm running a Flink 1.14.3 job with flink-Kubernetes-operator-1.6.0 on the AWS. I found my flink jobmananger container's thread restarted after this flinkdeployment has been requested to stop, here is the log of jobmanager: 2024-02-01 21:57:48,977 tn="flink-akka.actor.defaul

Encounter library registration references a different set of library BLOBs after jobManager restarted

2023-07-13 Thread Liting Liu (litiliu)
Hi, Community. There was an issue that happened to one of our Flink Streaming jobs using 1.14.3 and that job didn't enable JobManager HA. The issue is after the only jobManager pod's flink-main-container restarted, some of the taskManager pods keep throwing the below exception: INFO org.apa

Fail to run flink 1.17 job with flink-operator 1.5.0 version

2023-06-12 Thread Liting Liu (litiliu)
Hi, I was trying to submit a flink 1.17 job with the flink-kubernetes-operator version v1.5.0. But encountered the below exception: The FlinkDeployment "test-scale-z6t4cd" is invalid: spec.flinkVersion: Unsupported value: "v1_17": supported values: "v1_13", "v1_14", "v1_15", "v1_16" I think

回复: How to specify both the resource limit and resource request for JM/TM in flink-operator

2023-01-12 Thread Liting Liu (litiliu)
Seems i can achieve this by specify the "kubernetes.jobmanager.cpu.limit-factor" and "kubernetes.taskmanager.cpu.limit-factor" in flink properties. Those parameter are supported since flink 1.15 ____________ 发件人: Liting Liu (litiliu) 发送时间: 2023年1月12日

How to specify both the resource limit and resource request for JM/TM in flink-operator

2023-01-12 Thread Liting Liu (litiliu)
  Hi, community. I wonder how can i specify both the resource request and limit for JM/TM in the podTemplate using flink-operator? We have the need to set the request resource and limit resource to different value. For example: jobManager: limits: cpu: 500m memory: 500Mi reques

configMap value error when using flink-operator?

2022-10-25 Thread Liting Liu (litiliu)
hi:    I'm trying to deploy a flink job with flink-operaotor. The flink-operator's version is 1.2.0. And the yaml i use is here: apiVersion: flink.apache.org/v1beta1 kind: FlinkDeployment metadata: name: basic-example spec: image: flink:1.15 flinkVersion: v1_15 flinkConfiguration:

status no clear when deploying batch job with flink-k8s-operator

2022-10-25 Thread Liting Liu (litiliu)
  Hi, I'm deploying a flink batch job with flink-k8s-operator. My flink-k8s-operator's version is 1.2.0 and flink's version is 1.14.6. I found after the batch job execute finish, the jobManagerDeploymentStatus field became "MISSING" in FlinkDeployment crd. And the error field became "Missin

回复: Does kubernetes operator support manually triggering savepoint with canceling the job?

2022-10-19 Thread Liting Liu (litiliu)
9de925-b9ead1c58e7b timeStamp: 1666163606426 triggerType: MANUAL triggerId: '' triggerTimestamp: 0 triggerType: MANUAL startTime: '1666161791058' state: RUNNING 发件人: Geng Biao 发送时间: 2022年10月4日 13:5

fail to mount hadoop-config-volume when using flink-k8s-operator

2022-10-12 Thread Liting Liu (litiliu)
Hi, community: I'm using flink-k8s-operator v1.2.0 to deploy flink job. And the "HADOOP_CONF_DIR" environment variable was setted in the image that i buiilded from flink:1.15. I found the taskmanager pod was trying to mount a volume named "hadoop-config-volume" from configMap. But the config

Does kubernetes operator support manually triggering savepoint with canceling the job?

2022-10-03 Thread Liting Liu (litiliu)
Hello Flink community:    I want to manually trigger the savepoint with the help of kubernetes operator. But seems kubernetes operator hasn't provided an option for whether cancling the job when triggering savepoint. Because the `cancelJob` parameter was hard coded to false in latest code Abst

回复: Re:Re: get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar

2022-09-01 Thread Liting Liu (litiliu)
tter way is to relocate the class. [1] https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#classloader-parent-first-patterns-default Best regards, Yuxia ____________ 发件人: "Liting Liu (litiliu)" 收件人: "User" 发送时间: 星期三, 2022年 8 月 31日

get NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar

2022-08-31 Thread Liting Liu (litiliu)
Hi, i got NoSuchMethodError when using flink flink-sql-connector-hive-2.2.0_2.11-1.14.4.jar. Exception in thread "main" org.apache.flink.table.client.SqlClientException: Unexpected exception. This is a bug. Please consider filing an issue. at org.apache.flink.table.client.SqlClient.start

Exception when calculating throughputEMA in 1.14.3

2022-08-22 Thread Liting Liu (litiliu)
Hi, we are using 1.14.3, but got "Time should be non negative" after the job has been running for days. What should i do to get rid of this Exception? Do i have to disable the network-debloating feature? Does it's caused by System.currentTimeMillis doesn't always return a value bigger than befor

Failed to restore from ck, because of KryoException

2022-05-05 Thread Liting Liu (litiliu)
Hi, We are using flink 1.14.3. But when the job try to restart from checkPoint, the following exception accour. What's wrong? And how can i avoid it? Caused by: TimerException{com.esotericsoftware.kryo.KryoException: java.lang.IndexOutOfBoundsException: Index: 99, Size: 9 Serialization trace: