Hi Austin, Yang, Matthias,
I am following up to see if you guys have any idea regarding this problem.
I also created an issue here to describe the problem. [1]
After looking into the source code[1], I believe for native k8s, three
configuration files should be added to the flink-config-<cluster-id> configmap
automatically. However, it just have the flink-conf.yaml in the operator
created flink application. And that is also causing the start command
difference mentioned in the issue.
Native k8s using Flink CLI: Args:
native-k8s
$JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx1073741824
-Xms1073741824 -XX:MaxMetaspaceSize=268435456
-Dlog.file=/opt/flink/log/jobmanager.log
-Dlogback.configurationFile=file:/opt/flink/conf/logback-console.xml
-Dlog4j.configuration=file:/opt/flink/conf/log4j-console.properties
-Dlog4j.configurationFile=file:/opt/flink/conf/log4j-console.properties
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint
-D jobmanager.memory.off-heap.size=134217728b -D
jobmanager.memory.jvm-overhead.min=201326592b -D
jobmanager.memory.jvm-metaspace.size=268435456b -D
jobmanager.memory.heap.size=1073741824b -D
jobmanager.memory.jvm-overhead.max=201326592b
Operator flink app:
Args:
native-k8s
$JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx3462817376
-Xms3462817376 -XX:MaxMetaspaceSize=268435456
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint
-D jobmanager.memory.off-heap.size=134217728b -D
jobmanager.memory.jvm-overhead.min=429496736b -D
jobmanager.memory.jvm-metaspace.size=268435456b -D
jobmanager.memory.heap.size=3462817376b -D
jobmanager.memory.jvm-overhead.max=429496736b
Please share your opinion on this. Thanks!
[1] https://github.com/wangyang0918/flink-native-k8s-operator/issues/4
[2]
https://github.com/apache/flink/blob/release-1.12/flink-kubernetes/src/main/java/org/apache/flink/kubernetes/kubeclient/decorators/FlinkConfMountDecorator.java
Have a good weekend!
Best,
Fuyao
From: Fuyao Li <[email protected]>
Date: Tuesday, May 4, 2021 at 19:52
To: Austin Cawley-Edwards <[email protected]>, [email protected]
<[email protected]>, Yang Wang <[email protected]>
Cc: user <[email protected]>
Subject: Re: [External] : Re: StopWithSavepoint() method doesn't work in Java
based flink native k8s operator
Hello All,
I also checked the native-k8s’s automatically generated configmap. It only
contains the flink-conf.yaml, but no log4j.properties. I am not very familiar
with the implementation details behind native k8s.
That should be the root cause, could you check the implementation and help me
to locate the potential problem.
Yang’s initial code:
https://github.com/wangyang0918/flink-native-k8s-operator/blob/master/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkApplicationController.java<https://urldefense.com/v3/__https:/github.com/wangyang0918/flink-native-k8s-operator/blob/master/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkApplicationController.java__;!!GqivPVa7Brio!K_YcBE0y2rd7zQtMtIKmSNNUpMmUTUSA8VdkNCJ8i9w2tH2nwNWoq3j7UZwZEh4$>
My modified version:
https://github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkApplicationController.java<https://urldefense.com/v3/__https:/github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkApplicationController.java__;!!GqivPVa7Brio!K_YcBE0y2rd7zQtMtIKmSNNUpMmUTUSA8VdkNCJ8i9w2tH2nwNWoq3j78oKszhY$>
Thank you so much.
Best,
Fuyao
From: Fuyao Li <[email protected]>
Date: Tuesday, May 4, 2021 at 19:34
To: Austin Cawley-Edwards <[email protected]>, [email protected]
<[email protected]>, Yang Wang <[email protected]>
Cc: user <[email protected]>
Subject: Re: [External] : Re: StopWithSavepoint() method doesn't work in Java
based flink native k8s operator
Hello Austin, Yang,
For the logging issue, I think I have found something worth to notice.
They are all based on base image flink:1.12.1-scala_2.11-java11
Dockerfile:
https://pastebin.ubuntu.com/p/JTsHygsTP6/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/JTsHygsTP6/__;!!GqivPVa7Brio!OVpkpwiox1P7d8b34Nmo9xfzrPlbO4cW65uPc-xLTxaPavtVFQ1GVzVs7J-Fd4Q$>
In the JM and TM provisioned by the k8s operator. There is only flink-conf.yaml
in the $FLINK_HOME/conf directory. Even if I tried to add these configurations
to the image in advance. It seems the operator is seems overriding it and
removing all other log4j configurations. This is causing the logs can’t be
printed correctly.
root@flink-demo-5fc78c8cf-hgvcj:/opt/flink/conf# ls
flink-conf.yaml
However, for the pods that is provisioned by flink native k8s CLI. There exists
some log4j related configurations.
root@test-application-cluster-79c7f9dcf7-44bq8:/opt/flink/conf# ls
flink-conf.yaml log4j-console.properties logback-console.xml
The native Kubernetes operator pod can print logs correctly because it has the
log4j.properties file mounted to the opt/flink/conf/ directory. [1]
For the Flink pods, it seems that it only have a flink-conf.yaml injected
there. [2][3] No log4j related configmap is configured. That makes the logs in
those pods no available.
I am not sure how to inject something similar to the flink pods? Maybe adding
some similar structure that exists in [1], into the cr.yaml ? So that such
configmap will make the log4j.properties available for flink CRD?
I am kind of confused at how to implement this. The deployment is a one-step
operation in [4]. I don’t know how to make a configmap available to it? Maybe I
can only use the new feature – pod template in Flink 1.13 to do this?
[1]
https://github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/deploy/flink-native-k8s-operator.yaml#L58<https://urldefense.com/v3/__https:/github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/deploy/flink-native-k8s-operator.yaml*L58__;Iw!!GqivPVa7Brio!OVpkpwiox1P7d8b34Nmo9xfzrPlbO4cW65uPc-xLTxaPavtVFQ1GVzVsYsKSwdA$>
[2]
https://github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/deploy/cr.yaml#L21<https://urldefense.com/v3/__https:/github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/deploy/cr.yaml*L21__;Iw!!GqivPVa7Brio!OVpkpwiox1P7d8b34Nmo9xfzrPlbO4cW65uPc-xLTxaPavtVFQ1GVzVs-oWyvPk$>
[3]
https://github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/src/main/java/org/apache/flink/kubernetes/operator/Utils/FlinkUtils.java#L83<https://urldefense.com/v3/__https:/github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/src/main/java/org/apache/flink/kubernetes/operator/Utils/FlinkUtils.java*L83__;Iw!!GqivPVa7Brio!OVpkpwiox1P7d8b34Nmo9xfzrPlbO4cW65uPc-xLTxaPavtVFQ1GVzVsk28At-8$>
[4]
https://github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkApplicationController.java#L176<https://urldefense.com/v3/__https:/github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkApplicationController.java*L176__;Iw!!GqivPVa7Brio!OVpkpwiox1P7d8b34Nmo9xfzrPlbO4cW65uPc-xLTxaPavtVFQ1GVzVsc49StKI$>
Best,
Fuyao
From: Fuyao Li <[email protected]>
Date: Tuesday, May 4, 2021 at 15:23
To: Austin Cawley-Edwards <[email protected]>, [email protected]
<[email protected]>
Cc: user <[email protected]>, Yang Wang <[email protected]>, Austin
Cawley-Edwards <[email protected]>
Subject: Re: [External] : Re: StopWithSavepoint() method doesn't work in Java
based flink native k8s operator
Hello All,
For Logging issue:
Hi Austin,
This is the my full code implementation link [1], I just removed the credential
related things. The operator definition can be found here.[2] You can also
check other parts if you find any problem.
The operator uses log4j2 and you can see there is a log appender for operator
[3] and the operator pod can indeed print out logs. But for the flink
application JM and TM pod, I can see the errors mentioned earlier. Sed error
and ERROR StatusLogger No Log4j 2 configuration file found.
I used to use log4j for flink application, to avoid potential incompatible
issue, I have already upgraded the POM for flink application to use log4j2. But
the logging problem still exists.
This is my log4j2.properties file in flink application. [6] This is the loggin
related pom dependencies for flink application [7].
The logs can be printed during normal native k8s deployment and IDE debugging.
When it comes to the operator, it seems not working. Could this be caused by
class namespace conflict? Since I introduced the presto jar in the flink
distribution. This is my Dockerfile to build the flink application jar [5].
Please share your idea on this.
For the stopWithSavepoint issue,
Just to note, I understand cancel command (cancelWithSavepoint() ) is a
deprecated feature and it may not guarantee exactly once semantic and get
inconsistent result, like Timer related things? Please correct me if I am
wrong. The code that works with cancelWithSavepoint() is shared in [4] below.
[1]
https://github.com/FuyaoLi2017/flink-native-kubernetes-operator<https://urldefense.com/v3/__https:/github.com/FuyaoLi2017/flink-native-kubernetes-operator__;!!GqivPVa7Brio!M43rnv3I8fCMsbDbdQLrgzWtsF6rSJJlebU1hGD9w0OMn702aTN6N-zS0pc8xyg$>
[2]
https://github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/deploy/flink-native-k8s-operator.yaml<https://urldefense.com/v3/__https:/github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/deploy/flink-native-k8s-operator.yaml__;!!GqivPVa7Brio!M43rnv3I8fCMsbDbdQLrgzWtsF6rSJJlebU1hGD9w0OMn702aTN6N-zSxwZ7y0c$>
[3]
https://github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/src/main/resources/log4j2.properties<https://urldefense.com/v3/__https:/github.com/FuyaoLi2017/flink-native-kubernetes-operator/blob/master/src/main/resources/log4j2.properties__;!!GqivPVa7Brio!M43rnv3I8fCMsbDbdQLrgzWtsF6rSJJlebU1hGD9w0OMn702aTN6N-zSmcVTvfY$>
[4]
https://pastebin.ubuntu.com/p/tcxT2FwPRS/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/tcxT2FwPRS/__;!!GqivPVa7Brio!M43rnv3I8fCMsbDbdQLrgzWtsF6rSJJlebU1hGD9w0OMn702aTN6N-zSpACgSC8$>
[5]
https://pastebin.ubuntu.com/p/JTsHygsTP6/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/JTsHygsTP6/__;!!GqivPVa7Brio!M43rnv3I8fCMsbDbdQLrgzWtsF6rSJJlebU1hGD9w0OMn702aTN6N-zSFEpFx6Q$>
[6]
https://pastebin.ubuntu.com/p/2wgdcxVfSy/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/2wgdcxVfSy/__;!!GqivPVa7Brio!M43rnv3I8fCMsbDbdQLrgzWtsF6rSJJlebU1hGD9w0OMn702aTN6N-zSZDmRV7I$>
[7]
https://pastebin.ubuntu.com/p/Sq8xRjQyVY/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/Sq8xRjQyVY/__;!!GqivPVa7Brio!M43rnv3I8fCMsbDbdQLrgzWtsF6rSJJlebU1hGD9w0OMn702aTN6N-zSDxCbn6M$>
Best,
Fuyao
From: Austin Cawley-Edwards <[email protected]>
Date: Tuesday, May 4, 2021 at 14:47
To: [email protected] <[email protected]>
Cc: Fuyao Li <[email protected]>, user <[email protected]>, Yang Wang
<[email protected]>, Austin Cawley-Edwards <[email protected]>
Subject: [External] : Re: StopWithSavepoint() method doesn't work in Java based
flink native k8s operator
Hey all,
Thanks for the ping, Matthias. I'm not super familiar with the details of @Yang
Wang<mailto:[email protected]>'s operator, to be honest :(. Can you share
some of your FlinkApplication specs?
For the `kubectl logs` command, I believe that just reads stdout from the
container. Which logging framework are you using in your application and how
have you configured it? There's a good guide for configuring the popular ones
in the Flink docs[1]. For instance, if you're using the default Log4j 2
framework you should configure a ConsoleAppender[2].
Hope that helps a bit,
Austin
[1]:
https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/advanced/logging/<https://urldefense.com/v3/__https:/ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/advanced/logging/__;!!GqivPVa7Brio!IkcTZZ5rY-669_XS8ldTeXg0NeH1nsQkupDh_zuUHAC4yqDOoiJ6f2EvCjPpPPQ$>
[2]:
https://logging.apache.org/log4j/2.x/manual/appenders.html#ConsoleAppender<https://urldefense.com/v3/__https:/logging.apache.org/log4j/2.x/manual/appenders.html*ConsoleAppender__;Iw!!GqivPVa7Brio!IkcTZZ5rY-669_XS8ldTeXg0NeH1nsQkupDh_zuUHAC4yqDOoiJ6f2EvHl40sbA$>
On Tue, May 4, 2021 at 1:59 AM Matthias Pohl
<[email protected]<mailto:[email protected]>> wrote:
Hi Fuyao,
sorry for not replying earlier. The stop-with-savepoint operation shouldn't
only suspend but terminate the job. Is it that you might have a larger state
that makes creating the savepoint take longer? Even though, considering that
you don't experience this behavior with your 2nd solution, I'd assume that we
could ignore this possibility.
I'm gonna add Austin to the conversation as he worked with k8s operators as
well already. Maybe, he can also give you more insights on the logging issue
which would enable us to dig deeper into what's going on with
stop-with-savepoint.
Best,
Matthias
On Tue, May 4, 2021 at 4:33 AM Fuyao Li
<[email protected]<mailto:[email protected]>> wrote:
Hello,
Update:
I think stopWithSavepoint() only suspend the job. It doesn’t actually terminate
(./bin/flink cancel) the job. I switched to cancelWithSavepoint() and it works
here.
Maybe stopWithSavepoint() should only be used to update the configurations like
parallelism? For updating the image, this seems to be not suitable, please
correct me if I am wrong.
For the log issue, I am still a bit confused. Why it is not available in
kubectl logs. How should I get access to it?
Thanks.
Best,
Fuyao
From: Fuyao Li <[email protected]<mailto:[email protected]>>
Date: Sunday, May 2, 2021 at 00:36
To: user <[email protected]<mailto:[email protected]>>, Yang Wang
<[email protected]<mailto:[email protected]>>
Subject: [External] : Re: StopWithSavepoint() method doesn't work in Java based
flink native k8s operator
Hello,
I noticed that first trigger a savepoint and then delete the deployment might
cause the duplicate data issue. That could pose a bad influence to the semantic
correctness. Please give me some hints on how to make the stopWithSavepoint()
work correctly with Fabric8io Java k8s client to perform this image update
operation. Thanks!
Best,
Fuyao
From: Fuyao Li <[email protected]<mailto:[email protected]>>
Date: Friday, April 30, 2021 at 18:03
To: user <[email protected]<mailto:[email protected]>>, Yang Wang
<[email protected]<mailto:[email protected]>>
Subject: [External] : Re: StopWithSavepoint() method doesn't work in Java based
flink native k8s operator
Hello Community, Yang,
I have one more question for logging. I also noticed that if I execute kubectl
logs command to the JM. The pods provisioned by the operator can’t print out
the internal Flink logs in the kubectl logs. I can only get something like the
logs below. No actual flink logs is printed here… Where can I find the path to
the logs? Maybe use a sidecar container to get it out? How can I get the logs
without checking the Flink WebUI? Also, the sed error makes me confused here.
In fact, the application is already up and running correctly if I access the
WebUI through Ingress.
Reference:
https://github.com/wangyang0918/flink-native-k8s-operator/issues/4<https://urldefense.com/v3/__https:/github.com/wangyang0918/flink-native-k8s-operator/issues/4__;!!GqivPVa7Brio!PZPkOj4s7du8ItEG-AxKGR2EN6pWDuKfwcjZNKbpLfhXHRD3IoaH6zptEJWo5vM$>
[root@bastion deploy]# kubectl logs -f flink-demo-594946fd7b-822xk
sed: couldn't open temporary file /opt/flink/conf/sedh1M3oO: Read-only file
system
sed: couldn't open temporary file /opt/flink/conf/sed8TqlNR: Read-only file
system
/docker-entrypoint.sh: line 75: /opt/flink/conf/flink-conf.yaml: Read-only file
system
sed: couldn't open temporary file /opt/flink/conf/sedvO2DFU: Read-only file
system
/docker-entrypoint.sh: line 88: /opt/flink/conf/flink-conf.yaml: Read-only file
system
/docker-entrypoint.sh: line 90: /opt/flink/conf/flink-conf.yaml.tmp: Read-only
file system
Start command: $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx3462817376
-Xms3462817376 -XX:MaxMetaspaceSize=268435456
org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint
-D jobmanager.memory.off-heap.size=134217728b -D
jobmanager.memory.jvm-overhead.min=429496736b -D
jobmanager.memory.jvm-metaspace.size=268435456b -D
jobmanager.memory.heap.size=3462817376b -D
jobmanager.memory.jvm-overhead.max=429496736b
ERROR StatusLogger No Log4j 2 configuration file found. Using default
configuration (logging only errors to the console), or user programmatically
provided configurations. Set system property 'log4j2.debug' to show Log4j 2
internal initialization logging. See
https://logging.apache.org/log4j/2.x/manual/configuration.html<https://urldefense.com/v3/__https:/logging.apache.org/log4j/2.x/manual/configuration.html__;!!GqivPVa7Brio!PZPkOj4s7du8ItEG-AxKGR2EN6pWDuKfwcjZNKbpLfhXHRD3IoaH6zptpRoiZsE$>
for instructions on how to configure Log4j 2
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner
(file:/opt/flink/lib/flink-dist_2.11-1.12.1.jar) to field
java.util.Properties.serialVersionUID
WARNING: Please consider reporting this to the maintainers of
org.apache.flink.api.java.ClosureCleaner
WARNING: Use --illegal-access=warn to enable warnings of further illegal
reflective access operations
WARNING: All illegal access operations will be denied in a future release
-------- The logs stops here, flink applications logs doesn’t get printed here
anymore---------
^C
[root@bastion deploy]# kubectl logs -f flink-demo-taskmanager-1-1
sed: couldn't open temporary file /opt/flink/conf/sedaNDoNR: Read-only file
system
sed: couldn't open temporary file /opt/flink/conf/seddze7tQ: Read-only file
system
/docker-entrypoint.sh: line 75: /opt/flink/conf/flink-conf.yaml: Read-only file
system
sed: couldn't open temporary file /opt/flink/conf/sedYveZoT: Read-only file
system
/docker-entrypoint.sh: line 88: /opt/flink/conf/flink-conf.yaml: Read-only file
system
/docker-entrypoint.sh: line 90: /opt/flink/conf/flink-conf.yaml.tmp: Read-only
file system
Start command: $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx697932173
-Xms697932173 -XX:MaxDirectMemorySize=300647712 -XX:MaxMetaspaceSize=268435456
org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner -D
taskmanager.memory.framework.off-heap.size=134217728b -D
taskmanager.memory.network.max=166429984b -D
taskmanager.memory.network.min=166429984b -D
taskmanager.memory.framework.heap.size=134217728b -D
taskmanager.memory.managed.size=665719939b -D taskmanager.cpu.cores=1.0 -D
taskmanager.memory.task.heap.size=563714445b -D
taskmanager.memory.task.off-heap.size=0b --configDir /opt/flink/conf
-Djobmanager.memory.jvm-overhead.min='429496736b'
-Dpipeline.classpaths='file:usrlib/quickstart-0.1.jar'
-Dtaskmanager.resource-id='flink-demo-taskmanager-1-1'
-Djobmanager.memory.off-heap.size='134217728b' -Dexecution.target='embedded'
-Dweb.tmpdir='/tmp/flink-web-d7691661-fac5-494e-8154-896b4fe30692'
-Dpipeline.jars='file:/opt/flink/usrlib/quickstart-0.1.jar'
-Djobmanager.memory.jvm-metaspace.size='268435456b'
-Djobmanager.memory.heap.size='3462817376b'
-Djobmanager.memory.jvm-overhead.max='429496736b'
ERROR StatusLogger No Log4j 2 configuration file found. Using default
configuration (logging only errors to the console), or user programmatically
provided configurations. Set system property 'log4j2.debug' to show Log4j 2
internal initialization logging. See
https://logging.apache.org/log4j/2.x/manual/configuration.html<https://urldefense.com/v3/__https:/logging.apache.org/log4j/2.x/manual/configuration.html__;!!GqivPVa7Brio!PZPkOj4s7du8ItEG-AxKGR2EN6pWDuKfwcjZNKbpLfhXHRD3IoaH6zptpRoiZsE$>
for instructions on how to configure Log4j 2
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by
org.apache.flink.shaded.akka.org.jboss.netty.util.internal.ByteBufferUtil
(file:/opt/flink/lib/flink-dist_2.11-1.12.1.jar) to method
java.nio.DirectByteBuffer.cleaner()
WARNING: Please consider reporting this to the maintainers of
org.apache.flink.shaded.akka.org.jboss.netty.util.internal.ByteBufferUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal
reflective access operations
WARNING: All illegal access operations will be denied in a future release
Apr 29, 2021 12:58:34 AM oracle.simplefan.impl.FanManager configure
SEVERE: attempt to configure ONS in FanManager failed with
oracle.ons.NoServersAvailable: Subscription time out
-------- The logs stops here, flink applications logs doesn’t get printed here
anymore---------
Best,
Fuyao
From: Fuyao Li <[email protected]<mailto:[email protected]>>
Date: Friday, April 30, 2021 at 16:50
To: user <[email protected]<mailto:[email protected]>>, Yang Wang
<[email protected]<mailto:[email protected]>>
Subject: [External] : StopWithSavepoint() method doesn't work in Java based
flink native k8s operator
Hello Community, Yang,
I am trying to extend the flink native Kubernetes operator by adding some new
features based on the repo [1]. I wrote a method to release the image update
functionality. [2] I added the
triggerImageUpdate(oldFlinkApp, flinkApp, effectiveConfig);
under the existing method.
triggerSavepoint(oldFlinkApp, flinkApp, effectiveConfig);
I wrote a function to accommodate the image change behavior.[2]
Solution1:
I want to use stopWithSavepoint() method to complete the task. However, I found
it will get stuck and never get completed. Even if I use get() for the
completeableFuture. It will always timeout and throw exceptions. See solution 1
logs [3]
Solution2:
I tried to trigger a savepoint, then delete the deployment in the code and then
create a new application with new image. This seems to work fine. Log link: [4]
My questions:
1. Why solution 1 will get stuck? triggerSavepoint() CompleteableFuture
could work here… Why stopWithSavepoint() will always get stuck or timeout? Very
confused.
2. For Fabric8io library, I am still new to it, did I do anything wrong in
the implementation, maybe I should update the jobStatus? Please give me some
suggestions.
3. For work around solution 2, is there any bad influence I didn’t notice?
[1]
https://github.com/wangyang0918/flink-native-k8s-operator<https://urldefense.com/v3/__https:/github.com/wangyang0918/flink-native-k8s-operator__;!!GqivPVa7Brio!PJIKFBi86alhx1DCxiWp8FkWKToD8XC8tNHFFrYSZj3AKM3zqyiNRjijNSMY0DI$>
[2]
https://pastebin.ubuntu.com/p/tQShjmdcJt/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/tQShjmdcJt/__;!!GqivPVa7Brio!PJIKFBi86alhx1DCxiWp8FkWKToD8XC8tNHFFrYSZj3AKM3zqyiNRjijoiwPw-I$>
[3]
https://pastebin.ubuntu.com/p/YHSPpK4W4Z/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/YHSPpK4W4Z/__;!!GqivPVa7Brio!PJIKFBi86alhx1DCxiWp8FkWKToD8XC8tNHFFrYSZj3AKM3zqyiNRjijmgfSmqs$>
[4]
https://pastebin.ubuntu.com/p/3VG7TtXXfh/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/3VG7TtXXfh/__;!!GqivPVa7Brio!PJIKFBi86alhx1DCxiWp8FkWKToD8XC8tNHFFrYSZj3AKM3zqyiNRjijr_tizPo$>
Best,
Fuyao