Hello Community, Yang, I have one more question for logging. I also noticed that if I execute kubectl logs command to the JM. The pods provisioned by the operator can’t print out the internal Flink logs in the kubectl logs. I can only get something like the logs below. No actual flink logs is printed here… Where can I find the path to the logs? Maybe use a sidecar container to get it out? How can I get the logs without checking the Flink WebUI? Also, the sed error makes me confused here. In fact, the application is already up and running correctly if I access the WebUI through Ingress.
Reference: https://github.com/wangyang0918/flink-native-k8s-operator/issues/4 [root@bastion deploy]# kubectl logs -f flink-demo-594946fd7b-822xk sed: couldn't open temporary file /opt/flink/conf/sedh1M3oO: Read-only file system sed: couldn't open temporary file /opt/flink/conf/sed8TqlNR: Read-only file system /docker-entrypoint.sh: line 75: /opt/flink/conf/flink-conf.yaml: Read-only file system sed: couldn't open temporary file /opt/flink/conf/sedvO2DFU: Read-only file system /docker-entrypoint.sh: line 88: /opt/flink/conf/flink-conf.yaml: Read-only file system /docker-entrypoint.sh: line 90: /opt/flink/conf/flink-conf.yaml.tmp: Read-only file system Start command: $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx3462817376 -Xms3462817376 -XX:MaxMetaspaceSize=268435456 org.apache.flink.kubernetes.entrypoint.KubernetesApplicationClusterEntrypoint -D jobmanager.memory.off-heap.size=134217728b -D jobmanager.memory.jvm-overhead.min=429496736b -D jobmanager.memory.jvm-metaspace.size=268435456b -D jobmanager.memory.heap.size=3462817376b -D jobmanager.memory.jvm-overhead.max=429496736b ERROR StatusLogger No Log4j 2 configuration file found. Using default configuration (logging only errors to the console), or user programmatically provided configurations. Set system property 'log4j2.debug' to show Log4j 2 internal initialization logging. See https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions on how to configure Log4j 2 WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.flink.api.java.ClosureCleaner (file:/opt/flink/lib/flink-dist_2.11-1.12.1.jar) to field java.util.Properties.serialVersionUID WARNING: Please consider reporting this to the maintainers of org.apache.flink.api.java.ClosureCleaner WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release -------- The logs stops here, flink applications logs doesn’t get printed here anymore--------- ^C [root@bastion deploy]# kubectl logs -f flink-demo-taskmanager-1-1 sed: couldn't open temporary file /opt/flink/conf/sedaNDoNR: Read-only file system sed: couldn't open temporary file /opt/flink/conf/seddze7tQ: Read-only file system /docker-entrypoint.sh: line 75: /opt/flink/conf/flink-conf.yaml: Read-only file system sed: couldn't open temporary file /opt/flink/conf/sedYveZoT: Read-only file system /docker-entrypoint.sh: line 88: /opt/flink/conf/flink-conf.yaml: Read-only file system /docker-entrypoint.sh: line 90: /opt/flink/conf/flink-conf.yaml.tmp: Read-only file system Start command: $JAVA_HOME/bin/java -classpath $FLINK_CLASSPATH -Xmx697932173 -Xms697932173 -XX:MaxDirectMemorySize=300647712 -XX:MaxMetaspaceSize=268435456 org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner -D taskmanager.memory.framework.off-heap.size=134217728b -D taskmanager.memory.network.max=166429984b -D taskmanager.memory.network.min=166429984b -D taskmanager.memory.framework.heap.size=134217728b -D taskmanager.memory.managed.size=665719939b -D taskmanager.cpu.cores=1.0 -D taskmanager.memory.task.heap.size=563714445b -D taskmanager.memory.task.off-heap.size=0b --configDir /opt/flink/conf -Djobmanager.memory.jvm-overhead.min='429496736b' -Dpipeline.classpaths='file:usrlib/quickstart-0.1.jar' -Dtaskmanager.resource-id='flink-demo-taskmanager-1-1' -Djobmanager.memory.off-heap.size='134217728b' -Dexecution.target='embedded' -Dweb.tmpdir='/tmp/flink-web-d7691661-fac5-494e-8154-896b4fe30692' -Dpipeline.jars='file:/opt/flink/usrlib/quickstart-0.1.jar' -Djobmanager.memory.jvm-metaspace.size='268435456b' -Djobmanager.memory.heap.size='3462817376b' -Djobmanager.memory.jvm-overhead.max='429496736b' ERROR StatusLogger No Log4j 2 configuration file found. Using default configuration (logging only errors to the console), or user programmatically provided configurations. Set system property 'log4j2.debug' to show Log4j 2 internal initialization logging. See https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions on how to configure Log4j 2 WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.flink.shaded.akka.org.jboss.netty.util.internal.ByteBufferUtil (file:/opt/flink/lib/flink-dist_2.11-1.12.1.jar) to method java.nio.DirectByteBuffer.cleaner() WARNING: Please consider reporting this to the maintainers of org.apache.flink.shaded.akka.org.jboss.netty.util.internal.ByteBufferUtil WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release Apr 29, 2021 12:58:34 AM oracle.simplefan.impl.FanManager configure SEVERE: attempt to configure ONS in FanManager failed with oracle.ons.NoServersAvailable: Subscription time out -------- The logs stops here, flink applications logs doesn’t get printed here anymore--------- Best, Fuyao From: Fuyao Li <fuyao...@oracle.com> Date: Friday, April 30, 2021 at 16:50 To: user <user@flink.apache.org>, Yang Wang <danrtsey...@gmail.com> Subject: [External] : StopWithSavepoint() method doesn't work in Java based flink native k8s operator Hello Community, Yang, I am trying to extend the flink native Kubernetes operator by adding some new features based on the repo [1]. I wrote a method to release the image update functionality. [2] I added the triggerImageUpdate(oldFlinkApp, flinkApp, effectiveConfig); under the existing method. triggerSavepoint(oldFlinkApp, flinkApp, effectiveConfig); I wrote a function to accommodate the image change behavior.[2] Solution1: I want to use stopWithSavepoint() method to complete the task. However, I found it will get stuck and never get completed. Even if I use get() for the completeableFuture. It will always timeout and throw exceptions. See solution 1 logs [3] Solution2: I tried to trigger a savepoint, then delete the deployment in the code and then create a new application with new image. This seems to work fine. Log link: [4] My questions: 1. Why solution 1 will get stuck? triggerSavepoint() CompleteableFuture could work here… Why stopWithSavepoint() will always get stuck or timeout? Very confused. 2. For Fabric8io library, I am still new to it, did I do anything wrong in the implementation, maybe I should update the jobStatus? Please give me some suggestions. 3. For work around solution 2, is there any bad influence I didn’t notice? [1] https://github.com/wangyang0918/flink-native-k8s-operator<https://urldefense.com/v3/__https:/github.com/wangyang0918/flink-native-k8s-operator__;!!GqivPVa7Brio!PJIKFBi86alhx1DCxiWp8FkWKToD8XC8tNHFFrYSZj3AKM3zqyiNRjijNSMY0DI$> [2] https://pastebin.ubuntu.com/p/tQShjmdcJt/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/tQShjmdcJt/__;!!GqivPVa7Brio!PJIKFBi86alhx1DCxiWp8FkWKToD8XC8tNHFFrYSZj3AKM3zqyiNRjijoiwPw-I$> [3] https://pastebin.ubuntu.com/p/YHSPpK4W4Z/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/YHSPpK4W4Z/__;!!GqivPVa7Brio!PJIKFBi86alhx1DCxiWp8FkWKToD8XC8tNHFFrYSZj3AKM3zqyiNRjijmgfSmqs$> [4] https://pastebin.ubuntu.com/p/3VG7TtXXfh/<https://urldefense.com/v3/__https:/pastebin.ubuntu.com/p/3VG7TtXXfh/__;!!GqivPVa7Brio!PJIKFBi86alhx1DCxiWp8FkWKToD8XC8tNHFFrYSZj3AKM3zqyiNRjijr_tizPo$> Best, Fuyao