Flink Java17-based incompatibility with Kubernetes Operator

2025-07-23 Thread Nikola Milutinovic
Hello all. Just wanted to give a heads-up that Flink 1.20 images based on Java 17 have problems with launching jobs via Flink Kubernetes Operator. Those based on Java 11 work. We are running a Flink Session cluster on Kubernetes, deploying it using Flink K8s Operator. Our session cluster was r

Re: [PyFlink] Issue Deploying FlinkSessionJob with PVC and Python Script Access

2025-07-07 Thread Nikola Milutinovic
Hi Pachara. We are also struggling to get a PyFlink job to run. Hitting some other wall, but I would like to share some thoughts here. At the moment, we are launching our fleet of jobs using a specially crafted Docker image: Flink 1.20.1 + plugins + PyFlink + our Python libs + our jobs. From t

Re: Problems running Flink job via K8s operator

2025-06-11 Thread Nikola Milutinovic
some hints and reassurances. Nikola. From: Gunnar Morling Date: Wednesday, June 11, 2025 at 2:19 PM To: Nikola Milutinovic Subject: Re: Problems running Flink job via K8s operator Hey Nikola, > Problem 1: The first problem is that the operator doesn’t know about “local” > URI schema u

Problems running Flink job via K8s operator

2025-06-11 Thread Nikola Milutinovic
Hello. I have problems trying to run a Flink session job using Flink Kubernetes operator. Two problems, so far. This is the Spec I am trying to use: apiVersion: flink.apache.org/v1beta1 kind: FlinkSessionJob metadata: name: nix-test spec: deploymentName: flink-cluster-session-cluster job:

Re: Savepoints and Checkpoints missing files

2025-06-10 Thread Nikola Milutinovic
small amount of data? Nikola. From: Nikola Milutinovic Date: Tuesday, June 10, 2025 at 4:01 PM To: Cc: Flink Users Subject: Re: Savepoints and Checkpoints missing files Hi Gabor. Thanks for chiming in. I think it is failing but I could be mistaken. There are no errors in the log, everything looks

Re: Savepoints and Checkpoints missing files

2025-06-10 Thread Nikola Milutinovic
ile, as well? Or can I do a better job of analyzing it? Nikola. From: Gabor Somogyi Date: Tuesday, June 10, 2025 at 10:52 AM To: Nikola Milutinovic Cc: Flink Users Subject: Re: Savepoints and Checkpoints missing files Hi Nikola, Fails on how? Some stack trace or error would be beneficial. G

Savepoints and Checkpoints missing files

2025-06-10 Thread Nikola Milutinovic
Hello. We are running Flink 1.20.1 on Kubernetes (AKS). We have observed a consistent error situation: both checkpoints and savepoints only save “_metadata” file and nothing else. Sometimes this is OK, where all data is in that one file. But sometimes “_metadata” holds references to other files

Re: Kubernetes Operator Flink version null for FlinkSessionsJob

2025-06-06 Thread Nikola Milutinovic
What does your YAML for Job submission look like? And the YAML for Session cluster, for that matter. It is hard to tell without those. Nix, From: dominik.buen...@swisscom.com Date: Thursday, June 5, 2025 at 3:40 PM To: user@flink.apache.org Subject: Kubernetes Operator Flink version null for

Graceful stopping of Flink on K8s

2025-05-15 Thread Nikola Milutinovic
Hello all. We are running Flink 1.20 on Kubernetes cluster. We deploy using Flink K8s Operator. I was wandering, when Kubernets decides to kill a running Flink cluster, is it using some regular graceful method or does it just kill the pod? Just for the reference, Docker has a way to specify a

Re: Python based User defined function on Flink 1.19.1

2025-05-14 Thread Nikola Milutinovic
Hmm, lemme see… Oh, yes, we also had to link Python bin: ldconfig /usr/lib ln -s /usr/bin/python3 /usr/bin/python Classic trickery. Nix, From: George Date: Wednesday, May 14, 2025 at 1:09 PM To: Nikola Milutinovic , user@flink.apache.org Subject: Re: Python based User defined function on

Re: Python based User defined function on Flink 1.19.1

2025-05-14 Thread Nikola Milutinovic
Hi George. We saw the same problem, running Apache Flink 1.19 and 1.20 images. The cause is that Flink image provides a JRE and you need JDK to build/install PyFlink. And, oddly enough, I think it was only on ARM64 images. Amd64 was OK, I think. So, Mac M1, M2, M3… Our Docker file for building

Re: Flink task manager PODs autoscaling - K8s installation

2025-05-13 Thread Nikola Milutinovic
You can definitely run Flink in native Kubernetes mode without K8s Operator. Although, it should be easier to deploy such a cluster with the operator. Operator will just make the process easer. Nix. From: Kamal Mittal via user Date: Tuesday, May 13, 2025 at 3:59 AM To: Kamal Mittal , user@flin

Re: flink on native kubernetes does not rely on deployment to start Pod

2025-05-12 Thread Nikola Milutinovic
Hi Melin. Flink integration with Kubernetes is really just about resource provisioning. The behavior that you describe is caused by Flink’s resilience feature, implemented as Job restart policy. And it would be similar even without K8s integration. If the job is configured to restart on errors,

Flink checkpoints, is this OK?

2025-03-20 Thread Nikola Milutinovic
Hi all. I would like to ask if what I am seeing is good or not. We are running Flink as Kubernetes session cluster and have checkpoints enabled. When I inspect a checkpoint, I can see only one file: „_metadata“. As I understand it, that is OK, if the state in question is sufficiently small to f

Re: Kubernetes FlinkDeployments and Batch Jobs

2025-03-07 Thread Nikola Milutinovic
I think your idea is OK. You have 2 options, generally. You can launch application cluster(s), which do linger around when the application is finished. And you get the idea: to read the status, logs, etc. That is what you are doing right now. The other option is to have a session cluster which

Re: Implement HPA with Flink operator

2025-03-01 Thread Nikola Milutinovic
Hi Bianca. I think you are missing the point here. HPA can scale any resource based on the given metric. And, in the example you point to, they are scaling FlinkDeployments. Basically, they would be launching multiple instances of the same pipeline, working on the same target(s). If such parall

Re: How can we read checkpoint data for debugging state

2025-02-21 Thread Nikola Milutinovic
Hi Sachin. When you define your pipeline, there is an option to assign a UID to each task in the job graph. If you do not, Flink will auto-generate UID for you – each time you run the pipeline. It is highly recommended to define your own UIDs, since any re-run would assign new UIDs and effecti

Re: Please help with Flink + InfluxDB

2025-02-18 Thread Nikola Milutinovic
helps. Nix. From: Siva Krishna Date: Thursday, February 13, 2025 at 11:25 AM To: Nikola Milutinovic Cc: user@flink.apache.org Subject: Re: Please help with Flink + InfluxDB Can you please provide how you are submitting the Python job via JM. Siva,. On Thu, Feb 13, 2025 at 3:44 PM Nikola

Re: Bug in Flink 'run' for Python Job ?

2025-02-18 Thread Nikola Milutinovic
Hi Siva. I believe, when we built our Flink image (Flink 1.20.0 + several connectors + our Python library code), we had to make a symbolic link: ln -s /usr/bin/python3 /usr/bin/python Try that. Or put a full path in -pyexec argument. Nix. From: Siva Krishna Date: Monday, February 17, 2025 at

Re: Please help with Flink + InfluxDB

2025-02-13 Thread Nikola Milutinovic
successfully launching PyFlink jobs from JM, in Docker Compose. Nix,. From: Siva Krishna Date: Thursday, February 13, 2025 at 5:20 AM To: Nikola Milutinovic Cc: user@flink.apache.org Subject: Re: Please help with Flink + InfluxDB Hi Nix, I tried the python approach and successfully ran it with command

Re: Flink kafka source with OAuth authentication token refresh

2025-02-11 Thread Nikola Milutinovic
So, Flink is going to act as an OAuth2 Client. You would need to have fetch access token and cache it, either in Flink or externally. I you wish to cache in Flink, look into Broadcast State pattern. These are URLs. https://flink.apache.org/2019/06/26/a-practical-guide-to-broadcast-state-in-apach

Re: table.exec.source.idle-timeout support

2025-02-06 Thread Nikola Milutinovic
resulting set of configs. BTW, how does it work when using Kubernetes operator? Nix From: Nikola Milutinovic Date: Tuesday, January 28, 2025 at 6:10 PM To: user@flink.apache.org Subject: Re: table.exec.source.idle-timeout support Hi Zhao. Yes, I am deploying our solution as a Session Cluster on

Re: Please help with Flink + InfluxDB

2025-01-31 Thread Nikola Milutinovic
we set on its constructor. So, log, log and log. Nix. From: Siva Krishna Date: Friday, January 31, 2025 at 4:41 AM To: Nikola Milutinovic Cc: user@flink.apache.org Subject: Re: Please help with Flink + InfluxDB Yes Nix. I can confirm that I am able to connect to InfluxDB from Flink(taskmanager

Re: Please help with Flink + InfluxDB

2025-01-30 Thread Nikola Milutinovic
mistake to target “localhost”, which stays inside the container. Nix. From: Siva Krishna Date: Thursday, January 30, 2025 at 11:07 AM To: Nikola Milutinovic Cc: user@flink.apache.org Subject: Re: Please help with Flink + InfluxDB Hi Nix, Thank you for your reply. I have small update. For

Re: Please help with Flink + InfluxDB

2025-01-30 Thread Nikola Milutinovic
Hi Siva. What is the InfluxDB URL you are using? I assume it is something like “influxdb://influxdb:8086/…”. If it is “localhost:8086…” that would be a problem. Do you see a connection on InfluxDB? Anything in the logs? Anything in the logs of Job Mnager? Nix, From: Siva Krishna Date: Thurs

Re: table.exec.source.idle-timeout support

2025-01-28 Thread Nikola Milutinovic
same version of Flink 1.20. And then it stopped, after a redeploy. I do not recall changing anything in the config of the cluster, so it felt like a “gremlin”. Nikola. From: Zhanghao Chen Date: Friday, January 24, 2025 at 2:50 AM To: Nikola Milutinovic , user@flink.apache.org Subject: Re

Re: [Flink kubernetes operator] custom dependent resources?

2025-01-14 Thread Nikola Milutinovic
I’d go for a Helm chart, where I would define FlinkDeployment, with desired pod templates and all other stuff. That other stuff can be your headless service, PVCs, CofigMaps,… In the pod template you can define also sidecars and init containers, should you need them. Nix. From: Gyula Fóra Dat

Re: Using data classes in pyflink

2025-01-13 Thread Nikola Milutinovic
Hi Oleksii. The core error is (as you have seen): Caused by: java.lang.ClassCastException: pemja.core.object.PyIterator cannot be cast to pemja.core.object.PyIterator Now, since the name of the class is the same, the only way a cast can fail is if the versions of that class are different. How i

Re: table.exec.source.idle-timeout support

2025-01-10 Thread Nikola Milutinovic
Hi Nic. I do not have a solution (soy), but have seen something similar. And have complained about it, already. Look up my mail in the archives. I am also deploying our session cluster using Flink Kubernetes operator. I was setting “execution.checkpointing.interval: 30”. This should set

Re: Flink CDC real-time synchronization of Oracle data reports errors

2025-01-10 Thread Nikola Milutinovic
Looks like your Oracle DB is not configured for CDC. You should follow instructions on Debezium site: https://debezium.io/blog/2022/09/30/debezium-oracle-series-part-1/ Nix. From: 秋天的叹息 <1341317...@qq.com> Date: Friday, January 10, 2025 at 9:06 AM To: user@flink.apache.org Subject: Flink CDC r

Re: Java 17 support in Flink

2025-01-06 Thread Nikola Milutinovic
On Java 17 support, what does that actually mean? We are using the official Flink 1.20 image, which is based on Java JRE 17 and in the config file we must specify a ton of those module exports, like “--add-exports=java.base/sun.net.util=ALL-UNNAMED”. That tells me that Flink codebase does not re

Flink Kubernetes Operator with Flink 1.20

2024-12-05 Thread Nikola Milutinovic
Hello. I am trying to make a deployment of Flink 1.20 session cluster using Kubernetes Operator v1.10. It looks like the operator is setting up incorrect configuration for 1.20. Namely, it is providing custom Flink configuration file as ${FLINK_HOME}/conf/flink-conf.yaml. That file path has be