Hello all.
Just wanted to give a heads-up that Flink 1.20 images based on Java 17 have
problems with launching jobs via Flink Kubernetes Operator. Those based on Java
11 work.
We are running a Flink Session cluster on Kubernetes, deploying it using Flink
K8s Operator. Our session cluster was r
Hi Pachara.
We are also struggling to get a PyFlink job to run. Hitting some other wall,
but I would like to share some thoughts here.
At the moment, we are launching our fleet of jobs using a specially crafted
Docker image: Flink 1.20.1 + plugins + PyFlink + our Python libs + our jobs.
From t
some hints and reassurances.
Nikola.
From: Gunnar Morling
Date: Wednesday, June 11, 2025 at 2:19 PM
To: Nikola Milutinovic
Subject: Re: Problems running Flink job via K8s operator
Hey Nikola,
> Problem 1: The first problem is that the operator doesn’t know about “local”
> URI schema u
Hello.
I have problems trying to run a Flink session job using Flink Kubernetes
operator. Two problems, so far. This is the Spec I am trying to use:
apiVersion: flink.apache.org/v1beta1
kind: FlinkSessionJob
metadata:
name: nix-test
spec:
deploymentName: flink-cluster-session-cluster
job:
small amount of data?
Nikola.
From: Nikola Milutinovic
Date: Tuesday, June 10, 2025 at 4:01 PM
To:
Cc: Flink Users
Subject: Re: Savepoints and Checkpoints missing files
Hi Gabor.
Thanks for chiming in. I think it is failing but I could be mistaken. There are
no errors in the log, everything looks
ile,
as well? Or can I do a better job of analyzing it?
Nikola.
From: Gabor Somogyi
Date: Tuesday, June 10, 2025 at 10:52 AM
To: Nikola Milutinovic
Cc: Flink Users
Subject: Re: Savepoints and Checkpoints missing files
Hi Nikola,
Fails on how? Some stack trace or error would be beneficial.
G
Hello.
We are running Flink 1.20.1 on Kubernetes (AKS). We have observed a consistent
error situation: both checkpoints and savepoints only save “_metadata” file and
nothing else. Sometimes this is OK, where all data is in that one file. But
sometimes “_metadata” holds references to other files
What does your YAML for Job submission look like? And the YAML for Session
cluster, for that matter.
It is hard to tell without those.
Nix,
From: dominik.buen...@swisscom.com
Date: Thursday, June 5, 2025 at 3:40 PM
To: user@flink.apache.org
Subject: Kubernetes Operator Flink version null for
Hello all.
We are running Flink 1.20 on Kubernetes cluster. We deploy using Flink K8s
Operator.
I was wandering, when Kubernets decides to kill a running Flink cluster, is it
using some regular graceful method or does it just kill the pod?
Just for the reference, Docker has a way to specify a
Hmm, lemme see…
Oh, yes, we also had to link Python bin:
ldconfig /usr/lib
ln -s /usr/bin/python3 /usr/bin/python
Classic trickery.
Nix,
From: George
Date: Wednesday, May 14, 2025 at 1:09 PM
To: Nikola Milutinovic , user@flink.apache.org
Subject: Re: Python based User defined function on
Hi George.
We saw the same problem, running Apache Flink 1.19 and 1.20 images. The cause
is that Flink image provides a JRE and you need JDK to build/install PyFlink.
And, oddly enough, I think it was only on ARM64 images. Amd64 was OK, I think.
So, Mac M1, M2, M3…
Our Docker file for building
You can definitely run Flink in native Kubernetes mode without K8s Operator.
Although, it should be easier to deploy such a cluster with the operator.
Operator will just make the process easer.
Nix.
From: Kamal Mittal via user
Date: Tuesday, May 13, 2025 at 3:59 AM
To: Kamal Mittal , user@flin
Hi Melin.
Flink integration with Kubernetes is really just about resource provisioning.
The behavior that you describe is caused by Flink’s resilience feature,
implemented as Job restart policy. And it would be similar even without K8s
integration. If the job is configured to restart on errors,
Hi all.
I would like to ask if what I am seeing is good or not. We are running Flink as
Kubernetes session cluster and have checkpoints enabled. When I inspect a
checkpoint, I can see only one file: „_metadata“. As I understand it, that is
OK, if the state in question is sufficiently small to f
I think your idea is OK. You have 2 options, generally. You can launch
application cluster(s), which do linger around when the application is
finished. And you get the idea: to read the status, logs, etc. That is what you
are doing right now.
The other option is to have a session cluster which
Hi Bianca.
I think you are missing the point here. HPA can scale any resource based on the
given metric. And, in the example you point to, they are scaling
FlinkDeployments. Basically, they would be launching multiple instances of the
same pipeline, working on the same target(s). If such parall
Hi Sachin.
When you define your pipeline, there is an option to assign a UID to each task
in the job graph. If you do not, Flink will auto-generate UID for you – each
time you run the pipeline.
It is highly recommended to define your own UIDs, since any re-run would assign
new UIDs and effecti
helps.
Nix.
From: Siva Krishna
Date: Thursday, February 13, 2025 at 11:25 AM
To: Nikola Milutinovic
Cc: user@flink.apache.org
Subject: Re: Please help with Flink + InfluxDB
Can you please provide how you are submitting the Python job via JM.
Siva,.
On Thu, Feb 13, 2025 at 3:44 PM Nikola
Hi Siva.
I believe, when we built our Flink image (Flink 1.20.0 + several connectors +
our Python library code), we had to make a symbolic link:
ln -s /usr/bin/python3 /usr/bin/python
Try that. Or put a full path in -pyexec argument.
Nix.
From: Siva Krishna
Date: Monday, February 17, 2025 at
successfully launching PyFlink jobs
from JM, in Docker Compose.
Nix,.
From: Siva Krishna
Date: Thursday, February 13, 2025 at 5:20 AM
To: Nikola Milutinovic
Cc: user@flink.apache.org
Subject: Re: Please help with Flink + InfluxDB
Hi Nix,
I tried the python approach and successfully ran it with command
So, Flink is going to act as an OAuth2 Client. You would need to have fetch
access token and cache it, either in Flink or externally. I you wish to cache
in Flink, look into Broadcast State pattern. These are URLs.
https://flink.apache.org/2019/06/26/a-practical-guide-to-broadcast-state-in-apach
resulting
set of configs.
BTW, how does it work when using Kubernetes operator?
Nix
From: Nikola Milutinovic
Date: Tuesday, January 28, 2025 at 6:10 PM
To: user@flink.apache.org
Subject: Re: table.exec.source.idle-timeout support
Hi Zhao.
Yes, I am deploying our solution as a Session Cluster on
we set on its
constructor.
So, log, log and log.
Nix.
From: Siva Krishna
Date: Friday, January 31, 2025 at 4:41 AM
To: Nikola Milutinovic
Cc: user@flink.apache.org
Subject: Re: Please help with Flink + InfluxDB
Yes Nix.
I can confirm that I am able to connect to InfluxDB from Flink(taskmanager
mistake to
target “localhost”, which stays inside the container.
Nix.
From: Siva Krishna
Date: Thursday, January 30, 2025 at 11:07 AM
To: Nikola Milutinovic
Cc: user@flink.apache.org
Subject: Re: Please help with Flink + InfluxDB
Hi Nix,
Thank you for your reply.
I have small update. For
Hi Siva.
What is the InfluxDB URL you are using? I assume it is something like
“influxdb://influxdb:8086/…”. If it is “localhost:8086…” that would be a
problem.
Do you see a connection on InfluxDB? Anything in the logs? Anything in the logs
of Job Mnager?
Nix,
From: Siva Krishna
Date: Thurs
same version of Flink 1.20. And then it
stopped, after a redeploy. I do not recall changing anything in the config of
the cluster, so it felt like a “gremlin”.
Nikola.
From: Zhanghao Chen
Date: Friday, January 24, 2025 at 2:50 AM
To: Nikola Milutinovic , user@flink.apache.org
Subject: Re
I’d go for a Helm chart, where I would define FlinkDeployment, with desired pod
templates and all other stuff. That other stuff can be your headless service,
PVCs, CofigMaps,… In the pod template you can define also sidecars and init
containers, should you need them.
Nix.
From: Gyula Fóra
Dat
Hi Oleksii.
The core error is (as you have seen): Caused by: java.lang.ClassCastException:
pemja.core.object.PyIterator cannot be cast to pemja.core.object.PyIterator
Now, since the name of the class is the same, the only way a cast can fail is
if the versions of that class are different. How i
Hi Nic.
I do not have a solution (soy), but have seen something similar. And have
complained about it, already. Look up my mail in the archives.
I am also deploying our session cluster using Flink Kubernetes operator. I was
setting “execution.checkpointing.interval: 30”. This should set
Looks like your Oracle DB is not configured for CDC. You should follow
instructions on Debezium site:
https://debezium.io/blog/2022/09/30/debezium-oracle-series-part-1/
Nix.
From: 秋天的叹息 <1341317...@qq.com>
Date: Friday, January 10, 2025 at 9:06 AM
To: user@flink.apache.org
Subject: Flink CDC r
On Java 17 support, what does that actually mean?
We are using the official Flink 1.20 image, which is based on Java JRE 17 and
in the config file we must specify a ton of those module exports,
like “--add-exports=java.base/sun.net.util=ALL-UNNAMED”.
That tells me that Flink codebase does not re
Hello.
I am trying to make a deployment of Flink 1.20 session cluster using Kubernetes
Operator v1.10.
It looks like the operator is setting up incorrect configuration for 1.20.
Namely, it is providing custom Flink configuration file as
${FLINK_HOME}/conf/flink-conf.yaml. That file path has be
32 matches
Mail list logo