Hello Flink group,
I am currently evaluating Flink K8s operator to manage our Flink
applications, but there are some concerns raised by our K8s cluster admins
which I hope to get answers from this group:
1. What is the webhook used for? The documentation of webhook is very
limited. Am I risking
Hi Arthur,In your initial mail, it was seen an explicit job id set:$internal.pipeline.job-id, 044d28b712536c1d1feed3475f2b8111This might be the reason of duplicatedJobSubmission exception.In the job config on your last reply, I could not see such setting. You could verify from the JM logs that when
Hi Naci,
Thanks for your answer.
We do not explicitly define the job-id. As we are using
the flink-kubernetes-operator, I suppose it's the operator handling this ID.
The job is defined in the FlinkDeployment charts, where we have specs for
jobmanager, taskmanager and the job :
job:
jarURI: local:/
Hi Arthur,How you submit your job? Are you explicitly setting job id when submitting the job?Have you also tried without HA to see the behavior?Looks like the job is submitted with the same ID with the previous job, which the job result stored in HA does not let you submit it with the same job_id.B
Hello,
We are running into issues when deploying flink on kubernetes using the
flink-kubernetes-operator with a FlinkDeployment
Occasionally, when a *JobManager* gets rotated out (by karpenter in our
case), the next JobManager is incapable of getting into a stable state and
is stuck in a crash loo
Hi all,
we're running a session cluster and I submit around 20 jobs to it at the
same time by creating FlinkSessionJob Kubernetes resources. After
sufficient time there are 20 jobs created and running healthy. However, it
appears that some jobs are started with the wrong config. As a result some
j
Hi Gyula,
Thanks for getting back and explaining the difference in the
responsibilities of the autoscaler and the operator.
I figured out what the issue was.
Here is what I was trying to do: the autoscaler had initially down-scaled
(2->1) the flinkDeployment so there was
pipeline.jobvertex-parall
Hi Chetas,
The operator logic itself would normally call the rescale api during the
upgrade process, not the autoscaler module. The autoscaler module sets the
correct config with the parallelism overrides, and then the operator
performs the regular upgrade cycle (as when you yourself change someth
Hello,
We recently upgraded the operator to 1.8.0 to leverage the new autoscaling
features (
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/custom-resource/autoscaler/).
The FlinkDeployment (application cluster) is set to flink v1_18 as well. I
am able to observ
Hi! I'm deploying a job via the Flink K8s Operator with these settings in
the FlinkDeployment resource:
```
spec:
flinkConfiguration:
taskmanager.host: 0.0.0.0 <-- ignored / not applied
```
When I look into the flink-conf.yaml in the TM the setting is not there. Is
there an
t;>> resources:
>>> - deployments
>>> - deployments/finalizers
>>> verbs:
>>> - '*'
>>> ---
>>> apiVersion: rbac.authorization.k8s.io/v1
>>> kind: RoleBinding
>>> metadata:
>>> labels:
>>&g
/finalizers
>> verbs:
>> - '*'
>> ---
>> apiVersion: rbac.authorization.k8s.io/v1
>> kind: RoleBinding
>> metadata:
>> labels:
>> app.kubernetes.io/name: flink-kubernetes-operator
>> app.kubernetes.io/version: 1.5.0
>> name
; name: flink-role-binding
> roleRef:
> apiGroup: rbac.authorization.k8s.io
> kind: Role
> name: flink
> subjects:
> - kind: ServiceAccount
> name: flink
> EOF
>
> Hopefully that helps.
>
>
> On Tue, Sep 19, 2023 at 5:40 PM Krzysztof Chmielewski <
>
want to have Flink
deployments in.
kubectl apply -f - < wrote:
> Hi community,
> I was wondering if anyone tried to deploy Flink using Flink k8s operator
> on machine where OKD [1] is installed?
>
> We have tried to install Flink k8s operator version 1.6 which seems to
> succeed
Hi community,
I was wondering if anyone tried to deploy Flink using Flink k8s operator on
machine where OKD [1] is installed?
We have tried to install Flink k8s operator version 1.6 which seems to
succeed, however when we try to deploy simple Flink deployment we are
getting an error.
2023-09-19
FYI, adding environment variables of `
KUBERNETES_DISABLE_HOSTNAME_VERIFICATION=true` works for me.
This env variable needs to be added to both the Flink operator and the
Flink job definition.
On Tue, Aug 8, 2023 at 12:03 PM Xiaolong Wang
wrote:
> Ok, thank you.
>
> On Tue, Aug 8, 2023 at 11:22
How to add TM to Flink Session cluster via Java K8s client if Session
>>> Cluster has running jobs?
>>>
>>> Thanks,
>>> Krzysztof
>>>
>>> pt., 25 sie 2023 o 23:48 Krzysztof Chmielewski <
>>> krzysiek.chmielew...@gmail.com> napisał
l.com> napisał(a):
>>
>>> Hi community,
>>> I have a use case where I would like to add an extra TM) to a running
>>> Flink session cluster that has Flink jobs deployed. Session cluster
>>> creation, job submission and cluster patching is done using flink k8s
wski <
> krzysiek.chmielew...@gmail.com> napisał(a):
>
>> Hi community,
>> I have a use case where I would like to add an extra TM) to a running
>> Flink session cluster that has Flink jobs deployed. Session cluster
>> creation, job submission and cluster pat
isał(a):
> Hi community,
> I have a use case where I would like to add an extra TM) to a running
> Flink session cluster that has Flink jobs deployed. Session cluster
> creation, job submission and cluster patching is done using flink k8s
> operator Java API. The Details of this are pre
Hi community,
I have a use case where I would like to add an extra TM) to a running Flink
session cluster that has Flink jobs deployed. Session cluster creation, job
submission and cluster patching is done using flink k8s operator Java API.
The Details of this are presented here [1]
I would like
ink jobs using Apache Flink
> k8s operator.
> Where actions like job submission (new and from save point), Job cancel
> with save point, cluster creations will be triggered from Java based micro
> service.
>
> Is there any recommended/Dedicated Java API for Flink k8s operator?
>
Hi,
I have a use case where I would like to run Flink jobs using Apache Flink
k8s operator.
Where actions like job submission (new and from save point), Job cancel
with save point, cluster creations will be triggered from Java based micro
service.
Is there any recommended/Dedicated Java API for
Ok, thank you.
On Tue, Aug 8, 2023 at 11:22 AM Peter Huang
wrote:
> We will handle it asap. Please check the status of this jira
> https://issues.apache.org/jira/browse/FLINK-32777
>
> On Mon, Aug 7, 2023 at 8:08 PM Xiaolong Wang
> wrote:
>
>> Hi,
>>
>> I was testing flink-kubernetes-operator i
We will handle it asap. Please check the status of this jira
https://issues.apache.org/jira/browse/FLINK-32777
On Mon, Aug 7, 2023 at 8:08 PM Xiaolong Wang
wrote:
> Hi,
>
> I was testing flink-kubernetes-operator in an IPv6 cluster and found out
> the below issues:
>
> *Caused by: javax.net.ssl.
Hi,
I was testing flink-kubernetes-operator in an IPv6 cluster and found out
the below issues:
*Caused by: javax.net.ssl.SSLPeerUnverifiedException: Hostname
> fd70:e66a:970d::1 not verified:*
>
> *certificate: sha256/EmX0EhNn75iJO353Pi+1rClwZyVLe55HN3l5goaneKQ=*
>
> *DN: CN=kube-apiserve
I have fixed the issue by increasing the CPU and memory for my JM and TM
pods.
Make sure your instance type can accommodate the required resources.
On Wed, 19 Jul 2023 at 13:35, Orkhan Dadashov
wrote:
> Hi Flink users,
>
> I'm following up on this guide to try the Flink K8S o
Hi Flink users,
I'm following up on this guide to try the Flink K8S operator (1.5.0 version
):
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.5/docs/try-flink-kubernetes-operator/quick-start/
When I try to deploy a basic example, JM and TM start, but TM fai
Hi Flink users,
I'm following up on this guide to try the Flink K8S operator (1.5.0 version
):
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.5/docs/try-flink-kubernetes-operator/quick-start/
When I try to deploy a basic example, JM and TM start, but TM fai
Maybe you have inconsistent operator / CRC versions? In any case I highly
recommend upgrading to the lates operator version to get all the bug /
security fixes and improvements.
Gyula
On Wed, 12 Jul 2023 at 10:58, Paul Lam wrote:
> Hi,
>
> I’m using K8s operator 1.3.1 with Flink 1.15.2 on 2 K8s
Hi,
I’m using K8s operator 1.3.1 with Flink 1.15.2 on 2 K8s clusters. Weird enough,
on one K8s cluster the flink deployments would show savepoint trigger nonce.
while the flink deployments on the other cluster wouldn’t.
The normal output is as follows:
```
Last Savepoint:
Format
Hi Gyula,
Thank you and sorry for the late response.
My use case is that users may run finite jobs (either batch jobs or finite
stream jobs), leaving a lot of deprecated flink deployments around. I’ve filed
a ticket[1].
[1] https://issues.apache.org/jira/browse/FLINK-32143
Best,
Paul Lam
>
There is no such feature currently, Kubernetes resources usually do not
delete themselves :)
The problem I see here is by deleting the resource you lose all information
about what happened, you won't know if it failed or completed etc.
What is the use-case you are thinking about?
If this is someth
Hi all,
Currently, if a job turns into terminated status (e.g. FINISHED or FAILED),
the flinkdeployment remains until a manual cleanup is performed. I went
through the docs but did not find any way to clean them up automatically.
Am I missing something? Thanks!
Best,
Paul Lam
Yep!
Simple oversight, it was :/
Cheers,
Matyas
On Thu, Feb 23, 2023 at 10:54 PM Gyula Fóra wrote:
> Hey!
> You are right, these fields could have been of the PodTemplate /
> PodTemplateSpec type (probably PodTemplateSpec is actually better).
>
> I think the reason why we used it is two fold:
Hey!
You are right, these fields could have been of the PodTemplate /
PodTemplateSpec type (probably PodTemplateSpec is actually better).
I think the reason why we used it is two fold:
- Simple oversight :)
- Flink itself "expects" the podtemplate in this form for the native
integration as you c
Hi all,
Why does the FlinkDeployment CRD refer to the Pod class instead of the
PodTemplate class from the fabric8 library? As far as I can tell, the only
difference is that the Pod class exposes the PodStatus, which doesn't seem
mutable. Thanks in advance!
Best,
Mason
fecycleState enum if you want a single condensed status view .
>
> Cheers
> Gyula
>
> On Tue, 6 Dec 2022 at 04:12, Paul Lam <mailto:paullin3...@gmail.com>> wrote:
> Hi all,
>
> I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15.
>
> I found kub
want a single condensed status view .
Cheers
Gyula
On Tue, 6 Dec 2022 at 04:12, Paul Lam wrote:
> Hi all,
>
> I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15.
>
> I found kubectl shows that flinkdeployments stay in DEPLOYED like forever
> (the Flink job
Hi all,
I’m trying out Flink K8s operator 1.2 with K8s 1.25 and Flink 1.15.
I found kubectl shows that flinkdeployments stay in DEPLOYED like forever (the
Flink job status are RUNNING), but the operator logs shows that the
flinkdeployments already turned into STABLE.
Is that a known bug or
; should eliminate most of these cases.
>
> Cheers,
> Gyula
>
> On Tue, Nov 22, 2022 at 9:43 AM Dongwon Kim wrote:
>
>> Hi,
>>
>> While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and
>> flink-1.14.3, we're occasionally facing the foll
hould eliminate most of these cases.
Cheers,
Gyula
On Tue, Nov 22, 2022 at 9:43 AM Dongwon Kim wrote:
> Hi,
>
> While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and
> flink-1.14.3, we're occasionally facing the following error:
>
> Status:
>> Clu
Hi,
While using a last-state upgrade mode on flink-k8s-operator-1.2.0 and
flink-1.14.3, we're occasionally facing the following error:
Status:
> Cluster Info:
> Flink - Revision: 98997ea @ 2022-01-08T23:23:54+01:00
> Flink - Version: 1.
eploying a flink batch job with flink-k8s-operator.
> My flink-k8s-operator's version is 1.2.0 and flink's version is 1.14.6. I
> found after the batch job execute finish, the jobManagerDeploymentStatus
> field became "MISSING" in FlinkDeployment crd. And the err
Hi, I'm deploying a flink batch job with flink-k8s-operator. My
flink-k8s-operator's version is 1.2.0 and flink's version is 1.14.6. I found
after the batch job execute finish, the jobManagerDeploymentStatus field became
"MISSING" in FlinkDeployment crd. A
adoop conf directory exists in the image.
For flink-k8s-operator, another feasible solution is to create a
hadoop-config-configmap manually and then use
*"kubernetes.hadoop.conf.config-map.name
<http://kubernetes.hadoop.conf.config-map.name>" *to mount it to JobManager
and TaskManager
Hi, community:
I'm using flink-k8s-operator v1.2.0 to deploy flink job. And the
"HADOOP_CONF_DIR" environment variable was setted in the image that i buiilded
from flink:1.15. I found the taskmanager pod was trying to mount a volume
named "hadoop-config-volume"
. I
chalk all that up to just me lacking a bit of experience with k8s.
That being said... It's all working now and I documented the deployment
over here:
https://hop.apache.org/manual/next/pipeline/beam/flink-k8s-operator-running-hop-pipeline.html
A big thank you to everyone that helped m
Could you please share the JobManager logs of failed deployment? It will
also help a lot if you could show the pending pod status via "kubectl
describe ".
Given that the current Flink Kubernetes Operator is built on top of native
K8s integration[1], the Flink ResourceManager should allocate enough
Yes of-course. I already feel a bit less intelligent for having asked the
question ;-)
The status now is that I managed to have it all puzzled together. Copying
the files from s3 to an ephemeral volume takes all of 2 seconds so it's
really not an issue. The cluster starts and our fat jar and Ap
Hi Matt,
Yes. There are several official Flink images with various JVMs including
Java 11.
https://hub.docker.com/_/flink
Cheers,
Matyas
On Fri, Jun 24, 2022 at 2:06 PM Matt Casters
wrote:
> Hi Mátyás & all,
>
> Thanks again for the advice so far. On a related note I noticed Java 8
> being us
Hi Mátyás & all,
Thanks again for the advice so far. On a related note I noticed Java 8
being used, indicated in the log.
org.apache.flink.runtime.entrypoint.ClusterEntrypoint[] -
JAVA_HOME: /usr/local/openjdk-8
Is there a way to use Java 11 to start Flink with?
Kind regards,
Matt
On
Thanks for your valuable inputs.
To make deploying Flink on K8s easy as a normal Java application
is certainly the mission of Flink Kubernetes Operator. Obviously, we are
still a little far from this mission.
Back to the user jars download, I think it makes sense to introduce the
artifact fetcher
Hi Yang,
Thanks for the suggestion! I looked into this volume sharing on EKS
yesterday but I couldn't figure it out right away.
The way that people come into the Apache Hop project is often with very
little technical knowledge since that's sort of the goal of the project:
make things easy. Follo
Hi Matyas,
Again thank you very much for the information. I'm a beginner and all
the help is really appreciated. After some diving into the script
behind s3-artifiact-fetcher I kind of figured it out. Have an folder
sync'ed into the pod container of the task manager. Then I guess we should
be
Matyas and Gyula have shared many great informations about how to make the
Flink Kubernetes Operator work on the EKS.
One more input about how to prepare the user jars. If you are more familiar
with K8s, you could use persistent volume to provide the user jars and them
mount the volume to JobManag
Hi Matt,
I believe an artifact fetcher (e.g
https://hub.docker.com/r/agiledigital/s3-artifact-fetcher ) + the pod
template (
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/custom-resource/pod-template/#pod-template)
is an elegant way to solve your problem.
The operato
Thank you very much for the help Matyas and Gyula!
I just saw a video today where you were presenting the FKO. Really nice
stuff!
So I'm guessing we're executing "flink run" at some point on the master and
that this is when we need the jar file to be local?
Am I right in assuming that this happe
A small addition to what Matyas has said:
The limitation of only supporting local scheme is coming from the Flink
Kubernetes Application mode directly and is not related to the operator
itself.
Once this feature is added to Flink itself the operator can also support it
for newer Flink versions.
G
Hi Matt,
- In FlinkDeployments you can utilize an init container to download your
artifact onto a shared volume, then you can refer to it as local:/.. from
the main container. FlinkDeployments comes with pod template support
https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/do
Hi Flink team!
I'm interested in getting the new Flink Kubernetes Operator to work on AWS
EKS. Following the documentation I got pretty far. However, when trying
to run a job I got the following error:
Only "local" is supported as schema for application mode. This assumes t
> hat the jar is loc
Best,
Yang
Sigalit Eliazov 于2022年6月1日周三 14:54写道:
> Hi all,
> we just started using the flink k8s operator to deploy our flink cluster.
> From what we understand we are only able to start a flink cluster per job.
> So in our case when we have 2 jobs we have to create 2 different clusters
Hi all,
we just started using the flink k8s operator to deploy our flink cluster.
>From what we understand we are only able to start a flink cluster per job.
So in our case when we have 2 jobs we have to create 2 different clusters.
obviously we would prefer to deploy these 2 job which relate
63 matches
Mail list logo