Query: Geo-Redundancy with Apache Flink on Kubernetes & Replicated Checkpoints !

2025-07-03 Thread Sachin
Dear Apache Flink Community, I hope you're doing well. We are currently operating a Flink deployment on *Kubernetes*, with *high availability (HA) configured using Kubernetes-based HA services*. We're exploring approaches for *geo-redundancy (GR)* to ensure disaster recovery and fault tolerance a

RE: Trying to read a file from S3 with flink on kubernetes

2024-07-17 Thread gwenael . lebarzic
t;s3a://mybucket") val fs = FileSystem.get(s3uri, hadoopConfig) ### Best regards. [Logo Orange]<http://www.orange.com/> Gwenael Le Barzic De : LE BARZIC Gwenael DTSI/SI Envoyé : jeudi 11 juillet 2024 16:24 À : user@flink.apache.org Objet : Trying to read a file from S3 with flink on

Trying to read a file from S3 with flink on kubernetes

2024-07-11 Thread gwenael . lebarzic
Hey guys. I'm trying to read a file from an internal S3 with flink on Kubernetes, but get a strange blocking error. Here is the code : MyFlinkJob.scala : ### package com.example.flink import org.apache.flink.api.common.serialization.SimpleStringSchema i

Aw: Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes

2024-05-19 Thread Oliver Schmied
;Biao Geng" An: "Oliver Schmied" Cc: user@flink.apache.org Betreff: Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes Hi Oliver,   I believe you are almost there. One thing I found could improve is that in your job ya

Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes

2024-05-19 Thread Biao Geng
Hi Oliver, I believe you are almost there. One thing I found could improve is that in your job yaml, instead of using: kubernetes.operator.metrics.reporter.prommetrics.reporters: prom kubernetes.operator.metrics.reporter.prommetrics.reporter.prom.factory.class: org.apache.flink.metrics.promet

Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes

2024-05-18 Thread Oliver Schmied
Dear Apache Flink Community, I am currently trying to monitor an Apache Flink cluster deployed on Kubernetes using Prometheus and Grafana. Despite following the official guide (https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/operations/metrics-logging/)  on how t

Re: Apache Beam MinimalWordCount on Flink on Kubernetes using Flink Kubernetes Operator on GCP

2023-01-17 Thread Yang Wang
The "JAR file does not exist" exception comes from the JobManager side, not on the client. Please be aware that the local:// scheme in the jarURI means the path in the JobManager pod. You could use an init-container to download your user jar and mount it to the JobManager main-container. Refer to

Apache Beam MinimalWordCount on Flink on Kubernetes using Flink Kubernetes Operator on GCP

2023-01-17 Thread Lee Parayno
I have a Kubernetes cluster in GCP running the Flink Kubernetes Operator. I'm trying to package a project with the Apache Beam MinimalWordCount using the Flink Runner as a FlinkDeployment to the Kubernetes Cluster Job Docker image created with this Dockerfile: FROM flink ENV FLINK_CLASSPATH /op

Re: Flink on kubernetes HA ::Renew deadline reached

2021-10-25 Thread marco
Hello thanks for your response, I just want to know why you suggest that i should use the native Kubernetes HA. Im my case I need to use the standalone application mode. On 2021/10/25 09:25:21, Xintong Song wrote: > You should be using the native Kubernetes HA. This error message suggests > Fl

Re: Flink on kubernetes HA ::Renew deadline reached

2021-10-25 Thread marco
Any suggestions would be appreciated. On 2021/10/20 16:18:39, marco wrote: > > > Hello flink community:: > > I am deploying flink application cluster standalone mode on kubernetes, but i > am facing some problems > > the job starts normally and it continues to run but at some point in time

Flink on kubernetes HA ::Renew deadline reached

2021-10-20 Thread marco
Hello flink community:: I am deploying flink application cluster standalone mode on kubernetes, but i am facing some problems the job starts normally and it continues to run but at some point in time it crushes and gets restarted. Does anyone facing the same problem or know how to resolve

RE: [External] Re: Flink on Kubernetes

2021-09-03 Thread Julian Cardarelli
the contents of this information is strictly prohibited. From: Guowei Ma Sent: Thursday, September 2, 2021 11:32 PM To: Julian Cardarelli Cc: user Subject: [External] Re: Flink on Kubernetes Hi, Julian I notice that your configuration includes "restart-strategy.fixed-delay.attempt

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-03 Thread Yang Wang
* Freitag, 3. September 2021 08:09 > *To:* Alexis Sarda-Espinosa > *Cc:* spoon_lz ; Denis Cosmin NUTIU < > dnu...@bitdefender.com>; matth...@ververica.com; user@flink.apache.org > *Subject:* Re: Deploying Flink on Kubernetes with fractional CPU and > different limits and request

RE: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-03 Thread Alexis Sarda-Espinosa
. September 2021 08:09 To: Alexis Sarda-Espinosa Cc: spoon_lz ; Denis Cosmin NUTIU ; matth...@ververica.com; user@flink.apache.org Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests Hi Alexis Thanks for your valuable inputs. First, I want to share

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-02 Thread Yang Wang
5GiB. > > > > Regards, > > Alexis. > > > > *From:* spoon_lz > *Sent:* Donnerstag, 2. September 2021 14:12 > *To:* Yang Wang > *Cc:* Denis Cosmin NUTIU ; Alexis Sarda-Espinosa < > alexis.sarda-espin...@microfocus.com>; matth...@ververica.com; > user@fli

Re: Flink on Kubernetes

2021-09-02 Thread Guowei Ma
So if this does not work I think you could share the log at that time and the flink version you use. Best, Guowei On Fri, Sep 3, 2021 at 2:00 AM Julian Cardarelli wrote: > Hello – > > > > We have implemented Flink on Kubernetes with Google Cloud Storage in high > availability co

Flink on Kubernetes

2021-09-02 Thread Julian Cardarelli
Hello - We have implemented Flink on Kubernetes with Google Cloud Storage in high availability configuration as per the below configmap. Everything appears to be working normally, state is being saved to GCS. However, every now and then - perhaps weekly or every other week, all of the

RE: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-02 Thread Alexis Sarda-Espinosa
; Alexis Sarda-Espinosa ; matth...@ververica.com; user@flink.apache.org Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests Hi Yang, I agree with you, but I think the limit-factor should be greater than or equal to 1, and default to 1 is a better

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-02 Thread spoon_lz
Kubernetes | Apache Flink [2] [FLINK-15648] Support to configure limit for CPU and memory requirement - ASF JIRA (apache.org) From: Yang Wang Sent: Tuesday, August 31, 2021 6:04 AM To: Alexis Sarda-Espinosa Cc: Denis Cosmin NUTIU ; matth...@ververica.com ; user@flink.apache.org Subject: Re: Deploying

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread Yang Wang
u! I have limited knowledge of Flink internals but the >> kubernetes.jobmanager.limit-factor and kubernetes.taskmanager.limit-factor >> seems to be the right way to do it. >> >> [1] Native Kubernetes | Apache Flink >> <https://ci.apache.org/projects/flink/flink-docs-master/doc

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread spoon_lz
gust 31, 2021 6:04 AM To: Alexis Sarda-Espinosa Cc: Denis Cosmin NUTIU ; matth...@ververica.com ; user@flink.apache.org Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests Hi all, I think it is a good improvement to support different resource requ

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread Yang Wang
ink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#pod-template> > [2] [FLINK-15648] Support to configure limit for CPU and memory > requirement - ASF JIRA (apache.org) > <https://issues.apache.org/jira/browse/FLINK-15648> > > --

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread spoon_lz
t: Tuesday, August 31, 2021 6:04 AM To: Alexis Sarda-Espinosa Cc: Denis Cosmin NUTIU ; matth...@ververica.com ; user@flink.apache.org Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests Hi all, I think it is a good improvement to support different

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread 971066723
com>; matth...@ververica.com <matth...@ververica.com>; user@flink.apache.org <user@flink.apache.org> Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests   Hi all, I think it is a good improvement to support different resource requests

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-09-01 Thread Denis Cosmin NUTIU
pache.org<mailto:user@flink.apache.org> mailto:user@flink.apache.org>> Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests Hi all, I think it is a good improvement to support different resource requests and limits. And it is very useful

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-31 Thread Yang Wang
-- > *From:* Yang Wang > *Sent:* Tuesday, August 31, 2021 6:04 AM > *To:* Alexis Sarda-Espinosa > *Cc:* Denis Cosmin NUTIU ; matth...@ververica.com > ; user@flink.apache.org > *Subject:* Re: Deploying Flink on Kubernetes with fractional CPU and > d

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-31 Thread Denis Cosmin NUTIU
___ From: Yang Wang Sent: Tuesday, August 31, 2021 6:04 AM To: Alexis Sarda-Espinosa Cc: Denis Cosmin NUTIU ; matth...@ververica.com ; user@flink.apache.org Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests Hi all,

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-30 Thread Yang Wang
etting the configuration options get bloated. > > > > Regards, > > Alexis. > > > > *From:* Denis Cosmin NUTIU > *Sent:* Donnerstag, 26. August 2021 15:55 > *To:* matth...@ververica.com > *Cc:* user@flink.apache.org; danrtsey...@gmail.com > *Subject:* Re: De

RE: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Alexis Sarda-Espinosa
. Regards, Alexis. From: Denis Cosmin NUTIU Sent: Donnerstag, 26. August 2021 15:55 To: matth...@ververica.com Cc: user@flink.apache.org; danrtsey...@gmail.com Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests Hi Matthias, Thanks for getting back to me

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Denis Cosmin NUTIU
Hi Matthias, Thanks for getting back to me and for your time! We have some Flink jobs deployed on Kubernetes and running kubectl top pod gives the following result: NAMECPU(cores) MEMORY(bytes) aa-78c8cb77d4-zlmpg 8

Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Matthias Pohl
Hi Denis, I did a bit of digging: It looks like there is no way to specify them independently. You can find documentation about pod templates for TaskManager and JobManager [1]. But even there it states that for cpu and memory, the resource specs are overwritten by the Flink configuration. The code

Deploying Flink on Kubernetes with fractional CPU and different limits and requests

2021-08-26 Thread Denis Cosmin NUTIU
Hello, I've developed a Flink job and I'm trying to deploy it on a Kubernetes cluster using Flink Native. Setting kubernetes.taskmanager.cpu=0.5 and kubernetes.jobmanager.cpu=0.5 sets the requests and limits to 500m, which is correct, but I'd like to set the requests and limits to different value

Re: Flink on Kubernetes, Task/Job Manager Recycles

2021-01-29 Thread Yang Wang
t; I am running some testing with flink on Kubernetes. Every let’s say five > to ten days, all the jobs disappear from running jobs. There’s nothing > under completed jobs, and there’s no record of the submitted jar files in > the cluster. > > > > In some manner or another, it is

Flink on Kubernetes, Task/Job Manager Recycles

2021-01-28 Thread Julian Cardarelli (CA)
Hello - I am running some testing with flink on Kubernetes. Every let's say five to ten days, all the jobs disappear from running jobs. There's nothing under completed jobs, and there's no record of the submitted jar files in the cluster. In some manner or another, it is almost

Re: Concise example of how to deploy flink on Kubernetes

2020-11-26 Thread Igal Shilman
Hi George, Specifically for StateFun, we have the following Helm charts [1] to help you deploy Stateful Functions on k8s. The greeter example's docker-compose file also includes Kafka (and hence Zookeeper). Indeed the Flink cluster is "included" in the master/worker stateful functions docker image

Re: Concise example of how to deploy flink on Kubernetes

2020-11-24 Thread George Costea
Thank you. This is very helpful. On Mon, Nov 23, 2020 at 9:46 AM Till Rohrmann wrote: > Hi George, > > Here is some documentation about how to deploy a stateful function job > [1]. In a nutshell, you need to deploy a Flink cluster on which you can run > the stateful function job. This can either

Re: Concise example of how to deploy flink on Kubernetes

2020-11-23 Thread Till Rohrmann
Hi George, Here is some documentation about how to deploy a stateful function job [1]. In a nutshell, you need to deploy a Flink cluster on which you can run the stateful function job. This can either happen before (e.g. by spawning a session cluster on K8s [2]) or you can combine your job into a

Re: Concise example of how to deploy flink on Kubernetes

2020-11-23 Thread George Costea
Sorry. Forgot to reply to all. On Sun, Nov 22, 2020 at 9:24 PM George Costea wrote: > > Hi Xingbo, > > I’m interested in using stateful functions to build an application on > Kubernetes. Don’t I need to deploy the flink cluster on Kubernetes first > before deploying my stateful functions? > >

Re: Concise example of how to deploy flink on Kubernetes

2020-11-20 Thread Xingbo Huang
Hi George, Have you referred to the official document[1]? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/kubernetes.html Best, Xingbo 在 2020年11月21日星期六,George Costea 写道: > Hi there, > > Is there an example of how to deploy a flink cluster on Kubernetes? > I'd lik

Concise example of how to deploy flink on Kubernetes

2020-11-20 Thread George Costea
Hi there, Is there an example of how to deploy a flink cluster on Kubernetes? I'd like to deploy the flink cluster, a kafka-broker, and then the greeter example to give it a try. Thanks, George

Re: Difference between flink on kubernetes operator vs native kubernetes

2020-06-20 Thread Yang Wang
Inline comments for your questions. Is it similar to the first? How the second is different? The first leverages the K8s operator to make standalone job cluster running on K8s easier. However, the second is more Flink native. We have an embedded K8s client in Flink and could use "kubernetes-sessi

Difference between flink on kubernetes operator vs native kubernetes

2020-06-19 Thread SAMPAD SAHA
I was trying to deploy Flink in Kubernetes environment and came across two things: 1. Kubernetes Flink control plane developed by google and Lyft - https://github.com/lyft/flinkk8soperator - https://github.com/GoogleCloudPlatform/flink-on-k8s-operator 2. Deploying Kubernetes natively.

Re: Flink on Kubernetes

2020-05-21 Thread Yang Wang
Hi lvan Yang, #1. If a TaskManager crashed exceptionally and there are some running task on it, it could not join back gracefully. Whether the full job will be restarted depends on the failover strategies[1]. #2. Currently, when new TaskManagers join to the Flink cluster, the running Flink job co

Flink on Kubernetes

2020-05-21 Thread Ivan Yang
Hi, I have setup Filnk 1.9.1 on Kubernetes on AWS EKS with one job manager pod, 10 task manager pods, one pod per EC2 instance. Job runs fine. After a while, for some reason, one pod (task manager) crashed, then the pod restarted. After that, the job got into a bad state. All the parallelisms a

Re: Flink on Kubernetes unable to Recover from failure

2020-05-08 Thread Yun Tang
user Subject: Re: Flink on Kubernetes unable to Recover from failure Hey Morgan, Is it possible for you to provide us with the full logs of the JobManager and the affected TaskManager? This might give us a hint why the number of task slots is zero. Best, Robert On Tue, May 5, 2020 at 1

Re: Flink on Kubernetes unable to Recover from failure

2020-05-08 Thread Robert Metzger
Hey Morgan, Is it possible for you to provide us with the full logs of the JobManager and the affected TaskManager? This might give us a hint why the number of task slots is zero. Best, Robert On Tue, May 5, 2020 at 11:41 AM Morgan Geldenhuys < morgan.geldenh...@tu-berlin.de> wrote: > > Commun

Flink on Kubernetes unable to Recover from failure

2020-05-05 Thread Morgan Geldenhuys
Community, I am currently doing some fault tolerance testing for Flink (1.10) running on Kubernetes (1.18) and am encountering an error where after a running job experiences a failure, the job fails completely. A Flink session cluster has been created according to the documentation containe

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

2020-03-17 Thread Pankaj Chand
Thank you, Yang and Xintong! Best, Pankaj On Mon, Mar 16, 2020, 9:27 PM Yang Wang wrote: > Hi Pankaj, > > Just like Xintong has said, the biggest difference of Flink on Kubernetes > and native > integration is dynamic resource allocation. Since the latter has en > embedde

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

2020-03-16 Thread Yang Wang
Hi Pankaj, Just like Xintong has said, the biggest difference of Flink on Kubernetes and native integration is dynamic resource allocation. Since the latter has en embedded K8s client and will communicate with K8s Api server directly to allocate/release JM/TM pods. Both for the two ways to run

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

2020-03-16 Thread Pankaj Chand
ong wrote: > Forgot to mention that "running Flink natively on Kubernetes" is newly > introduced and is only available for Flink 1.10 and above. > > > Thank you~ > > Xintong Song > > > > On Mon, Mar 16, 2020 at 5:40 PM Xintong Song > wrote: > >> Hi

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

2020-03-16 Thread Xintong Song
Hi Pankaj, "Running Flink on Kubernetes" refers to the old way that basically deploys a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink Master and TaskManager processes inside Kubernetes container. In this way, Flink is not ware of whether it's running i

Re: Flink on Kubernetes Vs Flink Natively on Kubernetes

2020-03-16 Thread Xintong Song
Forgot to mention that "running Flink natively on Kubernetes" is newly introduced and is only available for Flink 1.10 and above. Thank you~ Xintong Song On Mon, Mar 16, 2020 at 5:40 PM Xintong Song wrote: > Hi Pankaj, > > "Running Flink on Kubernetes" refers

Flink on Kubernetes Vs Flink Natively on Kubernetes

2020-03-16 Thread Pankaj Chand
Hi all, I want to run Flink, Spark and other processing engines on a single Kubernetes cluster. >From the Flink documentation, I did not understand the difference between: (1) Running Flink on Kubernetes, Versus (2) Running Flink natively on Kubernetes. Could someone please explain

Re: scaling issue Running Flink on Kubernetes

2020-03-11 Thread Eleanore Jin
Hi Flavio, We have implemented our own flink operator, the operator will start a flink job cluster (the application jar is already packaged together with flink in the docker image). I believe Google's flink operator will start a session cluster, and user can submit the flink job via REST. Not look

Re: scaling issue Running Flink on Kubernetes

2020-03-11 Thread Flavio Pompermaier
Sorry I wanted to mention https://github.com/lyft/flinkk8soperator (I don't know which one of the 2 is better) On Wed, Mar 11, 2020 at 10:19 AM Flavio Pompermaier wrote: > Have you tried to use existing operators such as > https://github.com/GoogleCloudPlatform/flink-on-k8s-operator or > https:/

Re: scaling issue Running Flink on Kubernetes

2020-03-11 Thread Flavio Pompermaier
Have you tried to use existing operators such as https://github.com/GoogleCloudPlatform/flink-on-k8s-operator or https://github.com/GoogleCloudPlatform/flink-on-k8s-operator? On Wed, Mar 11, 2020 at 4:46 AM Xintong Song wrote: > Hi Eleanore, > > That does't sound like a scaling issue. It's proba

Re: scaling issue Running Flink on Kubernetes

2020-03-10 Thread Xintong Song
Hi Eleanore, That does't sound like a scaling issue. It's probably a data skew, that the data volume on some of the keys are significantly higher than others. I'm not familiar with this area though, and have copied Jark for you, who is one of the community experts in this area. Thank you~ Xinton

Re: scaling issue Running Flink on Kubernetes

2020-03-10 Thread Eleanore Jin
_Hi Xintong, Thanks for the prompt reply! To answer your question: - Which Flink version are you using? v1.8.2 - Is this skew observed only after a scaling-up? What happens if the parallelism is initially set to the scaled-up value? I also tried this, it

Re: scaling issue Running Flink on Kubernetes

2020-03-10 Thread Xintong Song
Hi Eleanore, I have a few more questions regarding your issue. - Which Flink version are you using? - Is this skew observed only after a scaling-up? What happens if the parallelism is initially set to the scaled-up value? - Keeping the job running a while after the scale-up, does the

scaling issue Running Flink on Kubernetes

2020-03-10 Thread Eleanore Jin
Hi Experts, I have my flink application running on Kubernetes, initially with 1 Job Manager, and 2 Task Managers. Then we have the custom operator that watches for the CRD, when the CRD replicas changed, it will patch the Flink Job Manager deployment parallelism and max parallelism according to th

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-28 Thread Hao Sun
e >>>>>> case of failed JM - perhaps we need to resubmit all jobs. >>>>>> Let me know if I have misunderstood anything. >>>>>> >>>>>> 3. Restarting jobs >>>>>> For the session cluster, you could directly cancel the job a

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-27 Thread Yang Wang
t;>>>> per-job >>>>> cluster from the latest savepoint. >>>>> >>>>> 4. Managing the flink jobs >>>>> The rest api and flink command line could be used to managing the >>>>> jobs(e.g. >>>>>

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-27 Thread Yang Wang
on cluster, if one taskmanager crashed, then all the jobs which >>> have tasks >>> on this taskmanager will failed. >>> Both session and per-job could be configured with high availability and >>> recover >>> from the latest checkpoint. >

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread Jin Yi
t;> recover >> from the latest checkpoint. >> >> Mans - Does a task manager failure cause the job to fail ? My >> understanding is the JM failure are catastrophic while TM failures are >> recoverable. >> >> > Is there any need for specifying volume for the

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread Yang Wang
failure are catastrophic while TM failures are > recoverable. > > > Is there any need for specifying volume for the pods? > No, you do not need to specify a volume for pod. All the data in the pod > local directory is temporary. When a pod crashed and relaunched, the > task

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread M Singh
sumefrom the latest checkpoint. Mans - So if we are saving checkpoint in S3 then there is no need for disks - should we use emptyDir ? [1].  https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html M Singh 于2020年2月23日周日 上午2:28写道: Hey Folks: I am trying to figure out

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread M Singh
org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html M Singh 于2020年2月23日周日 上午2:28写道: Hey Folks: I am trying to figure out the options for running Flink on Kubernetes and am trying to find out the pros and cons of running in Flink Session vs Flink Cluster mode (https://ci.apache.

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-26 Thread Arvid Heise
c while TM failures are >> recoverable. >> >> > Is there any need for specifying volume for the pods? >> No, you do not need to specify a volume for pod. All the data in the pod >> local directory is temporary. When a pod crashed and relaunched, the >> task

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-24 Thread Yang Wang
pecify a volume for pod. All the data in the pod > local directory is temporary. When a pod crashed and relaunched, the > taskmanager will retrieve the checkpoint from zookeeper + S3 and resume > from the latest checkpoint. > > Mans - So if we are saving checkpoint in S3 then there is no

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-24 Thread M Singh
should we use emptyDir ? [1].  https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html M Singh 于2020年2月23日周日 上午2:28写道: Hey Folks: I am trying to figure out the options for running Flink on Kubernetes and am trying to find out the pros and cons of running in F

Re: Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-23 Thread Yang Wang
. [1]. https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html M Singh 于2020年2月23日周日 上午2:28写道: > Hey Folks: > > I am trying to figure out the options for running Flink on Kubernetes and > am trying to find out the pros and cons of running in Flink

Flink on Kubernetes - Session vs Job cluster mode and storage

2020-02-22 Thread M Singh
Hey Folks: I am trying to figure out the options for running Flink on Kubernetes and am trying to find out the pros and cons of running in Flink Session vs Flink Cluster mode (https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html#flink-session-cluster-on

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread Li Peng
>>>> >>>> Li Peng 于2019年12月11日周三 下午1:24写道: >>>> >>>>> Ah I see. I think the Flink app is reading files from >>>>> /opt/flink/conf correctly as it is, since changes I make to flink-conf are >>>>> picked up as expected, it&

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread Li Peng
used, or don't apply to stdout or whatever source k8 uses for its >>>> logs? Given that the pods don't seem to have logs written to file >>>> anywhere, contrary to the properties, I'm inclined to say it's the former >>>> and that the log4j pr

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread ouywl
have no idea why though.On Tue, Dec 10, 2019 at 6:56 PM Yun Tang <myas...@live.com> wrote: Sure, /opt/flink/conf is mounted as a volume from the configmap.   Best Yun Tang   From: Li Peng <li.p...@doordash.com> Date: Wednesday, December 11, 2019 at 9:37 AM To: Yang Wang <danr

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-11 Thread ouywl
#x27;t being picked up. Still have no idea why though.On Tue, Dec 10, 2019 at 6:56 PM Yun Tang <myas...@live.com> wrote: Sure, /opt/flink/conf is mounted as a volume from the configmap.   Best Yun Tang   From: Li Peng <li.p...@doordash.com> Date: Wednesday, December 11, 2019 at 9:

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-11 Thread Yang Wang
ds don't seem to have logs written to file >>> anywhere, contrary to the properties, I'm inclined to say it's the former >>> and that the log4j properties just aren't being picked up. Still have no >>> idea why though. >>> >>> On Tue,

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-11 Thread Li Peng
;t being picked up. Still have no >> idea why though. >> >> On Tue, Dec 10, 2019 at 6:56 PM Yun Tang wrote: >> >>> Sure, /opt/flink/conf is mounted as a volume from the configmap. >>> >>> >>> >>> Best >>> >>> Yun

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-10 Thread Yang Wang
ue, Dec 10, 2019 at 6:56 PM Yun Tang wrote: > >> Sure, /opt/flink/conf is mounted as a volume from the configmap. >> >> >> >> Best >> >> Yun Tang >> >> >> >> *From: *Li Peng >> *Date: *Wednesday, December 11, 2019 at 9:37

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-10 Thread Li Peng
, /opt/flink/conf is mounted as a volume from the configmap. > > > > Best > > Yun Tang > > > > *From: *Li Peng > *Date: *Wednesday, December 11, 2019 at 9:37 AM > *To: *Yang Wang > *Cc: *vino yang , user > *Subject: *Re: Flink on Kubernetes seems to ignore

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-10 Thread Yun Tang
Sure, /opt/flink/conf is mounted as a volume from the configmap. Best Yun Tang From: Li Peng Date: Wednesday, December 11, 2019 at 9:37 AM To: Yang Wang Cc: vino yang , user Subject: Re: Flink on Kubernetes seems to ignore log4j.properties 1. Hey Yun, I'm calling /opt/flink/bin/stand

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-10 Thread Li Peng
1. Hey Yun, I'm calling /opt/flink/bin/standalone-job.sh and /opt/flink/bin/taskmanager.sh on my job and task managers respectively. It's based on the setup described here: http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ . I haven't tried the configmap approach yet, doe

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-09 Thread Yang Wang
Hi Li Peng, You are running standalone session cluster or per-job cluster on kubernetes. Right? If so, i think you need to check your log4j.properties in the image, not local. The log is stored to /opt/flink/log/jobmanager.log by default. If you are running active Kubernetes integration for a fre

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-09 Thread Yun Tang
AM To: user Subject: Flink on Kubernetes seems to ignore log4j.properties Hey folks, I noticed that my kubernetes flink logs (reached via kubectl logs ) completely ignore any of the configurations I put into /flink/conf/. I set the logger level to WARN, yet I still see INFO level logging from

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-09 Thread vino yang
Hi Li, A potential reason could be conflicting logging frameworks. Can you share the log in your .out file and let us know if the print format of the log is the same as the configuration file you gave. Best, Vino Li Peng 于2019年12月10日周二 上午10:09写道: > Hey folks, I noticed that my kubernetes flink

Flink on Kubernetes seems to ignore log4j.properties

2019-12-09 Thread Li Peng
Hey folks, I noticed that my kubernetes flink logs (reached via *kubectl logs *) completely ignore any of the configurations I put into /flink/conf/. I set the logger level to WARN, yet I still see INFO level logging from flink loggers like org.apache.flink.runtime.checkpoint.CheckpointCoordinator.

Re: [DISCUSS] Best practice to run flink on kubernetes

2019-10-11 Thread Till Rohrmann
Thanks for bumping this thread Yang. I've left some minor comments on your design document. All in all it looks very good to me. I think the next step could be to open the missing PRs to complete FLINK-9953. That would allow people to already try your implementation out and maybe someone from the c

Re: Re:Memory constrains running Flink on Kubernetes

2019-10-10 Thread Yun Tang
t Yun Tang From: shengjk1 Sent: Thursday, October 10, 2019 20:37 To: wvl Cc: user@flink.apache.org Subject: Re:Memory constrains running Flink on Kubernetes +1 I also encountered a similar problem, but I run flink application that uses state in RocksDB on yarn. Yarn con

Re:Memory constrains running Flink on Kubernetes

2019-10-10 Thread shengjk1
+1 I also encountered a similar problem, but I run flink application that uses state in RocksDB on yarn. Yarn container was killed because OOM. I also saw rockdb tuning guide[1], tune some parameters,but it is useless , such as: class MyOptions1 implements OptionsFactory { @Override public DB

Re: [DISCUSS] Best practice to run flink on kubernetes

2019-09-29 Thread 星沉
Yang Wang wrote: Through mixed-run with online services, they could get better resource utilization and reduce the cost. Flink, as an important case, the dynamical resource allocation is the basic requirement. That's why we want to move the progress more faster. vote +1. If flink migrati

Re: [DISCUSS] Best practice to run flink on kubernetes

2019-09-29 Thread Yang Wang
Hi dev and users, I just want to revive this discussion because we have some meaningful progress about kubernetes native integration. I have made a draft implementation to complete the poc. Cli and submission are both working as expected. The design doc[1] has been updated, including the detailed

Re: Memory constrains running Flink on Kubernetes

2019-08-05 Thread Yun Tang
Subject: Re: Memory constrains running Flink on Kubernetes Btw, with regard to: > The default writer-buffer-number is 2 at most for each column family, and the > default write-buffer-memory size is 4MB. This isn't what I see when looking at the OPTIONS-XX file in the rocksdb dir

Re: Memory constrains running Flink on Kubernetes

2019-08-05 Thread wvl
from >>> my experience this part of memory would not occupy too much only if you >>> have many open files. >>> >>> Last but not least, Flink would enable slot sharing by default, and even >>> if you only one slot per taskmanager, there might exists m

Re: Memory constrains running Flink on Kubernetes

2019-07-29 Thread wvl
g by default, and even >> if you only one slot per taskmanager, there might exists many RocksDB >> within that TM due to many operator with keyed state running. >> >> Apart from the theoretical analysis, you'd better to open RocksDB native >> metrics or track the memory

Re: Memory constrains running Flink on Kubernetes

2019-07-29 Thread Yu Li
Best > Yun Tang > -- > *From:* wvl > *Sent:* Thursday, July 25, 2019 17:50 > *To:* Yang Wang > *Cc:* Yun Tang ; Xintong Song ; > user > *Subject:* Re: Memory constrains running Flink on Kubernetes > > Thanks for all the answers so far. > >

Re: Memory constrains running Flink on Kubernetes

2019-07-25 Thread Yun Tang
the memory usage of pods through Prometheus with k8s. Best Yun Tang From: wvl Sent: Thursday, July 25, 2019 17:50 To: Yang Wang Cc: Yun Tang ; Xintong Song ; user Subject: Re: Memory constrains running Flink on Kubernetes Thanks for all the answers so far. Espec

Re: Memory constrains running Flink on Kubernetes

2019-07-25 Thread wvl
e-oom-behavior >> [2] >> https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB#indexes-and-filter-blocks >> [3] >> https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/config.html#rocksdb-native-metrics >> >> Best >> Yun Tang >>

Re: Memory constrains running Flink on Kubernetes

2019-07-24 Thread Yang Wang
i/Memory-usage-in-RocksDB#indexes-and-filter-blocks > [3] > https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/config.html#rocksdb-native-metrics > > Best > Yun Tang > -- > *From:* Xintong Song > *Sent:* Wednesday, July 24, 2019 11:59 &g

Re: Memory constrains running Flink on Kubernetes

2019-07-24 Thread Yun Tang
___ From: Xintong Song Sent: Wednesday, July 24, 2019 11:59 To: wvl Cc: user Subject: Re: Memory constrains running Flink on Kubernetes Hi, Flink acquires these 'Status_JVM_Memory' metrics through the MXBean library. According to MXBean document, non-heap is "the Java virtual machin

Re: Memory constrains running Flink on Kubernetes

2019-07-23 Thread Xintong Song
Hi, Flink acquires these 'Status_JVM_Memory' metrics through the MXBean library. According to MXBean document, non-heap is "the Java virtual machine manages memory other than the heap (referred as non-heap memory)". Not sure whether that is equivalent to the metaspace. If the '-XX:MaxMetaspaceSize

  1   2   >