Dear Apache Flink Community,
I hope you're doing well.
We are currently operating a Flink deployment on *Kubernetes*, with *high
availability (HA) configured using Kubernetes-based HA services*. We're
exploring approaches for *geo-redundancy (GR)* to ensure disaster recovery
and fault tolerance a
t;s3a://mybucket")
val fs = FileSystem.get(s3uri, hadoopConfig)
###
Best regards.
[Logo Orange]<http://www.orange.com/>
Gwenael Le Barzic
De : LE BARZIC Gwenael DTSI/SI
Envoyé : jeudi 11 juillet 2024 16:24
À : user@flink.apache.org
Objet : Trying to read a file from S3 with flink on
Hey guys.
I'm trying to read a file from an internal S3 with flink on Kubernetes, but get
a strange blocking error.
Here is the code :
MyFlinkJob.scala :
###
package com.example.flink
import org.apache.flink.api.common.serialization.SimpleStringSchema
i
;Biao Geng"
An: "Oliver Schmied"
Cc: user@flink.apache.org
Betreff: Re: Advice Needed: Setting Up Prometheus and Grafana Monitoring for Apache Flink on Kubernetes
Hi Oliver,
I believe you are almost there. One thing I found could improve is that in your job ya
Hi Oliver,
I believe you are almost there. One thing I found could improve is that in
your job yaml, instead of using:
kubernetes.operator.metrics.reporter.prommetrics.reporters: prom
kubernetes.operator.metrics.reporter.prommetrics.reporter.prom.factory.class:
org.apache.flink.metrics.promet
Dear Apache Flink Community,
I am currently trying to monitor an Apache Flink cluster deployed on Kubernetes using Prometheus and Grafana. Despite following the official guide (https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-release-1.8/docs/operations/metrics-logging/) on how t
The "JAR file does not exist" exception comes from the JobManager side, not
on the client.
Please be aware that the local:// scheme in the jarURI means the path in
the JobManager pod.
You could use an init-container to download your user jar and mount it to
the JobManager main-container.
Refer to
I have a Kubernetes cluster in GCP running the Flink Kubernetes Operator.
I'm trying to package a project with the Apache Beam MinimalWordCount using
the Flink Runner as a FlinkDeployment to the Kubernetes Cluster
Job Docker image created with this Dockerfile:
FROM flink
ENV FLINK_CLASSPATH /op
Hello thanks for your response,
I just want to know why you suggest that i should use the native Kubernetes HA.
Im my case I need to use the standalone application mode.
On 2021/10/25 09:25:21, Xintong Song wrote:
> You should be using the native Kubernetes HA. This error message suggests
> Fl
Any suggestions would be appreciated.
On 2021/10/20 16:18:39, marco wrote:
>
>
> Hello flink community::
>
> I am deploying flink application cluster standalone mode on kubernetes, but i
> am facing some problems
>
> the job starts normally and it continues to run but at some point in time
Hello flink community::
I am deploying flink application cluster standalone mode on kubernetes, but i
am facing some problems
the job starts normally and it continues to run but at some point in time it
crushes and gets restarted.
Does anyone facing the same problem or know how to resolve
the contents of this information is strictly prohibited.
From: Guowei Ma
Sent: Thursday, September 2, 2021 11:32 PM
To: Julian Cardarelli
Cc: user
Subject: [External] Re: Flink on Kubernetes
Hi, Julian
I notice that your configuration includes
"restart-strategy.fixed-delay.attempt
* Freitag, 3. September 2021 08:09
> *To:* Alexis Sarda-Espinosa
> *Cc:* spoon_lz ; Denis Cosmin NUTIU <
> dnu...@bitdefender.com>; matth...@ververica.com; user@flink.apache.org
> *Subject:* Re: Deploying Flink on Kubernetes with fractional CPU and
> different limits and request
. September 2021 08:09
To: Alexis Sarda-Espinosa
Cc: spoon_lz ; Denis Cosmin NUTIU ;
matth...@ververica.com; user@flink.apache.org
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi Alexis
Thanks for your valuable inputs.
First, I want to share
5GiB.
>
>
>
> Regards,
>
> Alexis.
>
>
>
> *From:* spoon_lz
> *Sent:* Donnerstag, 2. September 2021 14:12
> *To:* Yang Wang
> *Cc:* Denis Cosmin NUTIU ; Alexis Sarda-Espinosa <
> alexis.sarda-espin...@microfocus.com>; matth...@ververica.com;
> user@fli
So if this does not work I
think you could share the log at that time and the flink version you use.
Best,
Guowei
On Fri, Sep 3, 2021 at 2:00 AM Julian Cardarelli wrote:
> Hello –
>
>
>
> We have implemented Flink on Kubernetes with Google Cloud Storage in high
> availability co
Hello -
We have implemented Flink on Kubernetes with Google Cloud Storage in high
availability configuration as per the below configmap. Everything appears to be
working normally, state is being saved to GCS.
However, every now and then - perhaps weekly or every other week, all of the
; Alexis Sarda-Espinosa
; matth...@ververica.com;
user@flink.apache.org
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi Yang,
I agree with you, but I think the limit-factor should be greater than or equal
to 1, and default to 1 is a better
Kubernetes | Apache Flink
[2] [FLINK-15648] Support to configure limit for CPU and memory requirement -
ASF JIRA (apache.org)
From: Yang Wang
Sent: Tuesday, August 31, 2021 6:04 AM
To: Alexis Sarda-Espinosa
Cc: Denis Cosmin NUTIU ; matth...@ververica.com
; user@flink.apache.org
Subject: Re: Deploying
u! I have limited knowledge of Flink internals but the
>> kubernetes.jobmanager.limit-factor and kubernetes.taskmanager.limit-factor
>> seems to be the right way to do it.
>>
>> [1] Native Kubernetes | Apache Flink
>> <https://ci.apache.org/projects/flink/flink-docs-master/doc
gust 31, 2021 6:04 AM
To: Alexis Sarda-Espinosa
Cc: Denis Cosmin NUTIU ; matth...@ververica.com
; user@flink.apache.org
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi all,
I think it is a good improvement to support different resource requ
ink/flink-docs-master/docs/deployment/resource-providers/native_kubernetes/#pod-template>
> [2] [FLINK-15648] Support to configure limit for CPU and memory
> requirement - ASF JIRA (apache.org)
> <https://issues.apache.org/jira/browse/FLINK-15648>
>
> --
t: Tuesday, August 31, 2021 6:04 AM
To: Alexis Sarda-Espinosa
Cc: Denis Cosmin NUTIU ; matth...@ververica.com
; user@flink.apache.org
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi all,
I think it is a good improvement to support different
com>;
matth...@ververica.com <matth...@ververica.com>;
user@flink.apache.org <user@flink.apache.org>
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different limits and requests
Hi all,
I think it is a good improvement to support different resource requests
pache.org<mailto:user@flink.apache.org>
mailto:user@flink.apache.org>>
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi all,
I think it is a good improvement to support different resource requests and
limits. And it is very useful
--
> *From:* Yang Wang
> *Sent:* Tuesday, August 31, 2021 6:04 AM
> *To:* Alexis Sarda-Espinosa
> *Cc:* Denis Cosmin NUTIU ; matth...@ververica.com
> ; user@flink.apache.org
> *Subject:* Re: Deploying Flink on Kubernetes with fractional CPU and
> d
___
From: Yang Wang
Sent: Tuesday, August 31, 2021 6:04 AM
To: Alexis Sarda-Espinosa
Cc: Denis Cosmin NUTIU ; matth...@ververica.com
; user@flink.apache.org
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi all,
etting the configuration options get bloated.
>
>
>
> Regards,
>
> Alexis.
>
>
>
> *From:* Denis Cosmin NUTIU
> *Sent:* Donnerstag, 26. August 2021 15:55
> *To:* matth...@ververica.com
> *Cc:* user@flink.apache.org; danrtsey...@gmail.com
> *Subject:* Re: De
.
Regards,
Alexis.
From: Denis Cosmin NUTIU
Sent: Donnerstag, 26. August 2021 15:55
To: matth...@ververica.com
Cc: user@flink.apache.org; danrtsey...@gmail.com
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi Matthias,
Thanks for getting back to me
Hi Matthias,
Thanks for getting back to me and for your time!
We have some Flink jobs deployed on Kubernetes and running kubectl top pod
gives the following result:
NAMECPU(cores)
MEMORY(bytes)
aa-78c8cb77d4-zlmpg 8
Hi Denis,
I did a bit of digging: It looks like there is no way to specify them
independently. You can find documentation about pod templates for
TaskManager and JobManager [1]. But even there it states that for cpu and
memory, the resource specs are overwritten by the Flink configuration. The
code
Hello,
I've developed a Flink job and I'm trying to deploy it on a Kubernetes
cluster using Flink Native.
Setting kubernetes.taskmanager.cpu=0.5 and
kubernetes.jobmanager.cpu=0.5 sets the requests and limits to 500m,
which is correct, but I'd like to set the requests and limits to
different value
t; I am running some testing with flink on Kubernetes. Every let’s say five
> to ten days, all the jobs disappear from running jobs. There’s nothing
> under completed jobs, and there’s no record of the submitted jar files in
> the cluster.
>
>
>
> In some manner or another, it is
Hello -
I am running some testing with flink on Kubernetes. Every let's say five to ten
days, all the jobs disappear from running jobs. There's nothing under completed
jobs, and there's no record of the submitted jar files in the cluster.
In some manner or another, it is almost
Hi George,
Specifically for StateFun, we have the following Helm charts [1] to help
you deploy Stateful Functions on k8s.
The greeter example's docker-compose file also includes Kafka (and hence
Zookeeper).
Indeed the Flink cluster is "included" in the master/worker stateful
functions docker image
Thank you. This is very helpful.
On Mon, Nov 23, 2020 at 9:46 AM Till Rohrmann wrote:
> Hi George,
>
> Here is some documentation about how to deploy a stateful function job
> [1]. In a nutshell, you need to deploy a Flink cluster on which you can run
> the stateful function job. This can either
Hi George,
Here is some documentation about how to deploy a stateful function job [1].
In a nutshell, you need to deploy a Flink cluster on which you can run the
stateful function job. This can either happen before (e.g. by spawning a
session cluster on K8s [2]) or you can combine your job into a
Sorry. Forgot to reply to all.
On Sun, Nov 22, 2020 at 9:24 PM George Costea wrote:
>
> Hi Xingbo,
>
> I’m interested in using stateful functions to build an application on
> Kubernetes. Don’t I need to deploy the flink cluster on Kubernetes first
> before deploying my stateful functions?
>
>
Hi George,
Have you referred to the official document[1]?
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/kubernetes.html
Best,
Xingbo
在 2020年11月21日星期六,George Costea 写道:
> Hi there,
>
> Is there an example of how to deploy a flink cluster on Kubernetes?
> I'd lik
Hi there,
Is there an example of how to deploy a flink cluster on Kubernetes?
I'd like to deploy the flink cluster, a kafka-broker, and then the
greeter example to give it a try.
Thanks,
George
Inline comments for your questions.
Is it similar to the first? How the second is different?
The first leverages the K8s operator to make standalone job cluster running
on K8s easier. However,
the second is more Flink native. We have an embedded K8s client in Flink
and could use "kubernetes-sessi
I was trying to deploy Flink in Kubernetes environment and came across two
things:
1. Kubernetes Flink control plane developed by google and Lyft
- https://github.com/lyft/flinkk8soperator
- https://github.com/GoogleCloudPlatform/flink-on-k8s-operator
2. Deploying Kubernetes natively.
Hi lvan Yang,
#1. If a TaskManager crashed exceptionally and there are some running task
on it, it
could not join back gracefully. Whether the full job will be restarted
depends on the
failover strategies[1].
#2. Currently, when new TaskManagers join to the Flink cluster, the running
Flink
job co
Hi,
I have setup Filnk 1.9.1 on Kubernetes on AWS EKS with one job manager pod, 10
task manager pods, one pod per EC2 instance. Job runs fine. After a while, for
some reason, one pod (task manager) crashed, then the pod restarted. After
that, the job got into a bad state. All the parallelisms a
user
Subject: Re: Flink on Kubernetes unable to Recover from failure
Hey Morgan,
Is it possible for you to provide us with the full logs of the JobManager and
the affected TaskManager?
This might give us a hint why the number of task slots is zero.
Best,
Robert
On Tue, May 5, 2020 at 1
Hey Morgan,
Is it possible for you to provide us with the full logs of the JobManager
and the affected TaskManager?
This might give us a hint why the number of task slots is zero.
Best,
Robert
On Tue, May 5, 2020 at 11:41 AM Morgan Geldenhuys <
morgan.geldenh...@tu-berlin.de> wrote:
>
> Commun
Community,
I am currently doing some fault tolerance testing for Flink (1.10)
running on Kubernetes (1.18) and am encountering an error where after a
running job experiences a failure, the job fails completely.
A Flink session cluster has been created according to the documentation
containe
Thank you, Yang and Xintong!
Best,
Pankaj
On Mon, Mar 16, 2020, 9:27 PM Yang Wang wrote:
> Hi Pankaj,
>
> Just like Xintong has said, the biggest difference of Flink on Kubernetes
> and native
> integration is dynamic resource allocation. Since the latter has en
> embedde
Hi Pankaj,
Just like Xintong has said, the biggest difference of Flink on Kubernetes
and native
integration is dynamic resource allocation. Since the latter has en
embedded K8s
client and will communicate with K8s Api server directly to
allocate/release JM/TM
pods.
Both for the two ways to run
ong wrote:
> Forgot to mention that "running Flink natively on Kubernetes" is newly
> introduced and is only available for Flink 1.10 and above.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Mon, Mar 16, 2020 at 5:40 PM Xintong Song
> wrote:
>
>> Hi
Hi Pankaj,
"Running Flink on Kubernetes" refers to the old way that basically deploys
a Flink standalone cluster on Kubernetes. We leverage scripts to run Flink
Master and TaskManager processes inside Kubernetes container. In this way,
Flink is not ware of whether it's running i
Forgot to mention that "running Flink natively on Kubernetes" is newly
introduced and is only available for Flink 1.10 and above.
Thank you~
Xintong Song
On Mon, Mar 16, 2020 at 5:40 PM Xintong Song wrote:
> Hi Pankaj,
>
> "Running Flink on Kubernetes" refers
Hi all,
I want to run Flink, Spark and other processing engines on a single
Kubernetes cluster.
>From the Flink documentation, I did not understand the difference between:
(1) Running Flink on Kubernetes, Versus (2) Running Flink natively on
Kubernetes.
Could someone please explain
Hi Flavio,
We have implemented our own flink operator, the operator will start a flink
job cluster (the application jar is already packaged together with flink in
the docker image). I believe Google's flink operator will start a session
cluster, and user can submit the flink job via REST. Not look
Sorry I wanted to mention https://github.com/lyft/flinkk8soperator (I don't
know which one of the 2 is better)
On Wed, Mar 11, 2020 at 10:19 AM Flavio Pompermaier
wrote:
> Have you tried to use existing operators such as
> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator or
> https:/
Have you tried to use existing operators such as
https://github.com/GoogleCloudPlatform/flink-on-k8s-operator or
https://github.com/GoogleCloudPlatform/flink-on-k8s-operator?
On Wed, Mar 11, 2020 at 4:46 AM Xintong Song wrote:
> Hi Eleanore,
>
> That does't sound like a scaling issue. It's proba
Hi Eleanore,
That does't sound like a scaling issue. It's probably a data skew, that the
data volume on some of the keys are significantly higher than others. I'm
not familiar with this area though, and have copied Jark for you, who is
one of the community experts in this area.
Thank you~
Xinton
_Hi Xintong,
Thanks for the prompt reply! To answer your question:
- Which Flink version are you using?
v1.8.2
- Is this skew observed only after a scaling-up? What happens if the
parallelism is initially set to the scaled-up value?
I also tried this, it
Hi Eleanore,
I have a few more questions regarding your issue.
- Which Flink version are you using?
- Is this skew observed only after a scaling-up? What happens if the
parallelism is initially set to the scaled-up value?
- Keeping the job running a while after the scale-up, does the
Hi Experts,
I have my flink application running on Kubernetes, initially with 1 Job
Manager, and 2 Task Managers.
Then we have the custom operator that watches for the CRD, when the CRD
replicas changed, it will patch the Flink Job Manager deployment
parallelism and max parallelism according to th
e
>>>>>> case of failed JM - perhaps we need to resubmit all jobs.
>>>>>> Let me know if I have misunderstood anything.
>>>>>>
>>>>>> 3. Restarting jobs
>>>>>> For the session cluster, you could directly cancel the job a
t;>>>> per-job
>>>>> cluster from the latest savepoint.
>>>>>
>>>>> 4. Managing the flink jobs
>>>>> The rest api and flink command line could be used to managing the
>>>>> jobs(e.g.
>>>>>
on cluster, if one taskmanager crashed, then all the jobs which
>>> have tasks
>>> on this taskmanager will failed.
>>> Both session and per-job could be configured with high availability and
>>> recover
>>> from the latest checkpoint.
>
t;> recover
>> from the latest checkpoint.
>>
>> Mans - Does a task manager failure cause the job to fail ? My
>> understanding is the JM failure are catastrophic while TM failures are
>> recoverable.
>>
>> > Is there any need for specifying volume for the
failure are catastrophic while TM failures are
> recoverable.
>
> > Is there any need for specifying volume for the pods?
> No, you do not need to specify a volume for pod. All the data in the pod
> local directory is temporary. When a pod crashed and relaunched, the
> task
sumefrom the latest checkpoint.
Mans - So if we are saving checkpoint in S3 then there is no need for disks -
should we use emptyDir ?
[1].
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
M Singh 于2020年2月23日周日 上午2:28写道:
Hey Folks:
I am trying to figure out
org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
M Singh 于2020年2月23日周日 上午2:28写道:
Hey Folks:
I am trying to figure out the options for running Flink on Kubernetes and am
trying to find out the pros and cons of running in Flink Session vs Flink
Cluster mode
(https://ci.apache.
c while TM failures are
>> recoverable.
>>
>> > Is there any need for specifying volume for the pods?
>> No, you do not need to specify a volume for pod. All the data in the pod
>> local directory is temporary. When a pod crashed and relaunched, the
>> task
pecify a volume for pod. All the data in the pod
> local directory is temporary. When a pod crashed and relaunched, the
> taskmanager will retrieve the checkpoint from zookeeper + S3 and resume
> from the latest checkpoint.
>
> Mans - So if we are saving checkpoint in S3 then there is no
should we use emptyDir ?
[1].
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
M Singh 于2020年2月23日周日 上午2:28写道:
Hey Folks:
I am trying to figure out the options for running Flink on Kubernetes and am
trying to find out the pros and cons of running in F
.
[1].
https://ci.apache.org/projects/flink/flink-docs-master/ops/deployment/native_kubernetes.html
M Singh 于2020年2月23日周日 上午2:28写道:
> Hey Folks:
>
> I am trying to figure out the options for running Flink on Kubernetes and
> am trying to find out the pros and cons of running in Flink
Hey Folks:
I am trying to figure out the options for running Flink on Kubernetes and am
trying to find out the pros and cons of running in Flink Session vs Flink
Cluster mode
(https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/kubernetes.html#flink-session-cluster-on
>>>>
>>>> Li Peng 于2019年12月11日周三 下午1:24写道:
>>>>
>>>>> Ah I see. I think the Flink app is reading files from
>>>>> /opt/flink/conf correctly as it is, since changes I make to flink-conf are
>>>>> picked up as expected, it&
used, or don't apply to stdout or whatever source k8 uses for its
>>>> logs? Given that the pods don't seem to have logs written to file
>>>> anywhere, contrary to the properties, I'm inclined to say it's the former
>>>> and that the log4j pr
have no idea why though.On Tue, Dec 10, 2019 at 6:56 PM Yun Tang <myas...@live.com> wrote:
Sure, /opt/flink/conf is mounted as a volume from the configmap.
Best
Yun Tang
From: Li Peng <li.p...@doordash.com>
Date: Wednesday, December 11, 2019 at 9:37 AM
To: Yang Wang <danr
#x27;t being picked up. Still have no idea why though.On Tue, Dec 10, 2019 at 6:56 PM Yun Tang <myas...@live.com> wrote:
Sure, /opt/flink/conf is mounted as a volume from the configmap.
Best
Yun Tang
From: Li Peng <li.p...@doordash.com>
Date: Wednesday, December 11, 2019 at 9:
ds don't seem to have logs written to file
>>> anywhere, contrary to the properties, I'm inclined to say it's the former
>>> and that the log4j properties just aren't being picked up. Still have no
>>> idea why though.
>>>
>>> On Tue,
;t being picked up. Still have no
>> idea why though.
>>
>> On Tue, Dec 10, 2019 at 6:56 PM Yun Tang wrote:
>>
>>> Sure, /opt/flink/conf is mounted as a volume from the configmap.
>>>
>>>
>>>
>>> Best
>>>
>>> Yun
ue, Dec 10, 2019 at 6:56 PM Yun Tang wrote:
>
>> Sure, /opt/flink/conf is mounted as a volume from the configmap.
>>
>>
>>
>> Best
>>
>> Yun Tang
>>
>>
>>
>> *From: *Li Peng
>> *Date: *Wednesday, December 11, 2019 at 9:37
, /opt/flink/conf is mounted as a volume from the configmap.
>
>
>
> Best
>
> Yun Tang
>
>
>
> *From: *Li Peng
> *Date: *Wednesday, December 11, 2019 at 9:37 AM
> *To: *Yang Wang
> *Cc: *vino yang , user
> *Subject: *Re: Flink on Kubernetes seems to ignore
Sure, /opt/flink/conf is mounted as a volume from the configmap.
Best
Yun Tang
From: Li Peng
Date: Wednesday, December 11, 2019 at 9:37 AM
To: Yang Wang
Cc: vino yang , user
Subject: Re: Flink on Kubernetes seems to ignore log4j.properties
1. Hey Yun, I'm calling /opt/flink/bin/stand
1. Hey Yun, I'm calling /opt/flink/bin/standalone-job.sh and
/opt/flink/bin/taskmanager.sh on my job and task managers respectively.
It's based on the setup described here:
http://shzhangji.com/blog/2019/08/24/deploy-flink-job-cluster-on-kubernetes/ .
I haven't tried the configmap approach yet, doe
Hi Li Peng,
You are running standalone session cluster or per-job cluster on
kubernetes. Right?
If so, i think you need to check your log4j.properties in the image, not
local. The log is
stored to /opt/flink/log/jobmanager.log by default.
If you are running active Kubernetes integration for a fre
AM
To: user
Subject: Flink on Kubernetes seems to ignore log4j.properties
Hey folks, I noticed that my kubernetes flink logs (reached via kubectl logs
) completely ignore any of the configurations I put into
/flink/conf/. I set the logger level to WARN, yet I still see INFO level
logging from
Hi Li,
A potential reason could be conflicting logging frameworks. Can you share
the log in your .out file and let us know if the print format of the log is
the same as the configuration file you gave.
Best,
Vino
Li Peng 于2019年12月10日周二 上午10:09写道:
> Hey folks, I noticed that my kubernetes flink
Hey folks, I noticed that my kubernetes flink logs (reached via *kubectl
logs *) completely ignore any of the configurations I put into
/flink/conf/. I set the logger level to WARN, yet I still see INFO level
logging from flink loggers
like org.apache.flink.runtime.checkpoint.CheckpointCoordinator.
Thanks for bumping this thread Yang. I've left some minor comments on your
design document. All in all it looks very good to me. I think the next step
could be to open the missing PRs to complete FLINK-9953. That would allow
people to already try your implementation out and maybe someone from the
c
t
Yun Tang
From: shengjk1
Sent: Thursday, October 10, 2019 20:37
To: wvl
Cc: user@flink.apache.org
Subject: Re:Memory constrains running Flink on Kubernetes
+1
I also encountered a similar problem, but I run flink application that uses
state in RocksDB on yarn. Yarn con
+1
I also encountered a similar problem, but I run flink application that uses
state in RocksDB on yarn. Yarn container was killed because OOM.
I also saw rockdb tuning guide[1], tune some parameters,but it is useless ,
such as:
class MyOptions1 implements OptionsFactory {
@Override
public DB
Yang Wang wrote:
Through mixed-run with online services, they could get better
resource utilization and reduce the cost. Flink, as an important case,
the dynamical resource
allocation is the basic requirement. That's why we want to move the
progress more faster.
vote +1. If flink migrati
Hi dev and users,
I just want to revive this discussion because we have some meaningful
progress about
kubernetes native integration. I have made a draft implementation to
complete the poc.
Cli and submission are both working as expected. The design doc[1] has been
updated,
including the detailed
Subject: Re: Memory constrains running Flink on Kubernetes
Btw, with regard to:
> The default writer-buffer-number is 2 at most for each column family, and the
> default write-buffer-memory size is 4MB.
This isn't what I see when looking at the OPTIONS-XX file in the rocksdb
dir
from
>>> my experience this part of memory would not occupy too much only if you
>>> have many open files.
>>>
>>> Last but not least, Flink would enable slot sharing by default, and even
>>> if you only one slot per taskmanager, there might exists m
g by default, and even
>> if you only one slot per taskmanager, there might exists many RocksDB
>> within that TM due to many operator with keyed state running.
>>
>> Apart from the theoretical analysis, you'd better to open RocksDB native
>> metrics or track the memory
Best
> Yun Tang
> --
> *From:* wvl
> *Sent:* Thursday, July 25, 2019 17:50
> *To:* Yang Wang
> *Cc:* Yun Tang ; Xintong Song ;
> user
> *Subject:* Re: Memory constrains running Flink on Kubernetes
>
> Thanks for all the answers so far.
>
>
the memory usage of pods through Prometheus with k8s.
Best
Yun Tang
From: wvl
Sent: Thursday, July 25, 2019 17:50
To: Yang Wang
Cc: Yun Tang ; Xintong Song ; user
Subject: Re: Memory constrains running Flink on Kubernetes
Thanks for all the answers so far.
Espec
e-oom-behavior
>> [2]
>> https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB#indexes-and-filter-blocks
>> [3]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/config.html#rocksdb-native-metrics
>>
>> Best
>> Yun Tang
>>
i/Memory-usage-in-RocksDB#indexes-and-filter-blocks
> [3]
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/config.html#rocksdb-native-metrics
>
> Best
> Yun Tang
> --
> *From:* Xintong Song
> *Sent:* Wednesday, July 24, 2019 11:59
&g
___
From: Xintong Song
Sent: Wednesday, July 24, 2019 11:59
To: wvl
Cc: user
Subject: Re: Memory constrains running Flink on Kubernetes
Hi,
Flink acquires these 'Status_JVM_Memory' metrics through the MXBean library.
According to MXBean document, non-heap is "the Java virtual machin
Hi,
Flink acquires these 'Status_JVM_Memory' metrics through the MXBean
library. According to MXBean document, non-heap is "the Java virtual
machine manages memory other than the heap (referred as non-heap memory)".
Not sure whether that is equivalent to the metaspace. If the
'-XX:MaxMetaspaceSize
1 - 100 of 124 matches
Mail list logo