Re: Flink HA with Zookeeper and Docker Compose: unable to startup a working setup.

2024-01-15 Thread Yang Wang
wrote: > Hello, > i'm trying to setup a testing environment using: > > - Flink HA with Zookeeper > - Docker Compose > > While starting the TaskManager generates an exception and then after some > restarts if fails. > > The exception is: > "Caused by: org.apache.f

Flink HA with Zookeeper and Docker Compose: unable to startup a working setup.

2023-12-29 Thread Alessio Bernesco Làvore
Hello, i'm trying to setup a testing environment using: - Flink HA with Zookeeper - Docker Compose While starting the TaskManager generates an exception and then after some restarts if fails. The exception is: "Caused by: org.apache.flink.runtime.rpc.exceptions.FencingTokenExceptio

Flink HA on Kubernetes - RPC port

2023-01-20 Thread bastien dine
Hello, We are migrating our HA setup from ZK to K8S, and we have a question regarding the RPC port. Previously with ZK, the RPC connection config was the : high-availability.jobmanager.port We were expecting that the config will be the same with K8S HA, as the doc says : "The port (range) used b

Re: Activate Flink HA without checkpoints on k8S

2022-10-19 Thread Yang Wang
Add some more information to Gyula's comment. For application mode without checkpoint, you do not need to activate the HA since it will not take any effect and the Flink job will be submitted again after the JobManager restarted. Because the job submission happens on the JobManager side. For sess

Re: Activate Flink HA without checkpoints on k8S

2022-10-13 Thread Gyula Fóra
Without HA, if the jobmanager goes down, job information is lost so the job won’t be restarted after the JM comes back up. Gyula On Thu, 13 Oct 2022 at 19:07, marco andreas wrote: > > > Hello, > > Can someone explain to me what is the point of using HA when deploying an > application cluster wi

Activate Flink HA without checkpoints on k8S

2022-10-13 Thread marco andreas
Hello, Can someone explain to me what is the point of using HA when deploying an application cluster with a single JM and the checkpoints are not activated. AFAK when the pod of the JM goes down kubernetes will restart it anyway so we don't need to activate the HA in this case. Maybe there's som

Re: Need help of deploying Flink HA on kubernetes cluster

2021-08-02 Thread Yang Wang
Could you please check that the allocated load balancer could be accessed locally(on the Flink client side)? Best, Yang Fabian Paul 于2021年7月29日周四 下午7:45写道: > Hi Dhiru, > > Sorry for the late reply. Once the cluster is successfully started the web > UI should be reachable if you somehow forward

Re: Need help of deploying Flink HA on kubernetes cluster

2021-07-29 Thread Fabian Paul
Hi Dhiru, Sorry for the late reply. Once the cluster is successfully started the web UI should be reachable if you somehow forward the port of the running pod. Although with the exception you have shared I suspect the cluster never fully runs (or not long enough). Can you share the full stacktra

Re: Need help of deploying Flink HA on kubernetes cluster

2021-07-22 Thread Fabian Paul
Hi Dhiru, No worries I completely understand your point. Usually all the executable scripts from Flink can be found in the main repository [1]. We also provide a community edition of our commercial product [2] which manages the lifecycle of the cluster and you do not have to use these scripts an

Re: Need help of deploying Flink HA on kubernetes cluster

2021-07-22 Thread Fabian Paul
Hi Dhirendra, Thanks for reaching out. A good way to start is to have a look at [1] and [2]. Once you have everything setup it should be possible to delete the pod of the JobManager while an application is running and the job successfully recovers. You can use one of the example Flink applicati

Need help of deploying Flink HA on kubernetes cluster

2021-07-21 Thread Dhiru
hi ,    I am very new to flink , I am planning to install Flink HA setup on eks cluster with 5 worker nodes . Please can some one point me to right materials or direction how to install as well as any sample job which I can run only for testing and confirm all things are working as expected

Re: Uploading job jar via web UI in flink HA mode

2020-12-02 Thread sidhant gupta
loadHandler.channelRead0(FileUploadHandler.java:159) >> [flink-dist_2.11-1.11.2.jar:1.11.2] >> >> at >> org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:68) >> [flink-dist_2.11-1.11.2.jar:1.11.2] >> >> at >> org.apach

Re: Uploading job jar via web UI in flink HA mode

2020-12-02 Thread Till Rohrmann
ation-for-aws Cheers, Till On Wed, Dec 2, 2020 at 11:31 AM sidhant gupta wrote: > Hi All, > > I have 2 job managers in flink HA mode cluster setup. I have a load > balancer forwarding request to both (leader and stand by) the job managers > in default round-robin fashion. While uploa

Uploading job jar via web UI in flink HA mode

2020-12-02 Thread sidhant gupta
Hi All, I have 2 job managers in flink HA mode cluster setup. I have a load balancer forwarding request to both (leader and stand by) the job managers in default round-robin fashion. While uploading the job jar the Web UI is fluctuating between the leader and standby page. Its difficult to upload

Re: Flink HA for Job Cluster

2020-02-10 Thread KristoffSC
Thanks you both for answers. So I just want to have this right. I can I achieve HA for Job Cluster Docker config having the zookeeper quorum configured like mentioned in [1] right (with s3 and zookeeper)? I assume to modify default Job Cluster config to match the [1] setup. [1] https://ci.apach

Re: Flink HA for Job Cluster

2020-02-09 Thread KristoffSC
Thanks you both for answers. So I just want to have this right. I can I achieve HA for Job Cluster Docker config having the zookeeper quorum configured like mentioned in [1] right (with s3 and zookeeper)? I assume to modify default Job Cluster config to match the [1] setup. [1] https://ci.apach

Re: Flink HA for Job Cluster

2020-02-09 Thread Yang Wang
Just like tison has said, you could use a deployment to restart the jobmanager pod. However, if you want to make the all jobs could recover from the checkpoint, you also need to use the zookeeper and HDFS/S3 to store the high-availability data. Also some Kubernetes native HA support is in plan[1].

Re: Flink HA for Job Cluster

2020-02-09 Thread tison
Hi Krzysztof, Flink doesn't provide JM HA itself yet. For YARN deployment, you can rely on yarn.application-attempts configuration[1]; for Kubernetes deployment, Flink uses Kubernetes deployment to restart a failed JM. Though, such standalone mode doesn't tolerate JM failure and strategies above

Flink HA for Job Cluster

2020-02-07 Thread KristoffSC
Hi, In [1] where we can find setup for Stand Alone an YARN clusters to achieve Job Manager's HA. Is Standalone Cluster High Availability with a zookeeper the same approach for Docker's Job Cluster approach with Kubernetes? [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/jobma

Re: Multiple Job Managers in Flink HA Setup

2019-09-26 Thread Yang Wang
e41d80ec74f09a356c73938@%3Cdev.flink.apache.org%3E > > On Fri, Sep 20, 2019 at 10:57 PM Steven Nelson > wrote: > >> Hello! >> >> I am having some difficulty with multiple job managers in an HA setup >> using Flink 1.9.0. >> >> I have 2 job managers and

Re: Multiple Job Managers in Flink HA Setup

2019-09-25 Thread Gary Yao
; high-availability.cluster-id: /imet-enhance > high-availability.storageDir: hdfs:///flink/ha/ > high-availability.zookeeper.quorum: > flink-state-hdfs-zookeeper-1.flink-state-hdfs-zookeeper-headless.default.svc.cluster.local:2181,flink-state-hdfs-zookeeper-2.flink-state-hdfs-zookeeper-headles

Multiple Job Managers in Flink HA Setup

2019-09-20 Thread Steven Nelson
Hello! I am having some difficulty with multiple job managers in an HA setup using Flink 1.9.0. I have 2 job managers and have setup the HA setup with the following config high-availability: zookeeper high-availability.cluster-id: /imet-enhance high-availability.storageDir: hdfs:///flink/ha

Re: Flink HA cluster on YARN is restarted more than yarn.application-attempts value

2019-06-02 Thread Kazunori Shinhira
>> >> In that test, I set “yarn.application-attempts” to 5, but Flink cluster >> was recovered more than 5 times. >> >> >> Does anyone know what “yarn.application-attempts” mean, and when Flink >> cluster’s attempts time will be incremented ? >> >

Re: Flink HA cluster on YARN is restarted more than yarn.application-attempts value

2019-06-02 Thread Shuyi Chen
ut I still don’t get it. > > > > https://stackoverflow.com/questions/56225088/why-is-flink-ha-cluster-on-yarn-recovered-more-than-the-maximum-number-of-attemp > > > > Best, > -- > Kazunori Shinhira > Mail : k.shinhira.1...@gmail.com >

Flink HA cluster on YARN is restarted more than yarn.application-attempts value

2019-06-02 Thread 新平和礼
tions/56225088/why-is-flink-ha-cluster-on-yarn-recovered-more-than-the-maximum-number-of-attemp Best, -- Kazunori Shinhira Mail : k.shinhira.1...@gmail.com

Flink HA setup on Kubernetes

2018-12-31 Thread Steven Nelson
-availability.cluster-id: /cluster1 high-availability.storageDir: /flink/ha/ high-availability.zookeeper.quorum: flink-state-hdfs-zookeeper-1.flink-state-hdfs-zookeeper-headless.default.svc.cluster.local:2181,flink-state-hdfs-zookeeper-2.flink-state-hdfs-zookeeper-headless.default.svc.cluster.local:2181

Re: Unable to start Flink HA cluster with Zookeeper

2018-08-22 Thread mozer
Thanks for the info, I have managed to launch a HA cluster with adding rpc.address for all job managers. But it did not work with start-cluster.sh, I had to add one by one. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Unable to start Flink HA cluster with Zookeeper

2018-08-22 Thread Dawid Wysakowicz
Hi, It will use HA settings as long as you specify the high-availability: zookeeper. The jobmanager.rpc.adress is used by the jobmanager as a binding address. You can verify it by starting two jobmanagers and then killing the leader. Best, Dawid On Tue, 21 Aug 2018 at 17:46, mozer wrote: > Yeah,

Re: Unable to start Flink HA cluster with Zookeeper

2018-08-21 Thread mozer
Yeah, you are right. I have already tried to set up jobmanager.rpc.adress and it works in that case, but if I use this setting I will not be able to use HA, am i right ? How the job manager can register to zookeeper with the right address but not localhost ? -- Sent from: http://apache-flink-

Re: Unable to start Flink HA cluster with Zookeeper

2018-08-21 Thread Dawid Wysakowicz
Hi, In your case the jobmanager binds itself to localhost and that's what it writes to zookeeper. Try starting the jobmanager manually with jobmanager.rpc.address set to the ip of machine you are running the jobmanager. In other words make sure the jobmanager binds itself to the right ip. Regards

Re: Unable to start Flink HA cluster with Zookeeper

2018-08-21 Thread mozer
FQD or full ip; tried all of them, still no changes ... For ssh connection, I can connect to each machine without passwords. Do you think that the problem can come from : *high-availability.storageDir: file:///shareflink/recovery* ? I don't use a HDFS storage but NAS file system which is co

Re: Unable to start Flink HA cluster with Zookeeper

2018-08-21 Thread miki haiat
First of all try with FQD or full ip. Also in order to run HA cluster you need to make sure that you have password less ssh access to your slaves and master communication. . On Tue, Aug 21, 2018 at 4:15 PM mozer wrote: > I am trying to install a Flink HA cluster (Zookeeper mode) but the t

Unable to start Flink HA cluster with Zookeeper

2018-08-21 Thread mozer
I am trying to install a Flink HA cluster (Zookeeper mode) but the task manager cannot find the job manager. Here I give you the architecture; - Machine 1 : Job Manager + Zookeeper - Machine 2 : Task Manager masters: Machine1 slaves : Machine2 flink-conf.yaml

Unable to start Flink HA cluster with Zookeeper

2018-08-21 Thread mozer
I am trying to install a Flink HA cluster (Zookeeper mode) but the task manager cannot find the job manager. Here I give you the architecture; - Machine 1 : Job Manager + Zookeeper - Machine 2 : Task Manager masters: Machine1 slaves : Machine2 flink-conf.yaml

Zookeeper DR backup needed for Flink HA mode?

2018-05-15 Thread David Corley
We're looking at DR scenarios for our Flink cluster. We already use Zookeeper for JM HA. We use a HDFS cluster that's replicated off-site, and our high-availability.zookeeper.storageDir property is configure to use HDFS. However, in the event of a site-failure, is it also essential that we have a

Re: Retaining uploaded job jars on Flink HA restarts on Kubernetes

2018-05-07 Thread Rohil Surana
erridden. By default, the temporary >>>directory is used. >>>- >>> >>>jobmanager.web.upload.dir: The config parameter defining the >>>directory for uploading the job jars. If not specified a dynamic >>> directory >>>

Re: Retaining uploaded job jars on Flink HA restarts on Kubernetes

2018-05-07 Thread Sampath Bhat
nfig parameter defining the >>directory for uploading the job jars. If not specified a dynamic directory >>will be used under the directory specified by jobmanager.web.tmpdir. >> >> >> Regards, >> >> Chirag >> >> >> >> On Su

Re: Retaining uploaded job jars on Flink HA restarts on Kubernetes

2018-05-07 Thread Rohil Surana
cified a dynamic directory will be >used under the directory specified by jobmanager.web.tmpdir. > > > Regards, > > Chirag > > > > On Sunday, 6 May, 2018, 12:29:43 AM IST, Rohil Surana > wrote: > > > Hi, > > I have a very basic Flink HA setup on Kubern

Re: Retaining uploaded job jars on Flink HA restarts on Kubernetes

2018-05-06 Thread Chesnay Schepler
jobmanager.web.tmpdir. Regards, Chirag On Sunday, 6 May, 2018, 12:29:43 AM IST, Rohil Surana wrote: Hi, I have a very basic Flink HA setup on Kubernetes and wanted to retain job jars on JobManager Restarts. For HA I am using a Zookeeper and a NFS drive mounted on all pods

Re: Retaining uploaded job jars on Flink HA restarts on Kubernetes

2018-05-06 Thread Rohil Surana
ter defining the directory >for uploading the job jars. If not specified a dynamic directory will be >used under the directory specified by jobmanager.web.tmpdir. > > > Regards, > > Chirag > > > > On Sunday, 6 May, 2018, 12:29:43 AM IST, Rohil Surana < >

Re: Retaining uploaded job jars on Flink HA restarts on Kubernetes

2018-05-06 Thread Chirag Dewan
the directory specified by jobmanager.web.tmpdir. Regards, Chirag On Sunday, 6 May, 2018, 12:29:43 AM IST, Rohil Surana wrote: Hi, I have a very basic Flink HA setup on Kubernetes and wanted to retain job jars on JobManager Restarts. For HA I am using a Zookeeper and a NFS drive mounted

Retaining uploaded job jars on Flink HA restarts on Kubernetes

2018-05-05 Thread Rohil Surana
Hi, I have a very basic Flink HA setup on Kubernetes and wanted to retain job jars on JobManager Restarts. For HA I am using a Zookeeper and a NFS drive mounted on all pods (JobManager and TaskManagers), that is being used for checkpoints and have also set the `web.upload.dir: /data/flink

Re: accessing flink HA cluster with scala shell/zeppelin notebook

2018-03-01 Thread santoshg
Hi Alexis, Were you able to make this work ? I am also looking for zepplin integration with Flink and this might be helpful. Thanks Santosh -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink HA Zookeeper Connection Timeout

2017-11-13 Thread Nico Kruber
hypra) wrote: > Hi – We’re currently testing Flink HA and running into a zookeeper timeout > issue. Error log below. > Is there a production checklist or any information on parameters that are > related to flink HA that I need to pay attention to? > Any pointers would really help

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-23 Thread Ufuk Celebi
;> >> If you don’t want to actually rip way into the code for the Job Manager >> the ETCD Operator would be a good way to bring up an ETCD cluster that is >> separate from the core Kubernetes ETCD database. Combined with zetcd you >> could probably have that up and running qu

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-22 Thread Hao Sun
> If you don’t want to actually rip way into the code for the Job Manager > the ETCD Operator <https://github.com/coreos/etcd-operator> would > be a good way to bring up an ETCD cluster that is separate from the core > Kubernetes ETCD database. Combined with zetcd you could probably hav

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-22 Thread James Bucher
kly. Thanks, James Bucher From: Hao Sun mailto:ha...@zendesk.com>> Date: Monday, August 21, 2017 at 9:45 AM To: Stephan Ewen mailto:se...@apache.org>>, Shannon Carey mailto:sca...@expedia.com>> Cc: "user@flink.apache.org<mailto:user@flink.apache.org>" mailto:u

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-21 Thread Hao Sun
;> where the JobManager stores information which needs to be recovered after >> the JobManager fails. >> >> We're eyeing https://github.com/coreos/zetcd >> <https://github.com/coreos/zetcd> as a way to run >> Zookeeper on top of Kubernetes' etcd cl

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-21 Thread Stephan Ewen
https://github.com/coreos/zetcd as a way to run Zookeeper on > top of Kubernetes' etcd cluster so that we don't have to rely on a separate > Zookeeper cluster. However, we haven't tried it yet. > > -Shannon > > From: Hao Sun > Date: Sunday, August 20, 2017 at 9:04 PM

Re: Flink HA with Kubernetes, without Zookeeper

2017-08-21 Thread Shannon Carey
don't have to rely on a separate Zookeeper cluster. However, we haven't tried it yet. -Shannon From: Hao Sun mailto:ha...@zendesk.com>> Date: Sunday, August 20, 2017 at 9:04 PM To: "user@flink.apache.org<mailto:user@flink.apache.org>" mailto:user@flink.apache.org>

Flink HA with Kubernetes, without Zookeeper

2017-08-20 Thread Hao Sun
Hi, I am new to Flink and trying to bring up a Flink cluster on top of Kubernetes. For HA setup, with kubernetes, I think I just need one job manager and do not need Zookeeper? I will store all states to S3 buckets. So in case of failure, kubernetes can just bring up a new job manager without losi

Re: accessing flink HA cluster with scala shell/zeppelin notebook

2017-03-27 Thread Alexis Gendronneau
Hi Robert, Hi Till, I tried to setup high-availibility options in zepplin, but i guess it's just a matter of flink version compatibility on zepplin side. I'll try to compile zepplin with 1.2 and add needed parameter to see if its better. Thanks for your help ! 2017-03-27 15:09 GMT+02:00 Till Rohr

Re: accessing flink HA cluster with scala shell/zeppelin notebook

2017-03-27 Thread Till Rohrmann
Hi Maciek and Alexis, as far as I can tell, I think it is currently not possible to use Zeppelin with a Flink cluster running in HA mode. In order to make it work, it would be necessary to specify either a Flink configuration for the Flink interpreter (this is probably the most general solution) o

Re: accessing flink HA cluster with scala shell/zeppelin notebook

2017-03-23 Thread Robert Metzger
Hi Alexis, did you set the Zookeeper configuration for Flink in Zeppelin? On Mon, Mar 20, 2017 at 11:37 AM, Alexis Gendronneau < a.gendronn...@gmail.com> wrote: > Hello users, > > As Maciek, I'm currently trying to make apache Zeppelin 0.7 working with > Flink. I have two versions of flink avail

Re: accessing flink HA cluster with scala shell/zeppelin notebook

2017-03-20 Thread Alexis Gendronneau
Hello users, As Maciek, I'm currently trying to make apache Zeppelin 0.7 working with Flink. I have two versions of flink available (1.1.2 and 1.2.0). Each one is running in High-availability mode. When running jobs from Zeppelin in Flink local mode, everything works fine. But when trying to subm

Re: Starting flink HA cluster with start-cluster.sh

2017-03-08 Thread Ufuk Celebi
Shouldn't the else branch ``` else HIGH_AVAILABILITY=${DEPRECATED_HA} fi ``` set it to `zookeeper`? Of course, the truth is whatever the script execution prints out. ;-) PS Emails like this should either go to the dev list or it's also fine to open an issue and discuss there (and potentially

Starting flink HA cluster with start-cluster.sh

2017-03-08 Thread Dawid Wysakowicz
Hi, I've tried to start cluster with HA mode as described in the doc, but with a current state of bin/config.sh I failed. I think there is a bug with configuring the HIGH_AVAILABILITY variable in block (bin/config.sh): if [ -z "${HIGH_AVAILABILITY}" ]; then HIGH_AVAILABILITY=$(readFromConfi

Re: accessing flink HA cluster with scala shell/zeppelin notebook

2017-01-24 Thread Aljoscha Krettek
+Till Rohrmann , do you know what can be used to access a HA cluster from that setting. Adding Till since he probably knows the HA stuff best. On Sun, 22 Jan 2017 at 15:58 Maciek Próchniak wrote: > Hi, > > I have standalone Flink cluster configured with HA setting (i.e. with > zookeeper recover

accessing flink HA cluster with scala shell/zeppelin notebook

2017-01-22 Thread Maciek Próchniak
Hi, I have standalone Flink cluster configured with HA setting (i.e. with zookeeper recovery). How should I access it remotely, e.g. with Zeppelin notebook or scala shell? There are settings for host/port, but with HA setting they are not fixed - if I check which is *current leader* host and

Re: Flink HA

2016-02-22 Thread Robert Metzger
Hi Thomas, To avoid having jobs forever restarting, you have to cancel them manually (from the web interface or the /bin/flink client). Also, you can set an appropriate restart strategy (in 1.0-SNAPSHOT), which limits the number of retries. This way the retrying will eventually stop. On Fri, Feb

Re: Flink HA

2016-02-18 Thread Ufuk Celebi
On Thu, Feb 18, 2016 at 6:59 PM, Thomas Lamirault wrote: > We are trying flink in HA mode. Great to hear! > We set in the flink yaml : > > state.backend: filesystem > > recovery.mode: zookeeper > recovery.zookeeper.quorum: > > recovery.zookeeper.path.root: > > recovery.zookeeper.storageDir: >

Flink HA

2016-02-18 Thread Thomas Lamirault
Hi ! We are trying flink in HA mode. Our application is a streaming application with windowing mechanism. We set in the flink yaml : state.backend: filesystem recovery.mode: zookeeper recovery.zookeeper.quorum: recovery.zookeeper.path.root: recovery.zookeeper.storageDir: recovery.back

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-16 Thread Stephan Ewen
t;> >>> Hi Stefano, >>> >>> >>> >>> The Job should stop temporarily but then be resumed by the new >>> >>> JobManager. Have you increased the number of execution retries? >>> AFAIK, >>> >>> it is set to 0 by de

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-16 Thread Stefano Baghino
gt;> JobManager. Have you increased the number of execution retries? AFAIK, >> >>> it is set to 0 by default. This will not re-run the job, even in HA >> >>> mode. You can enable it on the StreamExecutionEnvironment. >> >>> >> >>> Otherwis

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Stefano Baghino
gt; >>> mode. You can enable it on the StreamExecutionEnvironment. > >>> > >>> Otherwise, you have probably already found the documentation: > >>> > >>> > https://ci.apache.org/projects/flink/flink-docs-master/setup/jobmanager_high_availabil

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Maximilian Michels
e, you have probably already found the documentation: >>> >>> https://ci.apache.org/projects/flink/flink-docs-master/setup/jobmanager_high_availability.html#configuration >>> >>> Cheers, >>> Max >>> >>> On Mon, Feb 15, 2016 at 12:35 PM,

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Ufuk Celebi
> On 15 Feb 2016, at 13:40, Stefano Baghino > wrote: > > Hi Ufuk, thanks for replying. > > Regarding the masters file: yes, I've specified all the masters and checked > out that they were actually running after the start-cluster.sh. I'll gladly > share the logs as soon as I get to see them.

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Maximilian Michels
have probably already found the documentation: >> >> https://ci.apache.org/projects/flink/flink-docs-master/setup/jobmanager_high_availability.html#configuration >> >> Cheers, >> Max >> >> On Mon, Feb 15, 2016 at 12:35 PM, Stefano Baghino >> wrote

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Stefano Baghino
35 PM, Stefano Baghino > wrote: > > Hello everyone, > > > > last week I've ran some tests with Apache ZooKeeper to get a grip on > Flink > > HA features. My tests went bad so far and I can't sort out the reason. > > > > My latest tests involved F

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Stefano Baghino
On Mon, Feb 15, 2016 at 12:35 PM, Stefano Baghino > wrote: > > Hello everyone, > > > > last week I've ran some tests with Apache ZooKeeper to get a grip on > Flink > > HA features. My tests went bad so far and I can't sort out the reason. > > > > My latest te

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Maximilian Michels
oKeeper to get a grip on Flink > HA features. My tests went bad so far and I can't sort out the reason. > > My latest tests involved Flink 0.10.2, ran as a standalone cluster with 3 > masters and 4 slaves. The 3 masters are also the ZooKeeper (3.4.6) ensemble. > I've started

Re: Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Ufuk Celebi
. Can you please share the job manager logs of all started job managers? – Ufuk On Mon, Feb 15, 2016 at 12:35 PM, Stefano Baghino wrote: > Hello everyone, > > last week I've ran some tests with Apache ZooKeeper to get a grip on Flink > HA features. My tests went bad so far and

Issues testing Flink HA w/ ZooKeeper

2016-02-15 Thread Stefano Baghino
Hello everyone, last week I've ran some tests with Apache ZooKeeper to get a grip on Flink HA features. My tests went bad so far and I can't sort out the reason. My latest tests involved Flink 0.10.2, ran as a standalone cluster with 3 masters and 4 slaves. The 3 masters are also the

Re: Flink HA mode

2015-09-10 Thread Ufuk Celebi
done yet. > > Best, Fabian > On Sep 10, 2015 01:29, "Emmanuel" wrote: > >> is this a 0.10 snapshot feature only? I'm using 0.9.1 right now >> >> >> -- >> From: ele...@msn.com >> To: user@flink.apache.org >&g

RE: Flink HA mode

2015-09-09 Thread Fabian Hueske
ng 0.9.1 right now > > > -- > From: ele...@msn.com > To: user@flink.apache.org > Subject: RE: Flink HA mode > Date: Wed, 9 Sep 2015 16:11:38 -0700 > > Been playing with the HA... > I find the UIs confusing here: > in the dashboard on one side I see 0 slots 0 taskmanag

RE: Flink HA mode

2015-09-09 Thread Emmanuel
is this a 0.10 snapshot feature only? I'm using 0.9.1 right now From: ele...@msn.com To: user@flink.apache.org Subject: RE: Flink HA mode Date: Wed, 9 Sep 2015 16:11:38 -0700 Been playing with the HA...I find the UIs confusing here: in the dashboard on one side I see 0 slots 0 taskman

RE: Flink HA mode

2015-09-09 Thread Emmanuel
e multiple JMs IPs in the jobmanager.rpc.address? Thanks Date: Wed, 9 Sep 2015 10:19:36 +0200 Subject: Re: Flink HA mode From: trohrm...@apache.org To: user@flink.apache.org The only necessary information for the JobManager and TaskManager is to know where to find the ZooKeeper quorum to do leader election

Re: Flink HA mode

2015-09-09 Thread Till Rohrmann
The only necessary information for the JobManager and TaskManager is to know where to find the ZooKeeper quorum to do leader election and retrieve the leader address from. This will be configured via the config parameter `ha.zookeeper.quorum`. On Wed, Sep 9, 2015 at 10:15 AM, Stephan Ewen wrote:

Re: Flink HA mode

2015-09-09 Thread Stephan Ewen
TL;DR is that you are right, it is only the initial list. If a JobManager comes back with a new IP address, it will be available. On Wed, Sep 9, 2015 at 8:35 AM, Ufuk Celebi wrote: > > > On 09 Sep 2015, at 04:48, Emmanuel wrote: > > > > my questions is: how critical is the bootstrap ip list in

Re: Flink HA mode

2015-09-08 Thread Ufuk Celebi
> On 09 Sep 2015, at 04:48, Emmanuel wrote: > > my questions is: how critical is the bootstrap ip list in masters? Hey Emmanuel, good questions. I read over the docs for this again [1] and you are right that we should make this clearer. The “masters" file is only relevant for the start/stop

RE: Flink HA mode

2015-09-08 Thread Emmanuel
my questions is: how critical is the bootstrap ip list in masters? does this get updated or does it have to be updated by some other service? From: zhangruc...@huawei.com To: user@flink.apache.org Subject: re: Flink HA mode Date: Wed, 9 Sep 2015 00:48:42 + In order to discover new

re: Flink HA mode

2015-09-08 Thread Zhangrucong
[mailto:ele...@msn.com] 发送时间: 2015年9月9日 7:59 收件人: user@flink.apache.org 主题: Flink HA mode Looking at Flink HA mode. Why do you need to have the list of masters in the config if zookeeper is used to keep track of them? In an environment like Google Cloud or Container Engine, the JM may come back up

Flink HA mode

2015-09-08 Thread Emmanuel
Looking at Flink HA mode. Why do you need to have the list of masters in the config if zookeeper is used to keep track of them? In an environment like Google Cloud or Container Engine, the JM may come back up but will likely have another IP address. Is the masters config file only for