Re: strange behavior with jobmanager.rpc.address on standalone HA cluster

2018-05-23 Thread Till Rohrmann
Alright, try to grab the logs if you see this problem reoccurring. I would be interested in understanding why this happens. Cheers, Till On Fri, May 18, 2018 at 9:45 PM, Derek VerLee wrote: > Till, > > Thanks for the response. Sorry for the delayed reply. > > The flink version is 1.3.2, in sta

Re: strange behavior with jobmanager.rpc.address on standalone HA cluster

2018-05-13 Thread Till Rohrmann
Hi Derek, given that you've started the different Flink cluster components all with the same HA enabled configuration, the TMs should be able to connect to jm1 after you've killed jm0. The jobmanager.rpc.address should not be used when HA mode is enabled. In order to get to the bottom of the desc

Re: strange behavior with jobmanager.rpc.address on standalone HA cluster

2018-05-07 Thread Fabian Hueske
Hi Derek, 1. I've created a JIRA issue to improve the docs as you recommended [1]. 2. This discussion goes quite a bit into the internals of the HA setup. Let me pull in Till (in CC) who knows the details of HA. Best, Fabian [1] https://issues.apache.org/jira/browse/FLINK-9309 2018-05-05 15:34

strange behavior with jobmanager.rpc.address on standalone HA cluster

2018-05-05 Thread Derek VerLee
Two things: 1. It would be beneficial I think to drop a line somewhere in the docs (probably on the production ready checklist as well as the HA page) explaining that enabling zookeeper "highavailability" allows for your jobs to restart automatically after a jo