John, we ran samza in HA. Something key to point out is the failover is not graceful once zookeeper (assuming that's your provider) elects the standby resourcemanager that resource manager will tear down and restart all applications, and so depending on how long it takes your jobs to bootstrap you will probably have minutes of downtime ... just heads up. -Jordan
On Tue, Nov 3, 2015 at 8:04 AM, John Tipper <john_tip...@hotmail.com> wrote: > Thanks Rick, much appreciated. Did you have to set > yarn.resourcemanager.hostname or can you leave this out? > > Thanks, > > John > > ________________________________________ > From: Rick Mangi <r...@chartbeat.com> > Sent: 03 November 2015 15:46 > To: dev@samza.apache.org > Subject: Re: Does Samza work with ResourceManager in HA? > > Hi John, > > We just got this set up last week. I haven’t fully tested the failover but > it certainly works for testing out our samza jobs on a pretty large cluster. > > All of the server names are fqdn in our live yarn-site.xml > > Hope this helps, > > Rick > > > <property> > <name>yarn.resourcemanager.ha.enabled</name> > <value>true</value> > </property> > <property> > <name>yarn.resourcemanager.ha.automatic-failover.enabled</name> > <value>true</value> > </property> > <property> > <name>yarn.resourcemanager.cluster-id</name> > <value>ops02</value> > </property> > <property> > <name>yarn.resourcemanager.ha.rm-ids</name> > <value>rm1,rm2</value> > </property> > <property> > <name>yarn.resourcemanager.hostname.rm1</name> > <value>yarnmaster01</value> > </property> > <property> > <name>yarn.resourcemanager.hostname.rm2</name> > <value>yarnmaster02</value> > </property> > <property> > <name>yarn.resourcemanager.webapp.address.rm1</name> > <value>yarnmaster01:8088</value> > </property> > <property> > <name>yarn.resourcemanager.webapp.address.rm2</name> > <value>yarnmaster02:8088</value> > </property> > <property> > <name>yarn.resourcemanager.zk-address</name> > <value>zk01:2181,zk02:2181,zk03:2181,zk04:2181,zk06:2181</va > lue> > </property> > <property> > <name>yarn.resourcemanager.store.class</name> > > <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value> > </property> > > <property> > <name>yarn.resourcemanager.recovery.enabled</name> > <value>true</value> > </property> > <property> > <name>yarn.resourcemanager.zk-state-store.parent-path</name> > <value>/yarn_ops02_rmstore</value> > </property> > > > > > On Nov 3, 2015, at 6:10 AM, John Tipper <john_tip...@hotmail.com> wrote: > > > > > > > > Does anyone have Samza working with resource manager in HA? If so, what > do I set yarn.resourcemanager.hostname to in yarn-site.xml? Can anyone > share a working yarn-site.xml please? > > > > > > If I set it to the first of my RMs, the the job submission works ok if I > submit the job from that RM and the RM is the active one. If the RM machine > that I run the job submission from is not active, I get connection refused > errors on port 8032. If I don't set it, I get errors where run-job.sh > tries to submit to 0.0.0.0:8032 > > > > > > Many thanks, > > > > > > John > > -- Jordan Shaw Full Stack Software Engineer PubNub Inc 1045 17th St San Francisco, CA 94107