Re: Yarn redundancy

2015-05-15 Thread Shekar Tippur
I think I figured it out. stop-yarn.sh and start-yarn.sh restarts resourcemanager on the same node but not on all other nodes. The workaround is to go to all rm nodes and start rm individually. I see that these scripts invoke another command yarn-daemon.sh which takes a argument --hosts. I tried to

Re: Yarn redundancy

2015-05-15 Thread Gustavo Anatoly
Hi, Shekar. The failed happens when: *sprdargas403t/10.180.195.33 to sprdargas403:8031* I suggest that you verify: 1)* $nmap -sT sprdargas * to check whether the port 8031 is open; 2) Use *traceroute* to check if the name of machine is being resolved correctly; 3) Check your /etc/hosts whethe

Re: Yarn redundancy

2015-05-15 Thread Yan Fang
Hi Shekar, I do not have much experience in setting up the HA. So if I were you, I may check 1) when you take the RM down, does the backup RM runs successfully? 2) if the backup RM runs successfully, can you see the Samza Application run in the Yarn UI (such as, localhost:8088?) 3) if can not see

Re: Yarn redundancy

2015-05-14 Thread Shekar Tippur
Yan, I have followed the doc. Here is what was done ... 1. Setup the yarn-site.xml yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id cluster1 yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 sprdargas402.

Re: Yarn redundancy

2015-05-14 Thread Yan Fang
Is the HA set correctly? The log looks like it's in the YARN setting side. Fang, Yan yanfang...@gmail.com On Thu, May 14, 2015 at 12:29 PM, Shekar Tippur wrote: > Other observation I forgot to mention is that if I kill the rm and nm > process, samza job seem to run properly. Only when 01 server

Re: Yarn redundancy

2015-05-14 Thread Shekar Tippur
Other observation I forgot to mention is that if I kill the rm and nm process, samza job seem to run properly. Only when 01 server is rebooted, I seem to encounter this error and as a result, no jobs get processed. - Shekar On Thu, May 14, 2015 at 12:14 PM, Shekar Tippur wrote: > Hello, > > I h