Other observation I forgot to mention is that if I kill the rm and nm
process, samza job seem to run properly. Only when 01 server is rebooted, I
seem to encounter this error and as a result, no jobs get processed.

- Shekar

On Thu, May 14, 2015 at 12:14 PM, Shekar Tippur <ctip...@gmail.com> wrote:

> Hello,
>
> I have setup redundancy on resource manager based on this doc
> https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html
> I then shut down server 1 and was expecting that 02 server would take over.
>
> Instead I see this error. I am not sure if I am missing something.
>
> 2015-05-14 11:55:01,820 INFO  [Node Status Updater]
> retry.RetryInvocationHandler (RetryInvocationHandler.java:invoke(140)) -
> Exception while invoking nodeHeartbeat of class ResourceTrackerPBClientImpl
> over rm2 after 19 fail over attempts. Trying to fail over after sleeping
> for 24180ms.
>
> java.net.ConnectException: Call From sprdargas403t/10.180.195.33 to
> sprdargas403:8031 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
>
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
>
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>
> at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
>
> at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:783)
>
> at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:730)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1415)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>
> at com.sun.proxy.$Proxy27.nodeHeartbeat(Unknown Source)
>
> at
> org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.nodeHeartbeat(ResourceTrackerPBClientImpl.java:80)
>
> at sun.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:606)
>
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>
> at com.sun.proxy.$Proxy28.nodeHeartbeat(Unknown Source)
>
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:512)
>
> at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.net.ConnectException: Connection refused
>
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
>
> at
> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:529)
>
> at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:493)
>
> at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:606)
>
> at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:700)
>
> at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367)
>
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1463)
>
> at org.apache.hadoop.ipc.Client.call(Client.java:1382)
>
> ... 12 more
>
> 2015-05-14 11:55:01,965 INFO  [Container Monitor]
> monitor.ContainersMonitorImpl (ContainersMonitorImpl.java:run(408)) -
> Memory usage of ProcessTree 21428 for container-id
> container_1431628855028_0001_01_000001: 369.7 MB of 1 GB physical memory
> used; 1.4 GB of 2.1 GB virtual memory used
>

Reply via email to