Hi Marcin, On Wed, May 14, 2014 at 7:22 AM, Marcin Cylke <marcin.cy...@ext.allegro.pl> wrote: > - This looks like some problems with HA - but I've checked namenodes during > the job was running, and there > was no switch between master and slave namenode. > > 14/05/14 15:25:44 ERROR security.UserGroupInformation: > PriviledgedActionException as:hc_client_reco_dev (auth:SIMPLE) > cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby > 14/05/14 15:25:44 WARN ipc.Client: Exception encountered while connecting to > the server : > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby > 14/05/14 15:25:44 ERROR security.UserGroupInformation: > PriviledgedActionException as:hc_client_reco_dev (auth:SIMPLE) > cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): > Operation category READ is not supported in state standby
These are actually not worrisome; that's just the HDFS client doing its own thing to support HA. It probably picked the "wrong" NN to try first, and got the "NN in standby" exception, which it logs. Then it tries the other NN and things just work as expected. Business as usual. Not sure about the other exceptions you mention. I've seen the second one before, but it didn't seem to affect my jobs - maybe some race during cleanup. -- Marcelo