Re: confirm expected HA behavior

2014-02-09 Thread Azuryy Yu
Hi Todd, I think Arpit's test method is incorrect. we cannot block port 8020 to simulate active NN down. because ZK session is live and NN process is running at the same time. so when unblock 8020, NN1 think himself still is active. On Sat, Feb 8, 2014 at 3:47 AM, Todd Lipcon wrote: > Hi Ar

Re: confirm expected HA behavior

2014-02-07 Thread Todd Lipcon
Hi Arpit, The issue here is that our transaction log is not a proper "write-ahead log". In fact, it is a "write-behind" log of sorts -- our general operations look something like: - lock namespace - make a change to namespace - write to log - unlock namespace - sync log In the case of an active

RE: confirm expected HA behavior

2014-02-05 Thread Vinayakumar B
Hi Arpit, In Your case, you blocked requests only on 8020 port. But ssh was reachable right? Have you configured fencing method? Such as "sshfence" If you have configured, then previous ActiveNN should be killed before making next one Active, Else shared storage needs to handle single writer m