For the benefit of Google and/or future me, and with huge thanks to Ed
Coleman, here’s a quick summary of an issue we hit with Accumulo 1.7.0 and
the fix. Details are in Slack but with a few red herrings (thanks to me).
Some of this is fat-fingered so apologies for any typos:



We recently needed to bounce our moderate (19 node) cluster (log4j on other
stuff), but Accumulo failed to restart. Four of the nodes had been down for
some time (root cause unknown).



Symptoms



1) Accumulo monitor showed list of tables but "-" against every entry

2) Accumulo files looked ok in HDFS

3) scan -t accumulo.root (debug on) in accumulo shell gave “Failed to
locate tablet for table : +r row :”

4) There were some Zookeeper warnings in some logs (I forget
precisely which) but they weren't hugely informative - ConnectionLoss
for /accumulo/{uuid}/root_tablet/walogs. This turns out to be critical, but
I didn't realise it at the time.

5) Zookeeper nodes showed that a tserver should host the root tablet
(/accumulo/{id}/root_tablet/location), but that tserver did not have a lock
for the root tablet
(/accumulo/{id}/tservers/mytservername.domain:9997/zlock-00000000)

6) Using the zookeeper cli, ls /accumulo/{id}/root_tablet/walogs bombed out
with familiar looking ConnectionLoss, although with some more helpful info
"Packet len is out of range"


Cause


Zookeeper clients (CLI or Accumulo tserver) are failing to list znode with
large numbers of children due to insufficient buffer space. See the docs on
jute.maxbuffer here -
https://zookeeper.apache.org/doc/r3.7.0/zookeeperAdmin.html#Unsafe+Options

Quite why there were so many children of the walogs node is unknown, but
may have been due to the four inactive tservers


Fix


Set "-Djute.maxbuffer=big_value" for all Accumulo processes seemed to fix
things. For me, big_value was around 8000000 (i.e. 8MB). Accumulo came back
slowly, found all its data files and then the number of children of the zk
walogs node dropped substantially.

Reply via email to