we had a little outage of our 12 node riak cluster yesterday, the first
so far, four nodes had gone down because of exhausted disk space.
What happened: our backup crashed, leaving one node down. Because of a
misconfiguration in our monitoring, this node stayed down for 9 days. It
was then started and recovered after about 7 hours of running.
But then the disk space usage on six of our nodes start to explode until
the disks were full on four of them. This seems to have been caused by
AAE. We have about 320GB of real user data stored in levelDB and
yesterday night AAE consumed about 150GB on each of this nodes. Is this
normal or what should I expect?
I deleted the "anti_entropy" on these six nodes and started the nodes
again, running fine since then. AAE gets rebuild but at a lot of lower
level so far. I would be happy if someone could shed a light on this, I
didn't find anything helpful in the logs or the docs on docs.basho.com
Software Architect
Blue Lion mobile GmbH
Tel. +49 (0) 221 788 797 14
Fax. +49 (0) 221 788 797 19
Mob. +49 (0) 176 24 87 30 89
>>> qeep: Hefferwolf
riak-users mailing list