Hi

I have a 3-node SSD-based cluster, with around 1 TB data, RF:3, C* v.1.2.0,
vnodes. One large CF, LCS. Everything was running smooth, until one of the
nodes crashed and was restarted.

At the time of normal operation there was 800 gb free space on each node.
After the crash, C* started using a lot more, resulting in an
out-of-diskspace situation on 2 nodes, eg. C* used up the 800 gb in just 2
days, giving us very little time to do anything about it, since
repairs/joins takes a considerable amount of time.

What can make C* suddenly use this amount of disk-space? We did see a lot
of pending compactions on one node (7k).

Any tips on recovering from an out-of-diskspace on multiple nodes,
situation? I've tried moving some SStables away, but C* seems to use
whatever space I free up in no time. I'm not sure if any of the nodes is
fully updated as 'nodetool status' reports 3 different loads

--  Address           Load       Tokens  Owns (effective)  Host ID
                      Rack
UN  10.146.145.26     1.4 TB     256     100.0%
 1261717d-ddc1-457e-9c93-431b3d3b5c5b  rack1
UN  10.148.149.141    1.03 TB    256     100.0%
 f80bfa31-e19d-4346-9a14-86ae87f06356  rack1
DN  10.146.146.4      1.11 TB    256     100.0%
 85d4cd28-93f4-4b96-8140-3605302e90a9  rack1


-- 

Sincerely,

*Nicolai Gylling*

Reply via email to