to update, i seem to be having luck with some combination of "cleanup"
followed by triggering a garbage collection on jmx (all on each node).
(using jxterm):
echo -e 'open localhost:8080\nrun -b java.lang:type=Memory gc' | java -jar
jmxterm-1.0-alpha-4-uber.jar
-scott
On Mon, 16 Aug 2010, Scott Dworkis wrote:
i followed the alternative approach for handling a failed node here:
http://wiki.apache.org/cassandra/Operations
i.e. bringing up a replacement node with the same ip, bootstrapping it into
the same token used by the failed node (using the InitialToken config
parameter), then doing a repair. at the end of this process i had a data
directory that was almost 3x the size of the directory on the failed node at
the time of failure... i expected around 2x for copies going around, but 3x
seems a bit high for headroom i should expect to need for recovery.
the data i inserted here was 100 copies of an almost 10M file, random
partitioner, no overwriting or anything, replication factor of 2. so i'd
expect to be using around 2G.
here is what ring and du looked like after the initial data load:
Address Status Load Range
Ring
170141183460469231731687303715884105728
10.3.0.84 Up 448.8 MB 42535295865117307932921825928971026432
|<--|
10.3.0.85 Up 374 MB 85070591730234615865843651857942052864
| |
10.3.0.114 Up 495 bytes
127605887595351923798765477786913079296 | |
10.3.0.115 Up 496 bytes
170141183460469231731687303715884105728 |-->|
655M /data/cassandra/
655M /data/cassandra
655M /data/cassandra
1001M /data/cassandra
so far so good... now after the bootstrap:
Address Status Load Range
Ring
170141183460469231731687303715884105728
10.3.0.84 Up 467.5 MB 42535295865117307932921825928971026432
|<--|
10.3.0.85 Up 205.7 MB 85070591730234615865843651857942052864
| |
10.3.0.114 Up 448.8 MB
127605887595351923798765477786913079296 | |
10.3.0.115 Up 514.25 MB
170141183460469231731687303715884105728 |-->|
674M /data/cassandra
206M /data/cassandra/
655M /data/cassandra
767M /data/cassandra
also reasonable, now after the repair:
Address Status Load Range
Ring
170141183460469231731687303715884105728
10.3.0.84 Up 467.5 MB 42535295865117307932921825928971026432
|<--|
10.3.0.85 Up 916.3 MB 85070591730234615865843651857942052864
| |
10.3.0.114 Up 654.5 MB
127605887595351923798765477786913079296 | |
10.3.0.115 Up 514.25 MB
170141183460469231731687303715884105728 |-->|
674M /data/cassandra
1.4G /data/cassandra/
655M /data/cassandra
767M /data/cassandra
so i need 3x headroom if i were to try this on a huge production data set?
after 3 or 4 nodetool cleanups, ring looks ok, but the data directories have
bloated:
Address Status Load Range
Ring
170141183460469231731687303715884105728
10.3.0.84 Up 467.5 MB 42535295865117307932921825928971026432
|<--|
10.3.0.85 Up 420.75 MB 85070591730234615865843651857942052864
| |
10.3.0.114 Up 448.8 MB
127605887595351923798765477786913079296 | |
10.3.0.115 Up 514.25 MB
170141183460469231731687303715884105728 |-->|
1.2G /data/cassandra
842M /data/cassandra/
1.1G /data/cassandra
1.3G /data/cassandra
so the question is, do i plan to need 3x headroom for node recoveries?
-scott