I currently have a group of about 51 hosts on Cassandra 1.2.15, 17 in each EC2 AZ (us-east-1a, 1d, 1e). These are m2.4xlarge machines, so they have basically a 10G partition on /, and then two ~800G partitions on /dev/sdb and /dev/sdc.
When I first started, I was expecting the commitlog to take up significantly more space than it does, so I mounted one of the two drives (/dev/sdb) for commitlog, saved_caches, etc., and the second drive (/dev/sdc) for data. However, now that I have had the cluster running for a week or so, I'm realizing that the space on sdb is much more necessary for my data (it will basically allow me to double the space). My question is twofold. 1. If I change data_file_directories and add a second data directory, will this affect my data? What will I need to do once I change it--just run a repair? Upgrade sstables? 2. Right now, the average size of my commitlog seems to hover around 1G with the defaults. My config settings are as follows: commitlog_sync: periodic commitlog_sync_period_in_ms: 10000 commitlog_segment_size_in_mb: 32 commitlog_total_space_in_mb: 4096 Am I correct in assuming that commitlog_total_space_in_mb will restrict the MAXIMUM commitlog directory size to 4GB? I.e., it should never grow more than that? I'm concerned that, if I move the commitlog directory to the root partition, it will fill it up and cause system instability. Furthermore, if I move the commitlog (and saved_caches) directory to another partition, would I just need to drain the node before shutting down Cassandra and moving it? Thanks for the help, folks! Andrew