I'm testing various scenarios in a multi data center configuration. The setup is 10 Cassandra 1.1.5 nodes configured into two data centers, 5 nodes in each DC (RF DC1:3,DC2:3, write consistency LOCAL_QUORUM). I have a synthetic random data generator that I can run, and each run adds roughly 1GiB of data to each node per run,
DC Rack Status State Load Effective-Ownership DC1 RAC1 Up Normal 1010.71 MB 60.00% DC2 RAC1 Up Normal 1009.08 MB 60.00% DC1 RAC1 Up Normal 1.01 GB 60.00% DC2 RAC1 Up Normal 1 GB 60.00% DC1 RAC1 Up Normal 1.01 GB 60.00% DC2 RAC1 Up Normal 1014.45 MB 60.00% DC1 RAC1 Up Normal 1.01 GB 60.00% DC2 RAC1 Up Normal 1.01 GB 60.00% DC1 RAC1 Up Normal 1.01 GB 60.00% DC2 RAC1 Up Normal 1.01 GB 60.00% Now, if I kill all the nodes in DC2, and run the data generator again, I would expect roughly 2GiB to be added to each node in DC1 (local replicas + hints to other data center), instead I get this: DC Rack Status State Load Effective-Ownership DC1 RAC1 Up Normal 17.56 GB 60.00% DC2 RAC1 Down Normal 1009.08 MB 60.00% DC1 RAC1 Up Normal 17.47 GB 60.00% DC2 RAC1 Down Normal 1 GB 60.00% DC1 RAC1 Up Normal 17.22 GB 60.00% DC2 RAC1 Down Normal 1014.45 MB 60.00% DC1 RAC1 Up Normal 16.94 GB 60.00% DC2 RAC1 Down Normal 1.01 GB 60.00% DC1 RAC1 Up Normal 17.26 GB 60.00% DC2 RAC1 Down Normal 1.01 GB 60.00% Checking the sstables on a node reveals this, -bash-3.2$ du -hs HintsColumnFamily/ 16G HintsColumnFamily/ -bash-3.2$ So it seems that what I would have expected to be 1GiB of hints is much larger in reality, a 15x-16x inflation. This has a huge impact on write performance as well. If I bring DC2 up again, eventually the load will drop down and even out to 2GiB across the entire cluster. I'm wondering if this inflation is intended or if it is possibly a bug or something I'm doing wrong? Assuming this inflation is correct, what is the best way to deal with temporary connectivity issues with a second data center? Write performance is paramount in my use case. A 2x-3x overhead is doable, but not 15x-16x. Thanks, /dml