I had submitted this issue which could have had (in theory) some serious performance benefit when using JBOD: https://issues.apache.org/jira/browse/CASSANDRA-8868
However, it was pointed out to me that https://issues.apache.org/jira/browse/CASSANDRA-6696 will be a better solution in a lot of cases. On Fri, Apr 10, 2015 at 4:13 PM, Robert Coli <rc...@eventbrite.com> wrote: > On Fri, Apr 10, 2015 at 4:00 PM, Roman Tkachenko <ro...@mailgunhq.com> > wrote: >> >> * Can I just move some SSTables data files from "sstables2" to "sstables1" >> which has much more free disk space? Will Cassandra start fine after that >> and not lose any data? > > > Cassandra generally discovers files in its data directories and treats them > as legitimate files. I do not have specific knowledge of JBOD behavior here, > but I would presume it would be the same. > >> >> * Provided multiple data dirs, should Cassandra distribute data equally >> between them? In what I'm observing this is almost always not true. On that >> particular node I mentioned above the difference is huge: 4% occupied disk >> space for "sstables1" and 87% for "sstables2"; on other nodes the situation >> is a little better but still not 50/50. > > > No, and especially not when using Size Tiered Compaction. > > I honestly wonder why people think JBOD is a useful feature for Cassandra. > You don't really want to continue to operate a node that has lost half of > its data, and managing multiple data directories seems relatively likely to > be more trouble than it's worth. You have a distributed, replicated > database... just replace nodes when they fail. Anyone care to set me > straight about the amazing benefits they see which make the costs > worthwhile? > > =Rob > -- Jon Haddad http://www.rustyrazorblade.com twitter: rustyrazorblade