I just wanted to confirm whether my understanding of how JBOD allocates
device space is correct of not...

Pre-3.2:
On each memtable flush Cassandra will select the directory (device) which
has the most available space as a percentage of the total available space
on all of the listed directories/devices. A random weighted value is used
so it won't always pick the same directory/device with the most space, the
goal being to balance writes for performance.

As of 3.2:
The ranges of tokens stored on the local node will be evenly distributed
among the configured storage devices - even by token range, even if that
may be uneven by actual partition sizes. The code presumes that each of the
configured local storage devices has the same capacity.

The relevant change in 3.2 appears to be:
Make sure tokens don't exist in several data directories (CASSANDRA-6696)

The code for the pre-3.2 model is still in 3.x - is there some other code
path which will cause the pre-3.2 behavior even when runing 3.2 or later?

I see this code which seems to allow for at least some cases where the
pre-3.2 behavior would still be invoked, but I'm not sure what user-level
cases that might be:

if (!cfs.getPartitioner().splitter().isPresent() || localRanges.isEmpty())
  return Collections.singletonList(new
FlushRunnable(lastReplayPosition.get(), txn));

return createFlushRunnables(localRanges, txn);

IOW, if the partitioner does not have a splitter present or the localRanges
for the node cannot be determined. But... what exactly would a user do to
cause that?

There is no doc for this stuff - can a committer (or adventurous user!)
confirm what is actually implemented, both pre and post 3.2? (I already
pinged docs on this.)

Or if anybody is actually using JBOD, what behavior they are seeing for
device space utilization.

Thanks!

-- Jack Krupansky

Reply via email to