On 07/13/2012 08:00 PM, Michael Theroux wrote:
Hello,
I've been trying to understand in greater detail how SStables are stored, and
how information is transferred between Cassandra nodes, especially when a new
node is joining a cluster.
Specifically, Is information stored to SStables ordered by rowkeys? Some of
the articles I've read suggests this is the case (although it's a little vague
if they actually mean that the columns are stored in order, not the rowkeys).
However, if data is stored in rowkey order, how is this achieved, as sstables
are immutable?
Thanks for any insights,
-Mike
It depends on what partitioner you use. You should be using the
RandomPartitioner, and if so, the rows are sorted by the hash of the row
key. there are partitioners that sort based on the raw key value but
these partitioners shouldn't be used as they have problems due to uneven
partitioning of data.
As for how this is done, remember an sstable doesn't hold all the data
for a column family. Not only does the data for a column family exist on
multiple servers, there are usually multiple sstable files on disk that
represent data from one column family on one machine. So at the time the
sstable is written, the rows that are to be put in the sstable are
sorted, and written in sorted order. In fact the same rowkey may be
written in multiple sstables, one sstable having one set of columns for
the key, the other sstable having other columns for the same key.
On query for some row based on a key, cassandra is responsible for
finding where the columns are found in which sstables (potentially
several) and merging the results.