Hi

I am researching various hash-tables and b-trees on disk.

while I researched, I has a thoughts about cassandra sstables that I want
to verify it here.

1. cassandra sstable uses sequential disk I/O when created. e.g. disk head
write it from the beginning to the end. Assuming the disk is not
fragmented, the sstable is placed on disk sectors one after the other.

2. when cassandra lookups a key in sstable (assuming bloom-filter and other
"stuff" failed, also assuming the key is located in this single sstable),
cassandra DO NOT USE sequential I/O. "She" probably will read the
hash-table slot or similar structure, then cassandra will do another disk
seek in order to get the value (and probably the key). Also probably there
will need another seek, if there is key collision there will need
additional seeks.

3. once the data (e.g. the row) is located, a sequential read for entire
row will occur. (Once again I assume there is single well compacted
sstable). Also if disk is not fragmented, the data will be placed on disk
sectors one after the other.

Am I wrong?

Nick.

Reply via email to