bloom filters can guess right sstables to be read with high probability < 0.1%. In reality even if you are using size based compaction and have about 300 sstables, reading is fast unless there is row fragmentation and you are reading entire row.Then for each read, Cassandra will go through all the SSTables (or one SSTable in each level for the leveled compaction strategy)? How to deal with this problem?
- random keys and overlapping key ranges in SSTables Kent Tong
- Re: random keys and overlapping key ranges in SSTables Radim Kolar
- Re: random keys and overlapping key ranges in SSTables Kent Tong