Oh duh, RACS does this already. But it would be nice to get some education on the bloom filter memory use vs # sstables question.
On Wed, Jul 25, 2018 at 10:41 AM Carl Mueller <carl.muel...@smartthings.com> wrote: > It would seem to me that if the replicated data managed by a node is in > separate sstables from the "main" data it manages, when a new node came > online it would be easier to discard the data it no longer is responsible > for since it was shifted a slot down the ring. > > Generally speaking I've been asking lots of questions about sstables that > would increase the number of them. It is my impression that the size of > bloom filters are linearly proportional to the number of hash keys > contained in the sstables of a particular node. Is that true? > > We also want to avoid massive numbers of sstables mostly due to > filesystem/inode problems? Because the endstate of me suggesting sstables > be segmented by RACS, primary/replicated, and possibly application-specific > separations would impose say 5-10x more sstables, even though the absolute > amount of data and partition keys wouldn't change. >