Hi Will, You can run 'nodetool upgradesstables', this will rewrite the SSTables and regenerate the bloom filters for those tables, This will reduce their usage.
Mark On Mon, Apr 14, 2014 at 3:16 PM, William Oberman <ober...@civicscience.com>wrote: > Ah, so I could change the chance value to "poke it". Good to know! > > > > On Mon, Apr 14, 2014 at 10:12 AM, Michal Michalski < > michal.michal...@boxever.com> wrote: > >> Sorry, I misread the question - I thought you've also changed FP chance >> value, not only removed the data. >> >> Kind regards, >> Michał Michalski, >> michal.michal...@boxever.com >> >> >> On 14 April 2014 15:07, Michal Michalski <michal.michal...@boxever.com>wrote: >> >>> Did you set Bloom Filter's FP chance before or after the step 3) above? >>> If you did it before, C* should build Bloom Filters properly. If not - >>> that's the reason. >>> >>> Kind regards, >>> Michał Michalski, >>> michal.michal...@boxever.com >>> >>> >>> On 14 April 2014 15:04, William Oberman <ober...@civicscience.com>wrote: >>> >>>> I didn't cross link my thread, but the basic idea is I've done: >>>> >>>> 1.) Process that deleted ~900M of ~1G rows from a CF >>>> 2.) Set GCGraceSeconds to 0 on CF >>>> 3.) Run nodetool compact on all N nodes >>>> >>>> And I checked, and all N nodes have bloom filters using 1.5 +/- .2 GB >>>> of RAM (I didn't explicitly write down the before numbers, but they seem >>>> about the same) . So, compaction didn't change the BF's (unless cassandra >>>> needs a 2nd compaction to see all of the data cleared by the 1st >>>> compaction). >>>> >>>> will >>>> >>>> >>>> On Mon, Apr 14, 2014 at 9:52 AM, Michal Michalski < >>>> michal.michal...@boxever.com> wrote: >>>> >>>>> Bloom filters are built on creation / rebuild of SSTable. If you >>>>> removed the data, but the old SSTables weren't compacted or you didn't >>>>> rebuild them manually, bloom filters will stay the same size. >>>>> >>>>> M. >>>>> >>>>> Kind regards, >>>>> Michał Michalski, >>>>> michal.michal...@boxever.com >>>>> >>>>> >>>>> On 14 April 2014 14:44, William Oberman <ober...@civicscience.com>wrote: >>>>> >>>>>> I had a thread on this forum about clearing junk from a CF. In my >>>>>> case, it's ~90% of ~1 billion rows. >>>>>> >>>>>> One side effect I had hoped for was a reduction in the size of the >>>>>> bloom filter. But, according to nodetool cfstats, it's still fairly >>>>>> large >>>>>> (~1.5GB of RAM). >>>>>> >>>>>> Do bloom filters ever resize themselves when the CF suddenly gets >>>>>> smaller? >>>>>> >>>>>> My next test will be restarting one of the instances, though I'll >>>>>> have to wait on that operation so I thought I'd ask in the meantime. >>>>>> >>>>>> will >>>>>> >>>>> >>>>> >>>> >>>> >>>> >>> >> > > >