On Mon, Nov 21, 2011 at 11:47 AM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > > > On Mon, Nov 21, 2011 at 3:30 AM, Philippe <watche...@gmail.com> wrote: >> >> I don't remember your exact situation but could it be your network >> connectivity? >> I know I've been upgrading mine because I'm maxing out fastethernet on a >> 12 node cluster. >> >> Le 20 nov. 2011 22:54, "Jahangir Mohammed" <md.jahangi...@gmail.com> a >> écrit : >>> >>> Mostly, they are I/O and CPU intensive during major compaction. If >>> ganglia doesn't have anything suspicious there, then what is performance >>> loss ? Read or write? >>> >>> On Nov 17, 2011 1:01 PM, "Maxim Potekhin" <potek...@bnl.gov> wrote: >>>> >>>> In view of my unpleasant discovery last week that deletions in Cassandra >>>> lead to a very real >>>> and serious performance loss, I'm working on a strategy of moving >>>> forward. >>>> >>>> If the tombstones do cause such problem, where should I be looking for >>>> performance bottlenecks? >>>> Is it disk, CPU or something else? Thing is, I don't see anything >>>> outstanding in my Ganglia plots. >>>> >>>> TIA, >>>> >>>> Maxim >>>> > > Tomstones do have a performance impact particularly in cases where data has > a lot of data turnover and your are using the standard (non LevelDB > compaction). Tombstones live on disk for gc_grace_seconds. First the > tombstone takes up some small amount of space, which has an effect on disk > caching. Secondly bloom filters having a tombstone has an effect on the read > path. As a read for a row key will now match multiple bloom filters. > If you are constantly adding and removing data and you have a long > gc_grace_seconds (10 days is pretty long if your dataset is new every day > for example) this is more profound then the use case that rarely deletes. > This is why you will notice some use cases call for 'major compaction' while > other people believe you should never need it. > I force majors on some columns families because there is a high turnover and > the data needs to be read often and the difference in data size is the > difference between a 20GB size on disk that fits in VFS cache or a 35Gb size > on disk that doesn't (and also may 'randomly' have a large compaction at > peak time.) > I am pretty excited about LevelDB because of how the tiered compaction looks > to be more space efficient.
Have you got chance to do benchmarking on the LevelDB compaction? > >