Adi, just to make sure my calculation is correct, the configured ops threshold is ~2m, we have 6 nodes, does that mean each node's threshold is around 300k? I do see the when flushing happens, ops is about 300k, with several 500k. Seems like the ops threshold is throttling us.
On Sep 7, 2011, at 11:31 AM, Adi wrote: > On Wed, Sep 7, 2011 at 2:09 PM, Hefeng Yuan <hfy...@rhapsody.com> wrote: > We didn't change MemtableThroughputInMB/min/maxCompactionThreshold, they're > 499/4/32. > As for why we're flushing at ~9m, I guess it has to do with this: > http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/ > The only parameter I tried to play with is the > compaction_throughput_mb_per_sec, tried cutting it in half and doubled, seems > none of them helps avoiding the simultaneous compactions on nodes. > > I agree that we don't necessarily need to add node, as long as we have a way > to avoid simultaneous compaction on 4+ nodes. > > Thanks, > Hefeng > > > > Can you check in the logs for something like this > ...... Memtable.java (line 157) Writing > Memtable-<ColumnFamilyName>@1151031968(67138588 bytes, 47430 operations) > to see the bytes/operations at which the column family gets flushed. In case > you are hitting the operations threshold you can try increasing that to a > high number. The operations threshold is getting hit at less than 2% of size > threshold. I would try bumping up the memtable_operations substantially. > Default is 1.1624999999999999(in millions). Try 10 or 20 and see if your CF > flushes at higher size. Keep adjusting it until the frequency/size of > flushing becomes satisfactory and hopefully reduces the compaction overhead. > > -Adi > > > > > > > On Sep 7, 2011, at 10:51 AM, Adi wrote: > >> >> On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan <hfy...@rhapsody.com> wrote: >> Adi, >> >> The reason we're attempting to add more nodes is trying to solve the >> long/simultaneous compactions, i.e. the performance issue, not the storage >> issue yet. >> We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, >> and when 4 nodes doing compaction at the same period, we're screwed, >> especially on read, since it'll cover one of the compaction node anyways. >> My assumption is that if we add more nodes, each node will have less load, >> and therefore need less compaction, and probably will compact faster, >> eternally avoid 4+ nodes doing compaction simultaneously. >> >> Any suggestion on how to calculate how many more nodes to add? Or, generally >> how to plan for number of nodes required, from a performance perspective? >> >> Thanks, >> Hefeng >> >> >> >> Adding nodes to delay and reduce compaction is an interesting performance >> use case :-) I am thinking you can find a smarter/cheaper way to manage >> that. >> Have you looked at >> a) increasing memtable througput >> What is the nature of your writes? Is it mostly inserts or also has lot of >> quick updates of recently inserted data. Increasing memtable_throughput can >> delay and maybe reduce the compaction cost if you have lots of updates to >> same data.You will have to provide for memory if you try this. >> When mentioned "with ~9m serialized bytes" is that the memtable throughput? >> That is quite a low threshold which will result in large number of SSTables >> needing to be compacted. I think the default is 256 MB and on the lower end >> values I have seen are 64 MB or maybe 32 MB. >> >> >> b) tweaking min_compaction_threshold and max_compaction_threshold >> - increasing min_compaction_threshold will delay compactions >> - decreasing max_compaction_threshold will reduce number of sstables per >> compaction cycle >> Are you using the defaults 4-32 or are trying some different values >> >> c) splitting column families >> Again splitting column families can also help because compactions occur >> serially one CF at a time and that spreads out your compaction cost over >> time and column families. It requires change in app logic though. >> >> -Adi >> > >