> I side-tracked some punctual benchmarks and stumbled on the observations of unlogged inserts being *A LOT* faster than the async counterparts.
My own testing agrees very strongly with this. When this topic came up on this list before, there was a concern that batch coordination produces GC pressure in your cluster because you're involving nodes which aren't *strictly speaking* necessary to be involved. Our own testing shows some small impact on this front, but really lightweight GC tuning mitigated the effects by putting a little more room in Xmn (if you're still on CMS garbage collector). On G1GC (which is what we run in production) we weren't able to measure a difference. Our testing shows data loads being as much as 5x to 8x faster when using small concurrent batches over using single statements concurrently. We tried three different concurrency models. To save on coordinator overhead, we group the statements in our "batch" by replica (using the functionality exposed by the DataStax Java driver), and do essentially token aware batching. This still has a *small* amount of additional coordinator overhead (since the data size of the unit of work is larger, and sits in memory in the coordinator longer). We've been running this way successfully for months with *sustained* rates north of 50,000 mutates per second. We burst *much* higher. Through trial and error we determined we got diminishing returns in the realm of 100 statements per token-aware batch. It looks like your own data bears that out as well. I'm sure that's workload dependent though. I've been disagreed with on this topic in this list in the past despite the numbers I was able to post. Nobody has shown me numbers (nor anything else concrete) that contradict my position though, so I stand by it. There's no question in my mind, if your mutates are of any significant volume and you care about the performance of them, token aware unlogged batching is the right strategy. When we reduce our batch sizes or switch to single async statements, we fall over immediately. On Tue, Sep 22, 2015 at 7:54 AM, Gerard Maas <gerard.m...@gmail.com> wrote: > General advice advocates for individual async inserts as the fastest way > to insert data into Cassandra. Our insertion mechanism is based on that > model and recently we have been evaluating performance, looking to measure > and optimize our ingestion rate. > > I side-tracked some punctual benchmarks and stumbled on the observations > of unlogged inserts being *A LOT* faster than the async counterparts. > > In our tests, unlogged batch shows increased throughput and lower cluster > CPU usage, so I'm wondering where the tradeoff might be. > > I compiled those observations in this document that I'm sharing and > opening up for comments. Are we observing some artifact or should we set > the record straight for unlogged batches to achieve better insertion > throughput? > > > https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI > > Let me know. > > Kind regards, > > Gerard. >