Generally this is all correct but I cannot emphasize enough how much this “just 
depends” and today I generally move people to async inserts first before trying 
to micro-optimize some things to keep in mind.

compaction usually is the limiter for most clusters, so the difference between 
async versus unlogged batch ends up being minor or worse..non existent cause 
the hardware and data model combination result in compaction being the main 
throttle.
if you add in token awareness to your batch..you’ve basically eliminated the 
primary complaint of using unlogged batches so why not do that. When I was at 
DataStax I made some similar suggestions for token aware batch after seeing the 
perf improvements with Spark writes using unlogged batch. Several others did as 
well so I’m not the first one with this idea.
write size makes in my experience the largest difference BY FAR about which is 
faster. and the number is largely irrelevant compared to the total payload 
size. Depending on the hardware and etc a good rule of thumb is writes below 1k 
bytes tend to get really inefficient and writes that are over 100k tend to slow 
down total throughput. I’ll reemphasize this magic number has been different on 
almost every cluster I’ve tuned.

In summary all this means is, too small or too large of writes are slow, and 
unlogged batches may involve some extra hops, if you eliminate the extra hops 
by token awareness then it just comes down to write size optimization.

> On Sep 24, 2015, at 5:18 PM, Eric Stevens <migh...@gmail.com> wrote:
> 
> > I side-tracked some punctual benchmarks and stumbled on the observations of 
> > unlogged inserts being *A LOT* faster than the async counterparts.
> 
> My own testing agrees very strongly with this.  When this topic came up on 
> this list before, there was a concern that batch coordination produces GC 
> pressure in your cluster because you're involving nodes which aren't strictly 
> speaking necessary to be involved.  
> 
> Our own testing shows some small impact on this front, but really lightweight 
> GC tuning mitigated the effects by putting a little more room in Xmn (if 
> you're still on CMS garbage collector).  On G1GC (which is what we run in 
> production) we weren't able to measure a difference. 
> 
> Our testing shows data loads being as much as 5x to 8x faster when using 
> small concurrent batches over using single statements concurrently.  We tried 
> three different concurrency models.
> 
> To save on coordinator overhead, we group the statements in our "batch" by 
> replica (using the functionality exposed by the DataStax Java driver), and do 
> essentially token aware batching.  This still has a small amount of 
> additional coordinator overhead (since the data size of the unit of work is 
> larger, and sits in memory in the coordinator longer).  We've been running 
> this way successfully for months with sustained rates north of 50,000 mutates 
> per second.  We burst much higher.
> 
> Through trial and error we determined we got diminishing returns in the realm 
> of 100 statements per token-aware batch.  It looks like your own data bears 
> that out as well.  I'm sure that's workload dependent though.
> 
> I've been disagreed with on this topic in this list in the past despite the 
> numbers I was able to post.  Nobody has shown me numbers (nor anything else 
> concrete) that contradict my position though, so I stand by it.  There's no 
> question in my mind, if your mutates are of any significant volume and you 
> care about the performance of them, token aware unlogged batching is the 
> right strategy.  When we reduce our batch sizes or switch to single async 
> statements, we fall over immediately.  
> 
> On Tue, Sep 22, 2015 at 7:54 AM, Gerard Maas <gerard.m...@gmail.com 
> <mailto:gerard.m...@gmail.com>> wrote:
> General advice advocates for individual async inserts as the fastest way to 
> insert data into Cassandra. Our insertion mechanism is based on that model 
> and recently we have been evaluating performance, looking to measure and 
> optimize our ingestion rate.
> 
> I side-tracked some punctual benchmarks and stumbled on the observations of 
> unlogged inserts being *A LOT* faster than the async counterparts.
> 
> In our tests, unlogged batch shows increased throughput and lower cluster CPU 
> usage, so I'm wondering where the tradeoff might be.
> 
> I compiled those observations in this document that I'm sharing and opening 
> up for comments.  Are we observing some artifact or should we set the record 
> straight for unlogged batches to achieve better insertion throughput?
> 
> https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI
>  
> <https://docs.google.com/document/d/1qSIJ46cmjKggxm1yxboI-KhYJh1gnA6RK-FkfUg6FrI>
> 
> Let me know.
> 
> Kind regards, 
> 
> Gerard.
> 

Regards,

Ryan Svihla

Reply via email to