Reviewing patch: CASSANDRA-7882

2014-12-01 Thread Jay Patel
Hi All, Can anyone help review this patch? It has been more than 2 moths & we're going live soon. It's currently assigned to Benedict for the review. Let me know if I should reassign to someone else. Thanks for your help! Thanks, Jay

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-01 Thread Dong Dai
Thanks Ryan, and also thanks for your great blog post. However, this makes me more confused. Mainly about the coordinators. Based on my understanding, no matter it is batch insertion, ordinary sync insert, or async insert, the coordinator was only selected once for the whole session by calling

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-01 Thread Ryan Svihla
So there is a bit of a misunderstanding about the role of the coordinator in all this. If you use an UNLOGGED BATCH and all of those writes are in the same partition key, then yes it's a savings and acts as one mutation. If they're not however, you're asking the coordinator node to do work the clie

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-01 Thread Dong Dai
Thank a lot for the reply, Raj, I understand they are different. But if we define a Batch with UNLOGGED, it will not guarantee the atomic transaction, and become more like a data import tool. According to my knowledge, BATCH statement packs several mutations into one RPC to save time. Similarly

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-01 Thread Rajanarayanan Thottuvaikkatumana
BATCH statement and Bulk Load are totally different things. The BATCH statement comes in the atomic transaction space which provides a way to make more than one statements into an atomic unit and bulk loader provides the ability to bulk load external data into a cluster. Two are totally differen