Guys, please move this discussion to users mailing list. This one is for Cassandra committers and other contributors, to discuss development of Cassandra itself.
-- AY > On Dec 2, 2014, at 16:17, Ryan Svihla <rsvi...@datastax.com> wrote: > > mispoke > > "That's all correct but what you're not accounting for is if you use a > token aware client then the coordinator will likely not own all the data in > a batch" > > should just be > > "That's all correct but what you're not accounting for is the coordinator > will likely not own all the data in a batch" > > Token awareness has no effect on that fact. > >> On Tue, Dec 2, 2014 at 9:13 AM, Ryan Svihla <rsvi...@datastax.com> wrote: >> >> >> >>> On Mon, Dec 1, 2014 at 1:52 PM, Dong Dai <daidon...@gmail.com> wrote: >>> >>> Thanks Ryan, and also thanks for your great blog post. >>> >>> However, this makes me more confused. Mainly about the coordinators. >>> >>> Based on my understanding, no matter it is batch insertion, ordinary sync >>> insert, or async insert, >>> the coordinator was only selected once for the whole session by calling >>> cluster.connect(), and after >>> that, all the insertions will go through that coordinator. >> >> That's all correct but what you're not accounting for is if you use a >> token aware client then the coordinator will likely not own all the data in >> a batch, ESPECIALLY as you scale up to more nodes. If you are using >> executeAsync and a single row then the coordinator node will always be an >> owner of the data, thereby minimizing network hops. Some people now stop me >> and say "but the client is making those hops!", and that's when I point out >> "what do you think the coordinator has to do", only you've introduced >> something in the middle, and prevent token awareness from doing it's job. >> The savings in latency are particularly huge if you use more than a >> consistency level one on your write. >> >> >>> If this is not the case, and the clients do more work, like distribute >>> each insert to different >>> coordinators based on its partition key. It is understandable the large >>> volume of UNLOGGED BATCH >>> will cause some bottleneck in the coordinator server. However, this >>> should be not hard to solve by distributing >>> insertions in one batch into different coordinators based on partition >>> keys. I will be curious why >>> this is not supported. >> >> The coordinator node does this of course today, but this is the very >> bottleneck of which you refer. To do what you're wanting to do and make it >> work, you'd have to enhance the CLIENT to make sure that all the objects in >> that batch were actually owned by the coordinator itself, and if you're >> talking about parsing a CQL BATCH on the client and splitting it out to the >> appropriate nodes in some sort of hyper token awareness, then you're taking >> a server side responsibility (CQL parsing) and moving it to the client. >> Worse you're asking for a number of bugs to occur by moving CQL parsing to >> the client, IE do all clients handle this the same way? what happens to >> older thrift clients with batch?, etc, etc, etc. >> >> Final point, every time you do a batch you're adding extra load on the >> heap to the coordinator node that could be instead on the client. This >> cannot be stated strongly enough. In production doing large batches (say >> over 5k) is a wonderful way to make your node spend a lot of it's time >> handling batches and the overhead of that process. >> >>> >>> P.S. I have the asynchronous insertion tested, probably because my >>> dataset is small. Batch insertion >>> is always much better than async insertions. Do you have a general idea >>> how large the dataset should be >>> to reverse this performance comparison. >> >> You could be in a situation where the node owns all the data, and so can >> respond quickly, so it's hard to say, you can see however as the cluster >> scales there is no way that a given node will own everything in the batch >> unless you've designed it to be that way, either by some token aware batch >> generation in the client or by only batching on the same partition key >> (strategy covered in that blog). >> >> PS Every time I've had a customer tell me batch is faster than async, it's >> been a code problem such as not storing futures for later, or in Python not >> using libev, in all cases I've gotten at least 2x speed up and often way >> more. >> >> >>> - Dong >>> >>>> On Dec 1, 2014, at 9:57 AM, Ryan Svihla <rsvi...@datastax.com> wrote: >>>> >>>> So there is a bit of a misunderstanding about the role of the >>> coordinator >>>> in all this. If you use an UNLOGGED BATCH and all of those writes are in >>>> the same partition key, then yes it's a savings and acts as one >>> mutation. >>>> If they're not however, you're asking the coordinator node to do work >>> the >>>> client could do, and you're potentially adding an extra round hop on >>>> several of those transactions if that coordinator node does not happen >>> to >>>> own that partition key (and assuming your client driver is using token >>>> awareness, as it is in recent versions of the DataStax Java Driver. This >>>> also says nothing of heap pressure, and the measurable effect of large >>>> batches on node performance is in practice a problem in production >>> clusters. >>>> >>>> I frequently have had to switch people off using BATCH for bulk loading >>>> style processes and in _every_ single case it's been faster to use >>>> executeAsync..not to mention the cluster was healthier as a result. >>>> >>>> As for the sstable loader options since they all use the streaming >>> protocol >>>> and as of today the streaming protocol will stream one copy to each >>> remote >>>> nodes, that they tend to be slower than even executeAsync in multi data >>>> center scenarios (though in single data center they're faster options, >>> that >>>> said..the executeAsync approach is often fast enough). >>>> >>>> This is all covered in a blog post >>> https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e >>>> and the DataStax CQL docs also reference BATCH is not a performance >>>> optimization >>> http://www.datastax.com/documentation/cql/3.1/cql/cql_using/useBatch.html >>>> >>>> In summary the only way UNLOGGED BATCH is a performance improvement over >>>> using async with the driver is if they're within a certain reasonable >>> size >>>> and they're all to the same partition. >>>> >>>>> On Mon, Dec 1, 2014 at 9:43 AM, Dong Dai <daidon...@gmail.com> wrote: >>>>> >>>>> Thank a lot for the reply, Raj, >>>>> >>>>> I understand they are different. But if we define a Batch with >>> UNLOGGED, >>>>> it will not guarantee the atomic transaction, and become more like a >>> data >>>>> import tool. According to my knowledge, BATCH statement packs several >>>>> mutations into one RPC to save time. Similarly, Bulk Loader also pack >>> all >>>>> the mutations as a SSTable file and (I think) may be able to save lot >>> of >>>>> time too. >>>>> >>>>> I am interested that, in the coordinator server, are Batch Insert and >>> Bulk >>>>> Loader the similar thing? I mean are they implemented in the similar >>> way? >>>>> >>>>> P.S. I try to randomly insert 1000 rows into a simple table on my >>> laptop >>>>> as a test. Sync Insert will take almost 2s to finish, but sync batch >>> insert >>>>> only take like 900ms. It is a huge performance improvement, I wonder is >>>>> this expected? >>>>> >>>>> Also, I used CQLSStableWriter to put these 1000 insertions into a >>> single >>>>> SSTable file, it costs around 2s to finish on my laptop. Seems to be >>> pretty >>>>> slow. >>>>> >>>>> thanks! >>>>> - Dong >>>>> >>>>>>> On Dec 1, 2014, at 2:33 AM, Rajanarayanan Thottuvaikkatumana < >>>>>> rnambood...@gmail.com> wrote: >>>>>> >>>>>> BATCH statement and Bulk Load are totally different things. The BATCH >>>>> statement comes in the atomic transaction space which provides a way to >>>>> make more than one statements into an atomic unit and bulk loader >>> provides >>>>> the ability to bulk load external data into a cluster. Two are totally >>>>> different things and cannot be compared. >>>>>> >>>>>> Thanks >>>>>> -Raj >>>>>> >>>>>>> On 01-Dec-2014, at 4:32 am, Dong Dai <daidon...@gmail.com> wrote: >>>>>>> >>>>>>> Hi, all, >>>>>>> >>>>>>> I have a performance question about the batch insert and bulk load. >>>>>>> >>>>>>> According to the documents, to import large volume of data into >>>>> Cassandra, Batch Insert and Bulk Load can both be an option. Using >>> batch >>>>> insert is pretty straightforwards, but there have not been an >>> ‘official’ >>>>> way to use Bulk Load to import the data (in this case, i mean the data >>> was >>>>> generated online). >>>>>>> >>>>>>> So, i am thinking first clients use CQLSSTableWriter to create the >>>>> SSTable files, then use “org.apache.cassandra.tools.BulkLoader” to >>> import >>>>> these SSTables into Cassandra directly. >>>>>>> >>>>>>> The question is can I expect a better performance using the >>> BulkLoader >>>>> this way comparing with using Batch insert? >>>>>>> >>>>>>> I am not so familiar with the implementation of Bulk Load. But i do >>> see >>>>> a huge performance improvement using Batch Insert. Really want to know >>> the >>>>> upper limits of the write performance. Any comment will be helpful, >>> Thanks! >>>>>>> >>>>>>> - Dong >>>> >>>> >>>> -- >>>> >>>> [image: datastax_logo.png] <http://www.datastax.com/> >>>> >>>> Ryan Svihla >>>> >>>> Solution Architect >>>> >>>> [image: twitter.png] <https://twitter.com/foundev> [image: >>> linkedin.png] >>>> <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> >>>> >>>> >>>> DataStax is the fastest, most scalable distributed database technology, >>>> delivering Apache Cassandra to the world’s most innovative enterprises. >>>> Datastax is built to be agile, always-on, and predictably scalable to >>> any >>>> size. With more than 500 customers in 45 countries, DataStax is the >>>> database technology and transactional backbone of choice for the worlds >>>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >> >> >> -- >> >> [image: datastax_logo.png] <http://www.datastax.com/> >> >> Ryan Svihla >> >> Solution Architect >> >> [image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png] >> <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> >> >> DataStax is the fastest, most scalable distributed database technology, >> delivering Apache Cassandra to the world’s most innovative enterprises. >> Datastax is built to be agile, always-on, and predictably scalable to any >> size. With more than 500 customers in 45 countries, DataStax is the >> database technology and transactional backbone of choice for the worlds >> most innovative companies such as Netflix, Adobe, Intuit, and eBay. > > > -- > > [image: datastax_logo.png] <http://www.datastax.com/> > > Ryan Svihla > > Solution Architect > > [image: twitter.png] <https://twitter.com/foundev> [image: linkedin.png] > <http://www.linkedin.com/pub/ryan-svihla/12/621/727/> > > DataStax is the fastest, most scalable distributed database technology, > delivering Apache Cassandra to the world’s most innovative enterprises. > Datastax is built to be agile, always-on, and predictably scalable to any > size. With more than 500 customers in 45 countries, DataStax is the > database technology and transactional backbone of choice for the worlds > most innovative companies such as Netflix, Adobe, Intuit, and eBay.