BATCH statement and Bulk Load are totally different things. The BATCH statement comes in the atomic transaction space which provides a way to make more than one statements into an atomic unit and bulk loader provides the ability to bulk load external data into a cluster. Two are totally different things and cannot be compared.
Thanks -Raj On 01-Dec-2014, at 4:32 am, Dong Dai <daidon...@gmail.com> wrote: > Hi, all, > > I have a performance question about the batch insert and bulk load. > > According to the documents, to import large volume of data into Cassandra, > Batch Insert and Bulk Load can both be an option. Using batch insert is > pretty straightforwards, but there have not been an ‘official’ way to use > Bulk Load to import the data (in this case, i mean the data was generated > online). > > So, i am thinking first clients use CQLSSTableWriter to create the SSTable > files, then use “org.apache.cassandra.tools.BulkLoader” to import these > SSTables into Cassandra directly. > > The question is can I expect a better performance using the BulkLoader this > way comparing with using Batch insert? > > I am not so familiar with the implementation of Bulk Load. But i do see a > huge performance improvement using Batch Insert. Really want to know the > upper limits of the write performance. Any comment will be helpful, Thanks! > > - Dong >