I'm using a queue of 100 ExecuteAsyncs * 1000 statements in in each batch = 
100K insert queue in non-batch scenario.

Using more than 1000 statememnts per batch throws batch limit exception and 
some documents recommend no to change batch_size_limit??!


Sent using Zoho Mail






---- On Sun, 18 Mar 2018 13:14:54 +0330 Ben Slater 
<ben.sla...@instaclustr.com> wrote ----




When you say batch was worth than async in terms of throughput are you 
comparing throughput with the same number of threads or something? I would have 
thought if you have much less CPU usage on the client with batching and your 
Cassandra cluster doesn’t sound terribly stressed then there is room to 
increase threads on the client to up throughput (unless your bottlenecked on IO 
or something)? 



On Sun, 18 Mar 2018 at 20:27 onmstester onmstester <onmstes...@zoho.com> 
wrote:




-- 

Ben Slater
Chief Product Officer

    

Read our latest technical blog posts here.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and 
Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally privileged 
information.  If you are not the intended recipient, do not copy or disclose 
its content, but please reply to this email immediately and highlight the error 
to the sender and then immediately delete the message.






Input data does not preserve good locality and I've already tested batch 
insert, it was worse than executeAsync in case of throughput but much less CPU 
usage at client side.



Sent using Zoho Mail






---- On Sun, 18 Mar 2018 12:46:02 +0330 Ben Slater 
<ben.sla...@instaclustr.com> wrote ----









You will probably find grouping writes into small batches improves overall 
performance (if you are not doing it already). See the following presentation 
for some more info: 
https://www.slideshare.net/Instaclustr/microbatching-highperformance-writes



Cheers

Ben




On Sun, 18 Mar 2018 at 19:23 onmstester onmstester <onmstes...@zoho.com> 
wrote:




-- 

Ben Slater
Chief Product Officer
    

Read our latest technical blog posts here.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia) and 
Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally privileged 
information.  If you are not the intended recipient, do not copy or disclose 
its content, but please reply to this email immediately and highlight the error 
to the sender and then immediately delete the message.









I need to insert some millions records in seconds in Cassandra. Using one 
client with asyncExecute with folllowing configs:

maxConnectionsPerHost = 5

maxRequestsPerHost = 32K

maxAsyncQueue at client side = 100K



I could achieve  25% of throughtput i needed, client CPU is more than 80% and 
increasing number of threads cause some execAsync to fail, so configs above are 
the best the client could handle. Cassandra nodes cpu is less than 30% in 
average. The data has no locality in sake of partition keys and i can't use 
createSStable mechanism. Is there any tuning which i'm missing in client side, 
cause the server side is already tuned with datastax recomendations.

Sent using Zoho Mail













Reply via email to