Re: IN versus multiple asynchronous queries

2014-10-07 Thread Tyler Hobbs
Also note that with an IN clause, if there is a failure fetching one of the partitions, the entire request will fail and will need to be retried. If you use concurrent async queries, you'll only need to retry one small request. On Mon, Oct 6, 2014 at 1:14 PM, DuyHai Doan wrote: > "Definitely be

Re: IN versus multiple asynchronous queries

2014-10-06 Thread DuyHai Doan
"Definitely better to not make the coordinator hold on to that memory while it waits for other requests to come back" --> You get it. When loading big documents, you risk starving the heap quickly, triggering long GC cycle on the coordinator etc... On Mon, Oct 6, 2014 at 6:22 PM, Robert Wille wro

Re: IN versus multiple asynchronous queries

2014-10-06 Thread Robert Wille
As far as latency is concerned, it seems like it wouldn't matter very much if the coordinator has to wait for all the responses to come back, or the client waits for all the responses to come back. I’ve got the same latency either way. I would assume that 50 coordinations is more expensive than

Re: IN versus multiple asynchronous queries

2014-10-04 Thread DuyHai Doan
Definitely 50 concurrent queries, possibly in async mode. If you're using the IN clause with 50 values, the coordinator will block, waiting for 50 partitions to be fetched from different nodes (worst case = 50 nodes) before responding to client. In addition to the very high latency, you'll put th

IN versus multiple asynchronous queries

2014-10-04 Thread Robert Wille
I have a table of small documents (less than 1K) that are often accessed together as a group. The group size is always less than 50. Which produces less load on the server, one query using an IN clause to get all 50 back together, or 50 concurrent queries? Which one is fastest? Thanks Robert