Also note that with an IN clause, if there is a failure fetching one of the
partitions, the entire request will fail and will need to be retried.  If
you use concurrent async queries, you'll only need to retry one small
request.
On Mon, Oct 6, 2014 at 1:14 PM, DuyHai Doan <doanduy...@gmail.com> wrote:

> "Definitely better to not make the coordinator hold on to that memory
> while it waits for other requests to come back" --> You get it. When
> loading big documents, you risk starving the heap quickly, triggering long
> GC cycle on the coordinator etc...
>
> On Mon, Oct 6, 2014 at 6:22 PM, Robert Wille <rwi...@fold3.com> wrote:
>
>>  As far as latency is concerned, it seems like it wouldn't matter very
>> much if the coordinator has to wait for all the responses to come back, or
>> the client waits for all the responses to come back. I’ve got the same
>> latency either way.
>>
>>  I would assume that 50 coordinations is more expensive than one
>> coordination that does 50 times the work, but that’s probably insignificant
>> when compared to the actual fetching of the data from the SSTables.
>>
>>  I do see the point about putting stress on coordinator memory. In
>> general, the documents will be very small, but there will occasionally be
>> some rather large ones, potentially several megabytes in size. Definitely
>> better to not make the coordinator hold on to that memory while it waits
>> for other requests to come back.
>>
>>  Robert
>>
>>  On Oct 4, 2014, at 8:34 AM, DuyHai Doan <doanduy...@gmail.com> wrote:
>>
>>  Definitely 50 concurrent queries, possibly in async mode.
>>
>>  If you're using the IN clause with 50 values, the coordinator will
>> block, waiting for 50 partitions to be fetched from different nodes (worst
>> case = 50 nodes) before responding to client. In addition to the very  high
>> latency, you'll put the stress on the coordinator memory.
>>
>>
>>
>> On Sat, Oct 4, 2014 at 3:09 PM, Robert Wille <rwi...@fold3.com> wrote:
>>
>>> I have a table of small documents (less than 1K) that are often accessed
>>> together as a group. The group size is always less than 50. Which produces
>>> less load on the server, one query using an IN clause to get all 50 back
>>> together, or 50 concurrent queries? Which one is fastest?
>>>
>>> Thanks
>>>
>>> Robert
>>>
>>>
>>
>>
>


-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Reply via email to