Thank you for your answer, however I am pretty sure that's not it. I have
a small-two node cluster for development testing, and I have loaded it
with data in a way that responses to my queries usually have about 50000
short rows (which I think is not very much).

First of all, if I do it like this, it works:

for (AbstractBounds range : ranges)
{
    <create handler for this range>

    for (InetAddress endpoint : entry.getValue())
    {
        MessagingService.instance.sendRR(message, endpoint, handler);
    }

    List<ByteBuffer> response = handler.get();
}

So it's not a matter or rpc_timeout. However, just in case, I increased
rpc_timeout to 40000 ms (which is longer than any of my queries take with
the above method), and the problem still occurred.

My problem is that I don't want to wait for all the results for one range
before sending the request to another one.

I also added some debugging to QuorumResponseHandler.get() and I found out
where exactly the problem occurs. This (QRH, line 64 in rc1):

success = condition.await(timeout, TimeUnit.MILLISECONDS);

is never true with my way.

The puzzling thing is that I noticed strongRead does exactly the same
thing that I want to do, but it obviously works, and I can't see what I am
doing differently. One thing to note, maybe, is that I even send a request
to the same node that sends the requests, and I am wondering if there is
some kind of deadlock between the different threads that handle the verb
and the requests.

Alexander

> I would guess that sending a bunch of range requests simultaneously
> overwhelms the targets (range scans are expensive), so you're timing
> out simply because it couldn't finish all of them within rpc_timeout.
>
> Solution: don't do that, or increase rpc_timeout.
>
> On Wed, Jan 12, 2011 at 3:03 AM,  <alta...@ceid.upatras.gr> wrote:
>> Hello,
>>
>> I am reading through getRangeSlice() in StorageProxy, and I am trying to
>> do roughly the same thing for a join operation I am trying to implement
>> in
>> Cassandra.
>>
>> I see that getRangeSlice() loops through all available ranges, and for
>> each range, it sends a request to the applicable nodes and then handles
>> their answer *before* advancing to the next range. That may be fine for
>> getRangeSlice(), but I'd like to send requests for all ranges at once
>> and
>> collect and handle the responses asynchronoysly (when they arrive).
>>
>> I tried the following (pseudocode mixed with code):
>>
>> <create a list of handlers>
>>
>> for (AbstractBounds range : ranges)
>> {
>>    <create handler for this range>
>>
>>    for (InetAddress endpoint : entry.getValue())
>>    {
>>        MessagingService.instance.sendRR(message, endpoint, handler);
>>    }
>>
>>    handlers.add(handler);
>> }
>>
>> for(QRH handler : handlers)
>> {
>>    List<ByteBuffer> response = handler.get();
>> }
>>
>> However my client gets a TimedOutException and I think Cassandra blocks
>> during the resolve() in get().
>>
>> I am using 0.7.0rc1, however I don't think this code would have changed
>> much.
>>
>> Any ideas?
>>
>> Alexander
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>
>

Reply via email to