Re: Timeouts running batch_mutate

Jonathan Ellis Thu, 20 May 2010 14:04:32 -0700

HBase has the same problem.

Your choices are basically (a) figure out a way to not do all writes
sequentially or (b) figure out a way to model w/o OPP.


Most Cassandra users go with option (b).

On Thu, May 20, 2010 at 8:21 AM, Sonny Heer <sonnyh...@gmail.com> wrote:
> Yes, I'm using OOP, because of the way we modeled our data.  Does
> Cassandra not handle OOP intensive write operations?  Is HBase a
> better approach if one must use OOP?
>
>
> On Thu, May 20, 2010 at 7:41 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
>> Are you using OOP?  That will tend to create hot spots like this,
>> which is why most people deploy on RP.
>>
>> If you are using RP you may simply need to add C* capacity, or take
>> TimeoutException as a signal to throttle your activity.
>>
>> On Tue, May 18, 2010 at 4:37 PM, Sonny Heer <sonnyh...@gmail.com> wrote:
>>> Yeah there are many writes happening at the same time to any given cass 
>>> node.
>>>
>>> e.g. assume 10 machines, all running hadoop and cassandra.  The hadoop
>>> nodes are randomly picking a cassandra node and writing directly using
>>> the batch mutate.
>>>
>>> After increasing the timeout even more, i don't get that exception
>>> anymore.  But now getting UnavailableException.
>>>
>>> The wiki states this happens when all the replicas required could be
>>> created and/or read.  How do we resolve this problem?  the write
>>> consistency is one.
>>>
>>> thanks
>>>
>>>
>>> On Sat, May 15, 2010 at 8:02 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
>>>> rpctimeout should be sufficient
>>>>
>>>> you can turn on debug logging to see how long it's actually taking the
>>>> destination node to do the write (or look at cfstats, if no other
>>>> writes are going on)
>>>>
>>>> On Fri, May 14, 2010 at 11:55 AM, Sonny Heer <sonnyh...@gmail.com> wrote:
>>>>> Hey,
>>>>>
>>>>> I'm running a map/reduce job, reading from HDFS directory, and
>>>>> reducing to Cassandra using the batch_mutate method.
>>>>>
>>>>> The reducer builds the list of rowmutations for a single row, and
>>>>> calls batch_mutate at the end.  As I move to a larger dataset, i'm
>>>>> seeing the following exception:
>>>>>
>>>>> Caused by: TimedOutException()
>>>>>        at 
>>>>> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra.java:15361)
>>>>>        at 
>>>>> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra.java:796)
>>>>>        at 
>>>>> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java:772)
>>>>>
>>>>> I changed the RpcTimeoutInMillis to 60 seconds with no changes.  What
>>>>> configuration changes should i make when doing intensive write
>>>>> operations using batch mutate?
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan Ellis
>>>> Project Chair, Apache Cassandra
>>>> co-founder of Riptano, the source for professional Cassandra support
>>>> http://riptano.com
>>>>
>>>
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Timeouts running batch_mutate

Reply via email to