Re: Batch mutation streaming

aaron morton Sun, 09 Dec 2012 16:48:05 -0800

> (and if the message is being decoded on the server site as a complete 
> message, then presumably the same resident memory consumption applies there 
> too).
Yerp. 
And every row mutation in your batch becomes a task in the Mutation thread 
pool. If one replica gets 500 row mutations from one client request it will 
take a while for the (default) 32 threads to chew through them. While this is 
going on other client request will be effectively blocked.


Depending on the number of clients, I would start with say 50 rows per mutation 
and keep and eye of the *request* latency. 

Hope that helps. 


-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 9/12/2012, at 7:18 AM, Ben Hood <0x6e6...@gmail.com> wrote:

> Thanks for the clarification Andrey. If that is the case, I had better ensure 
> that I don't put the entire contents of a very long input stream into a 
> single batch, since that is presumably going to cause a very large message to 
> accumulate on the client side (and if the message is being decoded on the 
> server site as a complete message, then presumably the same resident memory 
> consumption applies there too).
> 
> Cheers,
> 
> 
> Ben
> 
> On Dec 7, 2012, at 17:24, Andrey Ilinykh <ailin...@gmail.com> wrote:
> 
>> Cassandra uses thrift messages to pass data to and from server. A batch is 
>> just a convenient way to create such message. Nothing happens until you send 
>> this message. Probably, this is what you call "close the batch".
>> 
>> Thank you,
>>   Andrey
>> 
>> 
>> On Fri, Dec 7, 2012 at 5:34 AM, Ben Hood <0x6e6...@gmail.com> wrote:
>> Hi,
>> 
>> I'd like my app to stream a large number of events into Cassandra that 
>> originate from the same network input stream. If I create one batch 
>> mutation, can I just keep appending events to the Cassandra batch until I'm 
>> done, or are there some practical considerations about doing this (e.g. too 
>> much stuff buffering up on the client or server side, visibility of the data 
>> within the batch that hasn't been closed by the client yet)? Barring any 
>> discussion about atomicity, if I were able to stream a largish source into 
>> Cassandra, what would happen if the client crashed and didn't close the 
>> batch? Or is this kind of thing just a normal occurrence that Cassandra has 
>> to be aware of anyway?
>> 
>> Cheers,
>> 
>> Ben
>>

Re: Batch mutation streaming

Reply via email to