Re: Batch mutation streaming

Ben Hood Wed, 12 Dec 2012 04:21:56 -0800

Hey Aaron,

That sounds sensible - thanks for the heads up.


Cheers,

Ben

On Dec 10, 2012, at 0:47, aaron morton <aa...@thelastpickle.com> wrote:

>> (and if the message is being decoded on the server site as a complete 
>> message, then presumably the same resident memory consumption applies there 
>> too).
> Yerp. 
> And every row mutation in your batch becomes a task in the Mutation thread 
> pool. If one replica gets 500 row mutations from one client request it will 
> take a while for the (default) 32 threads to chew through them. While this is 
> going on other client request will be effectively blocked. 
> 
> Depending on the number of clients, I would start with say 50 rows per 
> mutation and keep and eye of the *request* latency. 
> 
> Hope that helps. 
> 
> 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 9/12/2012, at 7:18 AM, Ben Hood <0x6e6...@gmail.com> wrote:
> 
>> Thanks for the clarification Andrey. If that is the case, I had better 
>> ensure that I don't put the entire contents of a very long input stream into 
>> a single batch, since that is presumably going to cause a very large message 
>> to accumulate on the client side (and if the message is being decoded on the 
>> server site as a complete message, then presumably the same resident memory 
>> consumption applies there too).
>> 
>> Cheers,
>> 
>> 
>> Ben
>> 
>> On Dec 7, 2012, at 17:24, Andrey Ilinykh <ailin...@gmail.com> wrote:
>> 
>>> Cassandra uses thrift messages to pass data to and from server. A batch is 
>>> just a convenient way to create such message. Nothing happens until you 
>>> send this message. Probably, this is what you call "close the batch".
>>> 
>>> Thank you,
>>>   Andrey
>>> 
>>> 
>>> On Fri, Dec 7, 2012 at 5:34 AM, Ben Hood <0x6e6...@gmail.com> wrote:
>>>> Hi,
>>>> 
>>>> I'd like my app to stream a large number of events into Cassandra that 
>>>> originate from the same network input stream. If I create one batch 
>>>> mutation, can I just keep appending events to the Cassandra batch until 
>>>> I'm done, or are there some practical considerations about doing this 
>>>> (e.g. too much stuff buffering up on the client or server side, visibility 
>>>> of the data within the batch that hasn't been closed by the client yet)? 
>>>> Barring any discussion about atomicity, if I were able to stream a largish 
>>>> source into Cassandra, what would happen if the client crashed and didn't 
>>>> close the batch? Or is this kind of thing just a normal occurrence that 
>>>> Cassandra has to be aware of anyway?
>>>> 
>>>> Cheers,
>>>> 
>>>> Ben
>

Re: Batch mutation streaming

Reply via email to