> De : aaron morton [mailto:aa...@thelastpickle.com]
> Envoyé : dimanche 28 avril 2013 22:54
> À : user@cassandra.apache.org
> Objet : Re: cost estimate about some Cassandra patchs
>
> > Does anyone know enough of the inner working of Cassandra to tell me how 
> > much work is needed to patch Cassandra to enable such communication 
> > vectorization/batch ?
>

> Assuming you mean "have the coordinator send multiple row read/write requests 
> in a single message to replicas"
>
> Pretty sure this has been raised as a ticket before but I cannot find one now.
>
> It would be a significant change and I'm not sure how big the benefit is. To 
> send the messages the coordinator places them in a queue, there is little 
> delay sending. Then it waits on them async. So there may be some saving on 
> networking but from the coordinators point of view I think the impact is 
> minimal.
>
> What is your use case?

Use case = rows with rowkey like (folder id, file id)
And operations read/write multiple rows with same folder id => so, it could 
make sense to have a partitioner putting rows with same "folder id" on the same 
replicas.

But so far, Cassandra is not able to exploit this locality as batch effect ends 
at the coordinator node.

So, my question about the cost estimate for patching Cassandra.

The closest (or exactly corresponding to my need ?) JIRA entries I have found 
so far are:

CASSANDRA-166: Support batch inserts for more than one key at once
https://issues.apache.org/jira/browse/CASSANDRA-166
=> "WON'T FIX" status

CASSANDRA-5034: Refactor to introduce Mutation Container in write path
https://issues.apache.org/jira/browse/CASSANDRA-5034
=> I am not very sure if it's related to my topic

Thanks.

Dominique



>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com

On 27/04/2013, at 4:04 AM, DE VITO Dominique 
<dominique.dev...@thalesgroup.com<mailto:dominique.dev...@thalesgroup.com>> 
wrote:


Hi,

We are created a new partitioner that groups some rows with **different** row 
keys on the same replicas.

But neither the batch_mutate, or the multiget_slice are able to take 
opportunity of this partitioner-defined placement to vectorize/batch 
communications between the coordinator and the replicas.

Does anyone know enough of the inner working of Cassandra to tell me how much 
work is needed to patch Cassandra to enable such communication 
vectorization/batch ?

Thanks.

Regards,
Dominique



Reply via email to