> Use case = rows with rowkey like (folder id, file id) > And operations read/write multiple rows with same folder id => so, it could > make sense to have a partitioner putting rows with same "folder id" on the > same replicas. The entire row key the thing we use to make the token used to both locate the replicas and place the row in the node. I don't see that changing.
Have you done any performance testing to see if this is a problem? Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 7/05/2013, at 5:27 AM, DE VITO Dominique <dominique.dev...@thalesgroup.com> wrote: > > De : aaron morton [mailto:aa...@thelastpickle.com] > > Envoyé : dimanche 28 avril 2013 22:54 > > À : user@cassandra.apache.org > > Objet : Re: cost estimate about some Cassandra patchs > > > > > Does anyone know enough of the inner working of Cassandra to tell me how > > > much work is needed to patch Cassandra to enable such communication > > > vectorization/batch ? > > > > > Assuming you mean "have the coordinator send multiple row read/write > > requests in a single message to replicas" > > > > Pretty sure this has been raised as a ticket before but I cannot find one > > now. > > > > It would be a significant change and I'm not sure how big the benefit is. > > To send the messages the coordinator places them in a queue, there is > > little delay sending. Then it waits on them async. So there may be some > > saving on networking but from the coordinators point of view I think the > > impact is minimal. > > > > What is your use case? > > Use case = rows with rowkey like (folder id, file id) > And operations read/write multiple rows with same folder id => so, it could > make sense to have a partitioner putting rows with same "folder id" on the > same replicas. > > But so far, Cassandra is not able to exploit this locality as batch effect > ends at the coordinator node. > > So, my question about the cost estimate for patching Cassandra. > > The closest (or exactly corresponding to my need ?) JIRA entries I have found > so far are: > > CASSANDRA-166: Support batch inserts for more than one key at once > https://issues.apache.org/jira/browse/CASSANDRA-166 > => "WON'T FIX" status > > CASSANDRA-5034: Refactor to introduce Mutation Container in write path > https://issues.apache.org/jira/browse/CASSANDRA-5034 > => I am not very sure if it's related to my topic > > Thanks. > > Dominique > > > > > > > Cheers > > > > > > ----------------- > > Aaron Morton > > Freelance Cassandra Consultant > > New Zealand > > > > @aaronmorton > > http://www.thelastpickle.com > > On 27/04/2013, at 4:04 AM, DE VITO Dominique > <dominique.dev...@thalesgroup.com> wrote: > > > Hi, > > We are created a new partitioner that groups some rows with **different** row > keys on the same replicas. > > But neither the batch_mutate, or the multiget_slice are able to take > opportunity of this partitioner-defined placement to vectorize/batch > communications between the coordinator and the replicas. > > Does anyone know enough of the inner working of Cassandra to tell me how much > work is needed to patch Cassandra to enable such communication > vectorization/batch ? > > Thanks. > > Regards, > Dominique > >