Re: indexing question related to playOrm on github

Hiller, Dean Thu, 16 Aug 2012 05:30:07 -0700

Yes, the synch may work, and no, I do "not" want a transaction…I want a 
different kind of eventually consistent


That might work.
Let's say server 1 sends a mutation (65 is the pk)
Remove: <bill><65>  Add <tim><65>
Server 2 also sends a mutation (65 is the pk)
Remove: <bill><65> Add <mike><65>

What everyone does not want is to end up with a row that has <tim><65> and 
<mike><65>.  With the wide row pattern, we would like to have ONE or the other. 
 I am not sure synchronization fixes that……It would be kind of nice if the 
column <bill><65> would not actually be removed until after all servers are 
eventually consistent AND would keep a reference to the add that was happening 
so that when it goes to resolve eventually consistent between the servers, it 
would see that <mike><65> is newer and it would decide to drop the first add 
completely.

Ie. In a full process it might look like this
Cassandra node 1 receives remove <bill><65>, add <tim><65> AND in the remove 
column stores info about the add <tim><65> until eventual consistency is 
completed
Cassandra node 2 one ms later receives remove <bill><65> and <tim><65> AND in 
the remove column stores info about the add <tim><65> until eventual 
consistency is completed
Eventual consistency starts comparing node 1 and node 2 and finds <bill><65> is 
being removed by different servers and finds add info attached to that.  ONLY 
THE LAST add info is acknowledged and it makes the row consistent across the 
cluster.

That makes everyone's wide row indexing pattern tend to get less corrupt over 
time.

Thanks,
Dean


From: aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Wednesday, August 15, 2012 8:26 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: indexing question related to playOrm on github

1.  Can playOrm be listed on cassandra's list of ORMs?  It supports a JQL/HQL 
query on a trillion rows in under 100ms (partitioning is the trick so you can 
JQL a partition)
No sure if we have an ORM specific page. If it's a client then feel free to add 
it to http://wiki.apache.org/cassandra/ClientOptions

I was wondering if cassandra has or will ever support eventual constancy where 
it keeps both the REMOVE AND the ADD together such until it is on all 3 
replicated nodes and in resolving the consistency would end up with an index 
that only has the very last one in the index.
Not sure I fully understand but it sounds like you want a transaction, which is 
not going to happen.

Internally when Cassandra updates a secondary index it does the same thing. But 
it synchronises updates around the same row so one thread will apply the 
changes at a time.

Hope that helps.
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/08/2012, at 12:34 PM, "Hiller, Dean" 
<dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>> wrote:

1.  Can playOrm be listed on cassandra's list of ORMs?  It supports a JQL/HQL 
query on a trillion rows in under 100ms (partitioning is the trick so you can 
JQL a partition)
2.  Many applications have a common indexing problem and I was wondering if 
cassandra has or could have any support for this in the future….

When using wide row indexes, you frequently have <indexedValue>.<primaryKey> as 
the composite key.  This means when you have your object like so in the database

Activity {
  pk: 65
  name: bill
}

And then two servers want to save it as

Activity {
  pk:65
  name:tim
}
Activity {
  pk:65
  name:mike
}

Each server will remove <bill><65> and BOTH servers will add <tim><65> AND 
<mike><65> BUT one of them will really be a lie!!!!!  I was wondering if 
cassandra has or will ever support eventual constancy where it keeps both the 
REMOVE AND the ADD together such until it is on all 3 replicated nodes and in 
resolving the consistency would end up with an index that only has the very 
last one in the index.

Thanks,
Dean

Re: indexing question related to playOrm on github

Reply via email to