Re: Deployment on AWS and replication strategies

2010-04-04 Thread Benjamin Black
On Sat, Apr 3, 2010 at 8:23 PM, Mike Gallamore wrote: >> > I didn't mean a real time determination, more of if the nodes aren't > identical. For example if you have a cluster made up of a bunch of EC2 light > instances and decide to add a large instance, it would be nice if the new > node would

Re: Bug in Cassandra that occurs when removing a supercolumn.

2010-04-04 Thread Arash Bazrafshan
Vijay, I know I've written a ridiculously long bug-specification, but heck at least I've mentioned all the important stuff. :-) Look under PREREQUISITES and you see I've mentioned that I use 0.5.0-1 under ubuntu. I agree that CASSANDRA-703 is exactly the same bug as the one I've observed, and yet

Re: Bug in Cassandra that occurs when removing a supercolumn.

2010-04-04 Thread Jonathan Ellis
We do appreciate the effort, though. :) On Sun, Apr 4, 2010 at 3:42 AM, Arash Bazrafshan wrote: > Vijay, I know I've written a ridiculously long bug-specification, but heck > at least I've mentioned all the important stuff. :-) > > Look under PREREQUISITES and you see I've mentioned that I use 0.

Re: Deployment on AWS and replication strategies

2010-04-04 Thread Mike Gallamore
Pluggable placement: that is cool. It wasn't something that was obvious to me that was available from the documentation I read. I thought maybe the the rackaware and rackunaware were hard coded in somewhere. I'm not a java developer so I haven't looked at the code much. That said I'll take a lo

Memcached protocol?

2010-04-04 Thread Paul Prescod
Many Cassandra implementations seem to be memcached+X migrations, and some might be replacing memcached alone. Has anyone considered making a protocol handler or proxy that would allow Cassandra to talk the memached binary protocol? jmemcached + Cassandra = easy migration? I have barely started t

Re: Memcached protocol?

2010-04-04 Thread Joe Stump
Seems like this would be pretty easy to build on top of the proxy stuff that was recently mentioned. I don't see a reason why you couldn't just store key/blob-in-column to get running quickly. Might make for a pretty interesting clustered queue system as well, which has been mentioned before on

Re: Memcached protocol?

2010-04-04 Thread Ryan Daum
I'm the author/maintainer of jmemcached; I'd be willing to do this and it'd be quite easy to do, but Cassandra is missing a number of things which would make it so we could only support a subset of the memcache protocol. Memcache has: set-if-not-present ("add") atomic increment / decrement compare

Re: Memcached protocol?

2010-04-04 Thread Paul Prescod
On Sun, Apr 4, 2010 at 1:16 PM, Joe Stump wrote: > Seems like this would be pretty easy to build on top of the proxy stuff > that was recently mentioned. > I'm new to the list: is it easy for you to dig up a subject line or message-id that I can google? I was poking around to see how Avro was h

Re: Bug regarding removing and retrieving entire supercolumn.

2010-04-04 Thread Eric Evans
On Sat, 2010-04-03 at 02:55 +0200, Arash Bazrafshan wrote: > Think I got a bug in Cassandra. Do you also think it's a bug? > > It should be noted that I experience this bug when using cassandra > through thrift's php api (the low-level one generated by thrift, not > some high-level from the cassan

how to paginate through CF

2010-04-04 Thread AJ Chen
Pagination is be numbers, e.g. get 10 rows starting from number 100 or getRows(100, 10). Column family uses KeyRange to get a section of the table. This assumes the key is always sorted. Is it true? Secondly, the key normally a string. How do you translate the starting row number to a string key?

Re: Memcached protocol?

2010-04-04 Thread Paul Prescod
On Sun, Apr 4, 2010 at 2:13 PM, Ryan Daum wrote: > > I'm the author/maintainer of jmemcached; I'd be willing to do this and it'd > be quite easy to do, but Cassandra is missing a number of things which would > make it so we could only support a subset of the memcache protocol. Yes, I had presum

Re: Memcached protocol?

2010-04-04 Thread Benjamin Black
On Sun, Apr 4, 2010 at 4:52 PM, Paul Prescod wrote: > > In order to strictly implement Memcached behaviour (where the result > is returned immediately), you'd need to do a READ just after your > WRITE, to force the conflict engine to detect and resolve the > conflict. > Are you suggesting this wo

Re: how to paginate through CF

2010-04-04 Thread Vijay
>> This assumes the key is always sorted. Is it true? yes, keys are always sorted within a column family, >> How do you translate the starting row number to a string key? handled via client, you might need to store the last key which was returned and start from there to fetch the next set Fir

Re: Bug regarding removing and retrieving entire supercolumn.

2010-04-04 Thread Arash Bazrafshan
Please consider this thread closed. This was a sloppy description of the bug, written by me at a time when I should actually be sleeping. I wrote a more detailed specification at a later time, and the result of it was that the bug was confirmed and in fact it had already been solved (see CASSANDR

Re: Deployment on AWS

2010-04-04 Thread Dan Di Spaltro
A little off-topic, but is an availability zone in a separate physical datacenter? On Sat, Apr 3, 2010 at 5:08 PM, Benjamin Black wrote: > Right, you determine AZ by looking at the metadata.  us-east-1a is a > different AZ from us-east-1b.  You can't infer anything beyond that, > either with the

Re: Deployment on AWS

2010-04-04 Thread Masood Mortazavi
See here: http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/index.html?concepts-regions-availability-zones.html (My question remains. I'm interested in seed configuration practice/recipe when deploying on AWS. In the scenario, assume Cassandra sits behind some other part of the service --

Re: Deployment on AWS

2010-04-04 Thread Michael Russo
On 2010-04-04, at 10:18 PM, Masood Mortazavi wrote: > > (My question remains. I'm interested in seed configuration practice/recipe > when deploying on AWS. In the scenario, assume Cassandra sits behind some > other part of the service -- say, web container -- that are then exposed > publicly. C

Cassandra Design or another solution

2010-04-04 Thread JKnight JKnight
Dear all, I want to design the data storage to store user's mark for a large amount of user. When system run, user's mark changes frequently. I want to list top 10 user have largest mark. Could we use Cassandra for store this data? Ex, here my Cassandra data model design: Mark{ userId{

Re: Deployment on AWS

2010-04-04 Thread Benjamin Black
Not guaranteed within the same region. On Sun, Apr 4, 2010 at 6:48 PM, Dan Di Spaltro wrote: > A little off-topic, but is an availability zone in a separate physical > datacenter? >

Re: Memcached protocol?

2010-04-04 Thread Paul Prescod
On Sun, Apr 4, 2010 at 5:06 PM, Benjamin Black wrote: > ... > > Are you suggesting this would give you counter semantics? Yes: My understanding of cassandra-580 is that it gives you increment and decrement which are the basis of counters. Paul Prescod

Re: Cassandra Design or another solution

2010-04-04 Thread David Strauss
On 2010-04-05 02:48, JKnight JKnight wrote: > I want to design the data storage to store user's mark for a large > amount of user. When system run, user's mark changes frequently. What is a "mark"? > I want to list top 10 user have largest mark. Do the "marks" increase monotonically? What other

Re: Deployment on AWS

2010-04-04 Thread Krishna Sankar
Dan, AFAIK, AZ gives you infrastructure redundancy but not necessarily geographical dispersion. Regions are meant for that (as well as other characteristics). An interesting blog on this topic http://alestic.com/2009/07/ec2-availability-zones Cheers On 4/4/10 Sun Apr 4, 10, "Dan Di Spalt

Re: Memcached protocol?

2010-04-04 Thread Benjamin Black
On Sun, Apr 4, 2010 at 8:42 PM, Paul Prescod wrote: > On Sun, Apr 4, 2010 at 5:06 PM, Benjamin Black wrote: >> ... >> >> Are you suggesting this would give you counter semantics? > > Yes: My understanding of cassandra-580 is that it gives you increment > and decrement which are the basis of count

Re: Memcached protocol?

2010-04-04 Thread Paul Prescod
On Sun, Apr 4, 2010 at 8:48 PM, Benjamin Black wrote: > ... > > It gives vector clocks, but that does not mean you have a global > counter you can use as you are describing.  In particular, the "read > after write to trigger read repair" in cases where read repair is > actually required is most li