Re: Pyramid Organization of Data

2011-04-14 Thread Patrick Julien
On Thu, Apr 14, 2011 at 4:47 PM, Adrian Cockcroft wrote: > What you are asking for breaks the eventual consistency model, so you need to > create a separate cluster in NYC that collects the same updates but has a > much longer setting to timeout the data for deletion, or doesn't get the > delet

Re: Pyramid Organization of Data

2011-04-14 Thread Adrian Cockcroft
tore from our EC2 hosted Cassandra clusters >> to/from S3. Full backup/restore works fine, incremental and per-keyspace >> restore is being worked on. >> Adrian >> From: Patrick Julien >> Reply-To: "user@cassandra.apache.org" >> Date: Thu, 14 Apr 2011 05

Re: Pyramid Organization of Data

2011-04-14 Thread Patrick Julien
> From: Patrick Julien > Reply-To: "user@cassandra.apache.org" > Date: Thu, 14 Apr 2011 05:38:54 -0700 > To: "user@cassandra.apache.org" > Subject: Re: Pyramid Organization of Data > > Thanks,  I'm still working the problem so anything I find out

Re: Pyramid Organization of Data

2011-04-14 Thread Adrian Cockcroft
Apr 2011 05:38:54 -0700 To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" mailto:user@cassandra.apache.org>> Subject: Re: Pyramid Organization of Data Thanks, I'm still working the problem so anything I find out I will post here. Yes, you're right, th

Re: Pyramid Organization of Data

2011-04-14 Thread Patrick Julien
Thanks, I'm still working the problem so anything I find out I will post here. Yes, you're right, that is the question I am asking. No, adding more storage is not a solution since new york would have several hundred times more storage. On Apr 14, 2011 6:38 AM, "aaron morton" wrote: > I think yo

Re: Pyramid Organization of Data

2011-04-14 Thread aaron morton
I think your question is "NY is the archive, after a certain amount of time we want to delete the row from the original DC but keep it in the archive in NY." Once you delete a row, it's deleted as far as the client is concerned. GCGaceSeconds is only concerned with when the tombstone marker can

Re: Pyramid Organization of Data

2011-04-13 Thread Patrick Julien
We have been successful in implementing, at scale, the comments you posted here. I'm wondering what we can do about deleting data however. The way I see it, we have considerably more storage capacity in NY, but not in the other sites. Using this technique here, it occurs to me that we would repl

Re: Pyramid Organization of Data

2011-04-08 Thread Patrick Julien
thank you, I get it now. On Fri, Apr 8, 2011 at 7:15 PM, Jonathan Ellis wrote: > No, I'm suggesting you have a Tokyo keyspace that gets replicated as > {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London: > 2, NYC: 1}, for example. > > On Fri, Apr 8, 2011 at 5:59 PM, Patrick Juli

Re: Pyramid Organization of Data

2011-04-08 Thread Jonathan Ellis
No, I'm suggesting you have a Tokyo keyspace that gets replicated as {Tokyo: 2, NYC:1}, a London keyspace that gets replicated to {London: 2, NYC: 1}, for example. On Fri, Apr 8, 2011 at 5:59 PM, Patrick Julien wrote: > I'm familiar with this material.  I hadn't thought of it from this > angle bu

Re: Pyramid Organization of Data

2011-04-08 Thread Patrick Julien
I'm familiar with this material. I hadn't thought of it from this angle but I believe what you're suggesting is that the different data centers would hold a different properties file for node discovery instead of using auto-discovery. So Tokyo, and others, would have a configuration that make it

Re: Pyramid Organization of Data

2011-04-08 Thread Jonathan Ellis
On Fri, Apr 8, 2011 at 12:17 PM, Patrick Julien wrote: > The problem is this: we would like the historical data from Tokyo to > stay in Tokyo and only be replicated to New York.  The one in London > to be in London and only be replicated to New York and so on for all > data centers. > > Is this cu

Re: Pyramid Organization of Data

2011-04-08 Thread Joe Stump
A few lines of Java in a partitioning or rack aware strategy might be able to achieve this. --Joe -- Typed with big fingers on a small keyboard. On Apr 8, 2011, at 13:17, Patrick Julien wrote: > We have a pilot project running where all our historical data > worldwide would be stored using

Pyramid Organization of Data

2011-04-08 Thread Patrick Julien
We have a pilot project running where all our historical data worldwide would be stored using cassandra. So far, we have been successful at getting the write and read throughput we need, in fact, coming in over 27% over our needed capacity and well beyond what we were able to achieve with mysql, v