> * In the design discussed it is perfectly reasonable for data not to be on > the archive node. > > You mean when having the 2 DC setup I mentioned and using TTL? In case I have > the 2 DC setup but don't use TTL I don't understand why data wouldn't be on > the archive node? Originally you were talking about taking the archive node down, and then having HH write hints back. HH is not considered a reliable mechanism for obtaining consistency, it's better in 1.0 but repair is AFAIK still considered the way to achieve consistency. For example HH only collects hints for a down node for 1 hour. Also a read operation will check consistency and may repair it, snapshots do not do that.
Finally if you write into the DC with 2 nodes at a CL other than QUORUM or EACH_QUORUM there is no guarantee that the write will be committed in the other DC. > So what data format should I use for historical archiving? Plain text file, with documentation. So that any who follows you can work with the data. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 6/01/2012, at 12:31 AM, Alexandru Sicoe wrote: > Hi, > > On Wed, Jan 4, 2012 at 9:54 PM, aaron morton <aa...@thelastpickle.com> wrote: > Some thoughts on the plan: > > * You are monkeying around with things, do not be surprised when surprising > things happen. > > I am just trying to explore different solutions for solving my problem. > > * Deliberately unbalancing the cluster may lead to Bad Things happening. > > I will take your advice on this. I would have liked to have an extra node to > have 2 nodes in each DC. > > * In the design discussed it is perfectly reasonable for data not to be on > the archive node. > > You mean when having the 2 DC setup I mentioned and using TTL? In case I have > the 2 DC setup but don't use TTL I don't understand why data wouldn't be on > the archive node? > > * Truncate is a cluster wide operation and all nodes must be online before it > will start. > * Truncate will snapshot before deleting data, you could use this snapshot. > * TTL for a column is for a column no matter which node it is on. > > Thanks for clarifying these! > > * IMHO Cassandra data files (sstables or JSON dumps) are not a good format > for a historical archive, nothing against Cassandra. You need the lowest > common format. > > So what data format should I use for historical archiving? > > > If you have the resources for a second cluster could you put the two together > and just have one cluster with a very large retention policy? One cluster is > easier than two. > > I am constrained to have limited retention on the Cassandra cluster that is > collecting the data . Once I archive the data for long term storage I cannot > bring it back in the same Cassandra cluster that collected it in the first > place because it's in an enclosed network with strict rules. I have to load > it in another cluster outside the enclosed network. It's not that I have the > resources for a second cluster, I am forced to use a second cluster. > > > Assuming there is no business case for this, consider either: > > * Dumping the historical data into a Hadoop (with or without HDFS) cluster > with high compression. If needed you could then run Hive / Pig to fill a > companion Cassandra cluster with data on demand. Or just query using Hadoop. > * Dumping the historical data to files with high compression and a roll your > own solution to fill a cluster. > > Ok, thanks for these suggestions, I will have to investigate further. > > Also considering talking to Data Stax about DSE. > > Cheers > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 5/01/2012, at 1:41 AM, Alexandru Sicoe wrote: > > > Cheers, > Alex >> Hi, >> >> On Tue, Jan 3, 2012 at 8:19 PM, aaron morton <aa...@thelastpickle.com> wrote: >> Running a time based rolling window of data can be done using the TTL. >> Backing up the nodes for disaster recover can be done using snapshots. >> Restoring any point in time will be tricky because to may restore columns >> where the TTL has expired. >> >> Yeah, that's the thing...if I want to use the system as I explain further >> below, I cannot do backing up of data (for later restoration) if I'm using >> TTLs. >> >> >>> Will I get a single copy of the data in the remote storage or will it be >>> twice the data (data + replica)? >> You will RF copies of the data. (By the way, there is no original copy) >> >> Well, if I organize the cluster as I mentioned in the first email, I will >> get one copy of each row at a certain point in time on node2 if I take it >> offline, perform a major compaction and GC, won't I? I don't want to send >> duplicated data to the mass storage! >> >> >> Can you share a bit more about the use case ? How much data and what sort of >> read patterns ? >> >> >> I have several applications that feed into Cassandra about 2 million >> different variables (each representing a different monitoring >> value/channel). The system receives updates for each of these monitoring >> values at different rates. For each new update, the timestamp and value are >> recorded in a Cassandra name-value pair. The schema of Cassandra is built >> using one CF for data and 4 other CFs for metadata (metadata CFs are static >> - don't grow almost at all once they've been loaded). The data CF uses a row >> for each variable. Each row acts as a 4 hour time bin. I achieve this by >> creating the row key as a concatenation of the first 6 digits of the >> timestamp at which the data is inserted + the unique ID of the variable. >> After the time bin expires, a new row will be created for the same variable >> ID. >> >> The system can currently sustain the insertion load. Now I'm looking into >> organizing the flow of data out of the cluster and retrieval performance for >> random queries: >> >> Why do I need to organize the data out? Well, my requirement is to keep all >> the data coming into the system at the highest granularity for long term >> (several years). The 3 node cluster I mentioned is the online cluster which >> is supposed to be able to absorb the input load for a relatively short >> period of time, a few weeks (I am constrained to do this). After this period >> the data has to be shipped out of the cluster in a mass storage facility and >> the cluster needs to be emptied to make room for more data. Also, the online >> cluster will serve reads while it takes in data. For older data I am >> planning to have another cluster that gets loaded with data from the storage >> facility on demand and will serve reads from there. >> >> Why random queries? There is no specific use case about them, that's why I >> want to rely only on the built in Cassandra indexes for now. Generally the >> client will ask for sets of values within a time range up to 8-10 hours in >> the past. Apart from some sets of variables that will be almost always asked >> together, any combination is possible because this system will feed in a web >> dashboard which will be used for debugging purposes - to correlate and >> aggregate streams of variables. Depending on the problem, different variable >> combinations could be investigated. >> >> Can you split the data stream into a permanent log record and also into >> cassandra for a rolling window of query able data ? >> >> In the end, essentially that's what I've been meaning to do with organizing >> the cluster in a 2 DC setup: i wanted to have 2 nodes in DC1 taking the data >> and reads (the rolling window) and replicating to the node in DC2 (the >> permanent log - of a single copy of the data). I was thinking of >> implementing the rolling window by emptying the nodes in DC1 using truncate >> instead of what you propose now with the rolling window using TTL. >> >> Ok, so I can do what you are saying easily if Cassandra allows me to have a >> TTL only on the first copy of the data and have the second replica without a >> TTL. Is this possible? I think it would solve my problem, as long as I can >> backup and empty the node in DC2 before the TTLs expire in the other 2 nodes. >> >> Cheers, >> Alex >> >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 3/01/2012, at 11:41 PM, Alexandru Sicoe wrote: >> >>> Hi, >>> >>> I need to build a system that stores data for years, so yes, I am backing >>> up data in another mass storage system from where it could be later >>> accessed. The data that I successfully back up has to be deleted from my >>> cluster to make space for new data coming in. >>> >>> I was aware about the snapshotting which I will use for getting the data >>> out of node2: it creates hard links to the SSTables of a CF and then I can >>> copy over those files pointed to by the hard links into another location. >>> After that I get rid of the snapshot (hard links) and then I can truncate >>> my CFs. It's clear that snapshotting will give me a single copy of the data >>> in case I have a unique copy of the data on one node. It's not clear to me >>> what happens if I have let's say a cluster with 3 nodes and RF=2 and I do a >>> snapshot of every node and copy those snapshots to remote storage. Will I >>> get a single copy of the data in the remote storage or will it be twice the >>> data (data + replica)? >>> >>> I've started reading about TTL and I think I can use it but it's not clear >>> to me how it would work in conjunction with the snapshotting/backing up I >>> need to do. I mean, it will impose a deadline by which I need to perform a >>> backup in order not to miss any data. Also, I might duplicate the data if >>> some columns don't expire fully between 2 backups. Any clarifications on >>> this? >>> >>> Cheers, >>> Alex >>> >>> On Tue, Jan 3, 2012 at 9:44 AM, aaron morton <aa...@thelastpickle.com> >>> wrote: >>> That sounds a little complicated. >>> >>> Do you want to get the data out for an off node backup or is it for >>> processing in another system ? >>> >>> You may get by using: >>> >>> * TTL to expire data via compaction >>> * snapshots for backups >>> >>> Cheers >>> >>> ----------------- >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 3/01/2012, at 11:00 AM, Alexandru Sicoe wrote: >>> >>>> Hi everyone and Happy New Year! >>>> >>>> I need advice for organizing data flow outside of my 3 node Cassandra >>>> 0.8.6 cluster. I am configuring my keyspace to use the >>>> NetworkTopologyStrategy. I have 2 data centers each with a replication >>>> factor 1 (i.e. DC1:1; DC2:1) the configuration of the PropertyFileSnitch >>>> is: >>>> >>>> >>>> ip_node1=DC1:RAC1 >>>> >>>> ip_node2=DC2:RAC1 >>>> >>>> ip_node3=DC1:RAC1 >>>> I assign tokens like this: >>>> node1 = 0 >>>> node2 = 1 >>>> node3 = 85070591730234615865843651857942052864 >>>> >>>> My write consistency level is ANY. >>>> >>>> My data sources are only inserting data in node1 & node3. Essentially what >>>> happens is that a replica of every input value will end up on node2. Node >>>> 2 thus has a copy of the entire data written to the cluster. When Node2 >>>> starts getting full, I want to have a script which pulls it off-line and >>>> does a sequence of operations >>>> (compaction/snapshotting/exporting/truncating the CFs) in order to back up >>>> the data in a remote place and to free it up so that it can take more >>>> data. When it comes back on-line it will take hints from the other 2 nodes. >>>> >>>> This is how I plan on shipping data out of my cluster without any downtime >>>> or any major performance penalty. The problem is when I want to also >>>> truncate the CFs in node1 & node3 to also free them up of data. I don't >>>> know whether I can do this without any downtime or without any serious >>>> performance penalties. Is anyone using truncate to free up CFs of data? >>>> How efficient is this? >>>> >>>> Any observations or suggestions are much appreciated! >>>> >>>> Cheers, >>>> Alex >>> >>> >> >> > >