Re: batch dump of data from cassandra?

2011-05-22 Thread Adrian Cockcroft
Hi Yang, You could also use Hadoop (i.e. Brisk), and run a MapReduce job or Hive query to extract and summarize/renormalize the data into whatever format you like. If you use sstable2json, you have to run on every file on every node, deduplicate/merge all the output across machines, which is what

Re: batch dump of data from cassandra?

2011-05-22 Thread Yang
Thanks Jonathan. On Sun, May 22, 2011 at 9:56 PM, Jonathan Ellis wrote: > I'd modify SSTableExport.serializeRow (the sstable2json class) to > output to whatever system you are targeting. > > On Sun, May 22, 2011 at 11:19 PM, Yang wrote: >> let's say periodically (daily) I need to dump out the co

What's the valid name format of the column family in cassandra?

2011-05-22 Thread Dikang Gu
What's the naming convention of the column family in cassandra? I did not find this in the wiki yet... Thanks. -- Dikang Gu 0086 - 18611140205

Re: How to reduce the Read Latency.

2011-05-22 Thread Dikang Gu
Thanks Aaron! Through the commands "top" and "iostats", I find the IO system is not overloaded yet. So I will check the data model. And how to get the row size of a specific key? Do we have the api yet? Thanks. -- Dikang Gu 0086 - 18611140205 On Sunday, May 22, 2011 at 6:15 PM, aaron morton w

Re: batch dump of data from cassandra?

2011-05-22 Thread Jonathan Ellis
I'd modify SSTableExport.serializeRow (the sstable2json class) to output to whatever system you are targeting. On Sun, May 22, 2011 at 11:19 PM, Yang wrote: > let's say periodically (daily) I need to dump out the contents of my > Cassandra DB, and do a import into oracle , or some other custom da

batch dump of data from cassandra?

2011-05-22 Thread Yang
let's say periodically (daily) I need to dump out the contents of my Cassandra DB, and do a import into oracle , or some other custom data stores, is there a way to do it? I checked that you can do multi-get() but you probably can't pass the entire key domain into the API, cuz the entire db would

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-22 Thread Milind Parikh
I believe that the key reason is souped up performance for most recent data. And yes, "an intelligent flush" leaves you vulnerable to some data loss. /*** sent from my android...please pardon occasional typos as I respond @ the speed of thought / On May

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-22 Thread Yang
Thanks, I did read through that pdf doc, and went through the counters code in 0.8-rc2, I think I understand the logic in that code. in my hypothetical implementation, I am not suggesting to overstep the complicated logic in counters code, since the extra module will still need to enter the incre

Re: Inconsistent results using secondary indexes between two DC

2011-05-22 Thread Wojciech Pietrzok
I've already tried running nodetool repair severail times before but it didn't seem to help. Now I've upgraded Cassandra to 0.7.6, run nodetool scrub, and nodetool repair (twice). One of the problematic nodes seems to return correct results now. But the second one still returns inconsistent data.

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-22 Thread aaron morton
The implementation of distributed counters is more complicated than your example, there is a design doc attached to the ticket here https://issues.apache.org/jira/browse/CASSANDRA-1072 By collapsing some of those +1 increments together at the application level there is less work for the clu

Re: How to reduce the Read Latency.

2011-05-22 Thread aaron morton
It's hard to say the latency is to high without knowing how many columns and how many bytes you are asking for. It's also handy to know what the query looks like, i.e. is it a slice or a get by name, and the CF level Latency reported at the CF or KS level are for local read / write operations.

Re: Re : Re : selecting data

2011-05-22 Thread aaron morton
You should be using Cassandra 0.7+, 0.6 is basically end of life. 0.7 is stable and 0.8 is in release candidate. There is limited support for using 0.6. Grabe a copy of 0.7 and run bin/cassandra-cli and look at the help in there. Cheers - Aaron Morton Freelance Cassandra De