Hi Yang,
You could also use Hadoop (i.e. Brisk), and run a MapReduce job or
Hive query to extract and summarize/renormalize the data into whatever
format you like.
If you use sstable2json, you have to run on every file on every node,
deduplicate/merge all the output across machines, which is what
Thanks Jonathan.
On Sun, May 22, 2011 at 9:56 PM, Jonathan Ellis wrote:
> I'd modify SSTableExport.serializeRow (the sstable2json class) to
> output to whatever system you are targeting.
>
> On Sun, May 22, 2011 at 11:19 PM, Yang wrote:
>> let's say periodically (daily) I need to dump out the co
What's the naming convention of the column family in cassandra? I did not find
this in the wiki yet...
Thanks.
--
Dikang Gu
0086 - 18611140205
Thanks Aaron!
Through the commands "top" and "iostats", I find the IO system is not
overloaded yet. So I will check the data model.
And how to get the row size of a specific key? Do we have the api yet?
Thanks.
--
Dikang Gu
0086 - 18611140205
On Sunday, May 22, 2011 at 6:15 PM, aaron morton w
I'd modify SSTableExport.serializeRow (the sstable2json class) to
output to whatever system you are targeting.
On Sun, May 22, 2011 at 11:19 PM, Yang wrote:
> let's say periodically (daily) I need to dump out the contents of my
> Cassandra DB, and do a import into oracle , or some other custom da
let's say periodically (daily) I need to dump out the contents of my
Cassandra DB, and do a import into oracle , or some other custom data
stores,
is there a way to do it?
I checked that you can do multi-get() but you probably can't pass the
entire key domain into the API, cuz the entire db would
I believe that the key reason is souped up performance for most recent data.
And yes, "an intelligent flush" leaves you vulnerable to some data loss.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On May
Thanks,
I did read through that pdf doc, and went through the counters code in
0.8-rc2, I think I understand the logic in that code.
in my hypothetical implementation, I am not suggesting to overstep the
complicated logic in counters code, since the extra module will still
need to enter the incre
I've already tried running nodetool repair severail times before but
it didn't seem to help.
Now I've upgraded Cassandra to 0.7.6, run nodetool scrub, and nodetool
repair (twice). One of the problematic nodes seems to return correct
results now. But the second one still returns inconsistent data.
The implementation of distributed counters is more complicated than your
example, there is a design doc attached to the ticket here
https://issues.apache.org/jira/browse/CASSANDRA-1072
By collapsing some of those +1 increments together at the application level
there is less work for the clu
It's hard to say the latency is to high without knowing how many columns and
how many bytes you are asking for. It's also handy to know what the query looks
like, i.e. is it a slice or a get by name, and the CF level
Latency reported at the CF or KS level are for local read / write operations.
You should be using Cassandra 0.7+, 0.6 is basically end of life. 0.7 is stable
and 0.8 is in release candidate.
There is limited support for using 0.6.
Grabe a copy of 0.7 and run bin/cassandra-cli and look at the help in there.
Cheers
-
Aaron Morton
Freelance Cassandra De
12 matches
Mail list logo