On Tue, May 25, 2010 at 6:35 PM, Jeremy Hanna
wrote:
> What is the use case?
we end up with messed up data in the database, we run a mapreduce job
to find irregular data from time to time.
> Why are you using Cassandra versus using data stored in HDFS or HBase?
as of now our mapreduce task i
We're seeing RAM usage continually climb until eventually, cassandra becomes
unresponsive.
The JVM isn't OOM'ing. It has only committed 14/24GB of memory. So, I am
assuming that the memory usage is related to mmap'd IO. Fair assumption?
I tried setting the IO mode to standard, but it seemed to be
so the row cache contains both rows and keys and if I have large enough row
cache (in particular if row cache size equals key cache size) then it's just
wasteful to keep another key cache and I should eliminate the key-cache,
correct?
On Thu, May 27, 2010 at 1:21 AM, Jonathan Ellis wrote:
> It s
2010/5/26 Utku Can Topçu
> Hi Jeremy,
>
>
> > Why are you using Cassandra versus using data stored in HDFS or HBase?
> - I'm thinking of using it for realtime streaming of user data. While
> streaming the requests, I'm also using Lucandra for indexing the data in
> realtime. It's a better option
The example is a little confusing.
.. but ..
1) "sharding"
You can square the capacity by having a 2-level map.
CF1->row->value->CF2->row->value
This means finding some natural subgrouping or hash that provides a
good distribution.
2) "hashing"
You can also use some additional key hashing to sp
Hi all,
I'm currently looking at new database options for a URL shortener in order
to scale well with increased traffic as we add new features. Cassandra seems
to be a good fit for many of our requirements, but I'm struggling a bit to
find ways of designing certain indexes in Cassandra due to its
I'm very curious on this topic as well. Mainly, I'd like to know is this
functionality handled through Map/Reduce HADOOP operations?
Nick
From: Jeremy Davis [mailto:jerdavis.cassan...@gmail.com]
Sent: Wednesday, May 26, 2010 3:31 PM
To: user@cassandra.apache.org
Subject: Thoughts on addi
Are there any thoughts on adding a more complex query to Cassandra?
At a high level what I'm wondering is: Would it be possible/desirable/in
keeping with the Cassandra plan, to add something like a javascript blob on
to a get range slice etc, that does some further filtering on the results
before
It sure sounds like you're seeing the "my row cache contains the
entire hot data set, so the key cache only gets the cold reads"
effect.
On Wed, May 26, 2010 at 2:54 PM, Ran Tavory wrote:
> If I disable row cache the numbers look good - key cache hit rate is > 0, so
> it seems to be related to ro
On 26 May 2010 22:56, Miguel Verde wrote:
> Right, in C# this would be (not the most efficient way, but you get the
> idea):
> long timestamp = (DateTime.Now.Ticks - new DateTime(1970, 1, 1).Ticks)/10;
>
>
> Yeah, you're fine provided:
a) All your client applications (which perform writes) are c
Right, in C# this would be (not the most efficient way, but you get the
idea):
long timestamp = (DateTime.Now.Ticks - new DateTime(1970, 1, 1).Ticks)/10;
On Wed, May 26, 2010 at 4:50 PM, Mark Robson wrote:
> On 26 May 2010 22:42, Steven Haar wrote:
>
>> What is the best timestamp to use while
On 26 May 2010 22:42, Steven Haar wrote:
> What is the best timestamp to use while using Cassandra with C#? I have
> been using DateTime.Now.Ticks, but I have seen others using different
> things.
>
The standard that most clients seem to use is epoch-microseconds, or
microseconds since midnight
What is the best timestamp to use while using Cassandra with C#? I have been
using DateTime.Now.Ticks, but I have seen others using different things.
Thanks.
I wrote some Iterable<*> methods to do this for column families that
share key structure with OPP.
It is on the hector examples page. Caveat emptor.
It does iterative chunking of the working set for each column family,
so that you can set the nominal transfer size when you construct the
Iterator/I
If I disable row cache the numbers look good - key cache hit rate is > 0, so
it seems to be related to row cache.
Interestingly, after running for a really long time and with both row and
keys caches I do start to see Key cache hit rate > 0 but the numbers are so
small that it doesn't make sense.
On Wed, May 26, 2010 at 7:45 PM, Dodong Juan wrote:
>
> So I am not sure if you guys are familiar with OCM . Basically it is an ORM
> for Cassandra. Been testing it
>
In case anyone is interested I have posted a reply on the OCM issue
tracker where this was also raised.
http://github.com/charlie
I don't think that queries on a key range are valid unless you are using OPP.
As far as hashing the key for OPP goes, I take it to be the same a not
using OPP. It's really a matter of where it gets done, but it has much
the same effect.
(I think)
Jonathan
On Wed, May 26, 2010 at 12:51 PM, Peter H
Are there any exceptions in the log like the one in
https://issues.apache.org/jira/browse/CASSANDRA-1019 ?
If so you'll need to restart the moving node and try again.
On Wed, May 26, 2010 at 3:54 AM, Ran Tavory wrote:
> I ran nodetool move on one of the nodes and it seems stuck for a few hours
>
So I am not sure if you guys are familiar with OCM . Basically it is
an ORM for Cassandra. Been testing it
So I have created model that has the following object relationship.
OCM generates the code from this that allows me to do easy
programmatic query from Java to Cassandra.
Object1-(M
Sent from my iPhone
So after CASSANDRA-579, anti compaction won't be done on the source node,
and we can use more than 50% of the disk space if we use multiple column
families?
Thanks,
Sean
On Wed, May 26, 2010 at 10:01 AM, Stu Hood wrote:
> See https://issues.apache.org/jira/browse/CASSANDRA-579 for some
> backg
Correct me if I'm wrong here. Even though you can get your results with Random
Partitioner, it's a lot less efficient if you're going across different
machines to get your results. If you're doing a lot of range queries, it makes
sense to have things ordered sequentially so that if you do need
Hi,
I'm seeing a problem with inserting columns into one key using multiple
threads and I'm not sure if it's a bug or if it's my misunderstanding of how
insert/get_slice should work.
My setup is that I have two separate client processes, each with a single
thread, writing concurrently to Cassandr
See https://issues.apache.org/jira/browse/CASSANDRA-579 for some background
here: I was just about to start working on this one, but it won't make it in
until 0.7.
-Original Message-
From: "Sean Bridges"
Sent: Wednesday, May 26, 2010 11:50am
To: user@cassandra.apache.org
Subject: using
We're investigating Cassandra, and we are looking for a way to get Cassandra
use more than 50% of it's data disks. Is this possible?
For major compactions, it looks like we can use more than 50% of the disk if
we use multiple similarly sized column families. If we had 10 column
families of the s
Fantastic! Thank you.
On May 26, 2010, at 8:38 AM, Jeff Hammerbacher wrote:
> I've got a mostly working Avro server and client for HBase at
> http://github.com/hammer/hbase-trunk-with-avro and
> http://github.com/hammer/pyhbase. If you replace "scan" with "slice", it
> shouldn't be too much di
I've got a mostly working Avro server and client for HBase at
http://github.com/hammer/hbase-trunk-with-avro and
http://github.com/hammer/pyhbase. If you replace "scan" with "slice", it
shouldn't be too much different for Cassandra...
On Mon, May 17, 2010 at 10:31 AM, Wellman, David wrote:
> I s
In Thrift API, I guess you need to use read/ insert and then delete to
implement the move action.
If you can shut the Cassandra down, maybe you can try to sstable2json to export
data out, and json2sstable to import back to different column family file? I
did not do it before, but I guess it
Sorry I now realized that I used the wrong terminology.
What I really meant was, moving or copying the ROWS defined by a KeyRange in
between ColumnFamilies.
Do you think it's doable with an efficient way?
On Wed, May 26, 2010 at 3:14 PM, Dop Sun wrote:
> There are no single API call to achieve
There are no single API call to achieve this.
It’s read and write, plus a delete (if move) API calls I guess.
From: Utku Can Topçu [mailto:u...@topcu.gen.tr]
Sent: Wednesday, May 26, 2010 9:09 PM
To: user@cassandra.apache.org
Subject: Moving/copying columns in between ColumnFamilies
He
Hey All,
Assume I have two ColumnFamilies in the same keyspace and I want to move or
copy a range of columns (defined by a keyrange) into another columnfamily.
Do you think it's somehow possible and doable with the current support of
the API, if so how?
Best Regards,
Utku
The summary of your question is: is batch_mutate atomic in the general
sense, meaning when used with multiple keys, multiple column families etc,
correct?
On Wed, May 26, 2010 at 12:45 PM, Todd Nine wrote:
> Hey guys,
> I originally asked this on the Hector group, but no one was sure of the
>
I ran nodetool move on one of the nodes and it seems stuck for a few hours
now.
I've been able to run it successfully in the past, but this time it looks
stuck.
Streams shows as if there's work in progress, but the same files have been
at the same position for a few hours.
I've also checked the c
Hey guys,
I originally asked this on the Hector group, but no one was sure of the
answer. Can I get some feedback on this. I'd prefer to avoid having to use
something like Cages if I can for most of our use cases. Long term I can
see we'll need to use something like Cages, especially when it c
Just in case you don't know: You can do range searches on keys even with
Random Partitioner, you just won't get the results in order. If this is good
enough for you (e.g. if you can order the results on the client, or if you
just need to get the right answer, but not the right order), then you shou
On Wed, May 26, 2010 at 9:54 AM, Mishail wrote:
> You could either use 1 remove(keyspace, key, column_path, timestamp,
> consistency_level) per aech key, or wait till
> https://issues.apache.org/jira/browse/CASSANDRA-494 fixed (to use
> SliceRange in the Deletion)
thanks, I'm already doing that b
You could either use 1 remove(keyspace, key, column_path, timestamp,
consistency_level) per aech key, or wait till
https://issues.apache.org/jira/browse/CASSANDRA-494 fixed (to use
SliceRange in the Deletion)
gabriele renzi wrote:
>
> Is it correct that I cannot perform a row delete via batchMuta
This has been fixed in 0.7
(https://issues.apache.org/jira/browse/CASSANDRA-1027).
Not sure this has been merged in 0.6 though.
On Wed, May 26, 2010 at 9:05 AM, gabriele renzi wrote:
> Hi everyone,
>
> in our test code we perform a dummy "clear" by reading all the rows
> and deleting them (while
Hi everyone,
in our test code we perform a dummy "clear" by reading all the rows
and deleting them (while waiting for cassandra 0.7 & CASSANDRA-531).
A couple of days ago I updated our code to perform this operation
using batchMutate, but there seem to be no way to perform a deletion
of the whole
39 matches
Mail list logo