If you store only the key mappings in a column family, for custom ordering
of rows etc. for things like:
friends = {
user_id : { friendid1, friendid2, }
}
or
topForumPosts = {
forum_id1 : { post2343, post32343, post32223, ...}
}
Now on friends page or on the top_forum_posts page
Hello,
We want to use cassandra to store and retrieve time related data. Storing
the time-value pairs is easy and works perfectly. The problem arrives at
retrieving the data. We do not only want to retrieve data from within a time
range, but also be able to get the previous and/or next data sample
You want to use 'reversed' in SliceRange (and a start with whatever
you want and a count of 2).
--
Sylvain
On Tue, Jun 15, 2010 at 12:01 PM, Bram van der Waaij
wrote:
> Hello,
>
> We want to use cassandra to store and retrieve time related data. Storing
> the time-value pairs is easy and works p
On Tue, Jun 15, 2010 at 04:29, S Ahmed wrote:
> If you store only the key mappings in a column family, for custom ordering
> of rows etc. for things like:
> friends = {
>
> user_id : { friendid1, friendid2, }
> }
> or
> topForumPosts = {
>
> forum_id1 : { post2343, post32343, post32223,
On Mon, 14 Jun 2010 16:01:57 -0700 Anthony Molinaro
wrote:
AM> Now I would assume that for 'production' you want to remove
AM>-ea
AM> and
AM>-XX:+HeapDumpOnOutOfMemoryError
AM> as well as adjust -Xms and Xmx accordingly, but are there any others
AM> which should be tweaked? Is there a
well it won't be a range, it will be random key lookups.
On Tue, Jun 15, 2010 at 8:44 AM, Gary Dusbabek wrote:
> On Tue, Jun 15, 2010 at 04:29, S Ahmed wrote:
> > If you store only the key mappings in a column family, for custom
> ordering
> > of rows etc. for things like:
> > friends = {
> >
>
In a read-mostly workload it will be better to denormalize the post
contents into subcolumns of the top_posts rows.
On Tue, Jun 15, 2010 at 2:29 AM, S Ahmed wrote:
> If you store only the key mappings in a column family, for custom ordering
> of rows etc. for things like:
> friends = {
>
> use
Perfect! Thanks :-)
2010/6/15 Sylvain Lebresne
> You want to use 'reversed' in SliceRange (and a start with whatever
> you want and a count of 2).
>
> --
> Sylvain
>
> On Tue, Jun 15, 2010 at 12:01 PM, Bram van der Waaij
> wrote:
> > Hello,
> >
> > We want to use cassandra to store and retrieve
if you are reading 500MB per thrift request from each of 3 threads,
then yes, simple arithmetic indicates that 1GB heap is not enough.
On Mon, Jun 14, 2010 at 6:13 PM, Caribbean410 wrote:
> Hi,
>
> I wrote 200k records to db with each record 5MB. Get this error when I uses
> 3 threads (each threa
Hi, I'm running cassandra .6.2 on a dedicated 4 node cluster and I
also have a dedicated 4 node hadoop cluster. I'm trying to run a
simple map reduce job against a single column family and it only takes
32 map tasks before I get floods of thrift timeouts. That would make
sense to me if the cassandr
Sorry, the record size should be 5KB not 5MB. Coz 4KB is still OK. I will
try Benjamin's suggestion.
-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Tuesday, June 15, 2010 8:09 AM
To: user@cassandra.apache.org
Subject: Re: java.lang.OutofMemoryerror: Java heap spa
Today I retry the 2GB heap now it's working. No that out of memory error.
Looks like I have to restart Cassandra several times before the new changes
take effect.
-Original Message-
From: Benjamin Black [mailto:b...@b3k.us]
Sent: Monday, June 14, 2010 7:46 PM
To: user@cassandra.apache.or
You should only have to restart once per node to pick up config changes.
On Tue, Jun 15, 2010 at 9:41 AM, caribbean410 wrote:
> Today I retry the 2GB heap now it's working. No that out of memory error.
> Looks like I have to restart Cassandra several times before the new changes
> take effect.
>
(moving to user@)
On Mon, Jun 14, 2010 at 10:43 PM, Masood Mortazavi
wrote:
> Is the clearer interpretation of this statement (in
> conf/datacenters.properties) given anywhere else?
>
> # The sum of all the datacenter replication factor values should equal
> # the replication factor of the keyspa
http://wiki.apache.org/cassandra/ArticlesAndPresentations might help.
On Mon, Jun 14, 2010 at 1:13 PM, Johannes Weissensel
wrote:
> Hi everyone,
> i am new to nosql databases and especially column-oriented Databases
> like cassandra.
> I am a student on information-systems and i evaluate a fittin
I am running a 10 node cassandra 0.6.1 cluster with a replication factor of 3.
To populate the database to perform my read benchmarking, I have 8 applications
using Thrift, each connecting to a different cassandra server and writing
100,000 rows of data (100 KB each row), using a consistencyLev
You are likely exhausting your heap space (probably still at the very
small 1G default?), and maximizing the amount of resource consumption
by using CL.ALL. Why are you using ALL?
On Tue, Jun 15, 2010 at 11:58 AM, Julie wrote:
> I am running a 10 node cassandra 0.6.1 cluster with a replication f
Benjamin Black b3k.us> writes:
>
> You are likely exhausting your heap space (probably still at the very
> small 1G default?), and maximizing the amount of resource consumption
> by using CL.ALL. Why are you using ALL?
>
> On Tue, Jun 15, 2010 at 11:58 AM, Julie nextcentury.com>
wrote:
...
>
How are you doing your inserts?
I draw a clear line between 1) bootstrapping a cluster with data and 2)
simulating expected/projected read/write behavior.
If you are bootstrapping then I would look into the batch_mutate APIs. They
allow you to improve your performance on writes dramatically.
I
On Tue, Jun 15, 2010 at 1:40 PM, Julie wrote:
> Thanks for your reply. Yes, my heap space is 1G. My vms have only 1.7G of
> memory so I hesitate to use more.
Then write slower. There is no free lunch.
b
On Tue, Jun 15, 2010 at 1:58 PM, Julie wrote:
> Coinciding with my write timeouts, all 10 of my cassandra servers are getting
> the following exception written to system.log:
"Value too large for defined data type" looks like a bug found in
older JREs. Upgrade to u19 or later.
> Another thing t
Hello,
Phil Stanhope wimba.com> writes:
>
> How are you doing your inserts?
>
> I draw a clear line between 1) bootstrapping a cluster with data and 2)
simulating expected/projected
> read/write behavior.
>
> If you are bootstrapping then I would look into the batch_mutate APIs. They
allow you to imp
On Tue, Jun 15, 2010 at 5:15 PM, Julie wrote:
> I'm also baffled that after all compactions are done on every one of the 10
> servers, about 5 out of 10 servers are still at 40% CPU usage, although they
> are doing 0 disk IO. I am not running anything else running on these server
> nodes except fo
firstly, my apologies for the off-topic message,
but I thought most people on this list would be knowledgeable and
interested in this kind of thing.
We are looking to find a open source, scalable solution to do RT
aggregation and stream processing (similar to what the 'hop' project
http://cod
hello,
I have a 4 node cassandra cluster with 0.6.1 installed. We've been running
a mixed read / write workload test how it works in our environment, we run
about 4M bath mutations and 40M get_range_slice requests over 6 to 8 hours
that load about 10 to 15 GB of data.
Yesterday while there was
Known bug, fixed in latest 0.6 release.
On Tue, Jun 15, 2010 at 3:29 PM, aaron wrote:
> hello,
>
> I have a 4 node cassandra cluster with 0.6.1 installed. We've been running
> a mixed read / write workload test how it works in our environment, we run
> about 4M bath mutations and 40M get_range_sl
Benjamin Black b3k.us> writes:
>
> Then write slower. There is no free lunch.
>
> b
Are you implying that clients need to throttle their collective load on the
server to avoid causing the server to fail? That seems undesirable. Is this a
side effect of a server bug, or is it part of the int
On Tue, Jun 15, 2010 at 3:55 PM, Charles Butterfield
wrote:
> Benjamin Black b3k.us> writes:
>
>>
>> Then write slower. There is no free lunch.
>>
>> b
>
> Are you implying that clients need to throttle their collective load on the
> server to avoid causing the server to fail? That seems undesi
Thanks, will move to 0.6.2.
Aaron
On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Black wrote:
> Known bug, fixed in latest 0.6 release.
>
> On Tue, Jun 15, 2010 at 3:29 PM, aaron wrote:
>> hello,
>>
>> I have a 4 node cassandra cluster with 0.6.1 installed. We've been
>> running
>> a mixed read
Thanks for your updates, good to know that your performance is better now.
Actually, if the user asks one record a time, usually it will be done in
multi-threading, since most likely the requests coming from different users.
If a single users want 200k, and there are no difference to get 1
Benjamin Black b3k.us> writes:
>
> I am only saying something obvious: if you don't have sufficient
> resources to handle the demand, you should reduce demand, increase
> resources, or expect errors. Doing lots of writes without much heap
> space is such a situation (whether or not it is happen
I've setup a launchpad project, team, and a PPA (https://launchpad.net/ppa) for
Cassandra packages on Ubuntu here:
https://launchpad.net/cassandra-packages
https://launchpad.net/~cassandra-ubuntu
This team is currently made up of a few members of the Ubuntu Server Team. We'd
like to appeal to y
Actually, you shouldn't expect errors in the general case, unless you
are simply trying to use data that can't fit in available heap. There
are some practical limitations, as always.
If there aren't enough resources on the server side to service the
clients, the expectation should be that the serv
We are currently looking at a distributed database option and so far
Cassandra ticks all the boxes. However, I still have some questions.
Is there any need for archiving of Cassandra and what backup options are
available? As it is a no-data-loss system I'm guessing archiving is not
exactly rele
There is JSON import and export, of you want a form of external backup.
No, you can't hook event subscribers into the storage engine. You can modify
it to do this, however. It may not be trivial.
An easier way to do this would be to have a boundary system (or dedicated
thread, for example) consum
Doh! Replace "of" with "if" in the top line.
On Tue, Jun 15, 2010 at 7:57 PM, Jonathan Shook wrote:
> There is JSON import and export, of you want a form of external backup.
>
> No, you can't hook event subscribers into the storage engine. You can
> modify it to do this, however. It may not be t
On Tue, 15 Jun 2010 15:55:46 -0700, Benjamin Black wrote:
Known bug, fixed in latest 0.6 release.
>>> On 6/15/10 4:06 PM, aaron wrote:
>>> Thanks, will move to 0.6.2.
I believe that this thread refers to CASSANDRA-1169, and fix version for
that is the (unreleased) cassandra 0.6.3, not ("the
Thanks Jonathan, I was only asking about the event listeners because an
alternative we are considering is TIBCO Active Spaces which draws quite
a lot of parallels to Cassandra.
I guess it would be interesting to find out how other people use
Cassandra, i.e., is it your one stop shop for data st
This is not the bug to which I was referring. I don't recall the
number, perhaps someone else can assist on that front? I just know I
specifically upgraded to 0.6 trunk a bit before 0.6.2 to pick up the
fix (and it worked).
b
On Tue, Jun 15, 2010 at 6:07 PM, Rob Coli wrote:
>
>>> On Tue, 15 J
On Tue, Jun 15, 2010 at 4:44 PM, Charles Butterfield
wrote:
>
> I guess my point is that I have rarely run across database servers that die
> from either too many client connections, or too rapid client requests. They
> generally stop accepting incoming connections when there are too many
> conn
On Tue, Jun 15, 2010 at 4:44 PM, Charles Butterfield
wrote:
> To clarify the history here -- initially we were writing with CL=0 and had
> great performance but ended up killing the server. It was pointed out that
> we were really asking the server to accept and acknowledge an unbounded
> number
On Tue, Jun 15, 2010 at 4:58 PM, Jonathan Shook wrote:
> If there aren't enough resources on the server side to service the
> clients, the expectation should be that the servers have a graceful
> performance degradation, or in the worst case throw an error specific
> to resource exhaustion or expl
On Tue, Jun 15, 2010 at 6:07 PM, Anthony Ikeda
wrote:
>
> Thanks Jonathan, I was only asking about the event listeners because an
> alternative we are considering is TIBCO Active Spaces which draws quite a lot
> of parallels to Cassandra.
>
>
Based on painful production experience, I would not
https://issues.apache.org/jira/browse/CASSANDRA-1016
'Plugins', excuse me.
b
On 6/15/10 6:35 PM, Benjamin Black wrote:
jmhodges contributed a patch (I remain incompetent at Jira searches)
for 'coprocessors' to do what you want. That'd be where I'd start
looking.
https://issues.apache.org/jira/browse/CASSANDRA-1016
=Rob
Thanks Benjamin. Looking at the 'plugins' now :)
-Original Message-
From: Benjamin Black [mailto:b...@b3k.us]
Sent: Wednesday, 16 June 2010 11:35 AM
To: user@cassandra.apache.org
Subject: Re: Some questions about using Cassandra
On Tue, Jun 15, 2010 at 6:07 PM, Anthony Ikeda
wrote:
>
>
I think the one you're referring to is
https://issues.apache.org/jira/browse/CASSANDRA-1076
On Tue, Jun 15, 2010 at 8:16 PM, Benjamin Black wrote:
> This is not the bug to which I was referring. I don't recall the
> number, perhaps someone else can assist on that front? I just know I
> specific
Yes!
On Tue, Jun 15, 2010 at 6:44 PM, Jonathan Ellis wrote:
> I think the one you're referring to is
> https://issues.apache.org/jira/browse/CASSANDRA-1076
>
> On Tue, Jun 15, 2010 at 8:16 PM, Benjamin Black wrote:
>> This is not the bug to which I was referring. I don't recall the
>> number, p
The main change you'd commonly make is decreasing the max new gen size
on large heaps (say to 2GB) from the default of 1/3 of the heap.
IMO keeping heap dump on OOM around is a good idea in production; it
doesn't cost much (you're already screwed at the point where it starts
writing a dump, so why
Thank you for the update. For the select issue, right now we just focus on
read and write, later we may test delete operation which need to query all
keys.
From: Dop Sun [mailto:su...@dopsun.com] ks
Sent: Tuesday, June 15, 2010 4:14 PM
To: user@cassandra.apache.org
Subject: RE: read operation i
51 matches
Mail list logo