Re: RE: Multiple Keyspaces and memtables

2010-08-22 Thread Aaron Morton
Thanks Stu, I'll put my thinking cap on. AaronOn 23 Aug, 2010,at 04:15 PM, Stu Hood wrote:Hey Aaron, We are thinking a lot about multi-tenancy, but features to support multiple tenants on a cluster are only beginning to make their way into Cassandra. See http://wiki.apache.org/cassandra/MultiTena

Re: Poor performance; PHP & Thrift to blame

2010-08-22 Thread Juho Mäkinen
Beware that the native thrift php bindings has a bug which might change provided argument types. Check out the bug report which I filled: https://issues.apache.org/jira/browse/THRIFT-796 - Garo On Fri, Aug 20, 2010 at 10:35 AM, sasha wrote: > Julian Simon jules.com.au> writes: > >> >> Hi, >> >

Re: Thrift + PHP: help!

2010-08-22 Thread Juho Mäkinen
I have had to build a wrapper around php thrift calls which automatically retry the cassandra thrift operation in case there was a failure. It's not a proper sollution, but it has worked in our case well enough to be reliable. Of course it would be nice if I wouldn't need such ugly hack. - Garo

Re: [RELEASE] 0.7.0 beta1

2010-08-22 Thread Peter Harrison
On Thu, Aug 19, 2010 at 12:34 AM, Ran Tavory wrote: > Happy to announce hector's support in 0.7.0. Hector is a java client for > cassandra which wraps the low level thrift interface with a nicer API, adds > monitoring, connection pooling and more. > I didn't do anything... The amazing 0.7.0 work

script to manually generate tokens

2010-08-22 Thread Artie Copeland
i created a simple python script to ask for a cluster size and then generate tokens for each node http://github.com/yestech/yessandbox/blob/master/cassandra-gen-tokens.py it is derived from ben black's cassandra talk: http://www.slideshare.net/benjaminblack/cassandra-summit-2010-operations-troub

Re: RE: No Keyspace found after fresh .7 install

2010-08-22 Thread Aaron Morton
Also, there are some changed between 0.6* and 0.7* (dynamic schema and keys as byte[])that AFAIK make them incompatible in the same cluster.The 0.7 node is going to try and tell the 0.6 node all about the new schema.AOn 23 Aug, 2010,at 04:16 PM, Stu Hood wrote:See http://wiki.apache.org/cassandra/

RE: No Keyspace found after fresh .7 install

2010-08-22 Thread Stu Hood
See http://wiki.apache.org/cassandra/FAQ#no_keyspaces , or the Upgrading section in NEWS.txt. -Original Message- From: "Frank LoVecchio" Sent: Sunday, August 22, 2010 11:10pm To: user@cassandra.apache.org Subject: No Keyspace found after fresh .7 install I now have 2 running nodes : one

RE: Multiple Keyspaces and memtables

2010-08-22 Thread Stu Hood
Hey Aaron, We are thinking a lot about multi-tenancy, but features to support multiple tenants on a cluster are only beginning to make their way into Cassandra. See http://wiki.apache.org/cassandra/MultiTenant for a short listing of features that are being considered (including a mention of mem

No Keyspace found after fresh .7 install

2010-08-22 Thread Frank LoVecchio
I now have 2 running nodes : one running .6.3, and one running .7 beta 1. No Keyspace is recognized (aside from system) on the .7 beta 1 node; I saw this wiki page ->http://wiki.apache.org/cassandra/LiveSchemaUpdates about doing a one-time schema update, but I am still confused (and jsconsole is w

Re: data deleted came back after 9 days.

2010-08-22 Thread Jonathan Ellis
possibilities include 1) you're using something other than rackunwarepartitioner, which is the only one that behaves the way you describe 2) you've moved nodes around w/o running cleanup afterwards On Sun, Aug 22, 2010 at 10:09 PM, Zhong Li wrote: > Today, I checked all nodes data and logs, ther

Re: data deleted came back after 9 days.

2010-08-22 Thread Zhong Li
Today, I checked all nodes data and logs, there are very few nodes reported connections up/down. I found some data on each nodes which I don't understand. The ReplicationFactor is 2, write Consistency Level is one. Example, the ring like Node1(Token1)->Node2(Token2)->Node3(Token3)->...

Multiple Keyspaces and memtables

2010-08-22 Thread Aaron Morton
Am playing with the dynamic schema features of 0.7 and thinking about the impact of more keyspaces. Memtable settings are per CF, and so per Keyspace - so adding more keyspaces increases the amount of memory needed for memtables on the node. There are also some new features in the config for multi

Re: Privileges

2010-08-22 Thread Mark
On 8/21/10 4:36 PM, Benjamin Black wrote: For reference, I learned this from reading the source: thrift/CassandraServer.java On Sat, Aug 21, 2010 at 4:19 PM, Mark wrote: Is there anyway to remove drop column family/keyspace privileges? It seems that SimpleAuthenticator out of box is all or

Re: Cassandra Nodes Freeze/Down for ConcurrentMarkSweep GC?

2010-08-22 Thread Peter Schuller
> [4] Is GC ConcurrentMarkSweep a Stop-The-World situation? Where the > JVM cannot do anything else? Hence then node is technically Down? > Correct? No; the concurrent mark/sweep phase runs concurrently with your application. CMS will cause a stop-the-world full pause it it fails to complete a CMS

Re: Cassandra Nodes Freeze/Down for ConcurrentMarkSweep GC?

2010-08-22 Thread Benjamin Black
http://riptano.blip.tv/file/4012133/ On Sun, Aug 22, 2010 at 12:11 PM, Moleza Moleza wrote: > Hi, > I am setting up a cluster on a linux box. > Everything seems to be working great and I am watching the ring with: > watch -d -n 2 nodetool -h localhost ring > Suddenly, I see that one of the nodes

Re: Node OOM Problems

2010-08-22 Thread Benjamin Black
On Sun, Aug 22, 2010 at 2:03 PM, Wayne wrote: > From a testing whether cassandra can take the load long term I do not see it > as different. Yes bulk loading can be made faster using very different Then you need far more IO, whether it comes form faster drives or more nodes. If you can achieve 1

Re: Node OOM Problems

2010-08-22 Thread Wayne
>From a testing whether cassandra can take the load long term I do not see it as different. Yes bulk loading can be made faster using very different methods, but my purpose is to test cassandra with a large volume of writes (and not to bulk load as efficiently as possible). I have scaled back to 5

Re: Node OOM Problems

2010-08-22 Thread Benjamin Black
Wayne, Bulk loading this much data is a very different prospect from needing to sustain that rate of updates indefinitely. As was suggested earlier, you likely need to tune things differently, including disabling minor compactions during the bulk load, to make this work efficiently. b On Sun,

Re: Cassandra Nodes Freeze/Down for ConcurrentMarkSweep GC?

2010-08-22 Thread Jonathan Ellis
GCs never take that long unless you're swapping. On Sun, Aug 22, 2010 at 2:11 PM, Moleza Moleza wrote: > Hi, > I am setting up a cluster on a linux box. > Everything seems to be working great and I am watching the ring with: > watch -d -n 2 nodetool -h localhost ring > Suddenly, I see that one of

Re: Node OOM Problems

2010-08-22 Thread Wayne
Has anyone loaded 2+ terabytes of real data in one stretch into a cluster without bulk loading and without any problems? How long did it take? What kind of nodes were used? How many writes/sec/node can be sustained for 24+ hours? On Sun, Aug 22, 2010 at 8:22 PM, Peter Schuller wrote: > I only

Re: Cassandra Nodes Freeze/Down for ConcurrentMarkSweep GC?

2010-08-22 Thread Moleza Moleza
Hi, I am setting up a cluster on a linux box. Everything seems to be working great and I am watching the ring with: watch -d -n 2 nodetool -h localhost ring Suddenly, I see that one of the nodes just went down (at 14:07): Status changed from Up to Down. 13 minutes later (without any intervention) t

Re: Node OOM Problems

2010-08-22 Thread Peter Schuller
I only sifted recent history of this thread (for time reasons), but: > You have started a major compaction which is now competing with those > near constant minor compactions for far too little I/O (3 SATA drives > in RAID0, perhaps?).  Normally, this would result in a massive > ballooning of your

Re: Node OOM Problems

2010-08-22 Thread Benjamin Black
Is the need for 10k/sec/node just for bulk loading of data or is it how your app will operate normally? Those are very different things. On Sun, Aug 22, 2010 at 4:11 AM, Wayne wrote: > Currently each node has 4x1TB SATA disks. In MySQL we have 15tb currently > with no replication. To move this t

Re: Node OOM Problems

2010-08-22 Thread Edward Capriolo
On Sun, Aug 22, 2010 at 7:11 AM, Wayne wrote: > Currently each node has 4x1TB SATA disks. In MySQL we have 15tb currently > with no replication. To move this to Cassandra replication factor 3 we need > 45TB assuming the space usage is the same, but it is probably more. We had > assumed a 30 node c

Re: Node OOM Problems

2010-08-22 Thread Wayne
Currently each node has 4x1TB SATA disks. In MySQL we have 15tb currently with no replication. To move this to Cassandra replication factor 3 we need 45TB assuming the space usage is the same, but it is probably more. We had assumed a 30 node cluster with 4tb per node would suffice with head room f

Re: Node OOM Problems

2010-08-22 Thread Benjamin Black
I see no reason to make that assumption. Cassandra currently has no mechanism to alternate in that manner. At the update rate you require, you just need more disk io (bandwidth and iops). Alternatively, you could use a bunch more, smaller nodes with the same SATA RAID setup so they each take many

Re: Node OOM Problems

2010-08-22 Thread Wayne
Due to compaction being so expensive in terms of disk resources, does it make more sense to have 2 data volumes instead of one? We have 4 data disks in raid 0, would this make more sense to be 2 x 2 disks in raid 0? That way the reader and writer I assume would always be a different set of spindles