Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread AJ
On 6/6/2011 11:25 PM, Benjamin Coverston wrote: Currently, my data dir has about 16 sets. I thought that compaction (with nodetool) would clean-up these files, but it doesn't. Neither does cleanup or repair. You're not even talking about snapshots using nodetool snapshot yet. Also nodetool

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread Maki Watanabe
You can find useful information in: http://www.datastax.com/docs/0.8/operations/scheduled_tasks sstables are immutable. Once it written to disk, it won't be updated. When you take snapshot, the tool makes hard links to sstable files. After certain time, you will have some times of memtable flushs,

Re: Troubleshooting IO performance ?

2011-06-07 Thread Terje Marthinussen
If you run iostat without output every few second, is the I/O stable or do you see very uneven I/O? Regards, Terje On Tue, Jun 7, 2011 at 11:12 AM, aaron morton wrote: > There is a big IO queue and reads are spending a lot of time in the queue. > > Some more questions: > - what version are you

Re: Multiple large disks in server - setup considerations

2011-06-07 Thread Erik Forsberg
On Tue, 31 May 2011 13:23:36 -0500 Jonathan Ellis wrote: > Have you read http://wiki.apache.org/cassandra/CassandraHardware ? I had, but it was a while ago so I guess I kind of deserved an RTFM! :-) After re-reading it, I still want to know: * If we disregard the performance hit caused by havi

Re: Replication-aware compaction

2011-06-07 Thread David Boxenhorn
Thanks! I'm actually on vacation now, so I hope to look into this next week. On Mon, Jun 6, 2011 at 10:25 PM, aaron morton wrote: > You should consider upgrading to 0.7.6 to get a fix to Gossip. Earlier 0.7 > releases were prone to marking nodes up and down when they should not have > been. See

Re: Installing Thrift with Solandra

2011-06-07 Thread Jake Luciani
Good point it doesn't include the Cassandra.thrift file. I suppose I should include it with the code but you can also grab it from Cassandra. Jake On Tuesday, June 7, 2011, Jean-Nicolas Boulay Desjardins wrote: > Thanks again :) > Ok... But in the tutorial it says that I need to build a Thrift

Re: Installing Thrift with Solandra

2011-06-07 Thread Jake Luciani
This seems to be a common cause of confusion. Let me try again. Solandra doesn't integrate your Cassandra data into solr. It simply provides a scalable backend for solr by Building on Cassandra. The inverted index lives in it's own Cassandra keyspace. What you have in the end is two functionally

upgrading to cassandra 0.8

2011-06-07 Thread Sasha Dolgy
Hi, Good news on the 0.8 release. So ... if I upgrade one node out of four, and let it run for a bit ... I should have no issues, correct? If I make schema changes, specifically, adding a new column family for counters, how will this behave with the other three nodes that aren't upgraded? Or ..

set up a cassandra cluster with ByteOrderedPartitioner using whirr?

2011-06-07 Thread Khanh Nguyen
Hi, I'm struggling to set up a cassandra cluster with ByteOrderedPartitioner using whirr. (I'm not sure if the issue is caused by Cassandra or Whirr so I cc-ed both lists). Here are the steps I took - use whirr to lauch a cassandra (version 0.8) cluster - ssh into each instances and do 1) kill c

Re: set up a cassandra cluster with ByteOrderedPartitioner using whirr?

2011-06-07 Thread Edward Capriolo
On Tue, Jun 7, 2011 at 10:57 AM, Khanh Nguyen wrote: > Hi, > > I'm struggling to set up a cassandra cluster with > ByteOrderedPartitioner using whirr. (I'm not sure if the issue is > caused by Cassandra or Whirr so I cc-ed both lists). > > Here are the steps I took > > - use whirr to lauch a cassa

Re: upgrading to cassandra 0.8

2011-06-07 Thread Edward Capriolo
On Tue, Jun 7, 2011 at 10:54 AM, Sasha Dolgy wrote: > Hi, > > Good news on the 0.8 release. So ... if I upgrade one node out of four, > and let it run for a bit ... I should have no issues, correct? If I make > schema changes, specifically, adding a new column family for counters, how > will th

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread AJ
On 6/7/2011 2:29 AM, Maki Watanabe wrote: You can find useful information in: http://www.datastax.com/docs/0.8/operations/scheduled_tasks sstables are immutable. Once it written to disk, it won't be updated. When you take snapshot, the tool makes hard links to sstable files. After certain time,

Re: sync commitlog in batch mode lose data

2011-06-07 Thread Peter Schuller
> But I have another question, while I disable the disk cache but leave the > cache write mode write-back, how sync works ? Still write the data into the > cache ? This issue may not belong to the scope of discussion here  . I'm not sure, it depends on at what level of abstraction you changed to

RE: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread Jeremiah Jordan
Don't manually delete things. Let Cassandra do it. Force a garbage collection or restart your instance and Cassandra will delete the unused files. -Original Message- From: AJ [mailto:a...@dude.podzone.net] Sent: Tuesday, June 07, 2011 10:15 AM To: user@cassandra.apache.org Subject: Re:

Re: upgrading to cassandra 0.8

2011-06-07 Thread Jonathan Ellis
Even schema changes *should* work, although to be safe, the less "unusual" stuff you do with a mixed-version cluster, the better. However, any kind of streaming (bootstrap, node movement, decommission, nodetool repair) will not work. On Tue, Jun 7, 2011 at 10:07 AM, Edward Capriolo wrote: > > >

Re: upgrading to cassandra 0.8

2011-06-07 Thread Sasha Dolgy
Thanks everyone ... upgrade to 0.8 on all nodes is a first priority then ... On Tue, Jun 7, 2011 at 5:28 PM, Jonathan Ellis wrote: > Even schema changes *should* work, although to be safe, the less > "unusual" stuff you do with a mixed-version cluster, the better. > > However, any kind of stream

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread Benjamin Coverston
Hi AJ, Unfortunately, for storage capacity planning it's a bit of a guessing game. Until you run your load against it and profile the usage you just are not going to know for sure. I have seen cases where planning to have 50% excess capacity/node was plenty, and I have seen other extreme cases

Re: Installing Thrift with Solandra

2011-06-07 Thread Jean-Nicolas Boulay Desjardins
Ok So I have to install Thrift and Cassandra than Solandra. I am asking because I followed the instructions in your Git page but I get this error: # cd solandra-app; ./start-solandra.sh -bash: ./start-solandra.sh: No such file or directory Thanks again :) On Tue, Jun 7, 2011 at 7:55 AM, Jake

Re: Troubleshooting IO performance ?

2011-06-07 Thread Philippe
very even will answer aaron's email... will upgrade to 0.8 too ! Le 7 juin 2011 13:09, "Terje Marthinussen" a écrit : > If you run iostat without output every few second, is the I/O stable or do > you see very uneven I/O? > > Regards, > Terje > > On Tue, Jun 7, 2011 at 11:12 AM, aaron morton wrot

Re: Multiple large disks in server - setup considerations

2011-06-07 Thread Ryan King
On Tue, Jun 7, 2011 at 4:34 AM, Erik Forsberg wrote: > On Tue, 31 May 2011 13:23:36 -0500 > Jonathan Ellis wrote: > >> Have you read http://wiki.apache.org/cassandra/CassandraHardware ? > > I had, but it was a while ago so I guess I kind of deserved an RTFM! :-) > > After re-reading it, I still w

Re: [RELEASE] 0.8.0

2011-06-07 Thread Ryan King
On Mon, Jun 6, 2011 at 7:00 PM, Terje Marthinussen wrote: > Yes, I am aware of it but it was not an alternative for this project which > will face production soon. > The patch I have is fairly non-intrusive (especially vs. 674) so I think it > can be interesting depending on how quickly 674 will b

Re: multiple clusters communicating

2011-06-07 Thread Ryan King
On Mon, Jun 6, 2011 at 5:01 PM, Jeffrey Wang wrote: > Hey all, > > > > We’re seeing a strange issue in which two completely separate clusters > (0.7.3) on the same subnet (X.X.X.146 through X.X.X.150) with 3 machines > (146-148) and 2 machines (149-150). Both of them are seeded with the > respecti

getIndexedSlices issue using Pelops

2011-06-07 Thread Tan Huynh
Hi, I am using Pelops client to query Cassandra secondary index and I get the e= xception listed below. The code is pretty simple too. I can use Cassandra-cli to query the same secondary index, so there must be something wrong in my code. If you've seen this issue, would you please point me

Re: getIndexedSlices issue using Pelops

2011-06-07 Thread Jonathan Ellis
internal error means look at the cassandra server logs for the stacktrace. On Tue, Jun 7, 2011 at 12:20 PM, Tan Huynh wrote: > Hi, > > > > I am using Pelops client to query Cassandra secondary index and I get the e= > xception listed below. > > The code is pretty simple too. I can use Cassandra-c

Re: getIndexedSlices issue using Pelops

2011-06-07 Thread Jonathan Ellis
... also, are you on 0.7.6? "works on cli but internal error w/ pelops" sounds like pelops is giving an invalid request, 0.7.6 is better at catching those and giving a real error message. On Tue, Jun 7, 2011 at 12:31 PM, Jonathan Ellis wrote: > internal error means look at the cassandra server l

CLI set command returns null

2011-06-07 Thread AJ
Ver 0.8.0. Please help. I don't know what I'm doing wrong. One simple keyspace with one simple CF with one simple column. I've tried two simple tutorials. Is there a common newbie mistake I could be making??? Thanks in advance! [default@Keyspace1] describe keyspace; Keyspace: Keyspace1:

Re: CLI set command returns null

2011-06-07 Thread Dan Kuebrich
Null response may mean an error on the server side. Have you checked your cassandra server's logs? On Tue, Jun 7, 2011 at 2:22 PM, AJ wrote: > Ver 0.8.0. > > Please help. I don't know what I'm doing wrong. One simple keyspace with > one simple CF with one simple column. I've tried two simple

Re: CLI set command returns null

2011-06-07 Thread Jonathan Ellis
try running cli with --debug On Tue, Jun 7, 2011 at 1:22 PM, AJ wrote: > Ver 0.8.0. > > Please help.  I don't know what I'm doing wrong.  One simple keyspace with > one simple CF with one simple column.  I've tried two simple tutorials.  Is > there a common newbie mistake I could be making??? > >

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread AJ
Thanks to everyone who responded thus far. On 6/7/2011 10:16 AM, Benjamin Coverston wrote: Not to say that there aren't workloads where having many TB/Node doesn't work, but if you're planning to read from the data you're writing you do want to ensure that your working set is stored in memory

Re: CLI set command returns null

2011-06-07 Thread AJ
The log only shows INFO level messages about flushes, etc.. The debug mode of the CLI shows an exception after the set: [al@mars ~]$ cassandra-cli -h 192.168.1.101 --debug Connected to: "Test Cluster" on 192.168.1.101/9160 Welcome to the Cassandra CLI. Type 'help;' or '?' for help. Type 'quit;'

About Brisk, Hadoop powered by Cassandra

2011-06-07 Thread Marcos Ortiz
Regards to all. I was reading about this DataStax's product called Brisk, and I think that's a amazing piece of technology. Only two questions? - Brisk is a propetary tecnology? - Can anyone participate on its development? (I'm very interested on this Hadoop-Cassandra Integration) - Which is t

Re: Troubleshooting IO performance ?

2011-06-07 Thread Philippe
Aaron, - what version are you on ? 0.7.6-2 - what is the concurrent_reads config setting ? > concurrent_reads: 64 concurrent_writes: 64 Givent that I've got 4 cores and SSD drives, I doubled the concurrent writes recommended. Given that I've RAID-0ed the SSD drive, I figured I could at least do

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread aaron morton
I'd also say consider what happens during maintenance and failure scenarios. Moving 10's TB around takes a lot longer than 100's GB. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Jun 2011, at 06:40, AJ wrote: > Thanks to eve

Re: About Brisk, Hadoop powered by Cassandra

2011-06-07 Thread aaron morton
it's here https://github.com/riptano/brisk under the apache v2 licence try the #datastax-brisk irc room on freenode cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 8 Jun 2011, at 07:03, Marcos Ortiz wrote: > Regards to all. > I

Re: Troubleshooting IO performance ?

2011-06-07 Thread aaron morton
> So basically, I'm flooding the system right ? For example 99303 means there > are 99303 key reads pending, possibly from just a couple MultiSlice gets ? Yes and then some. Each row you ask for in a multiget turns into a single row request in the server. You are overloading the server. > - exa

Re: Backups, Snapshots, SSTable Data Files, Compaction

2011-06-07 Thread Benjamin Coverston
Aaron makes a good point, the happiest customers in my opinion are the ones that choose nodes on the smaller side, and more of them. Regarding the working set, I am referring to the OS cache. On linux, with JNA, Cassadra utilizes, to great effectiveness, memory mapped files and this is where I

Re: how to know there are some columns in a row

2011-06-07 Thread Patrick de Torcy
But I want values in my columns... Imagine a cf with authors as keys. Each author has written several books. So each row has columns with the title as column names and the text of the book as value (ie a lot of data). If a user wants to know the different books for an author, I'd like to be able to

RE: getIndexedSlices issue using Pelops

2011-06-07 Thread Tan Huynh
Thanks Jonathan for the pointer. It turns out the issue has to do w/ the count number that I specify in the index clause (Integer.MAX_VALUE). The StorageProxy.scan() method allocates a list of this size, causing Cassandra running out of heap space. Changing the count value to smaller value fi

Re: how to know there are some columns in a row

2011-06-07 Thread Dan Kuebrich
There might not be a built-in way to do this, but if you make two rows for each author, eg: nabokov_fulltext [ 'lolita' : 'Lolita, light of my life ...' , ...] nabokov_bookindex [ 'lolita' : None , ... ] you could query the bookindex for each author without cassandra having to load the full texts

Re: how to know there are some columns in a row

2011-06-07 Thread aaron morton
> If you have a method to retrieve the number of columns of a row (without > their values), I can't see why you couldn't retrieve the column names > (without their values). It's perharps harder than I think... But it would be > rather useful ! Internally this just gets the full columns and co

Re: Multiple large disks in server - setup considerations

2011-06-07 Thread Edward Capriolo
On Tue, Jun 7, 2011 at 12:43 PM, Ryan King wrote: > On Tue, Jun 7, 2011 at 4:34 AM, Erik Forsberg wrote: > > On Tue, 31 May 2011 13:23:36 -0500 > > Jonathan Ellis wrote: > > > >> Have you read http://wiki.apache.org/cassandra/CassandraHardware ? > > > > I had, but it was a while ago so I guess

Re: CLI set command returns null, ver 0.8.0

2011-06-07 Thread AJ
Can anyone help? The CLI seems to be having issues. The count command isn't working either: [default@Keyspace1] count User[long(1)]; Expected 8 or 0 byte long (13) java.lang.RuntimeException: Expected 8 or 0 byte long (13) at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliC

Re: Multiple large disks in server - setup considerations

2011-06-07 Thread AJ
On 6/7/2011 9:32 PM, Edward Capriolo wrote: I do not like large disk set-ups. I think they end up not being economical. Most low latency use cases want high RAM to DISK ratio. Two machines with 32GB RAM is usually less expensive then one machine with 64GB ram. For a machine with 1TB drive

Re: Installing Thrift with Solandra

2011-06-07 Thread Jean-Nicolas Boulay Desjardins
I found start-solandra.sh in resources folder. But when I execute it. I still get an error. http://dl.dropbox.com/u/20599297/Screen%20shot%202011-06-08%20at%201.27.26%20AM.png Thanks again. On Tue, Jun 7, 2011 a

Re: Installing Thrift with Solandra

2011-06-07 Thread Krish Pan
you are trying to run solandra from resources directory, follow these steps 1) don't use root - use a regular user 2) cd /tmp/ 3) git clone git://github.com/tjake/Solandra.git 4) cd Solandra 5) ant once you get BUILD SUCCESSFUL 6) cd solandra-app 7) ./start-solandra.sh On Tue, Jun 7, 2011 at