What is the future of supercolumns ?

2012-01-04 Thread Aklin_81
I have seen supercolumns usage been discouraged most of the times. However sometimes the supercolumns seem to fit the scenario most appropriately not only in terms of how the data is stored but also in terms of how is it retrieved. Some of the queries supported by SCs are uniquely capable of doing

Re: emptying my cluster

2012-01-04 Thread aaron morton
Some thoughts on the plan: * You are monkeying around with things, do not be surprised when surprising things happen. * Deliberately unbalancing the cluster may lead to Bad Things happening. * In the design discussed it is perfectly reasonable for data not to be on the archive node. * Truncat

is it bad to have lots of column families?

2012-01-04 Thread Michael Cetrulo
in a traditional database it's not a good a idea to have hundreds of tables but is it also bad to have hundreds of column families in cassandra? thank you.

Re: Consistency Level

2012-01-04 Thread Kamal Bahadur
Hi Aaron, Thanks for your response! I re-ran the test case # 5. (Node 1 & 2 running, Node 3 & 4 down, Node 1 contains the data, CL ONE and RF 2). I was connected to Node 1 while I ran the test. I still did not get any data. See below logs:

Should I throttle deletes?

2012-01-04 Thread Maxim Potekhin
Now that my cluster appears to run smoothly and after a few successful repairs and compacts, I'm back in the business of deletion of portions of data based on its date of insertion. For reasons too lengthy to be explained here, I don't want to use TTL. I use a batch mutator in Pycassa to delete ~

Re: Cannot start cassandra node anymore

2012-01-04 Thread aaron morton
If you have the time turn logging up to DEBUG and start again, it will log where it failed. Put the logs aside incase there is a bug there. To get things running again: Move the commit log segment out of the directory and try the restart. Then run a repair from the node. Have you made any re

Re: Migration from 0.7 to 1.0

2012-01-04 Thread aaron morton
Sounds good. You can take some extra steps when doing a rolling restart see http://blog.milford.io/2011/11/rolling-upgrades-for-cassandra/ Also make sure repair *does not* run until all the nodes have been upgraded. > Do i miss something (I will backup everything before the > upgrade)? I'm

Re: Replacing supercolumns with composite columns; Getting the equivalent of retrieving a list of supercolumns by name

2012-01-04 Thread Guy Incognito
i know it's a throwaway example, but i would probably structure your column the other way around in that case. ie steve.4, steve.5, steve.6, greg.4, greg.6, greg.9. and then do two slice queries, steve.4-steve.10, greg.4-greg.10. On 04/01/2012 15:41, Jeremiah Jordan wrote: You can't use a slic

Re: java.lang.AssertionError

2012-01-04 Thread aaron morton
Will be fixed in 1.0.7 https://issues.apache.org/jira/browse/CASSANDRA-3656 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 4/01/2012, at 11:26 PM, Michael Vaknine wrote: > Hi, > > I have a 4 cluster version 1.0.3 which was upgraded from

Cannot start cassandra node anymore

2012-01-04 Thread Carlo Pires
Hi, I can't start a node of my cluster. Could someone help me to catch the problem? Using: debian with cassandra 1.0.6. root@carlo-laptop:/etc/cassandra# cat /var/log/cassandra/output.log INFO 13:46:00,596 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_26 INFO 13:46:00,600 Heap si

Re: Replacing supercolumns with composite columns; Getting the equivalent of retrieving a list of supercolumns by name

2012-01-04 Thread Jeremiah Jordan
Unless you are running into an issue with using super columns that make the composite columns better fit what you are trying to do, I would just stick with super-columns. "if it ain't broke don't fix it". -Jeremiah On 01/03/2012 11:21 PM, Asil Klin wrote: @Stephan: in that case, you can easil

Re: Replacing supercolumns with composite columns; Getting the equivalent of retrieving a list of supercolumns by name

2012-01-04 Thread Jeremiah Jordan
You can't use a slice range. But you can query for the specific columns. "4.steve", "5.steve", "6.steve" ... "4.greg", "5.greg", "6.greg". Just have to ask for all of the possible columns you want. On 01/03/2012 04:31 PM, Stephen Pope wrote: The bonus you're talking about here, how do I a

Re: Cassandra OOM

2012-01-04 Thread Vitalii Tymchyshyn
04.01.12 14:25, Radim Kolar написав(ла): > So, what are cassandra memory requirement? Is it 1% or 2% of disk data? It depends on number of rows you have. if you have lot of rows then primary memory eaters are index sampling data and bloom filters. I use index sampling 512 and bloom filters set

Migration from 0.7 to 1.0

2012-01-04 Thread cbert...@libero.it
Hi, I'm going to migrate from Cassandra 0.7 to 1.0 in production and I'd like to know the best way to do it ... "Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling restart, one node at a time. (0.8.0 or 0.8.1 are NOT network-compatible with 1.0: upgrade to the most recent 0.8 r

RE: Replacing supercolumns with composite columns; Getting the equivalent of retrieving a list of supercolumns by name

2012-01-04 Thread Stephen Pope
I don't think I can tell my exact column names in many cases. For example most of our queries are for specific keys, and an unknown range of numbers (like key1, key where number > 1). How can I set up my slice in this case to retrieve only the columns that match both criteria? Cheers, Steve

Re: emptying my cluster

2012-01-04 Thread Alexandru Sicoe
Hi, On Tue, Jan 3, 2012 at 8:19 PM, aaron morton wrote: > Running a time based rolling window of data can be done using the TTL. > Backing up the nodes for disaster recover can be done using snapshots. > Restoring any point in time will be tricky because to may restore columns > where the TTL h

Re: Cassandra OOM

2012-01-04 Thread Radim Kolar
> Looking at heap dumps, a lot of memory is taken by memtables, much more than 1/3 of heap. At the same time, logs say that it has nothing to flush since there are not dirty memtables. I seen this too. > So, what are cassandra memory requirement? Is it 1% or 2% of disk data? It depends on numbe

Re: Strange OOM when doing "list" in CLI

2012-01-04 Thread Maxim Potekhin
Ed, thanks for a dose of common sense, I should have thunk about it. In fact, I only have 2 columns in that one particular CF, but one of these can get really fat (for a good reason). So the CLI just plain runs out of memory when pulling the default 100 rows (with a little help from various o

Re: Cassandra OOM

2012-01-04 Thread Vitalii Tymchyshyn
Hello. BTW: It would be great for cassandra to shutdown on Errors like OOM because now I am not sure if the problem described in previous email is the root cause or some of OOM error found in log made some "writer" stop. I am now looking at different OOMs in my cluster. Currently each node h

java.lang.AssertionError

2012-01-04 Thread Michael Vaknine
Hi, I have a 4 cluster version 1.0.3 which was upgraded from 0.7.6 in 2 stages. Upgrade to 1.0.0 run scrub on all nodes Upgrade to 1.0.3 I keep getting this errors from time to time on all 4 nodes. Is there any maintenance I can do to fix the problem? I tried to run repair on the clu

Re: TimedOutException()

2012-01-04 Thread aaron morton
Look at the nodetool tpstats when you get the TimedOutException, to work out which nodes are backing up with pending messages. Then try to identify why. Check the server logs for GC, and the CPU and IO usage. Somehow the cluster is getting overwhelmed and cannot respond. Either the clients hav

Re: rename column family

2012-01-04 Thread aaron morton
You do not need a restart if you use nodetool refresh Otherwise a rolling restart will do the trick. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 4/01/2012, at 12:31 PM, Jim Newsham wrote: > > Thanks that's very helpful. I'm assuming

Re: Consistency Level

2012-01-04 Thread aaron morton
I've not spent much time with the secondary indexes, so a couple of questions. Whats is the output of nodetool ring ? Which node were you connected to when you did the get ? If you enable DEBUG logging what do the log messages from StorageProxy say that contain the string "scan ranges are" and

Re: Dealing with "Corrupt (negative) value length encountered"

2012-01-04 Thread aaron morton
> I was able to scrub the node the repair that failed was running on. Are you > saying the error could be displayed on that node but the bad data coming from > another node ? Yes. The error occurred the node was receiving a data stream from another, you will need to clean the source of the data.