Re: Composite Column Grouping

2013-09-10 Thread Ravikumar Govindarajan
Thanks Michael, But I cannot sort the rows in memory, as the number of columns will be quite huge. >From the python script above: select_stmt = "select * from time_series where userid = 'XYZ'" This would return me many hundreds of thousands of columns. I need to go in time-series order using

read consistency and clock drift and ntp

2013-09-10 Thread Jimmy Lin
hi, I have few question around the area how Cassandra use record's timestamp to determine which one to return from its replicated nodes ... - A record's timestamp is determined by the Cassandra server node's system timestamp when the request arrive the server and NOT by the client timestamp who ma

Re: Long running nodetool move operation

2013-09-10 Thread Ike Walker
Below is the output of "nodetool netstats". I've never run that before, but from what I can read it shows no incoming streams, and a bunch of outgoing streams to two other nodes, all at 0%. I'll try the restart. Thanks. nodetool netstats Mode: MOVING Streaming to: /10.xxx.xx.xx ... Streaming

Re: Composite Column Grouping

2013-09-10 Thread Laing, Michael
If you have set up the table as described in my previous message, you could run this python snippet to return the desired result: #!/usr/bin/env python # -*- coding: utf-8 -*- import logging logging.basicConfig() from operator import itemgetter import cassandra from cassandra.cluster import Clus

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-10 Thread sankalp kohli
What have you set these to? # commitlog_sync may be either "periodic" or "batch." # When in batch mode, Cassandra won't ack writes until the commit log # has been fsynced to disk. It will wait up to # commitlog_sync_batch_window_in_ms milliseconds for other writes, before # performing the sync. #

Re: cluster rename ?

2013-09-10 Thread Robert Coli
On Tue, Sep 10, 2013 at 11:03 AM, Langston, Jim wrote: > http://comments.gmane.org/gmane.comp.db.cassandra.user/29753 > For step 3 in the instructions, I moved LocationInfo located in the system > keyspace to another > directory and when I try to restart the node, the directory is re-created, >

Cassandra input paging for Hadoop

2013-09-10 Thread Renat Gilfanov
Hi, We have Hadoop jobs that read data from our Cassandra column families and write some data back to another column families. The input column families are pretty simple CQL3 tables without wide rows. In Hadoop jobs we set up corresponding WHERE clause in ConfigHelper.setInputWhereClauses(...)

Re: Composite Column Grouping

2013-09-10 Thread Laing, Michael
You could try this. C* doesn't do it all for you, but it will efficiently get you the right data. -ml -- put this in and run using 'cqlsh -f DROP KEYSPACE latest; CREATE KEYSPACE latest WITH replication = { 'class': 'SimpleStrategy', 'replication_factor' : 1 }; USE latest; CREATE TA

Re: FileNotFoundException while inserting (1.2.8)

2013-09-10 Thread sankalp kohli
Have you dropped and recreated a keyspace with the same name recently? On Tue, Sep 10, 2013 at 8:40 AM, Keith Freeman <8fo...@gmail.com> wrote: > While running a heavy insert load, one of my nodes started throwing this > exception when trying a compaction: > > INFO [CompactionExecutor:23] 2013-

Re: Throughput and RAM

2013-09-10 Thread Jan Algermissen
On 10.09.2013, at 19:37, Robert Coli wrote: > "Cassandra does not prevent a given node from writing to RAM faster than it > can flush to disk"? Yes, that is what I meant. What remains unclear to me is what the oprational strategy is towards handling an increase in writes or peaks. Seems to

Re: making sure 1 copy per availability zone(rack) using EC2Snitch

2013-09-10 Thread Robert Coli
On Mon, Sep 9, 2013 at 11:21 AM, rash aroskar wrote: > Are you suggesting deploying 1.2.9 only if using Cassandra "DC" outside of > EC2 or if I wish to use rack replication at all? > 1) use 1.2.9 no matter what, instead of 1.2.5 2) if only *ever* will have clusters in EC2, EC2Snitch is fine, but

heavy insert load overloads CPUs, with MutationStage pending

2013-09-10 Thread Keith Freeman
On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog and data, I get high CPU loads during a heavy-ish wide-row insert load into a single CF (5000 1k inserts/sec), e.g. uptime load avg for last minute 18/11/10. Checking tpstats, I see MutationStage pending on all the nodes, e

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-10 Thread Nate McCall
With SSDs, you can turn up memtable_flush_writers - try 3 initially (1 by default) and see what happens. However, given that there are no entries in 'All time blocked' for such, they may be something else. How are you inserting the data? On Tue, Sep 10, 2013 at 12:40 PM, Keith Freeman <8fo...@gm

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-10 Thread Keith Freeman
On 09/10/2013 11:17 AM, Robert Coli wrote: On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8fo...@gmail.com > wrote: On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog and data On SSD, you don't need to separate commitlog and data. You onl

Re: Throughput and RAM

2013-09-10 Thread Robert Coli
On Tue, Sep 10, 2013 at 2:30 AM, Jan Algermissen wrote: > So in a sense, C* is designed to maximize IO write efficiency by > pre-organizing write queries in memory. The more memory, the better the > organization works (caveat GC). > http://en.wikipedia.org/wiki/Log-structured_merge-tree " The LS

Leveled Compaction resetting tool in 2.0

2013-09-10 Thread Nate McCall
LCS fragmentation comes up a lot here and this issue caught a lot of us on IRC by surprise so I'm going to pass it on here: https://issues.apache.org/jira/browse/CASSANDRA-5271 See this thread for additional context: http://www.mail-archive.com/user@cassandra.apache.org/msg31416.html

cluster rename ?

2013-09-10 Thread Langston, Jim
Hi all, Following these instructions: http://comments.gmane.org/gmane.comp.db.cassandra.user/29753 I am trying to change the name of the cluster, but I'm getting an error: ERROR [main] 2013-09-10 17:52:43,250 CassandraDaemon.java (line 247) Fatal exception during initialization org.apache.cas

FileNotFoundException while inserting (1.2.8)

2013-09-10 Thread Keith Freeman
While running a heavy insert load, one of my nodes started throwing this exception when trying a compaction: INFO [CompactionExecutor:23] 2013-09-09 16:08:07,528 CompactionTask.java (line 105) Compacting [SSTableReader(p ath='/var/lib/cassandra/data/smdb/tracedata/smdb-tracedata-ic-6-Data.db')

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-10 Thread Robert Coli
On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8fo...@gmail.com> wrote: > On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog and > data On SSD, you don't need to separate commitlog and data. You only win from this separation if you have a head to not-move between appends to

Re: heavy insert load overloads CPUs, with MutationStage pending

2013-09-10 Thread Robert Coli
On Tue, Sep 10, 2013 at 10:17 AM, Robert Coli wrote: > On Tue, Sep 10, 2013 at 7:55 AM, Keith Freeman <8fo...@gmail.com> wrote: > >> On my 3-node cluster (v1.2.8) with 4-cores each and SSDs for commitlog >> and data > > BTW, is RF=3? If so, you effectively have a 1 node cluster while writing. =R

Re: cassandra error on restart

2013-09-10 Thread Langston, Jim
Thanks Mina, That was it exactly … Jim From: Mina Naguib mailto:mina.nag...@adgear.com>> Reply-To: mailto:user@cassandra.apache.org>> Date: Tue, 10 Sep 2013 10:16:17 -0400 To: mailto:user@cassandra.apache.org>> Subject: Re: cassandra error on restart There was mention of a similar crash on the

Re: cassandra error on restart

2013-09-10 Thread Mina Naguib
There was mention of a similar crash on the mailing list. Does this apply to your case ? http://mail-archives.apache.org/mod_mbox/cassandra-user/201306.mbox/%3ccdecfcfa.11e95%25agundabatt...@threatmetrix.com%3E -- Mina Naguib AdGear Technologies Inc. http://adgear.com/ On 2013-09-10, at 10:0

cassandra error on restart

2013-09-10 Thread Langston, Jim
Hi all, I restarted my cassandra ring this morning, but it is refusing to start. Everything was fine, but now I get this error in the log: …. INFO 14:05:14,420 Compacting [SSTableReader(path='/raid0/cassandra/data/system/local/system-local-ic-20-Data.db'), SSTableReader(path='/raid0/cassandra

Re: Streaming never completes during nodetool rebuild

2013-09-10 Thread Paulo Motta
Thanks for the reply Robert! Actually increasing the property "streaming_socket_timeout_in_ms" fixed the problem. :) It seems 60 seconds is a too low value for this property for inter-region streaming of very large files. I increased it to 600 seconds, but a lower value should be enough. 2013/

Composite Column Grouping

2013-09-10 Thread Ravikumar Govindarajan
I have been faced with a problem of grouping composites on the second-part. Lets say my CF contains this TimeSeriesCF key:UserID composite-col-name:TimeUUID:PKID Some sample data UserID = XYZ

cassandra hi bandwith

2013-09-10 Thread Nikolay Mihaylov
Hi, we have cassandra 1.2.6, single node. we have a website there, running on different server. recently we noticed that we have 40 MBit traffic from cassandra server to the web server we use phpcassa. on ops center we have "KeyCache Hits" value around 2000 . I found the most used CF's from n

Throughput and RAM

2013-09-10 Thread Jan Algermissen
Based on my tuning work with C* over the last days, I guess I reached the following insights. Maybe someone can confirm whether they make sense: The more heap I give to Cassandra (up to the GC tipping point of ~8GB) the more writes it can accumulate in memtables before doing IO. The more write

Re: One node out of three not flushing memtables

2013-09-10 Thread Jan Algermissen
On 10.09.2013, at 02:34, "Laing, Michael" wrote: > I have seen something similar. > > Of course correlation is not causation... Thanks for sharing - interesting. However, I still find it confusing that C* does not refuse service befor it dies. Maybe that is a by-product of the SEDA architect