Re: Network, Compaction, Garbage collection and Cache monitoring in cassandra

2012-03-21 Thread Thomas van Neerijnen
Collectd with GenericJMX pushing data into Graphite is what we use. You can monitor the Graphite graphs directly instead of having an extra JMX interface on the Cassandra nodes for monitoring. On Wed, Mar 21, 2012 at 8:16 PM, Jeremiah Jordan < jeremiah.jor...@morningstar.com> wrote: > You can al

Re: ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-21 Thread Thomas van Neerijnen
Hi I'm going with yes to all three of your questions. I found a very heavily hit index which we have since reworked to remove the secondry index entirely. This fixed a large portion of the problem but during the panic of the overloaded cluster we did the simple scaling out trick of doubling the c

tombstones problem with 1.0.8

2012-03-21 Thread Ross Black
Hi, We recently moved from 0.8.2 to 1.0.8 and the behaviour seems to have changed so that tombstones are now not being deleted. Our application continually adds and removes columns from Cassandra. We have set a short gc_grace time (3600) since our application would automatically delete zombies i

Cassandra Indexing: The Good, the Bad and the Ugly

2012-03-21 Thread Brian O'Neill
Over the past 9 months, we've learned a lot about indexing in Cassandra and we've had a few false starts. I've tried to capture what we learned in the hopes that we can save a few others from false starts. http://brianoneill.blogspot.com/2012/03/cassandra-indexing-good-bad-and-ugly.html We've al

RE: exception when attempting to truncate a table

2012-03-21 Thread Viktor Jevdokimov
This is a known issue(s) to be fixed (can't find exact tickets on the tracker). Controller, that receives truncate command, checks all nodes up and send truncate message to all (including itself), waiting for an answer for rpc_timeout_in_ms (will be fixed to separate timeout setting). If any nod

RE: exception when attempting to truncate a table

2012-03-21 Thread Richard Lowe
I'd double-check the firewall on each node to make sure the storage and RPC ports aren't being blocked. We've found that "Up" in nodetool ring output reflects the gossip status, which means only that one of the nodes can contact the node, not necessarily the entire ring. It's possible for the e

Re: Failed to delete C:\tmp\lib\cassandra\commitlog\CommitLog-1332175713378.log

2012-03-21 Thread martien huijsmans
Aaron txs for the reply. Yes I am shutting down the server. I also noticed that I put the wrong version nr of Cassandra in my email. I have multiple installations. The version that I used for the test was 0.7 I will upgrade and check against a more recent version. /Martien Op 21 maart 2012 10:04

RE: Network, Compaction, Garbage collection and Cache monitoring in cassandra

2012-03-21 Thread Jeremiah Jordan
You can also use any network/server monitoring tool which can talk to JMX. We are currently using vFabric Hyperic's JMX plugin for this. IIRC there are some cacti and nagios scripts on github for getting the data into those. -Jeremiah From: R. Verlangen [ro..

Re: Exceptions related to thrift transport

2012-03-21 Thread Ben Coverston
With the crazy message size, the version exception, and the frequency, is there a service that is connecting to the thrift port on occasion and sending garbage (port scan or something else of that nature). Can you rule that out? On Wed, Mar 21, 2012 at 11:15 AM, aaron morton wrote: > 1. or

Re: exception when attempting to truncate a table

2012-03-21 Thread Ben Coverston
run 'show keyspaces' in the cassandra-cli and paste the details for ks1 here. On Wed, Mar 21, 2012 at 12:33 PM, Cyril Scetbon wrote: > Hi, > > I'm using version 1.0.7 and when I try to truncate a CF in cqlsh it raises > an error message, but it seems incorrect ... > > cqlsh:ks1> truncate core; >

exception when attempting to truncate a table

2012-03-21 Thread Cyril Scetbon
Hi, I'm using version 1.0.7 and when I try to truncate a CF in cqlsh it raises an error message, but it seems incorrect ... cqlsh:ks1> truncate core; Unable to complete request: *one or more nodes were unavailable*. >nodetool -h localhost ring

Re: ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-21 Thread aaron morton
The node is overloaded with hints. I'll just grab the comments from codeā€¦ // avoid OOMing due to excess hints. we need to do this check even for "live" nodes, since we can // still generate hints for those if it's overloaded or simply dead but not yet known-to-be-dead

Re: On Bloom filters and Key Cache

2012-03-21 Thread aaron morton
> Regarding bloom filters, have I understood correctly that they are stored on > Heap, yes. > and that the "Bloom Filter Space Used" reported by 'nodetool cfstats' is an > approximation of the heap space used by bloom filters? Yes, it's the on serialised on disk size. This will be smaller than

Re: Order rows numerically

2012-03-21 Thread A J
Yes, that is good enough for now. Thanks. On Fri, Mar 16, 2012 at 6:49 PM, Watanabe Maki wrote: > How about to fill zeros before smaller digits? > Ex. 0001, 0002, etc > > maki > > > On 2012/03/17, at 6:29, A J wrote: > >> If I define my rowkeys to be Integer >> (key_validation_class=Inte

Re: Cassandra Exception

2012-03-21 Thread Daning Wang
and we are on 0.8.6. On Wed, Mar 21, 2012 at 10:24 AM, Daning Wang wrote: > Hi All, > > > We got lots of Exception in the log, and later the server crashed. any > idea what is happening and how to fix it? > > ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482 > AbstractCassandraDaemon.java

Cassandra Exception

2012-03-21 Thread Daning Wang
Hi All, We got lots of Exception in the log, and later the server crashed. any idea what is happening and how to fix it? ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[RequestResponseStage:4,5,main] java.io.IOError:

Re: Exceptions related to thrift transport

2012-03-21 Thread aaron morton
> 1. org.apache.thrift.TException: Message length exceeded: 134218240 thrift_mas_message_length_in_mb https://github.com/apache/cassandra/blob/cassandra-1.0/conf/cassandra.yaml#L243 (134218240 is 128MB, which is a lot of data( > 2. org.apache.thrift.protocol.TProtocolException: Missi

Re: Max # of CFs

2012-03-21 Thread Vitalii Tymchyshyn
There is a forced flusher that kicks in when your heap becomes full. Look for log lines from GCInspector. There is a bug that prevents flushing memtable when it has only full key delete mutations, see https://issues.apache.org/jira/browse/CASSANDRA-3741 For me it happened when we've started to mo

Re: Failed to delete C:\tmp\lib\cassandra\commitlog\CommitLog-1332175713378.log

2012-03-21 Thread aaron morton
I remember problems in the past with windows and deleting files. Are you shutting down the server in the test tear down ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 21/03/2012, at 7:02 PM, martien huijsmans wrote: > Hi, > > I have

Re: cassandra-cli and "uncreachable" status confusion

2012-03-21 Thread aaron morton
> How will you recommend doing schema level health checks (consistency) for > Cassandra within the cluster? describe cluster in the cli can be used to see how many schema versions there are. The similar functionality is included in most other clients. Cheers - Aaron Morton F

StackOverflowError during 1.0.8 upgrade

2012-03-21 Thread Wenjun Che
Hello I am trying to upgrade our 1-node setup from 0.8.9 to 1.0.8 and seeing the following exception when starting up 1.0.8. We have been running 0.8.9 without any issues. DEBUG [OptionalTasks:1] 2012-03-21 11:59:14,731 IntervalNode.java (line 45) Creating IntervalNode from [Interval(DecoratedKe

Attaching a virtual DC

2012-03-21 Thread Pierre-Yves Ritschard
Hi, I need to attach a virtual DC to an existing cluster. My existing cluster uses the Ec2Snitch as endpoint_snitch in cassandra.yaml. Since both my clusters will live in the same AZ in AWS, it seems as though my best bet is to switch to a property file snitch to divide. A couple of questions: *

Re: sstable size increase at compaction

2012-03-21 Thread Erik Forsberg
On 2012-03-21 16:36, Erik Forsberg wrote: Hi! We're using the bulkloader to load data to Cassandra. During and after bulkloading, the minor compaction process seems to result in larger sstables being created. An example: This is on Cassandra 1.1, btw. \EF

sstable size increase at compaction

2012-03-21 Thread Erik Forsberg
Hi! We're using the bulkloader to load data to Cassandra. During and after bulkloading, the minor compaction process seems to result in larger sstables being created. An example: INFO [CompactionExecutor:105] 2012-03-21 15:18:46,608 CompactionTask.java (line 115) Compacting [SSTableReader(p

Re: Max # of CFs

2012-03-21 Thread A J
I have increased index_interval. Will let you know if I see a difference. My theory is that memtables are not getting flushed. If I manually flush them, the heap consumption goes down drastically. I think when memtable_total_space_in_mb is exceeded not enough memtables are getting flushed. There

On Bloom filters and Key Cache

2012-03-21 Thread Erik Forsberg
Hi! We're currently testing Cassandra with a large number of row keys per node - nodetool cfstats approximated number of keys to something like 700M per node. This seems to have caused a very large heap consumption. After reading http://wiki.apache.org/cassandra/LargeDataSetConsiderations I

ReplicateOnWriteStage exception causes a backlog in MutationStage that never clears

2012-03-21 Thread Thomas van Neerijnen
Hi all I'm running into a weird error on Cassandra 1.0.7. As my clusters load gets heavier many of the nodes seem to hit the same error around the same time, resulting in MutationStage backing up and never clearing down. The only way to recover the cluster is to kill all the nodes and start them u

Re: high level of MemtablePostFlusher pending events

2012-03-21 Thread Vitalii Tymchyshyn
To note: I still have the problem in before beta 1.1 custom build that seems to have the fix. I am going to upgrade to 1.1 beta and check if problem will go away and will file a bug if problem still exists. BTW: It would be great for cassandra to exit on any fatal errors, like assertion problems

Re: Max # of CFs

2012-03-21 Thread Vitalii Tymchyshyn
Hello. There is also a primary row index. It's space can be controlled with index_interval setting. Don't know if you can look for it's memory usage somewhere. If I where you, I'd take jmap tool and examine heap histogram first, heap dump second. Best regards, Vitalii Tymchyshyn 20.03.12 18

Exceptions related to thrift transport

2012-03-21 Thread Tiwari, Dushyant
Hi Cassandra Users, A couple of questions on the server side exceptions that I see sometimes - 1. org.apache.thrift.TException: Message length exceeded: 134218240 n How to configure message length? 2. org.apache.thrift.protocol.TProtocolException: Missing version in readMessageB