Re: monitor cassandra 1.1.6 with MX4J

2012-11-12 Thread Francisco Trujillo Hacha
Thank Michael. Adding the config to the file I can configure the port and address to use with mx4j. Wei Zhu, you are right. If you only add the mx4j-tool to the classpath cassandra enable mx4j by default using the default port 8081. 2012/11/12 Wei Zhu > In my cassandra-env.sh for 1.1.6, ther

Re: Counter column families (pending replicate on write stage tasks)

2012-11-12 Thread Rob Coli
On Mon, Nov 12, 2012 at 3:35 PM, cem wrote: > We are currently facing a performance issue with counter column families. I > see lots of pending ReplicateOnWriteStage tasks in tpstats. Then I disabled > replicate_on_write. It helped a lot. I want to use like that but I am not > sure how to use it.

Re: remove DC

2012-11-12 Thread Jeremiah Jordan
If you have any data that you wrote to DC2, since the last time you ran repair, you should probably run repair to make sure that data made it over to DC1, if you never wrote data directly to DC2, then you are correct you don't need to run repair. You should just need to update the schema, and t

Re: read request distribution

2012-11-12 Thread Kirk True
Somewhat recently the Ownership column was changed to Effective Ownership. Previously the formula was essentially 100/. Now it's 100*/. So in previous releases of Cassandra it would be 100/12 = 8.33, now it would be closer to 25% (8.33*3 (assuming a replication factor of three)). Kirk On M

Re: read request distribution

2012-11-12 Thread Ananth Gundabattula
Hi all, On an unrelated observation of the below readings, it looks like all the 3 nodes own 100% of the data. This confuses me a bit. We have a 12 node cluster with RF=3 but the effective ownership is shown as 8.33 % . So here is my question. How is the ownership calculated : Is Replica factor c

Counter column families (pending replicate on write stage tasks)

2012-11-12 Thread cem
Hi All, We are currently facing a performance issue with counter column families. I see lots of pending ReplicateOnWriteStage tasks in tpstats. Then I disabled replicate_on_write. It helped a lot. I want to use like that but I am not sure how to use it. I have 3 node setup with replication fact

Re: Single Node Cassandra Installation

2012-11-12 Thread Rob Coli
On Sat, Nov 10, 2012 at 6:16 PM, Drew Kutcharian wrote: > Thanks Rob, this makes sense. We only have one rack at this point, so I think > it'd be better to start with PropertyFileSnitch to make Cassandra think that > these nodes each are in a different rack without having to put them on > diffe

Re: backup/restore from sstable files ?

2012-11-12 Thread Rob Coli
On Sat, Nov 10, 2012 at 3:00 PM, Tyler Hobbs wrote: > For an alternative that doesn't require the same ring topology, you can use > the bulkloader, which will take care of distributing the data to the correct > nodes automatically. For more details on which cases are best for the different bulk l

Re: read request distribution

2012-11-12 Thread Wei Zhu
That is actually my original question. All three nodes have the complete data and all of them have the exactly the same hardware/software configuration and client uses RR to distribute the read request among the nodes, why one of them consistently report the much larger latency than the other tw

Re: Questions around the heap

2012-11-12 Thread aaron morton
For background, this thread discusses the working for cassandra http://www.mail-archive.com/user@cassandra.apache.org/msg25762.html tl;dr you can work it out or guess based on the tenured usage after CMS. > How can we know how the heap is being used, monitor it ? My favourite is to turn on the

Re: CF metadata syntax for an array

2012-11-12 Thread Kevin Burton
While this solves the problem for an array of 'primitive' types. What if I want an array or collection of an arbitrary type like list, where foo is a user defined type? I am guessing that this cannot be done with 'collections'. What are the options to solve this type of array? On Nov 12, 2012,

Re: read request distribution

2012-11-12 Thread Tyler Hobbs
Whichever node gets the initial Thrift request from the client is always the coordinator; there's no concept of making another node the coordinator. As far as QUORUM goes, only two nodes need to give a response to meet the consistency level, so Cassandra only sends out two read requests: one data

Re: read request distribution

2012-11-12 Thread Wei Zhu
Thanks Tyler for the information. From the online document: QUORUM Returns the record with the most recent timestamp after a quorum of replicas has responded. It's hard to know that digest query will be sent to *one* other replica. When the node gets the request, does it become the coordinator

Re: CF metadata syntax for an array

2012-11-12 Thread aaron morton
This may help http://www.datastax.com/dev/blog/cql3_collections > I have gotten as far as feeling a need to understand a ‘super-column’ You can happily ignore them. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 12/11/2012, at 8:35 PM,

remove DC

2012-11-12 Thread William Oberman
There is a great guide here on how to add resources: http://www.datastax.com/docs/1.1/operations/cluster_management#adding-capacity What about deleting resources? I'm thinking of removing a data center. Clearly I'd need to change strategy options, which is currently something like this: {DC1:3,DC

Re: monitor cassandra 1.1.6 with MX4J

2012-11-12 Thread Wei Zhu
In my cassandra-env.sh for 1.1.6, there is no setting regarding mx4j at all. I simply dropped the mx4j jar to the lib folder and enable jmx from cassandra-env.sh, I can connect to the default mx4j port 8081 with no problem.  I guess without the mx4j setting, it uses default port. If youi need to

Re: removing SSTABLEs

2012-11-12 Thread B. Todd Burruss
thx, i was pretty sure it would be ok (from a cassandra point of view) to remove it, but needed to check. voted up. i like having tools, but i think a few more dials to play with to control compaction would be nice too On Mon, Nov 12, 2012 at 9:01 AM, Edward Capriolo wrote: > Because you did a

Fwd: Cassandra BulkOutputFormat with Hadoop MRv1

2012-11-12 Thread Uldis Barbans
Hello, Is BulkOutputFormat intended to be compatible with MRv1 (mapred) at all? I'm trying to write to Cassandra, roughly following the example at http://shareitexploreit.blogspot.se/2012/03/bulkloadto-cassandra-with-hadoop.html but with MRv1 - that is, calling output.collect(rowkey, Collections.

Re: [BETA RELEASE] Apache Cassandra 1.2.0-beta2 released

2012-11-12 Thread Sylvain Lebresne
On Mon, Nov 12, 2012 at 6:10 PM, Tyler Hobbs wrote: > On Sun, Nov 11, 2012 at 4:21 AM, Sylvain Lebresne wrote: > >> Actually, if we're going to be precise, it's -2^63 to 2^63 - 1. >> Long.MIN_VALUE is not a valid token for technical reasons. > > > I think you typo'd, so just to be clear, did you

Re: Strange delay in query

2012-11-12 Thread Binh Nguyen
I don't think that statement is accurate. The minor compaction is still triggered for small sstables but for the big sstables it may or may not. By default Cassandra will wait until it finds 4 sstables of the same size to trigger the compaction so if the sstables are big then it may take a while to

Re: [BETA RELEASE] Apache Cassandra 1.2.0-beta2 released

2012-11-12 Thread Tyler Hobbs
On Sun, Nov 11, 2012 at 4:21 AM, Sylvain Lebresne wrote: > Actually, if we're going to be precise, it's -2^63 to 2^63 - 1. > Long.MIN_VALUE is not a valid token for technical reasons. I think you typo'd, so just to be clear, did you mean to say that then lower end of the range is -2^63 + 1 (leav

Re: removing SSTABLEs

2012-11-12 Thread Edward Capriolo
Because you did a major compaction that table is larger then all the rest. So it will never go away until you have 3 other tables about that size or you run major compaction again. You should vote on the ticket: https://issues.apache.org/jira/browse/CASSANDRA-4766 On Mon, Nov 12, 2012 at 11:51 A

Re: 3 data centers

2012-11-12 Thread Edward Capriolo
You can not defeat the speed of light. You can read up on LOCAL_QUORUM and EACH_QUORUM consistency levels here: http://www.packtpub.com/article/apache-cassandra-working-multiple-datacenter-environments Look for the recipe: Quorum operations in multi-datacenter environments On Mon, Nov 12, 2012 a

Re: removing SSTABLEs

2012-11-12 Thread Jason Wee
The existence of sstable X will give an impact to the system or cluster? when the compaction threshold is reach, the sstable x and sstable y will be compacted. it's more like the system responsibility than human intervention. On Mon, Nov 12, 2012 at 12:09 PM, B. Todd Burruss wrote: > if i stop

3rd installment of the Paris Cassandra meetup

2012-11-12 Thread Sylvain Lebresne
Parisian folks (for the others, sorry for the spam), You will want to mark next Monday in your Agenda as we're having the 3rd Paris Cassandra meetup. This time, Jonathan Ellis will be our speaker (somehow we get more speakers from Texas than from Paris in this meetup, go figure) who will present t

Re: Memory Manager

2012-11-12 Thread Brian Tarbox
Can you supply your java parameters? On Mon, Nov 12, 2012 at 7:29 AM, Everton Lima wrote: > Hi people, > > I was using cassandra on distributed project. I am using java 6 and > cassandra 1.1.6. My problem is in Memory manager (I think). My system was > throwing heap limit exception. > The problem

3 data centers

2012-11-12 Thread Baskar Duraikannu
Good morning.  We are thinking of setting up 3 data centers with "NetworkTopologyStrategy" with each DC having 1 copy.  All three data centers are going to be connected by dark fiber with < 10 ms latency.  Due to network latency, all QUORUM reads and writes will be slower. Has anyone used this

Memory Manager

2012-11-12 Thread Everton Lima
Hi people, I was using cassandra on distributed project. I am using java 6 and cassandra 1.1.6. My problem is in Memory manager (I think). My system was throwing heap limit exception. The problem is that after some inserts (2Gb) the Old Gen memory of heap full and can not be cleaned. This problem,

Re: monitor cassandra 1.1.6 with MX4J

2012-11-12 Thread Michal Michalski
Hmm... It looks like it wasn't merged at some time (why?), because I can see that appropriate lines were present in a few branches. I didn't check if it works, but looking at git history tells me that you could try modifying cassandra-env.sh like this: Add this somewhere & configure: # To use

Re: Questions around the heap

2012-11-12 Thread Alain RODRIGUEZ
It's been Does anybody has an answer to any of these questions ? Alain 2012/11/7 Hiller, Dean > +1, I am interested in this answer as well. > > From: Alain RODRIGUEZ mailto:arodr...@gmail.com>> > Reply-To: "user@cassandra.apache.org" < > user@cassandra.apache.

Re: Hinted Handoff runs every ten minutes

2012-11-12 Thread Vegard Berget
 Hi, HintsColumnFamily directory on Node 1 (the first to be upgraded):1.8K Oct 27 11:27 system-HintsColumnFamily-hf-2-Data.db79 Oct 27 11:27 system-HintsColumnFamily-hf-2-Digest.sha1496 Oct 27 11:27 system-HintsColumnFamily-hf-2-Filter.db26 Oct 27 11:27 system-HintsColumnFamily-hf-2-Index.db4.3K Oc