nodetool repair with vnodes

2013-02-17 Thread Marco Matarazzo
Greetings. I'm trying to run "nodetool repair" on a Cassandra 1.2.1 cluster of 3 nodes with 256 vnodes each. On a pre-1.2 cluster I used to launch a "nodetool repair" on every node every 24hrs. Now I'm getting a differenf behavior, and I'm sure I'm missing something. What I see on the command

Re: Size Tiered -> Leveled Compaction

2013-02-17 Thread Mike
Hello Wei, First thanks for this response. Out of curiosity, what SSTable size did you choose for your usecase, and what made you decide on that number? Thanks, -Mike On 2/14/2013 3:51 PM, Wei Zhu wrote: I haven't tried to switch compaction strategy. We started with LCS. For us, after mass

Re: virtual nodes + map reduce = too many mappers

2013-02-17 Thread cem
Thanks Eric for the appreciation :) Default split size is 64K rows. ColumnFamilyInputFormat first collects all tokens and create a split for each. if you have 256 vnode for each node that it creates 256 splits even if you have no data at all. current split size will only work if you have a vnode t

Re: NPE in running "ClientOnlyExample"

2013-02-17 Thread Edward Capriolo
This is a bad example to follow. This is the internal client the Cassandra nodes use to talk to each other (fat client) usually you do not use this unless you want to write some embedded code on the Cassandra server. Typically clients use thrift/native transport. But you are likely getting the err

Re: Deleting old items during compaction (WAS: Deleting old items)

2013-02-17 Thread aaron morton
That's what the TTL does. Manually delete all the older data now, then start using TTL. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 13/02/2013, at 11:08 PM, Ilya Grebnov wrote: > Hi, > > We looking for solut

Re: Mutation dropped

2013-02-17 Thread aaron morton
You are hitting the maximum throughput on the cluster. The messages are dropped because the node fails to start processing them before rpc_timeout. However the request is still a success because the client requested CL was achieved. Testing with RF 2 and CL 1 really just tests the disks on

Re: [nodetool] repair with vNodes

2013-02-17 Thread aaron morton
I'm a bit late, but for reference. Repair runs in two stages, first differences are detected. You an monitor the validation compaction with nodetool compactionstats. Then the differences are streamed between the nodes, you can monitor that with nodetool netstats. > Nodetool repair command h

Re: Question on Cassandra Snapshot

2013-02-17 Thread aaron morton
> With incremental_backup turned OFF in Cassandra.yaml - Are all SSTables are > under /data/TestKeySpace/ColumnFamily at all times? No. They are deleted when they are compacted and no internal operations are referencing them. > With incremental_backup turned ON in cassandra.yaml - Are current

unsubscribe

2013-02-17 Thread puneet loya
unsubscribe me please. Thank you

RE: NPE in running "ClientOnlyExample"

2013-02-17 Thread Jain Rahul
Thanks Edward, My Bad. I was confused as It does seems to create keyspace also, As I understand (although i'm not sure) List cfDefList = new ArrayList(); CfDef columnFamily = new CfDef(KEYSPACE, COLUMN_FAMILY); cfDefList.add(columnFamily); try { cli

Re: unsubscribe

2013-02-17 Thread Dave Brosius
On 02/17/2013 01:26 PM, puneet loya wrote: unsubscribe me please. Thank you if only directions were followed: http://hadonejob.com/images/full/102.jpg send to user-unsubscr...@cassandra.apache.org

Re: odd production issue today 1.1.4

2013-02-17 Thread aaron morton
There is always this old chestnut http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs A - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 16/02/2013, at 8:22 AM, Edward Capriolo wrote: > With hyper threading a core can show up

Re: cassandra vs. mongodb quick question

2013-02-17 Thread aaron morton
If you have spinning disk and 1G networking and no virtual nodes, I would still say 300G to 500G is a soft limit. If you are using virtual nodes, SSD, JBOD disk configuration or faster networking you may go higher. The limiting factors are the time it take to repair, the time it takes to rep

Re: can we pull rows out compressed from cassandra(lots of rows)?

2013-02-17 Thread aaron morton
No. The rows are uncompressed deep down in the IO stack. There is compression in the binary protocol http://www.datastax.com/dev/blog/binary-protocol https://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=doc/native_protocol.spec;hb=refs/heads/cassandra-1.2 Cheers ---

unsubscribe

2013-02-17 Thread James Wong
On Feb 17, 2013 10:27 AM, "puneet loya" wrote: > > unsubscribe me please. > > Thank you

Re: unsubscribe

2013-02-17 Thread Michael Kjellman
Please see the Mailing Lists section of the home page. http://cassandra.apache.org user-unsubscr...@cassandra.apache.org From: James Wong mailto:jwong...@gmail.com>> Reply-To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Date: Sunday, Februa

Re: Deleting old items

2013-02-17 Thread aaron morton
I'll email the docs people. I believe they are saying "use compaction throttling rather than this" not "this does nothing" Although I used this in the last month on a machine with very little ram to limit compaction memory use. Cheers - Aaron Morton Freelance Cassandra Develo

Re: Is there any consolidated literature about Read/Write and Data Consistency in Cassandra ?

2013-02-17 Thread aaron morton
If you want the underlying ideas try the Dynamo paper, the Big Table paper and the original Cassandra paper from facebook. Start here http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton ht

Re: nodetool repair with vnodes

2013-02-17 Thread aaron morton
> …so it seems to me that it is running on all vnodes ranges. Yes. > Also, whatever the node which I launch the command on is, only one node log > is "moving" and is always the same node. Not sure what you mean here. > So, to me, it's like the "nodetool repair" command is running always on the

Re: Nodetool doesn't shows two nodes

2013-02-17 Thread Boris Solovyov
Hi, I've checked all things Alain suggested and set up a fresh 2-node cluster, and I still get the same result: each node lists itself as only one. This time I made the following changes: - I set listen_address to the public DNS name. Internally, AWS's DNS will map this to the 10.x IP, so

Re: Is C* common nickname for Cassandra?

2013-02-17 Thread Boris Solovyov
Thanks. I don't know if anyone cares my opinion, but as a newcomer to the community, my feedback is that it is not needed. At best it confuses a newbie and makes him feel like an outsider. At worst it just looks totally unprofessional, like here: http://www.planetcassandra.org/blog/post/calling-all

Re: Is C* common nickname for Cassandra?

2013-02-17 Thread Michael Kjellman
Why do you feel that link is unprofessional? Just wondering. I actually quite like the abbreviation personally. On Feb 17, 2013, at 1:37 PM, "Boris Solovyov" mailto:boris.solov...@gmail.com>> wrote: Thanks. I don't know if anyone cares my opinion, but as a newcomer to the community, my feedbac

RE: Deleting old items during compaction (WAS: Deleting old items)

2013-02-17 Thread Ilya Grebnov
According to https://issues.apache.org/jira/browse/CASSANDRA-2103 There is no support for time to live (TTL) on counter columns. Did I miss something? Thanks, Ilya From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Sunday, February 17, 2013 9:16 AM To: user@cassandra.apache.org Subjec

Re: Is C* common nickname for Cassandra?

2013-02-17 Thread Boris Solovyov
Is hard to say, really. I guess just feels like not very serious, overly casual, which mean not treating the project with respect? I guess I believe if you want something treated with respect you must demonstrate how seriously you take it oneself. I am sure this is personal opinion only, but perhap

Re: Nodetool doesn't shows two nodes

2013-02-17 Thread Boris Solovyov
No, it doesn't works, same thing: both nodes seems to just exist solo and I have 2 single-node clusters :-( OK, so now I am confused, and hope list will help me out. To understand what wrong, I think I need to know what happens in node bootstrap, and in node join ring. Who does node communicate, on

Re: nodetool repair with vnodes

2013-02-17 Thread Marco Matarazzo
>> So, to me, it's like the "nodetool repair" command is running always on the >> same single node and repairing everything. > If you use nodetool repair without the -pr flag in your setup (3 nodes and I > assume RF 3) it will repair all token ranges in the cluster. That's correct, 3 nodes and

Re: Nodetool doesn't shows two nodes

2013-02-17 Thread Boris Solovyov
OK. I got it. I realized that storage_port wasn't actually open between the nodes, because it is using the public IP. (I did find this information in the docs, after looking more... it is in section on "Types of snitches." It explains everything I found by try and error.) After opening this port 7

Re: Size Tiered -> Leveled Compaction

2013-02-17 Thread Wei Zhu
We doubled the SStable size to 10M. It still generates a lot of SSTable and we don't see much difference of the read latency. We are able to finish the compactions after repair within serveral hours. We will increase the SSTable size again if we feel the number of SSTable hurts the performance.

Re: Nodetool doesn't shows two nodes

2013-02-17 Thread Jared Biel
This is something that I found while using the multi-region snitch - it uses public IPs for communication. See the original ticket here: https://issues.apache.org/jira/browse/CASSANDRA-2452. It'd be nice if it used the private IPs to communicate with nodes that are in the same region as itself, but