Reduce Cassandra GC

2013-04-16 Thread Joel Samuelsson
Hi, We have a small production cluster with two nodes. The load on the nodes is very small, around 20 reads / sec and about the same for writes. There are around 2.5 million keys in the cluster and a RF of 2. About 2.4 million of the rows are skinny (6 columns) and around 3kb in size (each). Curr

C* consumes all RAM

2013-04-16 Thread Mikhail Mazursky
Hello. C* have been running without any problem for some weeks but now it started to consume all available ram. The cluster have very little data in it. There are no errors in logs, CPU is not loaded at all, jstack shows no deadlocks, there are 83 threads. Read/write latency is 1-4ms. The questi

Re: C* consumes all RAM

2013-04-16 Thread Mikhail Mazursky
More details: USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND 219 3801 0.7 92.7 6561116 3567016 ? SLl Mar11 372:44 /usr/java/latest/bin/java Linux XXX@YYY 3.2.30-49.59.amzn1.x86_64 #1 SMP Wed Oct 3 19:54:33 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux total

RE: Reduce Cassandra GC

2013-04-16 Thread Viktor Jevdokimov
For a >40GB of data 1GB of heap is too low. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adformi

Configuring 2 node Cassandra cluster

2013-04-16 Thread Sai Kumar Ganji
Hello Guys, I am trying to setup a 2 node Cassandra clustesr. My parameters are: *Node1(ip1):* initial_token: 0 rpc_address: ip1 listen_address: ip1 seeds: “ip1" *Node2(ip2):* initial_token: 85070591730234615865843651857942052864 rpc_address: ip1 listen_address: ip1 seeds: “ip1" and in Node2

Re: Configuring 2 node Cassandra cluster

2013-04-16 Thread Alain RODRIGUEZ
Could we have the logs after starting each node ? 2013/4/16 Sai Kumar Ganji > Hello Guys, > > I am trying to setup a 2 node Cassandra clustesr. > > My parameters are: > > *Node1(ip1):* > > initial_token: 0 > rpc_address: ip1 > listen_address: ip1 > seeds: “ip1" > > *Node2(ip2):* > > initial_tok

Re: Reduce Cassandra GC

2013-04-16 Thread Joel Samuelsson
How do you calculate the heap / data size ratio? Is this a linear ratio? Each node has slightly more than 12 GB right now though. 2013/4/16 Viktor Jevdokimov > For a >40GB of data 1GB of heap is too low. > > ** ** >Best regards / Pagarbiai > *Viktor Jevdokimov* > Senior Developer > >

MySQL Cluster performing faster than Cassandra cluster on single table

2013-04-16 Thread jrdn hannah
Hi, I was wondering if anybody here had any insight into this. I was running some tests on cassandra and mysql performance, with a two node and three node cassandra cluster, and a five node mysql cluster (mgmt, 2 x api, 2 x data). On the cassandra 2 node cluster vs mysql cluster, I was getting

Re: MySQL Cluster performing faster than Cassandra cluster on single table

2013-04-16 Thread horschi
Hi Hannah, mysql-cluster is a in-memory database. In-memory is fast. But I dont think you ever be able to store hundreds of Gigabytes of data on a node, which is something you can do with Cassandra. If your dataset is small, then maybe NDB is the better choice for you. I myself will not even tou

Re: Lost data after expanding cluster c* 1.2.3-1

2013-04-16 Thread Kais Ahmed
Thanks aaron, I feel that rebuilding indexes went well, but the result of my query (SELECT * FROM userdata WHERE login='kais';) is still emty. INFO [Creating index: userdata.userdata_login_idx] 2013-03-30 01:16:33,110 SecondaryIndex.java (line 175) Submitting index build of userdata.userdata_logi

Re: Does Memtable resides in Heap?

2013-04-16 Thread Jay Svc
Thanks Edward.! On Fri, Apr 12, 2013 at 9:46 AM, Edward Capriolo wrote: > This issue describes the design of the arena allocation of memtabes. > https://issues.apache.org/jira/browse/CASSANDRA-2252 > > > On Fri, Apr 12, 2013 at 1:35 AM, Viktor Jevdokimov < > viktor.jevdoki...@adform.com> wrote:

Re: Configuring 2 node Cassandra cluster

2013-04-16 Thread Alicia Leong
*Node2(ip2):* initial_token: 85070591730234615865843651857942052864 rpc_address: ip*1* < Should be *ip2* listen_address: ip*1*< Should be *ip2* seeds: “ip1" On Tue, Apr 16, 2013 at 5:43 PM, Alain RODRIGUEZ wrote: > Could we have the logs after starting each node ? > > > 201

Re: Configuring 2 node Cassandra cluster

2013-04-16 Thread Edward Capriolo
If you are using a two node cassandra cluster locally use ccm, it builds all the configuration files for you. https://github.com/pcmanus/ccm On Tue, Apr 16, 2013 at 11:06 AM, Alicia Leong wrote: > > > *Node2(ip2):* > > initial_token: 85070591730234615865843651857942052864 > rpc_address: ip*1*

Re: MySQL Cluster performing faster than Cassandra cluster on single table

2013-04-16 Thread jrdn hannah
Ah, I see, that makes sense. Have you got a source for the storing of hundreds of gigabytes? And does Cassandra not store anything in memory? Yeah, my dataset is small at the moment - perhaps I should have chosen something larger for the work I'm doing (University dissertation), however, it is

Re: MySQL Cluster performing faster than Cassandra cluster on single table

2013-04-16 Thread jrdn hannah
Yeah, I remember reading about that, but the schema had already been set and submitted. I will have to take that into consideration when discussing the results. Thanks, Hannah On 16 Apr 2013, at 17:42, Robert Coli wrote: > On Tue, Apr 16, 2013 at 3:56 AM, jrdn hannah wrote: > For example, on

Re: MySQL Cluster performing faster than Cassandra cluster on single table

2013-04-16 Thread Robert Coli
On Tue, Apr 16, 2013 at 3:56 AM, jrdn hannah wrote: > For example, on updating a single table in MySQL, with the equivalent > super column in Cassandra, I was getting results of 0.231 ms for MySQL and > 1.248ms for Cassandra to perform the update 1000 times. > You probably do not want to use a S

Re: MySQL Cluster performing faster than Cassandra cluster on single table

2013-04-16 Thread horschi
Ah, I see, that makes sense. Have you got a source for the storing of > hundreds of gigabytes? And does Cassandra not store anything in memory? > It stores bloom filters and index-samples in memory. But they are much smaller than the actual data and they can be configured. > > Yeah, my dataset is

RE: Reduce Cassandra GC

2013-04-16 Thread Viktor Jevdokimov
How one could provide any help without any knowledge about your cluster, node and environment settings? 40GB was calculated from 2 nodes with RF=2 (each has 100% data range), 2.4-2.5M rows * 6 cols * 3kB as a minimum without compression and any overhead (sstable, bloom filters and indexes). Wi

Re: 1.1.9 to 1.2.3 upgrade issue

2013-04-16 Thread aaron morton
> Is this a known issue? Or rolling upgrade form 1.1.x to 1.2.x not possible? Definitely supported. The error is from the coordinator processing the response from a replica. The size of the digest in the response has been miss-reported. Was this only reported on the 1.1.9 nodes ? Did you compl

Re: re-execution of failed queries with rpc_timeout

2013-04-16 Thread aaron morton
If you are using Counters you need to do everything you can to avoid timeouts. In the worse case we do not know where it has been applied. The increment is applied on a lead and then replicated to the others, if the coordinator is not the lead it may not know if the increments was applied at al

Re: re-execution of failed queries with rpc_timeout

2013-04-16 Thread Edward Capriolo
Q: The newer versions of Cassandra include extra information in the exception, I **think** you can use that information to determine how many machines the operation succeeded on. However I do not think that information means you can make counters that timed out "bulletproof" On Tue, Apr 16, 2013

Re: Rename failed while cassandra is starting up

2013-04-16 Thread aaron morton
I forgot to add I created a ticket for it https://issues.apache.org/jira/browse/CASSANDRA-5469 See that ticket for recent changes to the MeteredFlusher. IMHO this is not related to the metered flusher. Index rebuilds force a flush. Cheers - Aaron Morton Freelance Cassandra Co

Re: Added extra column as composite key while creation counter column family

2013-04-16 Thread aaron morton
What version are you using ? WIth 1.2.4 … cqlsh:dev> CREATE TABLE counters ( ... key text, ... value counter, ... PRIMARY KEY (key) ... ) WITH COMPACT STORAGE; cqlsh:dev> describe table counters; CREATE TABLE counters ( key text PRIMARY KEY, value counter

Re: Any experience of 20 node mini-itx cassandra cluster

2013-04-16 Thread aaron morton
> Can't we use LCS? Do some reading and some tests… http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra http://www.datastax.com/dev/blog/when-to-use-leveled-compaction Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.t

Re: Any experience of 20 node mini-itx cassandra cluster

2013-04-16 Thread Jabbar Azam
I already have thanks. I'll do the tests with the hardware arrives. Thanks Jabbar Azam On 16 April 2013 22:27, aaron morton wrote: > Can't we use LCS? > > Do some reading and some tests… > > http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra > http://www.datastax.com/dev/

Re: Extracting data from SSTable files with MapReduce

2013-04-16 Thread aaron morton
> I did try to upgrade to 1.2 but it did not work out. Maybe to many versions > in between. Newer versions should be able to read older file formats. What was the error? > Why would later formats make this easier you think? it will be easier to write against the current code base and you find it

Re: StatusLogger format?

2013-04-16 Thread aaron morton
> 99% sure it's in bytes. +1 to your confidence level. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 16/04/2013, at 6:14 AM, William Oberman wrote: > 99% sure it's in bytes. > > > On Mon, Apr 15, 2013 at 11:25

Re: Thrift message length exceeded

2013-04-16 Thread aaron morton
Can you confirm the you are using the same thrift version that ships 1.2.3 ? Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 16/04/2013, at 10:17 AM, Lanny Ripple wrote: > A bump to say I found this > > > http:/

Re: C* consumes all RAM

2013-04-16 Thread aaron morton
You are probably seeing this http://wiki.apache.org/cassandra/FAQ#mmap Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 16/04/2013, at 8:43 PM, Mikhail Mazursky wrote: > More details: > > USER PID %CPU %MEM

Re: Lost data after expanding cluster c* 1.2.3-1

2013-04-16 Thread aaron morton
Sorry can you repost the details of that issue including the CL you are using. Aaron - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 17/04/2013, at 12:57 AM, Kais Ahmed wrote: > Thanks aaron, > > I feel that rebuilding i

Re: Does Memtable resides in Heap?

2013-04-16 Thread aaron morton
Compression Meta data is also off heap http://www.datastax.com/dev/blog/performance-improvements-in-cassandra-1-2 Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 17/04/2013, at 3:02 AM, Jay Svc wrote: > Thanks Edwa

Repair Freeze / Gossip Invisibility / EC2 Public IP configuration

2013-04-16 Thread Arya Goudarzi
TL;DR; An EC2 Multi-Region Setup's Repair/Gossip Works with 1.1.10 but with 1.2.4, gossip does not see the nodes after restarting all nodes at once, and repair gets stuck. This is a working configuration: Cassandra 1.1.10 Cluster with 12 nodes in us-east-1 and 12 nodes in us-west-2 Using Ec2MultiR

Re: Repair Freeze / Gossip Invisibility / EC2 Public IP configuration

2013-04-16 Thread Edward Capriolo
So cassandra does inter node compression. I have not checked but this might be accidentally getting turned on by default. Because the storage port is typically 7000. Not sure why you are allowing 7100. In any case try allowing 7000 or with internode compression off. On Tue, Apr 16, 2013 at 6:42 P

Re: unexplained hinted handoff

2013-04-16 Thread Dane Miller
On Sun, Apr 14, 2013 at 11:28 AM, aaron morton wrote: >> If hints are being stored, doesn't that imply DOWN nodes, and why don't I >> see that in the logs? > > Hints are stored for two reasons. First if the node is down when the write > request starts, second if the node does not reply to the coo

How to stop Cassandra and then restart it in windows?

2013-04-16 Thread Raihan Jamal
Hello, I installed single node cluster in my local dev box which is running Windows 7 and it was working fine. Due to some reason, I need to restart my desktop and then after that whenever I am doing like this on the command prompt, it always gives me the below exception- S:\Apache Cassandra\apac

Cassandra Client Recommendation

2013-04-16 Thread Techy Teck
Hello, I have recently started working with Cassandra Database. Now I am in the process of evaluating which Cassandra client I should go forward with. I am mainly interested in these three- --1) Astyanax client 2--) New Datastax client that uses Binary protocol. --3) Pelops clien

Re: MySQL Cluster performing faster than Cassandra cluster on single table

2013-04-16 Thread Jabbar Azam
MySQL cluster also has the index in ram. So with lots of rows the ram becomes a limiting factor. That's what my colleague found and hence why were sticking with Cassandra. On 16 Apr 2013 21:05, "horschi" wrote: > > > Ah, I see, that makes sense. Have you got a source for the storing of >> hundr

Re: C* consumes all RAM

2013-04-16 Thread Mikhail Mazursky
Thank you, Aaron. p.s. we're on 1.1.9 - i forgot to mention that. 2013/4/17 aaron morton > You are probably seeing this http://wiki.apache.org/cassandra/FAQ#mmap > > Cheers > >- > Aaron Morton > Freelance Cassandra Consultant > New Zealand > > @aaronmorton > http://www.thel

Commit Log question

2013-04-16 Thread aaron morton
I'm looking into a case where it appears that recycling a commit log segment and flushing the dirty CF's results in 46 CF's being flushed. Out of 47 in the keyspace. All this flush activity blocks writes. Before I dig further I wanted to confirm my understanding. At 10:46 the MeteredFlusher ki

Re: Added extra column as composite key while creation counter column family

2013-04-16 Thread Kuldeep Mishra
cassandra 1.2.0 Is it a bug in 1.2.0 ? Thanks KK On Wed, Apr 17, 2013 at 2:56 AM, aaron morton wrote: > What version are you using ? > > WIth 1.2.4 … > > cqlsh:dev> CREATE TABLE counters ( >... key text, >... value counter, >... PRIMARY KEY (key) >...

Re: Cassandra Client Recommendation

2013-04-16 Thread Everton Lima
Hi Techy, We are using Astyanax with cassandra 1.2.4. beneficits: * It is so easy to configure and use. * Good wiki * Mantained by Netflix * Solution to manage the store of big files (more than 15mb) * Solution to read all rows efficiently problems: * It consume more memory 2013/4/16 Tec

Re: Cassandra Client Recommendation

2013-04-16 Thread Techy Teck
Thanks Everton for the suggestion. Couple of questions- 1) Does Astyanax client have any problem with previous version of Cassandra? 2) You said one problem, that it will consume more memory? Can you elaborate that slightly? What do you mean by that? 3) Does Astyanax supports asynch capabilities?

Re: Cassandra Client Recommendation

2013-04-16 Thread Everton Lima
1) Does Astyanax client have any problem with previous version of Cassandra? We have used with 1.1.8, but for this version we do not use the last version of Astyanax. But I think that to Cassandra 1.2.* the last version of astyanax will work. 2) You said one problem, that it will consume more memo

RE: Cassandra Client Recommendation

2013-04-16 Thread Francisco Trujillo
Hi We are using Cassandra 1.6 at this moment. We start to work with Hector, because it is the first recommendation that you can find in a simple google search for java clients Cassandra. We start using Hector but when we start to have non dynamically column families, that can be managed using