OOMs during high (read?) load in Cassandra 1.2.11

2013-12-06 Thread Klaus Brunner
We're getting fairly reproducible OOMs on a 2-node cluster using Cassandra 1.2.11, typically in situations with a heavy read load. A sample of some stack traces is at https://gist.github.com/KlausBrunner/7820902 - they're all failing somewhere down from table.getRow(), though I don't know if that's

Try to configure commitlog_archiving.properties

2013-12-06 Thread Bonnet Jonathan
Hello, I try to configure commitlog_archiving.properties to take advantage of backup and restore at a point of time, but there is no ressources on internet for that. So i need some help. If I understand I have 4 parameters: archive_command= restore_command= restore_directories= restore_

cassandra backup

2013-12-06 Thread Marcelo Elias Del Valle
Hello everyone, I am trying to create backups of my data on AWS. My goal is to store the backups on S3 or glacier, as it's cheap to store this kind of data. So, if I have a cluster with N nodes, I would like to copy data from all N nodes to S3 and be able to restore later. I know Priam does th

Re: cassandra backup

2013-12-06 Thread Michael Theroux
Hi Marcelo, Cassandra provides and eventually consistent model for backups.  You can do staggered backups of data, with the idea that if you restore a node, and then do a repair, your data will be once again consistent.  Cassandra will not automatically copy the data to other nodes (other than

Write performance with 1.2.12

2013-12-06 Thread srmore
We have a 3 node cluster running cassandra 1.2.12, they are pretty big machines 64G ram with 16 cores, cassandra heap is 8G. The interesting observation is that, when I send traffic to one node its performance is 2x more than when I send traffic to all the nodes. We ran 1.0.11 on the same box and

help on backup muiltinode cluster

2013-12-06 Thread Amalrik Maia
hey guys, I'm trying to take backups of a multi-node cassandra and save them on S3. My idea is simply doing ssh to each server and use nodetool to create the snapshots then push then to S3. So is this approach recommended? my concerns are about inconsistencies that this approach can lead, sin

[JOB] - Full time opportunity in San Francisco bay area

2013-12-06 Thread Gnani Balaraman
We have a full time perm opportunity with a reputable client in the San Francisco bay area. Looking for good Cassandra and Java/ J2EE skills. Should you be interested, please reply with your contact number. Will call to discuss more. Thanks, * _* Gnani Bala

Re: OOMs during high (read?) load in Cassandra 1.2.11

2013-12-06 Thread Jason Wee
Hi, Just taking a wild shot here, sorry if it does not help. Could it be thrown during reading the sstable? That is, try to find the configuration parameters for read operation, tune down a little for those settings. Also check on the the chunk_length_kb. http://www.datastax.com/documentation/cql

Re: new project - Under Siege

2013-12-06 Thread Vicky Kak
Out of curiosity with quick look I found you have directory name as com.shift.undersiege https://github.com/StartTheShift/UnderSiege/blob/master/src/main/java/com.shift.undersiege/StatsdReporter.java You should have created the directory structure as src/main/java/com/shift/undersiege/StatsReporte

Re: OOMs during high (read?) load in Cassandra 1.2.11

2013-12-06 Thread Vicky Kak
I am not sure if you had got a chance to take a look at this http://www.datastax.com/docs/1.1/troubleshooting/index#oom http://www.datastax.com/docs/1.1/install/recommended_settings Can you attach the cassandra logs and the cassandra.yaml, it should be able to give us more details about the issue?

Re: how to find nodes by row key?

2013-12-06 Thread Daneel Yaitskov
Thanks Rob, There is one thing bothers me. I have complex row key. $ create table b (x int, s text, ((x,s)) primary key); In cqlsh I cannot fill row key partially: $ insert into b (x) values(4); Bad Request: Missing mandatory PRIMARY KEY part s But nodetool can find hosts by incomplete key $ n

Re: Write performance with 1.2.12

2013-12-06 Thread Vicky Kak
Hard to say much without knowing about the cassandra configurations. Yes compactions/GC's could skipe the CPU, I had similar behavior with my setup. -VK On Fri, Dec 6, 2013 at 7:40 PM, srmore wrote: > We have a 3 node cluster running cassandra 1.2.12, they are pretty big > machines 64G ram wit

Re: Try to configure commitlog_archiving.properties

2013-12-06 Thread Vicky Kak
>>Why, can you give me a good example and the good way to configure archive >>commit logs ? Take a look at the cassandra code ;) On Fri, Dec 6, 2013 at 3:34 PM, Bonnet Jonathan < jonathan.bon...@externe.bnpparibas.com> wrote: > Hello, > > I try to configure commitlog_archiving.properties to t

Re: cassandra backup

2013-12-06 Thread Rahul Menon
You should look at this - https://github.com/amorton/cassback i dont believe its setup to use 1.2.10 and above but i believe is just small tweeks to get it running. Thanks Rahul On Fri, Dec 6, 2013 at 7:09 PM, Michael Theroux wrote: > Hi Marcelo, > > Cassandra provides and eventually consisten

Re: Counters question - is there a better way to count

2013-12-06 Thread Alex Popescu
On Thu, Dec 5, 2013 at 7:44 AM, Christopher Wirt wrote: > I want to build a really simple column family which counts the occurrence > of a single event X. > > The guys from Disqus are big into counters: https://www.youtube.com/watch?v=A2WdS0YQADo http://www.slideshare.net/planetcassandra/cassand

Re: How to monitor the progress of a HintedHandoff task?

2013-12-06 Thread Rahul Menon
Tom, you should look at phi_convict_threshold and try and increase the value if you have too much chatter on your network. Also, rebuilding the entire node because of a OOM does not make sense, could you please post the C* version that you are using & the head size you have configured? Thanks Ra

Re: Write performance with 1.2.12

2013-12-06 Thread srmore
On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak wrote: > Hard to say much without knowing about the cassandra configurations. > The cassandra configuration is -Xms8G -Xmx8G -Xmn800m -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=4 -XX:MaxTenuringThreshold=2 -X

Re: Write performance with 1.2.12

2013-12-06 Thread Jason Wee
Hi srmore, Perhaps if you use jconsole and connect to the jvm using jmx. Then uner MBeans tab, start inspecting the GC metrics. /Jason On Fri, Dec 6, 2013 at 11:40 PM, srmore wrote: > > > > On Fri, Dec 6, 2013 at 9:32 AM, Vicky Kak wrote: > >> Hard to say much without knowing about the cassa

Re: Write performance with 1.2.12

2013-12-06 Thread Vicky Kak
You have passed the JVM configurations and not the cassandra configurations which is in cassandra.yaml. The spikes are not that significant in our case and we are running the cluster with 1.7 gb heap. Are these spikes causing any issue at your end? On Fri, Dec 6, 2013 at 9:10 PM, srmore wrote

Re: cassandra backup

2013-12-06 Thread Jonathan Haddad
I believe SSTables are written to a temporary file then moved. If I remember correctly, tools like tablesnap listen for the inotify event IN_MOVED_TO. This should handle the "try to back up sstable while in mid-write" issue. On Fri, Dec 6, 2013 at 5:39 AM, Michael Theroux wrote: > Hi Marcelo,

Re: Write performance with 1.2.12

2013-12-06 Thread srmore
On Fri, Dec 6, 2013 at 9:59 AM, Vicky Kak wrote: > You have passed the JVM configurations and not the cassandra > configurations which is in cassandra.yaml. > Apologies, was tuning JVM and that's what was in my mind. Here are the cassandra settings http://pastebin.com/uN42GgYT > The spikes ar

Re: Write performance with 1.2.12

2013-12-06 Thread Vicky Kak
Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3 which is 8/3 ~ 2.6 gb in capacity http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management The flushing of 2.6 gb to the disk might slow the performance if frequently called, may

Re: Write performance with 1.2.12

2013-12-06 Thread srmore
Looks like I am spending some time in GC. java.lang:type=GarbageCollector,name=ConcurrentMarkSweep CollectionTime = 51707; CollectionCount = 103; java.lang:type=GarbageCollector,name=ParNew CollectionTime = 466835; CollectionCount = 21315; On Fri, Dec 6, 2013 at 9:58 AM, Jason Wee wrote:

Re: Write performance with 1.2.12

2013-12-06 Thread Vicky Kak
Since how long the server had been up, hours,days,months? On Fri, Dec 6, 2013 at 10:41 PM, srmore wrote: > Looks like I am spending some time in GC. > > java.lang:type=GarbageCollector,name=ConcurrentMarkSweep > > CollectionTime = 51707; > CollectionCount = 103; > > java.lang:type=GarbageCo

Re: Write performance with 1.2.12

2013-12-06 Thread srmore
Not long: Uptime (seconds) : 6828 Token: 56713727820156410577229101238628035242 ID : c796609a-a050-48df-bf56-bb09091376d9 Gossip active: true Thrift active: true Native Transport active: false Load : 49.71 GB Generation No: 1386344053 Uptime (secon

AddContractPoint /VIP

2013-12-06 Thread chandra Varahala
Greetings, I have 4 node cassandra cluster that will grow upt to 10 nodes,we are using CQL Java client to access the data. What is the good practice to put in the code as addContactPoint ie.,how many servers ? 1) I am also thinking to put this way here I am not sure this good or bad if i con

Re: cassandra performance problems

2013-12-06 Thread J. Ryan Earl
On Thu, Dec 5, 2013 at 6:33 AM, Alexander Shutyaev wrote: > We've plugged it into our production environment as a cache in front of > postgres. Everything worked fine, we even stressed it by explicitly > propagating about 30G (10G/node) data from postgres to cassandra. > If you just want a cachin

Re: Write performance with 1.2.12

2013-12-06 Thread srmore
Changed memtable_total_space_in_mb to 1024 still no luck. On Fri, Dec 6, 2013 at 11:05 AM, Vicky Kak wrote: > Can you set the memtable_total_space_in_mb value, it is defaulting to 1/3 > which is 8/3 ~ 2.6 gb in capacity > > http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-me

Re: cassandra backup

2013-12-06 Thread Robert Coli
On Fri, Dec 6, 2013 at 5:13 AM, Marcelo Elias Del Valle < marc...@s1mbi0se.com.br> wrote: > I am trying to create backups of my data on AWS. My goal is to store > the backups on S3 or glacier, as it's cheap to store this kind of data. So, > if I have a cluster with N nodes, I would like to cop

Re: help on backup muiltinode cluster

2013-12-06 Thread Robert Coli
On Fri, Dec 6, 2013 at 6:41 AM, Amalrik Maia wrote: > hey guys, I'm trying to take backups of a multi-node cassandra and save > them on S3. > My idea is simply doing ssh to each server and use nodetool to create the > snapshots then push then to S3. > https://github.com/synack/tablesnap So is th

Re: datastax community ami -- broken? --- datastax-agent conflicts with opscenter-agent

2013-12-06 Thread Joaquin Casares
Hey John, Thanks for letting us know. I'm also seeing that the motd gets stuck, but if I ctrl-c during the message and try a `nodetool status` there doesn't appear to be an issue. I'm currently investigating why it's getting stuck. Are you seeing something similar? What happens if you try to run

calculating sizes on disk

2013-12-06 Thread John Sanda
I am trying to do some disk capacity planning. I have been referring the datastax docs[1] and this older blog post[2]. I have a column family with the following, row key - 4 bytes column name - 8 bytes column value - 8 bytes max number of non-deleted columns per row - 20160 Is there an effective

Re: calculating sizes on disk

2013-12-06 Thread John Sanda
I should have also mentioned that I have tried using the calculations from the storage sizing post. My lack of success may be due to the post basing things off of Cassandra 0.8 as well as a lack of understanding in how to do some of the calculations. On Fri, Dec 6, 2013 at 3:08 PM, John Sanda wr

Re: vnodes on aws

2013-12-06 Thread Robert Coli
On Thu, Dec 5, 2013 at 6:58 PM, Andrey Ilinykh wrote: > > > > On Thu, Dec 5, 2013 at 3:31 PM, Jayadev Jayaraman wrote: > >> Availability zones are analogous to racks not data centres . EC2 regions >> are equivalent to data centres. >> > Yes, this is what I meant. I guess my question is - is possi

Re: calculating sizes on disk

2013-12-06 Thread Jacob Rhoden
Not sure what your end setup will be, but I would probably just spin up a cluster and fill it with typical data to and measure the size on disk. __ Sent from iPhone > On 7 Dec 2013, at 6:08 am, John Sanda wrote: > > I am trying to do some disk capacity planning. I h

Re: calculating sizes on disk

2013-12-06 Thread John Sanda
I have done that, but it only gets me so far because the cluster and app that manages it is run by 3rd parties. Ideally, I would like to provide my end users with a formula or heuristic for establishing some sort of baselines that at least gives them a general idea for planning. Generating data as

Re: datastax community ami -- broken? --- datastax-agent conflicts with opscenter-agent

2013-12-06 Thread Joaquin Casares
Hello again John, The AMI has been patched and tested for both DSE and C* and works for the standard 3 node test. The new code has been pushed to the 2.4 branch so launching a new set of instances will give you an updated AMI. You should now have the newest version of OpsCenter installed, along w

Re: datastax community ami -- broken? --- datastax-agent conflicts with opscenter-agent

2013-12-06 Thread Franc Carter
Hi Joaquin, A quick word of praise - addressing the issue so quickly presents a really good view of Datastax cheers On Sat, Dec 7, 2013 at 8:14 AM, Joaquin Casares wrote: > Hello again John, > > The AMI has been patched and tested for both DSE and C* and works for the > standard 3 node test.