Re: Increasing size of "Batch of prepared statements"

2014-10-23 Thread shahab
Thanks Jens for the comments. As I am trying "cassandra stress tool", does it mean that the tool is executing batch of "Insert" statements (probably hundreds, or thousands) to the cassandra (for the sake of stressing Cassnadra ? best, /Shahab On Wed, Oct 22, 2014 at 8:14 PM, Jens Rantil wrote:

Multi Datacenter / MultiRegion on AWS Best practice ?

2014-10-23 Thread Alain RODRIGUEZ
Hi, We are currently wondering about the best way to configure network architecture to have a Cassandra cluster multi DC. Reading previous messages on this mailing list, I see 2 main ways to do this: 1 - 2 private VPC, joined by a VPN tunnel linking 2 regions. C* using EC2Snitch (or PropertyFile

Operating on large cluster

2014-10-23 Thread Alain RODRIGUEZ
Hi, I was wondering about how do you guys handle a large cluster (50+ machines). I mean there is sometime you need to change configuration (cassandra.yaml) or send a command to one, some or all nodes (cleanup, upgradesstables, setstramthoughput or whatever). So far we have been using things like

Re: Operating on large cluster

2014-10-23 Thread Jens Rantil
Hi, While I am nowhere close to 50+ machines I've been using Saltstack for both configuration management as well as remote execution. I has worked great for me and supposedly scales to 1000+ machines. Cheers, Jens — Sent from Mailbox On Thu, Oct 23, 2014 at 11:18 AM, Alain RODRIGUEZ wr

Re: Increasing size of "Batch of prepared statements"

2014-10-23 Thread Jens Rantil
Hi again Shabab, Yes, it seems that way. I have no experience with the “cassandra stress tool”, but wouldn’t be surprised if the batch size could be tweaked. Cheers, Jens ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se Phone: +46 708 84 18 32 Web: www.tink.se Faceb

Re: Increasing size of "Batch of prepared statements"

2014-10-23 Thread shahab
OK, Thanks again Jens. best, /Shahab On Thu, Oct 23, 2014 at 1:22 PM, Jens Rantil wrote: > Hi again Shabab, > > Yes, it seems that way. I have no experience with the “cassandra stress > tool”, but wouldn’t be surprised if the batch size could be tweaked. > > Cheers, > Jens > > ——— Jens Rantil B

Cassandra Node Commissioning

2014-10-23 Thread Aravindan T
Hi   We are facing several problems during commissioning of new nodes to the existing cluster. The existing cluster(5 nodes) is holding data of 13 TB and daily 0.1 TB of data will be loaded.Ten days back,we started adding 5 nodes. In the middle of the commissioning process, the bootstrap proce

Re: Empty cqlsh cells vs. null

2014-10-23 Thread DuyHai Doan
Hello Jens What do you mean by "cqlsh explicitely writes 'null' in those cells" ? Are you seing textual value "null" written in the cells ? Null in CQL can have 2 meanings: 1. the column did not exist (or more precisely, has never been created) 2. the column did exist sometimes in the past (h

Re: Empty cqlsh cells vs. null

2014-10-23 Thread Adam Holmberg
'null' is how cqlsh displays empty cells: https://github.com/apache/cassandra/blob/trunk/pylib/cqlshlib/formatting.py#L47-L58 On Thu, Oct 23, 2014 at 9:36 AM, DuyHai Doan wrote: > Hello Jens > > What do you mean by "cqlsh explicitely writes 'null' in those cells" ? > Are you seing textual value

Re: frequently update/read table and level compaction

2014-10-23 Thread DuyHai Doan
#1 There are 2 levels of JMX metrics: the one for each table and the one related to the StorageProxy. Depending on each "readcount" you're looking at, the meaning can be different, watch this video for more explanation: https://www.youtube.com/watch?v=w6aD4vAY_a8&index=14&list=PLqcm6qE9lgKJkxYZUOI

Re: Increasing size of "Batch of prepared statements"

2014-10-23 Thread Tyler Hobbs
CASSANDRA-8091 (Stress tool creates too large batches) is relevant: https://issues.apache.org/jira/browse/CASSANDRA-8091 On Thu, Oct 23, 2014 at 6:28 AM, shahab wrote: > OK, Thanks again Jens. > > best, > /Shahab > > On Thu, Oct 23, 2014 at 1:22 PM, Jens Rantil wrote: > >> Hi again Shabab, >> >

Re: Operating on large cluster

2014-10-23 Thread Ranjib Dey
We use chef for configuration management and blender for on demand jobs https://github.com/opscode/chef https://github.com/PagerDuty/blender On Oct 23, 2014 2:18 AM, "Alain RODRIGUEZ" wrote: > Hi, > > I was wondering about how do you guys handle a large cluster (50+ > machines). > > I mean ther

Re: Increasing size of "Batch of prepared statements"

2014-10-23 Thread shahab
Thanks Tyler for sharing this. It is exactly what I was looking for to know. best, /Shahab On Thu, Oct 23, 2014 at 5:37 PM, Tyler Hobbs wrote: > CASSANDRA-8091 (Stress tool creates too large batches) is relevant: > https://issues.apache.org/jira/browse/CASSANDRA-8091 > > On Thu, Oct 23, 2014 at

Re: are repairs in 2.0 more expensive than in 1.2

2014-10-23 Thread Sean Bridges
We switched to to parallel repairs, and now our repairs in 2.0 are behaving like the repairs in 1.2. The change from parallel to sequential is very dramatic. For a small cluster with 3 nodes, using cassandra 2.0.10, a parallel repair takes 2 hours, and io throughput peaks at 6 mb/s. Sequential

Re: are repairs in 2.0 more expensive than in 1.2

2014-10-23 Thread Robert Coli
On Thu, Oct 23, 2014 at 9:33 AM, Sean Bridges wrote: > The change from parallel to sequential is very dramatic. For a small > cluster with 3 nodes, using cassandra 2.0.10, a parallel repair takes 2 > hours, and io throughput peaks at 6 mb/s. Sequential repair takes 40 > hours, with average io

Re: Operating on large cluster

2014-10-23 Thread Michael Shuler
On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote: I was wondering about how do you guys handle a large cluster (50+ machines). Configuration management tools are awesome, until they aren't. Having used or played with all the popular ones, and having been bitten by failures of those tools on larg

Re: Operating on large cluster

2014-10-23 Thread Eric Plowe
I am a big fan of perl-ssh-tools (https://github.com/tobert/perl-ssh-tools) to let me manage my nodes and SVN to store configs. ~Eric Plowe On Thu, Oct 23, 2014 at 3:07 PM, Michael Shuler wrote: > On 10/23/2014 04:18 AM, Alain RODRIGUEZ wrote: > >> I was wondering about how do you guys handle

Re: Operating on large cluster

2014-10-23 Thread Roni Balthazar
Hi, We use Puppet to manage our Cassandra configuration. (http://puppetlabs.com) You can use Cluster SSH to send commands to the server as well. Another good choice is Saltstack. Regards, Roni On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ wrote: > Hi, > > I was wondering about how do you

Re: are repairs in 2.0 more expensive than in 1.2

2014-10-23 Thread Janne Jalkanen
On 23 Oct 2014, at 21:29 , Robert Coli wrote: > On Thu, Oct 23, 2014 at 9:33 AM, Sean Bridges wrote: > The change from parallel to sequential is very dramatic. For a small cluster > with 3 nodes, using cassandra 2.0.10, a parallel repair takes 2 hours, and > io throughput peaks at 6 mb/s.

Re: are repairs in 2.0 more expensive than in 1.2

2014-10-23 Thread Robert Coli
On Thu, Oct 23, 2014 at 2:04 PM, Janne Jalkanen wrote: > > If I had known that this had so far been a theoretical problem, I would’ve > spoken up earlier. Perhaps serial repair is not the best default. > Unfortunately you must not hang out in #cassandra on freenode, where I've been ranting^Wcomp

Re: Operating on large cluster

2014-10-23 Thread Otis Gospodnetic
Hi Alain, We use Puppet and introducing Ansible at Sematext. Not for Cassandra, but for other similar tech. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On Thu, Oct 23, 2014 at 5:18 AM, Alain RODRIGUEZ wrote