Re: Migrating from single node to cluster

2016-02-26 Thread Carlos Alonso
Hi Jason. Moving from one node to two has exactly the same process as moving from 2 to 3 or any other +1 jump. Basically you configure the node and make it join the ring and Cassandra will take care of assigning data and load to that new node. Regarding how much data your new node should have or

Cassandra Multi DC (Active-Active) Setup - Measuring latency & throughput performance

2016-02-26 Thread chandrasekar.krc
Hi, Are there any links/resources which describe performance measurement (latency & throughput) for a Cassandra Multi DC Active-Active setup across a WAN network (20Gbps bandwidth) with 5 nodes in each DC. Basically, I would like to know how to measure latency of writes when data is replicat

Re: Unexpected high internode network activity

2016-02-26 Thread Gianluca Borello
I understand your point about the billing, but billing here was merely the triggering factor that had me start analyzing the traffic in the first place. At the moment, I'm not considering the numbers on my bill anymore but simply the numbers that I am measuring with iftop on each node of the clust

Low compactionthroughput blocks reads?

2016-02-26 Thread horschi
Hi, I just had a weird behaviour on one of our Cassandra nodes, which I would like to share: Short version: My pending reads went up from ~0 to the hundreds when I reduced the compactionthroughput from 16 to 2. Long version: One of our more powerful nodes had a few pending reads, while the oth

Re: Unexpected high internode network activity

2016-02-26 Thread Nate McCall
> > > Unfortunately, these numbers still don't match at all. > > And yes, the cluster is in a single DC and since I am using the EC2 > snitch, replicas are AZ aware. > > Are repairs running on the cluster? Other thoughts: - is internode_compression set to 'all' in cassandra.yaml (should be 'all' b

Re: Unexpected high internode network activity

2016-02-26 Thread Gianluca Borello
Thank you for your reply. - Repairs are not running on the cluster, in fact we've been "slacking" when it comes to repair, mainly because we never manually delete our data as it's always TTLed and we haven't had major failures or outages that required repairing data (I know that's not a good reaso

[ANNOUNCE] YCSB 0.7.0 Release

2016-02-26 Thread Kevin Risden
On behalf of the development community, I am pleased to announce the release of YCSB 0.7.0. Highlights: * GemFire binding replaced with Apache Geode (incubating) binding * Apache Solr binding was added * OrientDB binding improvements * HBase Kerberos support and use single connection * Accumulo i

Nodetool Rebuild sending few big packets of data. Is it normal?

2016-02-26 Thread Felipe Esteves
Hi, I'm running a nodetool rebuild to include a new DC in my cluster. My config is: DC1, 2 nodes per rack (2 racks), 70gb each node DC2, 2 nodes per rack (1 rack), 90gb each node DC3, 2 nodes per rack (1 rack) (*THIS IS THE NEW DC*) What I did was get the 2 nodes in DC3 up and running with bootst

Re: Nodetool Rebuild sending few big packets of data. Is it normal?

2016-02-26 Thread Jeff Jirsa
Cassandra is streaming it at a near constant rate (if you had metrics for network interface, you’d probably see that), but it doesn’t register in nodetool status until it completes all of the sstables for a column family. At that point, the -tmp–Data.db files get renamed to drop the –tmp, and th

Re: Nodetool Rebuild sending few big packets of data. Is it normal?

2016-02-26 Thread Felipe Esteves
Hi Jeff, Thanks for the info, you're right! Felipe Esteves Tecnologia felipe.este...@b2wdigital.com Tel.: (21) 3504-7162 ramal 57162 2016-02-26 17:38 GMT-03:00 Jeff Jirsa : > Cassandra is streaming it at a near constant rate (if you had metrics for > network interface, you’d probably see th

Re: Cassandra Multi DC (Active-Active) Setup - Measuring latency & throughput performance

2016-02-26 Thread Bryan Cheng
Hi Chandra, For write latency, etc. the tools are still largely the same set of tools you'd use for single-DC- stuff like tracing, cfhistograms, cassandra-stress come to mind. The exact results are going to differ based on your consistency tuning (can you get away with LOCAL_QUORUM vs QUORUM?) and

Re: Checking replication status

2016-02-26 Thread Bryan Cheng
Hi Jimmy, If you sustain a long downtime, repair is almost always the way to go. It seems like you're asking to what extent a cluster is able to recover/resync a downed peer. A peer will not attempt to reacquire all the data it has missed while being down. Recovery happens in a few ways: 1) Hin

Problem running select with partial partition keys in version 3.3

2016-02-26 Thread Saurabh Sethi
We just upgraded our cluster to version 3.3 and I am not able to run select or delete statements using partial partition keys in where clause. It asks me to provide all the partition keys. But I have only partial partition key info (1 out of two columns) when running a select. Is there any wor

Re: Problem running select with partial partition keys in version 3.3

2016-02-26 Thread Jonathan Haddad
You wouldn't be able to do that query with that schema in any version of Cassandra. Here's the output from 2.1: cqlsh> create KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh> use test; cqlsh:test> create table if not exists persistent_map ( ...