Cassandra Dead but pid file exists

2015-03-04 Thread Mohit Garg
I have novice to cassandra and tried my hands to install cassandra-2.1.2 on centos 7.0. After complete installation execute cqlsh command and created few keyspace(s) and column family. Which seems to me in first glance its working perfectly. But later onwards i realized below issues: 1> when i ex

Re: Issue restarting cassandra with a cluster running Cassandra 1.2.x and Cassandra 2.0.x

2015-03-04 Thread Fabrice Facorat
Upgrade a node from 1.2.13 to 2.0.10 works correctly and we did run upgradesstable on the new 2.0.x node. The issue lies with the others nodes still running Cassandra 1.2.x which failed to start if you did just a restart of the node. Here is the describecluster output during the upgrade procedure

Inconsistent count(*) and distinct results from Cassandra

2015-03-04 Thread Rumph, Frens Jan
Hi, Is it to be expected that select count(*) from ... and select distinct partition-key-columns from ... to yield inconsistent results between executions even though the table at hand isn't written to? I have a table in a keyspace with replication_factor = 1 which is something like: CREATE TABL

Re: Cassandra Dead but pid file exists

2015-03-04 Thread Sibbald, Charles
Check your cassandra.yaml config file. Seems you have a misconfigured file path in there From: Mohit Garg mailto:gargmohit3...@gmail.com>> Reply-To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Date: Wednesday, 4 March 2015 09:14 To: "user@cass

Cassandra Stress Test Result Evaluation

2015-03-04 Thread Nisha Menon
I have been using the cassandra-stress tool to evaluate my cassandra cluster for quite some time now. My problem is that I am not able to comprehend the results generated for my specific use case. My schema looks something like this: CREATE TABLE Table_test( ID uuid, Time timestamp,

cassandra node jvm stall intermittently

2015-03-04 Thread Jason Wee
Hi, our cassandra node using java 7 update 72 and we ran jstat on one of the node, and notice some strange behaviour as indicated by output below. any idea why when eden space stay the same for few seconds like 100% and 18.02% for few seconds? we suspect such "stalling" cause timeout to our cluster

Streaming failures during bulkloading data using CqlBulkOutputFormat

2015-03-04 Thread Aby Kuruvilla
I am trying to use the CqlBulkOutputFormat in a Hadoop job to bulk load data into Cassandra. Was not able to find any documentation of this new output format , but from looking through the code this uses CQLSSTableWriter to write SSTable files to disk , which are then streamed to Cassandra using S

OOM and high SSTables count

2015-03-04 Thread Roni Balthazar
Hi there, We are running C* 2.1.3 cluster with 2 DataCenters: DC1: 30 Servers / DC2 - 10 Servers. DC1 servers have 32GB of RAM and 10GB of HEAP. DC2 machines have 16GB of RAM and 5GB HEAP. DC1 nodes have about 1.4TB of data and DC2 nodes 2.3TB. DC2 is used only for backup purposes. There are no re

Re: OOM and high SSTables count

2015-03-04 Thread Jan
HI Roni;  You mentioned: DC1 servers have 32GB of RAM and 10GB of HEAP. DC2 machines have 16GB of RAM and 5GB HEAP. Best practices would be be to:a)  have a consistent type of node across both DC's.  (CPUs, Memory, Heap & Disk) b)  increase heap on DC2 servers to be  8GB for C* Heap  The leveled

Re: Inconsistent count(*) and distinct results from Cassandra

2015-03-04 Thread Mikhail Strebkov
We have observed the same issue in our production Cassandra cluster (5 nodes in one DC). We use Cassandra 2.1.3 (I joined the list too late to realize we shouldn’t user 2.1.x yet) on Amazon machines (created from community AMI). In addition to count variations with 5 to 10% we observe variati

Re: Streaming failures during bulkloading data using CqlBulkOutputFormat

2015-03-04 Thread Yuki Morishita
Do you have corresponding error in the other side of the stream (/192.168.56.11)? On Wed, Mar 4, 2015 at 9:11 AM, Aby Kuruvilla wrote: > I am trying to use the CqlBulkOutputFormat in a Hadoop job to bulk load data > into Cassandra. Was not able to find any documentation of this new output > for

Re: OOM and high SSTables count

2015-03-04 Thread Patrick McFadin
What kind of disks are you running here? Are you getting a lot of GC before the OOM? Patrick On Wed, Mar 4, 2015 at 9:26 AM, Jan wrote: > HI Roni; > > You mentioned: > DC1 servers have 32GB of RAM and 10GB of HEAP. DC2 machines have 16GB of > RAM and 5GB HEAP. > > Best practices would be be to:

Re: OOM and high SSTables count

2015-03-04 Thread daemeon reiydelle
Are you finding a correlation between the shards on the OOM DC1 nodes and the OOM DC2 nodes? Does your monitoring tool indicate that the DC1 nodes are using significantly more CPU (and memory) than the nodes that are NOT failing? I am leading you down the path to suspect that your sharding is givin

Re: Inconsistent count(*) and distinct results from Cassandra

2015-03-04 Thread Jens Rantil
Frens, What consistency are you querying with? Could be you are simply receiving result from different nodes each time. Jens – Skickat från Mailbox On Wed, Mar 4, 2015 at 7:08 PM, Mikhail Strebkov wrote: > We have observed the same issue in our production Cassandra cluster (5 nodes > in o

Write timeout under load but Read is fine

2015-03-04 Thread Jaydeep Chovatia
Hi, In my test program when I increase load then I keep getting few "write timeout" from Cassandra say every 10~15 mins. My read:write ratio is 50:50. My reads are fine but only writes time out. Here is my Cassandra details: Version: 2.0.11 Ring of 3 nodes with RF=3 Node configuration: 24 core +

Re: Input/Output Error

2015-03-04 Thread Jens Rantil
Hi, Check your Cassandra and kernel (if on Linux) log files for errors. Cheers, Jens – Skickat från Mailbox On Wed, Mar 4, 2015 at 2:18 AM, 曹志富 wrote: > Some times My C* 2.1.3 cluster compaction or streaming occur this error ,do > this because of disk or filesystem problem?? > Thanks All. >

Re: Write timeout under load but Read is fine

2015-03-04 Thread Jan
HI Jaydeep;  - look at the i/o  on all three nodes - Increase the write_request_timeout_in_ms: 1 - check the time-outs if any on the client inserting the Writes - check the Network for  dropped/lost packets hope this helpsJan/ On Wednesday, March 4, 2015

Re: cassandra node jvm stall intermittently

2015-03-04 Thread Jan
HI Jason;  Whats in the log files at the moment jstat shows 100%. What is the activity on the cluster & the node at the specific point in time (reads/ writes/ joins etc) Jan/ On Wednesday, March 4, 2015 5:59 AM, Jason Wee wrote: Hi, our cassandra node using java 7 update 72 and we ra

Re: Inconsistent count(*) and distinct results from Cassandra

2015-03-04 Thread daemeon reiydelle
What is the replication? Could you be serving stale data from a node that was not properly replicated (hints timeout exceeded by a node being down?) On Wed, Mar 4, 2015 at 11:03 AM, Jens Rantil wrote: > Frens, > > What consistency are you querying with? Could be you are simply receiving > resu

Re: Inconsistent count(*) and distinct results from Cassandra

2015-03-04 Thread DuyHai Doan
"Is it to be expected that select count(*) from ... and select distinct partition-key-columns from ... to yield inconsistent results between executions even though the table at hand isn't written to?" Actually, depending on the definition of your primary key, select count(*) and select distinct pa

Re: OOM and high SSTables count

2015-03-04 Thread graham sanderson
We can confirm a problem on 2.1.3 (sadly our beta sstable state obviously did not match our production ones in some critical way) We have about 20k sstables on each of 6 nodes right now; actually a quick glance shows 15k of those are from OpsCenter, which may have something to do with beta/prod

java consuming lot of cpu with lots of futex calls

2015-03-04 Thread Steffen Winther
Hi Trying to make a test lab workable with cassandra 1.2.15 nodes on Centos 6.6 kernel 2.6.32-504.8.1.el6.x86_64 on top of KVM nodes. But I finding java perf very poor, seems JVM is doing a lot of futext sys calls which times out, thus spinning a lot of cpu cycles. Tried with both Oracle java 1

Howto remove currently assigned data directory from 2.0.12 nodes

2015-03-04 Thread Steffen Winther
HI Got a cassandra cluster 2.0.12 with three nodes, that I would like to reduce storage capacity as I would like to reuse some disks for a PoC cassandra 1.2.15 cluster on the same nodes. Howto remove already assigned data file directories from running nodes? f.ex. got: data_file_directories :

Does it makes sense to split Gossip from Thrift network

2015-03-04 Thread Steffen Winther
Hi Wondering if if makes sense to split network for client traffic vs Gossip/Internode traffic (possible with larger MTU for storage traffic). So I tried this: - Gossip storage listener (port 700x) on one network - Thrift/CQL listeners (port 9160/9042) on another Only I find it a bit confusing

Re: Does it makes sense to split Gossip from Thrift network

2015-03-04 Thread daemeon reiydelle
If your cluster is typical, your most critical resource is your network bandwidth, if this is the case, I would not do this split you are proposing. One issue with large MTU's is that they are often split at the switch fabric. Switches are not generally known for having processors that are idle, so

Re: Howto remove currently assigned data directory from 2.0.12 nodes

2015-03-04 Thread Robert Coli
On Wed, Mar 4, 2015 at 3:28 PM, Steffen Winther wrote: > Howto remove already assigned > data file directories from running nodes? > 1) stop node 2) move sstables from no-longer-data-directories into still-data-directories 3) modify conf file 4) start node I wonder how pending compactions are h

Re: Does it makes sense to split Gossip from Thrift network

2015-03-04 Thread Steffen Winther
daemeon reiydelle gmail.com> writes: > >If your cluster is typical, your most critical resource is your >network bandwidth, >if this is the case, I would not do this split you are proposing. >One issue with large MTU's is that they are often split >at the switch fabric. Got control of my switches,

Re: Howto remove currently assigned data directory from 2.0.12 nodes

2015-03-04 Thread Steffen Winther
Robert Coli eventbrite.com> writes: > > 1) stop node > 2) move sstables from no-longer-data-directories into still-data-directories Okay, just into any other random data dir? Few files here and there to spread amount of data between still-data-dirs? > 3) modify conf file > 4) start node > > I

Re: Input/Output Error

2015-03-04 Thread 曹志富
thanks! -- Ranger Tsao 2015-03-05 3:40 GMT+08:00 Jens Rantil : > Hi, > > Check your Cassandra and kernel (if on Linux) log files for errors. > > Cheers, > Jens > > – > Skickat från Mailbox > > > On Wed, Mar 4, 2015 at 2:18 AM,

Re: OOM and high SSTables count

2015-03-04 Thread J. Ryan Earl
We think it is this bug: https://issues.apache.org/jira/browse/CASSANDRA-8860 We're rolling a patch to beta before rolling it into production. On Wed, Mar 4, 2015 at 4:12 PM, graham sanderson wrote: > We can confirm a problem on 2.1.3 (sadly our beta sstable state obviously > did not match our