Re: Cassandra process exiting mysteriously

2014-08-05 Thread Clint Kelly
HI Kevin, Thanks for your reply. That is what I assumed, but some of the posts I read on Stack Overflow (e.g., the one that I referenced in my mail) suggested otherwise. I was just curious if others had experienced OOM problems that weren't logged or if there were other common culprits. Best re

Re: Cassandra process exiting mysteriously

2014-08-05 Thread Kevin Burton
If there is an oom it will be in the logs. On Aug 5, 2014 8:17 PM, "Clint Kelly" wrote: > Hi everyone, > > For some integration tests, we start up a CassandraDaemon in a > separate process (using the Java 7 ProcessBuilder API). All of my > integration tests run beautifully on my laptop, but one

Cassandra process exiting mysteriously

2014-08-05 Thread Clint Kelly
Hi everyone, For some integration tests, we start up a CassandraDaemon in a separate process (using the Java 7 ProcessBuilder API). All of my integration tests run beautifully on my laptop, but one of them fails on our Jenkins cluster. The failing integration test does around 10k writes to diffe

Re: Make an existing cluster multi data-center compatible.

2014-08-05 Thread Rameez Thonnakkal
I think the RAC placement of these 12 nodes will become important. As the 12 nodes are placed in SimpleSnitch, which is not RAC aware, it would be good to retain them in single RAC in the property file snitch also initially. node repair is a safe option. If you need to change the RAC placement, my

NPE in UUIDGen.decompose - Cassandra 1.2.9

2014-08-05 Thread Elias Ross
Here's an error I've seen when adding while decommissioning a different node. INFO [GossipStage:1] 2014-08-06 00:37:43,615 StorageService.java (line 1614) Removing tokens [-1092160356339948581, -1135452450397885068, ... ERROR [GossipStage:1] 2014-08-06 00:37:43,618 CassandraDaemon.java (line 194)

Re: Make an existing cluster multi data-center compatible.

2014-08-05 Thread Robert Coli
On Tue, Aug 5, 2014 at 2:27 PM, Rene Kochen wrote: > "As long as you correctly configure the new snitch so that the replica > sets do not change, no, you do not need to repair." > > Is the following correct: > > The replica sets do not change if you modify the snitch from SimpleSnitch > to Networ

Re: Make an existing cluster multi data-center compatible.

2014-08-05 Thread Rene Kochen
"As long as you correctly configure the new snitch so that the replica sets do not change, no, you do not need to repair." Is the following correct: The replica sets do not change if you modify the snitch from SimpleSnitch to NetworkTopologyStrategy and the topology file puts all nodes in the sam

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Robert Coli
On Tue, Aug 5, 2014 at 11:53 AM, Clint Kelly wrote: > Ah FWIW I was able to reproduce the problem by reducing > "range_request_timeout_in_ms." This is great since I want to increase > the timeout for batch jobs where we scan a large set of rows, but > leave the timeout for single-row queries alo

Re: Fail to reconnect to other nodes after intermittent network failure

2014-08-05 Thread Jiri Horky
OK, ticket 7696 [1] created. Jiri Horky https://issues.apache.org/jira/browse/CASSANDRA-7696 On 08/05/2014 07:57 PM, Robert Coli wrote: > > On Tue, Aug 5, 2014 at 5:48 AM, Jiri Horky > wrote: > > What puzzles me is the fact that the authentization apparently started

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
Also, right now the "top" command shows that we are at 500-700% CPU, and we have 23 total processors, which means we have a lot of idle CPU left over, so throwing more threads at compaction and flush should alleviate the problem? On Tue, Aug 5, 2014 at 2:57 PM, Ruchir Jha wrote: > > Right now,

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
Right now, we have 6 flush writers and compaction_throughput_mb_per_sec is set to 0, which I believe disables throttling. Also, Here is the iostat -x 5 5 output: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util sda 10.00 1450

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
Ah FWIW I was able to reproduce the problem by reducing "range_request_timeout_in_ms." This is great since I want to increase the timeout for batch jobs where we scan a large set of rows, but leave the timeout for single-row queries alone. Best regards, Clint On Tue, Aug 5, 2014 at 11:42 AM, Cl

Re: Node stuck during nodetool rebuild

2014-08-05 Thread Mark Reddy
Hi Vasilis, To further on what Rob said I believe you might be able to tune the phi detector threshold to help this > operation complete, hopefully someone with direct experience of same will > chime in. I have been through this operation where streams break due to a node falsely being marked d

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
Hi Rob, Thanks for your feedback. I understand that use of ALLOW FILTERING is not a best practice. In this case, however, I am building a tool on top of Cassandra that allows users to sometimes do things that are less than optimal. When they try to do expensive queries like this, I'd rather pro

Re: moving older tables from SSD to HDD?

2014-08-05 Thread Benedict Elliott Smith
Hi Kevin, This is something we do plan to support, but don't right now. You can see the discussion around this and related issues here (although it may seem unrelated at first glance). On Mon, Aug 4, 2014 at 8:43 PM, Kevin Burton wrote:

Re: Node stuck during nodetool rebuild

2014-08-05 Thread Robert Coli
On Tue, Aug 5, 2014 at 1:28 AM, Vasileios Vlachos < vasileiosvlac...@gmail.com> wrote: > The problem is that the nodetool seems to be stuck, and nodetool netstats > on node1 of DC2 appears to be stuck at 10% streaming a 5G file from node2 > at DC1. This doesn't tally with nodetool netstats when ru

Re: moving older tables from SSD to HDD?

2014-08-05 Thread Sávio S . Teles de Oliveira
Have you looked nodetool? http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsNodetool_r.html 2014-08-04 16:43 GMT-03:00 Kevin Burton : > Is it possible to take older tables, which are immutable, and move them > from SSD to HDD? > > We lower the SLA on older data so keeping

Re: Issue with ALLOW FILTERING

2014-08-05 Thread Sávio S . Teles de Oliveira
You need to create an index on attribute *c.* 2014-08-05 9:24 GMT-03:00 Jens Rantil : > Hi, > > I'm having an issue with ALLOW FILTERING with Cassandra 2.0.8. See a > minimal example here: > https://gist.github.com/JensRantil/ec43622c26acb56e5bc9 > > I expect the second last to fail, but the las

RE: vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-05 Thread DuyHai Doan
The discussion about racks & NTS is also mentioned in this recent article : planetcassandra.org/multi-data-center-replication-in-nosql-databases/ The last section may be of interest for you Le 5 août 2014 18:14, "DE VITO Dominique" a écrit : > > Jonathan wrote: > > > > Yes, if you have only 1 ma

Re: Make an existing cluster multi data-center compatible.

2014-08-05 Thread Robert Coli
On Tue, Aug 5, 2014 at 3:52 AM, Rene Kochen wrote: > Do I have to run full repairs after this change? Because the yaml file > states: IF YOU CHANGE THE SNITCH AFTER DATA IS INSERTED INTO THE CLUSTER, > YOU MUST RUN A FULL REPAIR, SINCE THE SNITCH AFFECTS WHERE REPLICAS ARE > PLACED. > As long as

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Sávio S . Teles de Oliveira
How much did you reduce *read_request_timeout_in_ms* on your local machine? Cassandra timeout during read query is higher than one machine because Cassandra server must run the read operation in more servers (so you have network traffic). 2014-08-05 14:54 GMT-03:00 Robert Coli : > On Tue, Aug 5,

Re: Fail to reconnect to other nodes after intermittent network failure

2014-08-05 Thread Robert Coli
On Tue, Aug 5, 2014 at 5:48 AM, Jiri Horky wrote: > What puzzles me is the fact that the authentization apparently started > to work after the network recovered but the exchange of data did not. > > I would like to understand what could caused the problems and how to > avoid them in the future.

Re: Node bootstrap

2014-08-05 Thread Mark Reddy
Hi Ruchir, With the large number of blocked flushes and the number of pending compactions would still indicate IO contention. Can you post the output of 'iostat -x 5 5' If you do in fact have spare IO, there are several configuration options you can tune such as increasing the number of flush wri

Re: Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Robert Coli
On Tue, Aug 5, 2014 at 10:01 AM, Clint Kelly wrote: > Allow me to rephrase a question I asked last week. I am performing some > queries with ALLOW FILTERING and getting consistent read timeouts like the > following: > ALLOW FILTERING should be renamed PROBABLY TIMEOUT in order to properly descr

Read timeouts with ALLOW FILTERING turned on

2014-08-05 Thread Clint Kelly
Hi all, Allow me to rephrase a question I asked last week. I am performing some queries with ALLOW FILTERING and getting consistent read timeouts like the following: com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout during read query at consistency ONE (1 responses we

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
Also Mark to your comment on my tpstats output, below is my iostat output, and the iowait is at 4.59%, which means no IO pressure, but we are still seeing the bad flush performance. Should we try increasing the flush writers? Linux 2.6.32-358.el6.x86_64 (ny4lpcas13.fusionts.corp) 08/05/2014 _x8

RE: vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-05 Thread DE VITO Dominique
> Jonathan wrote: > > Yes, if you have only 1 machine in a rack then your cluster will be > imbalanced. You're going to be able to dream up all sorts of weird failure > cases when you choose a scenario like RF=2 & totally imbalanced network arch. > > Vnodes attempt to solve the problem of imbal

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
nodetool status: Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.10.20.27 1.89 TB256 25.4% 76023cdd-c42d-4068-8b53-ae94584b8b04 rack1 UN 10.10.

Re: A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Mark Reddy
> > So the ‘strategy’ change may not be seen by all nodes when the ‘upgrade > keyspace …’ command returns and I can use ’describe cluster’ to check if > the change has taken effect on all nodes right? Correct, the change may take time to propagate to all nodes. As Rahul said you can check describ

Re: vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-05 Thread Jonathan Haddad
Yes, if you have only 1 machine in a rack then your cluster will be imbalanced. You're going to be able to dream up all sorts of weird failure cases when you choose a scenario like RF=2 & totally imbalanced network arch. Vnodes attempt to solve the problem of imbalanced rings by choosing so many

Re: vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-05 Thread Jeremy Jongsma
If your nodes are not actually evenly distributed across physical racks for redundancy, don't use multiple racks. On Tue, Aug 5, 2014 at 10:57 AM, DE VITO Dominique < dominique.dev...@thalesgroup.com> wrote: > First, thanks for your answer. > > > This is incorrect. Network Topology w/ Vnodes wi

RE: vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-05 Thread DE VITO Dominique
First, thanks for your answer. > This is incorrect. Network Topology w/ Vnodes will be fine, assuming you've > got RF= # of racks. IMHO, it's not a good enough condition. Let's use an example with RF=2 N1/rack_1 N2/rack_1 N3/rack_1 N4/rack_2 Here, you have RF= # of racks And due to Ne

Re: vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-05 Thread Jonathan Haddad
* When I say wild imbalance, I do not mean all tokens on 1 node in the cluster, I really should have said slightly imbalanced On Tue, Aug 5, 2014 at 8:43 AM, Jonathan Haddad wrote: > This is incorrect. Network Topology w/ Vnodes will be fine, assuming > you've got RF= # of racks. For each token

Re: vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-05 Thread Jonathan Haddad
This is incorrect. Network Topology w/ Vnodes will be fine, assuming you've got RF= # of racks. For each token, replicas are chosen based on the strategy. Essentially, you could have a wild imbalance in token ownership, but it wouldn't matter because the replicas would be distributed across the

Re: Node bootstrap

2014-08-05 Thread Mark Reddy
> > Yes num_tokens is set to 256. initial_token is blank on all nodes > including the new one. Ok so you have num_tokens set to 256 for all nodes with initial_token commented out, this means you are using vnodes and the new node will automatically grab a list of tokens to take over responsibility

vnode and NetworkTopologyStrategy: not playing well together ?

2014-08-05 Thread DE VITO Dominique
Hi, My understanding is that NetworkTopologyStrategy does NOT play well with vnodes, due to: * Vnode => tokens are (usually) randomly generated (AFAIK) * NetworkTopologyStrategy => required carefully choosen tokens for all nodes in order to not to get a VERY unbalanced ring lik

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
Sorry for the multiple updates, but another thing I found was all the other existing nodes have themselves in the seeds list, but the new node does not have itself in the seeds list. Can that cause this issue? On Tue, Aug 5, 2014 at 10:30 AM, Ruchir Jha wrote: > Just ran this on the new node: >

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
Just ran this on the new node: nodetool netstats | grep "Streaming from" | wc -l 10 Seems like the new node is receiving data from 10 other nodes. Is that expected in a vnodes enabled environment? Ruchir. On Tue, Aug 5, 2014 at 10:21 AM, Ruchir Jha wrote: > Also not sure if this is relevant

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
Also not sure if this is relevant but just noticed the nodetool tpstats output: Pool NameActive Pending Completed Blocked All time blocked FlushWriter 0 0 1136 0 512 Looks like about 50% of flushes are blocked

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
Yes num_tokens is set to 256. initial_token is blank on all nodes including the new one. On Tue, Aug 5, 2014 at 10:03 AM, Mark Reddy wrote: > My understanding was that if initial_token is left empty on the new node, >> it just contacts the heaviest node and bisects its token range. > > > If you

RE: A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Lu, Boying
Thanks a lot. So the ‘strategy’ change may not be seen by all nodes when the ‘upgrade keyspace …’ command returns and I can use ’describe cluster’ to check if the change has taken effect on all nodes right? From: Rahul Neelakantan [mailto:ra...@rahul.be] Sent: 2014年8月5日 18:46 To: user@cassandra.

Re: Node bootstrap

2014-08-05 Thread Ruchir Jha
Thanks Patricia for your response! On the new node, I just see a lot of the following: INFO [FlushWriter:75] 2014-08-05 09:53:04,394 Memtable.java (line 400) Writing Memtable INFO [CompactionExecutor:3] 2014-08-05 09:53:11,132 CompactionTask.java (line 262) Compacted 12 sstables to so basically

Re: Node bootstrap

2014-08-05 Thread Mark Reddy
> > My understanding was that if initial_token is left empty on the new node, > it just contacts the heaviest node and bisects its token range. If you are using vnodes and you have num_tokens set to 256 the new node will take token ranges dynamically. What is the configuration of your other nodes

Re: data type is object when metric instrument using Gauge?

2014-08-05 Thread Ken Hancock
If you look at VisualVM metadata, it'll show that what's return is java.lang.Object which is different than Meters or Counters. Looking at the source for metrics-core, it seems that this is a "feature" of Gauges because unlike Meters or Counters, Gauges can be of various types -- long, double, etc

Fail to reconnect to other nodes after intermittent network failure

2014-08-05 Thread Jiri Horky
Hi, we experienced a strange problem after intermittent network failure when the affected node did not reconnect to the rest of the cluster but did allow to autenticate users (which was not possible during the actual network outage, see below). The cluster consists of 1 node in each of 3 datacente

Re: Reasonable range for the max number of tables?

2014-08-05 Thread Michal Michalski
>> - Use a keyspace per customer > These effectively amount to the same thing and they both fall foul to the > limit in the number of column families so do not scale. But then you can scale by moving some of the customers to a new cluster easily. If you keep everything in a single keyspace or - wo

Re: Reasonable range for the max number of tables?

2014-08-05 Thread Jack Krupansky
Multi-tenant remain a "challenge" - for most technologies. Yes, you can do what you suggest, but... you need to exercise great care and test and provision your cluster with great care. It's not like a free resource that scales wildly in all directions with no forethought or care. It is somethi

Issue with ALLOW FILTERING

2014-08-05 Thread Jens Rantil
Hi, I'm having an issue with ALLOW FILTERING with Cassandra 2.0.8. See a minimal example here: https://gist.github.com/JensRantil/ec43622c26acb56e5bc9 I expect the second last to fail, but the last query to return a single row. In particular I expect the last SELECT to first select using the clus

Re: Reasonable range for the max number of tables?

2014-08-05 Thread Phil Luckhurst
Hi Mark, Mark Reddy wrote > To segregate customer data, you could: > - Use customer specific column families under a single keyspace > - Use a keyspace per customer These effectively amount to the same thing and they both fall foul to the limit in the number of column families so do not scale.

Re: Make an existing cluster multi data-center compatible.

2014-08-05 Thread Rene Kochen
What I understand is that SimpleStrategy determines the endpoints for replica's by traversing the ring clock-wise. NetworkTopologyStrategy determines the replica's by traversing the ring clock-wise and taking into account the racks and DC locations. Since the file used by PropertyFileSnitch puts

Re: Make an existing cluster multi data-center compatible.

2014-08-05 Thread Mark Reddy
Yes, you must run a full repair for the reasons stated in the yaml file. Mark On Tue, Aug 5, 2014 at 11:52 AM, Rene Kochen wrote: > Hi all, > > I want to add a data-center to an existing single data-center cluster. > First I have to make the existing cluster multi data-center compatible. > >

Make an existing cluster multi data-center compatible.

2014-08-05 Thread Rene Kochen
Hi all, I want to add a data-center to an existing single data-center cluster. First I have to make the existing cluster multi data-center compatible. The existing cluster is a 12 node cluster with: - Replication factor = 3 - Placement strategy = SimpleStrategy - Endpoint snitch = SimpleSnitch I

Re: A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Rahul Neelakantan
Try running "describe cluster" from Cassandra-CLI to see if all nodes have the same schema version. Rahul Neelakantan > On Aug 5, 2014, at 6:13 AM, Sylvain Lebresne wrote: > >> On Tue, Aug 5, 2014 at 11:40 AM, Lu, Boying wrote: >> What I want to know is “are the strategy changed ?’ after the

Re: A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Sylvain Lebresne
On Tue, Aug 5, 2014 at 11:40 AM, Lu, Boying wrote: > What I want to know is “are the *strategy* changed ?’ after the ‘udpate > keyspace with strategy_options…’ command returns successfully > Like all schema changes, not necessarily on all nodes. You will have to check for schema agreement betwee

RE: A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Lu, Boying
Yes. Sorry for not say it clearly. What I want to know is “are the strategy changed ?’ after the ‘udpate keyspace with strategy_options…’ command returns successfully Not the data change. e.g. say I run the command ‘update keyspace with strategy_opitons [dc1: 3, dc2:3]’ , when this command ret

Re: Reasonable range for the max number of tables?

2014-08-05 Thread Mark Reddy
Hi Phil, In theory, the max number of column families would be in the low number of hundreds. In practice the limit is related the amount of heap you have, as each column family will consume 1 MB of heap due to arena allocation. To segregate customer data, you could: - Use customer specific colum

Re: A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Sylvain Lebresne
Changing the strategy options, and in particular the replication factor, does not perform any data replication by itself. You need to run a repair to ensure data is replicated following the new replication. On Tue, Aug 5, 2014 at 10:52 AM, Lu, Boying wrote: > Thanks. yes. I can use the ‘show ke

RE: A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Lu, Boying
Thanks. yes. I can use the ‘show keyspace’ command to check and see the strategy does changed. But what I want to know is if the ‘update keyspace with strategy_options …’ command is a ‘sync’ operation or a ‘async’ operation. From: Rahul Menon [mailto:ra...@apigee.com] Sent: 2014年8月5日 16:38 To

Re: A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Rahul Menon
Try the show keyspaces command and look for "Options" under each keyspace. Thanks Rahul On Tue, Aug 5, 2014 at 2:01 PM, Lu, Boying wrote: > Hi, All, > > > > I want to run ‘update keyspace with strategy_options={dc1:3, dc2:3}’ from > cassandra-cli to update the strategy options of some keyspace

A question about using 'update keyspace with strategyoptions' command

2014-08-05 Thread Lu, Boying
Hi, All, I want to run 'update keyspace with strategy_options={dc1:3, dc2:3}' from cassandra-cli to update the strategy options of some keyspace in a multi-DC environment. When the command returns successfully, does it mean that the strategy options have been updated successfully or I need to w

Node stuck during nodetool rebuild

2014-08-05 Thread Vasileios Vlachos
Hello All, We are on 1.2.18 (running on Ubuntu 12.04) and we recently tried to add a second DC on our demo environment, just before trying it on live. The existing DC1 has two nodes which approximately hold 10G of data (RF=2). In order to add the second DC, DC2, we followed this procedure: On DC1

Re: Reasonable range for the max number of tables?

2014-08-05 Thread Phil Luckhurst
Is there any mention of this limitation anywhere in the Cassandra documentation? I don't see it mentioned in the 'Anti-patterns in Cassandra' section of the DataStax 2.0 documentation or anywhere else. When starting out with Cassandra as a store for a multi-tenant application it seems very attract