subject:"Restarting nodes and reported load"

Re: Restarting nodes and reported load

2017-06-02 Thread Daniel Steuernol

Thanks for the info, this provides a lot to go through, especially Al Tobey's guide. I'm running java version "1.8.0_121" and using G1GC for the gc type. On Jun 1 2017, at 2:32 pm, Victor Chen wrote: Regarding mtime, I'm

Re: Restarting nodes and reported load

2017-06-01 Thread Victor Chen

Regarding mtime, I'm just talking about using something like the following (assuming you are on linux) "find *pathtoyourdatadir *-mtime -1 -ls" which will find all files in your datadir last modifed within the past 24h. You can compare increase in your reported nodetool load within the past N days

Re: Restarting nodes and reported load

2017-06-01 Thread Daniel Steuernol

I'll try to capture answer to questions in the last 2 messages.Network traffic looks pretty steady overall. About 0.5 up to 2 megabytes/s. The cluster handles about 100k to 500k operations per minute, right now the read/write comparison is about 50/50 right now, eventually though it will probably b

Re: Restarting nodes and reported load

2017-06-01 Thread Victor Chen

Hi Daniel, In my experience when a node shows DN and then comes back up by itself that sounds some sort of gc pause (especially if nodtool status when run from the "DN" node itself shows it is up-- assuming there isn't a spotty network issue). Perhaps I missed this info due to length of thread but

Re: Restarting nodes and reported load

2017-06-01 Thread daemeon reiydelle

Some random thoughts; I would like to thank you for giving us an interesting problem. Cassandra can get boring sometimes, it is too stable. - Do you have a way to monitor the network traffic to see if it is increasing between restarts or does it seem relatively flat? - What activities are happenin

Re: Restarting nodes and reported load

2017-06-01 Thread Daniel Steuernol

I am just restarting cassandra. I'm not having any disk space issues I think, but we're having issues where operations have increased latency, and these are fixed by a restart. It seemed like the load reported by nodetool status might be helpful in understanding what is going wrong but I'm not sure

Re: Restarting nodes and reported load

2017-05-31 Thread Anthony Grasso

Hi Daniel, When you say that the nodes have to be restarted, are you just restarting the Cassandra service or are you restarting the machine? How are you reclaiming disk space at the moment? Does disk space free up after the restart? Regarding storage on nodes, keep in mind the more data stored o

Re: Restarting nodes and reported load

2017-05-30 Thread Jonathan Haddad

You're the only one I see in the thread that's made any reference to HDFS. The OP even noted that his question is about C*, not HDFS. On Tue, May 30, 2017 at 2:59 PM daemeon reiydelle wrote: > Did you notice that HDFS is the distributed file system used? > > > > > > *Daemeon C.M. ReiydelleUSA (+

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle

Did you notice that HDFS is the distributed file system used? *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* *“All men dream, but not equally. Those who dream by night in the dusty recesses of their minds wake up in the day to find it was vanity, but the dreamers

Re: Restarting nodes and reported load

2017-05-30 Thread Jonathan Haddad

Daniel - my comment wasn't to you, it was in response to Daemeon. > no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs node Jon On Tue, May 30, 2017 at 2:30 PM Daniel Steuernol wrote: > My question is about cassandra, ultimately I'm trying to figure out why > our clusters p

Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol

My question is about cassandra, ultimately I'm trying to figure out why our clusters performance degrades approximately every 6 days. I noticed that the load as reported by nodetool status was very high, but that might be unrelated to the problem. A restart solves the performance problem.I've attac

Re: Restarting nodes and reported load

2017-05-30 Thread Jonathan Haddad

This isn't an HDFS mailing list. On Tue, May 30, 2017 at 2:14 PM daemeon reiydelle wrote: > no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs > node. Depends somewhat on whether there is a mix of more and less > frequently accessed data. But even storing only hot data, never

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle

no, 3tb is small. 30-50tb of hdfs space is typical these days per hdfs node. Depends somewhat on whether there is a mix of more and less frequently accessed data. But even storing only hot data, never saw anything less than 20tb hdfs per node. *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London

Re: Restarting nodes and reported load

2017-05-30 Thread tommaso barbugli

Am I the only one thinking 3TB is way too much data for a single node on a VM? On Tue, May 30, 2017 at 10:36 PM, Daniel Steuernol wrote: > I don't believe incremental repair is enabled, I have never enabled it on > the cluster, and unless it's the default then it is off. Also I don't see a > set

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle

No degradation. *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872* *“All men dream, but not equally. Those who dream by night in the dusty recesses of their minds wake up in the day to find it was vanity, but the dreamers of the day are dangerous men, for they may act

Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol

That does sound like what's happening, did performance degrade as the reported load increased? On May 30 2017, at 1:52 pm, daemeon reiydelle wrote: OK, thanks.So there was a bug in a prior version of C*, symptoms were:Node

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle

OK, thanks. So there was a bug in a prior version of C*, symptoms were: Nodetool would show increasing load utilization over time. Stopping and restarting C* nodes would reset the storage back to what one would expect on that node, for a while, then it would creep upwards again, until the node(s)

Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol

I don't believe incremental repair is enabled, I have never enabled it on the cluster, and unless it's the default then it is off. Also I don't see a setting in cassandra.yaml for it. On May 30 2017, at 1:10 pm, daemeon reiydelle wrote:

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle

Unless there is a bug, snapshots are excluded (they are not HDFS anyway!) from nodetool status. Out of curiousity, is incremenatal repair enabled? This is almost certainly a rat hole, but there was an issue a few releases back where load would only increase until the node was restarted. Had been f

Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol

incremental backup is set to false in the config file, also I have set snapshot_before_compaction and auto_snapshot to false as well. I ran nodetool clearsnapshot, but before doing that I ran nodetool listsnapshots and it listed a bunch of snapshots. I would have expected that to be empty because

Re: Restarting nodes and reported load

2017-05-30 Thread Varun Gupta

Can you please check if you have incremental backup enabled and snapshots are occupying the space. run nodetool clearsnapshot command. On Tue, May 30, 2017 at 11:12 AM, Daniel Steuernol wrote: > It's 3-4TB per node, and by load rises, I'm talking about load as reported > by nodetool status. > >

Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol

It's 3-4TB per node, and by load rises, I'm talking about load as reported by nodetool status. On May 30 2017, at 10:25 am, daemeon reiydelle wrote: When you say "the load rises ... ", could you clarify what you mean by "l

Re: Restarting nodes and reported load

2017-05-30 Thread daemeon reiydelle

When you say "the load rises ... ", could you clarify what you mean by "load"? That has a specific Linux term, and in e.g. Cloudera Manager. But in neither case would that be relevant to transient or persisted disk. Am I missing something? On Tue, May 30, 2017 at 10:18 AM, tommaso barbugli wrote

Re: Restarting nodes and reported load

2017-05-30 Thread tommaso barbugli

3-4 TB per node or in total? On Tue, May 30, 2017 at 6:48 PM, Daniel Steuernol wrote: > I should also mention that I am running cassandra 3.10 on the cluster > > > > On May 29 2017, at 9:43 am, Daniel Steuernol > wrote: > >> The cluster is running with RF=3, right now each node is storing about

Re: Restarting nodes and reported load

2017-05-30 Thread Daniel Steuernol

I should also mention that I am running cassandra 3.10 on the cluster On May 29 2017, at 9:43 am, Daniel Steuernol wrote: The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.

Re: Restarting nodes and reported load

2017-05-29 Thread Daniel Steuernol

The cluster is running with RF=3, right now each node is storing about 3-4 TB of data. I'm using r4.2xlarge EC2 instances, these have 8 vCPU's, 61 GB of RAM, and the disks attached for the data drive are gp2 ssd ebs volumes with 10k iops. I guess this brings up the question of what's a good marker

Re: Restarting nodes and reported load

2017-05-29 Thread tommaso barbugli

Hi Daniel, This is not normal. Possibly a capacity problem. Whats the RF, how much data do you store per node and what kind of servers do you use (core count, RAM, disk, ...)? Cheers, Tommaso On Mon, May 29, 2017 at 6:22 PM, Daniel Steuernol wrote: > > I am running a 6 node cluster, and I have

Restarting nodes and reported load

2017-05-29 Thread Daniel Steuernol

I am running a 6 node cluster, and I have noticed that the reported load on each node rises throughout the week and grows way past the actual disk space used and available on each node. Also eventually latency for operations suffers and the nodes have to be restarted. A couple questions on this, is

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Re: Restarting nodes and reported load

Restarting nodes and reported load

28 matches

Site Navigation

Mail list logo

Footer information