Re: Convert single node C* to cluster (rebalancing problem)

2017-06-02 Thread Junaid Nasir
I am able to get it working. I added a new node with following changes

#rpc_address:0.0.0.0
rpc_address: 10.128.1.11
#rpc_broadcast_address:10.128.1.11

rpc_address was set to 0.0.0.0, (I ran into a problem previously regarding
remote connection and made these changes
https://stackoverflow.com/questions/12236898/apache-cassandra-remote-access
)

should it be happening?

On Thu, Jun 1, 2017 at 6:31 PM, Vladimir Yudovin 
wrote:

> Did you run "nodetool cleanup" on first node after second was
> bootstrapped? It should clean rows not belonging to node after tokens
> changed.
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Wed, 31 May 2017 03:55:54 -0400 *Junaid Nasir  >* wrote 
>
> Cassandra ensure that adding or removing nodes are very easy and that load
> is balanced between nodes when a change is made. but it's not working in my
> case.
> I have a single node C* deployment (with 270 GB of data) and want to load
> balance the data on multiple nodes, I followed this guide
> 
>
> `nodetool status` shows 2 nodes but load is not balanced between them
>
> Datacenter: dc1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens   Owns (effective)  Host IDRack
> UN  10.128.0.7   270.75 GiB  256  48.6%
> 1a3f6faa-4376-45a8-9c20-11480ae5664c  rack1
> UN  10.128.0.14  414.36 KiB  256  51.4%
> 66a89fbf-08ba-4b5d-9f10-55d52a199b41  rack1
>
> I also ran 'nodetool repair' on new node but result is same. any pointers
> would be appreciated :)
>
> conf file of new node
>
> cluster_name: 'cluster1'
>  - seeds: "10.128.0.7"
> num_tokens: 256
> endpoint_snitch: GossipingPropertyFileSnitch
>
> Thanks,
> Junaid
>
>
>


Re: Restarting nodes and reported load

2017-06-02 Thread Daniel Steuernol
Thanks for the info, this provides a lot to go through, especially Al Tobey's guide.  I'm running java version "1.8.0_121" and using G1GC for the gc type.
  

On Jun 1 2017, at 2:32 pm, Victor Chen  wrote:


  Regarding mtime, I'm just talking about using something like the following (assuming you are on linux) "find pathtoyourdatadir -mtime -1 -ls" which will find all files in your datadir last modifed within the past 24h. You can compare increase in your reported nodetool load within the past N days and then use the same period of time to look for files modified that could match that size. Not really sure what sort of load or how long that would take on 3-4T of data though. Regarding compactionstats and tpstats, I would just be interested if there are increasing "pending" tasks for either. Did you say you observed latency issues or degraded performance or not? What version of java/cassandra did you say you were running and what type of gc are you using?Regarding not showing a node not creating "DOWN" entry in log, if a node experiences a sufficiently long gc pause (I'm not sure what the threshold is, maybe somebody more knowledgeable can chime in?), then even though the node itself still "thinks" it's up, other nodes will mark it as DN, thus you wouldn't see a "is now DOWN" entry in the system.log of the gc-ing node, but you would see a "is now DOWN" entry in the system.log of the remote nodes (and a corresponding "is now UP" entry when the node comes out of its gc pause. Assuming the logs have not been rotated off, if you just grep system.log for "DOWN" on your nodes, that usually reveals a useful timestamp from where to start looking on the problematic node's system.log or gc.log.Do you have peristent cpu/memory disk io/ space monitoring mechanisms? You should think about putting something in place to gathering that info if you don't ... I find myself coming back to Al Tobey's tuning guide frequently if nothing else for the tools he mentions and notes on the java gc. I want to say heap size of 15G sounds a little high but I am starting to talk a bit out of my depth when it comes to java tuning. see datastax's official cassandra 2.1 jvm tuning doc and also this stackoverflow thread. good luck!On Thu, Jun 1, 2017 at 4:06 PM, Daniel Steuernol  wrote:I'll try to capture answer to questions in the last 2 messages.Network traffic looks pretty steady overall. About 0.5 up to 2 megabytes/s. The cluster handles about 100k to 500k operations per minute, right now the read/write comparison is about 50/50 right now, eventually though it will probably be 70% writes and 30% reads.There does seem to be some nodes that are affected more frequently then others. I haven't captured cpu/memory stats vs other nodes at the time the problem is occurring, I will do that next time it happens. Also I will look at compaction stats and tpstats, what are some things that I should be looking for in tpstats in particular, I'm not exactly sure how to read the output from that command.The heap size is set to 15GB on each node, and each node has 60GB of ram available.In regards to the "... is now DOWN" messages. I'm unable to find one in the system.log for a time I know that a node was having problems. I've built a system that polls nodetool status and parses the output, and if it sees a node reporting as DN it sends a message to a slack channel. Is it possible for a node to report as DN, but not have the message show up in th log?The system polling nodetool status is not the status that was reported as DN.I'm a bit unclear about the last point about mtime/size of files and how to check, can you provide more information there?Thanks for the all the help, I really appreciate it.
  

On Jun 1 2017, at 10:33 am, Victor Chen  wrote:


  Hi Daniel,In my experience when a node
 shows DN and then comes back up by itself that sounds some sort of gc pause 
(especially if nodtool status when run from the "DN" node itself shows 
it is up-- assuming there isn't a spotty network issue). Perhaps I missed this info due to length of thread but have 
you shared info about the following?cpu/memory usage of affected nodes (are all nodes affected comparably, or some more than others?)nodetool compactionstats and tpstats output (especially as the )what is your heap size set to?system.log and gc.logs: for investigating node "DN" symptoms I
 will usually start by noting the timestamp of the "123.56.78.901 is now DOWN" 
entries in system.log of other nodes to tell me where to look in 
system.log of node in question. Then it's a question answer "what was 
this node doing up to that point?"mtime/size of files in data directory-- which files are growing in size? That will help reduce 
how much we need to speculate. I don't think you should need to restart cassandra every X days if 

Re: How to avoid flush if the data can fit into memtable

2017-06-02 Thread Jeff Jirsa


On 2017-05-24 17:42 (-0700), preetika tyagi  wrote: 
> Hi,
> 
> I'm running Cassandra with a very small dataset so that the data can exist
> on memtable only. Below are my configurations:
> 
> In jvm.options:
> 
> -Xms4G
> -Xmx4G
> 
> In cassandra.yaml,
> 
> memtable_cleanup_threshold: 0.50
> memtable_allocation_type: heap_buffers
> 
> As per the documentation in cassandra.yaml, the *memtable_heap_space_in_mb*
>  and *memtable_heap_space_in_mb* will be set of 1/4 of heap size i.e. 1000MB
> 
> According to the documentation here (
> http://docs.datastax.com/en/cassandra/3.0/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__memtable_cleanup_threshold),
> the memtable flush will trigger if the total size of memtabl(s) goes beyond
> (1000+1000)*0.50=1000MB.

1/4 heap (=1G) * .5 cleanup means cleanup happens at 500MB, or when commitlog 
hits its max size. If you disable durable writes (disable the cleanup), you're 
flushing at 500MB.

Recall that your 300MB of data also has associated data with it (timestamps, 
ttls, etc) that will increase size beyond your nominal calculation from the 
user side.

If you're sure you want to do this, set durable_writes=false and either raise 
the memtable_cleanup_threshold significantly, raise your heap or memtable size. 


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Information on Cassandra

2017-06-02 Thread Jeff Jirsa


On 2017-06-01 09:09 (-0700), "Harper, Paul"  wrote: 
> Hello All,
> 
> I'm about 3 months into support several clusters of Cassandra databases. I 
> recently subscribed to this email list and I receive lots of interesting 
> emails most of which I don't understand. I feel like I have a pretty good 
> grasp on Cassandra, I would like to know what types of this should I be 
> checking on a daily, weekly or monthly basis. Many of the email I see in this 
> string are on subjects I've never had to look at so far. So I'm wondering 
> what is it that I should be monitoring or doing or I should know. I would 
> appreciate it any advice or guidance you can provide. Please to my email and 
> not the group listing  unless it's something that maybe helpful to others.
> 

The good news is that cassandra can run for years without any intervention, 
especially if you're not pushing the limits.

At a high level, you should be watching:
- Read/writes per second. Your application may warn you if these change, but 
catching it before it impacts your application is always nice. 
- Latencies (how long does each read/write take, and is that getting worse over 
time, which may indicate a problem brewing)
- How much data is on each node (hopefully it's pretty even)
- How many sstables are on each node (hopefully it's pretty even)
- GC pause times (you're probably using parnew/cms, most metrics packages will 
know how to graph those as two distinct lines - seeing long pauses is a good 
hint that things are starting to get bad)
- How often are you running repair? Is repair succeeding? Is it failing? If you 
delete data, you need to repair (successfully, all nodes) at least once every 
gc_grace_seconds (by default 10 days). 
- Whether or not schema versions match - if schema diverges, you could have a 
big problem brewing.






-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: How to avoid flush if the data can fit into memtable

2017-06-02 Thread preetika tyagi
Great explanation and the blog post, Akhil.

Sorry for the delayed response (somehow didn't notice the email in my
inbox), but this is what I concluded as well.

In addition to compression, I believe the sstable is serialized as well and
the combination of both results into much smaller sstable size in
comparison to in-memory memtable size which holds all the data in java
objects.

I also did a small experiment for this. When I allocate 4GB of heap
(resulting into roughly 981MB for memtable as per your post) and then write
approx 920MB of data, it ends up writing some sstables. However, if I
increase the heap size to 120GB and write ~920MB of data again, it doesn't
write anything to the sstable. Therefore, it clearly indicates that I need
bigger heap sizes.

One interesting fact though, if I bring heap size down to 64GB which means
memtable will roughly be around 16GB and again write ~920MB data, it still
writes some sstables. The ratio of 920MB serialized + compressed data and
more than 16GB in-memory memtable data looks a bit weird but I don't have a
solid explanation for this behavior.

However, I'm not going to look into that so we can conclude this post :)

Thank you all for your responses!

Preetika


On Fri, Jun 2, 2017 at 10:56 AM, Jeff Jirsa  wrote:

>
>
> On 2017-05-24 17:42 (-0700), preetika tyagi 
> wrote:
> > Hi,
> >
> > I'm running Cassandra with a very small dataset so that the data can
> exist
> > on memtable only. Below are my configurations:
> >
> > In jvm.options:
> >
> > -Xms4G
> > -Xmx4G
> >
> > In cassandra.yaml,
> >
> > memtable_cleanup_threshold: 0.50
> > memtable_allocation_type: heap_buffers
> >
> > As per the documentation in cassandra.yaml, the
> *memtable_heap_space_in_mb*
> >  and *memtable_heap_space_in_mb* will be set of 1/4 of heap size i.e.
> 1000MB
> >
> > According to the documentation here (
> > http://docs.datastax.com/en/cassandra/3.0/cassandra/
> configuration/configCassandra_yaml.html#configCassandra_
> yaml__memtable_cleanup_threshold),
> > the memtable flush will trigger if the total size of memtabl(s) goes
> beyond
> > (1000+1000)*0.50=1000MB.
>
> 1/4 heap (=1G) * .5 cleanup means cleanup happens at 500MB, or when
> commitlog hits its max size. If you disable durable writes (disable the
> cleanup), you're flushing at 500MB.
>
> Recall that your 300MB of data also has associated data with it
> (timestamps, ttls, etc) that will increase size beyond your nominal
> calculation from the user side.
>
> If you're sure you want to do this, set durable_writes=false and either
> raise the memtable_cleanup_threshold significantly, raise your heap or
> memtable size.
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Convert single node C* to cluster (rebalancing problem)

2017-06-02 Thread Akhil Mehra
So now the data is evenly balanced in both nodes? 

Refer to the following documentation to get a better understanding of the 
roc_address and the broadcast_rpc_address 
https://www.instaclustr.com/demystifying-cassandras-broadcast_address/ 
. I am 
surprised that your node started up with rpc_broadcast_address set as this is 
an unsupported property. I am assuming you are using Cassandra version 3.10.


Regards,
Akhil

> On 2/06/2017, at 11:06 PM, Junaid Nasir  wrote:
> 
> I am able to get it working. I added a new node with following changes
> #rpc_address:0.0.0.0
> rpc_address: 10.128.1.11
> #rpc_broadcast_address:10.128.1.11
> rpc_address was set to 0.0.0.0, (I ran into a problem previously regarding 
> remote connection and made these changes 
> https://stackoverflow.com/questions/12236898/apache-cassandra-remote-access 
> )
>  
> 
> should it be happening?
> 
> On Thu, Jun 1, 2017 at 6:31 PM, Vladimir Yudovin  > wrote:
> Did you run "nodetool cleanup" on first node after second was bootstrapped? 
> It should clean rows not belonging to node after tokens changed.
> 
> Best regards, Vladimir Yudovin, 
> Winguzone  - Cloud Cassandra Hosting
> 
> 
>  On Wed, 31 May 2017 03:55:54 -0400 Junaid Nasir  > wrote 
> 
> Cassandra ensure that adding or removing nodes are very easy and that load is 
> balanced between nodes when a change is made. but it's not working in my case.
> I have a single node C* deployment (with 270 GB of data) and want to load 
> balance the data on multiple nodes, I followed this guide 
> 
>  
> `nodetool status` shows 2 nodes but load is not balanced between them
> Datacenter: dc1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address  Load   Tokens   Owns (effective)  Host IDRack
> UN  10.128.0.7   270.75 GiB  256  48.6%
> 1a3f6faa-4376-45a8-9c20-11480ae5664c  rack1
> UN  10.128.0.14  414.36 KiB  256  51.4%
> 66a89fbf-08ba-4b5d-9f10-55d52a199b41  rack1
> I also ran 'nodetool repair' on new node but result is same. any pointers 
> would be appreciated :)
> 
> conf file of new node
> cluster_name: 'cluster1'
>  - seeds: "10.128.0.7"
> num_tokens: 256
> endpoint_snitch: GossipingPropertyFileSnitch
> Thanks,
> Junaid
> 
>