Re: full gc too often

2014-12-04 Thread Jonathan Haddad
I recommend reading through https://issues.apache.org/jira/browse/CASSANDRA-8150 to get an idea of how the JVM GC works and what you can do to tune it. Also good is Blake Eggleston's writeup which can be found here: http://blakeeggleston.com/cassandra-tuning-the-jvm-for-read-heavy-workloads.html

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-04 Thread Dong Dai
> On Dec 4, 2014, at 1:46 PM, Tyler Hobbs wrote: > > > On Thu, Dec 4, 2014 at 11:50 AM, Dong Dai > wrote: > As we already did what coordinators do in client side, why don’t we do one > step more: > break the UNLOGGED batch statements into several small batch statem

Re: Replacing a dead node by deleting it and auto_bootstrap'ing a new node (Cassandra 2.0)

2014-12-04 Thread Jaydeep Chovatia
as per my knowledge if you have externally NOT specified "-Dcassandra.replace_address=old_node_ipaddress" then new tokens (randomly) would get assigned to bootstrapping node instead of tokens of dead node. -jaydeep On Thu, Dec 4, 2014 at 6:50 AM, Omri Bahumi wrote: > Hi, > > I was wondering, ho

Re: full gc too often

2014-12-04 Thread Philo Yang
I have two kinds of machine: 16G RAM, with default heap size setting, about 4G. 64G RAM, with default heap size setting, about 8G. These two kinds of nodes have same number of vnodes, and both of them have gc issue, although the node of 16G have a higher probability of gc issue. Thanks, Philo Ya

Re: full gc too often

2014-12-04 Thread Tim Heckman
On Dec 4, 2014 8:14 PM, "Philo Yang" wrote: > > Hi,all > > I have a cluster on C* 2.1.1 and jdk 1.7_u51. I have a trouble with full gc that sometime there may be one or two nodes full gc more than one time per minute and over 10 seconds each time, then the node will be unreachable and the latency

full gc too often

2014-12-04 Thread Philo Yang
Hi,all I have a cluster on C* 2.1.1 and jdk 1.7_u51. I have a trouble with full gc that sometime there may be one or two nodes full gc more than one time per minute and over 10 seconds each time, then the node will be unreachable and the latency of cluster will be increased. I grep the GCInspecto

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-04 Thread Shane Hansen
I'd be really interested to know what sort of performance or load improvements you see by doing client side partitioning. Please post back some results if you've tried that strategy. On Thu, Dec 4, 2014 at 11:46 AM, Tyler Hobbs wrote: > > On Thu, Dec 4, 2014 at 11:50 AM, Dong Dai wrote: > >> As

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-04 Thread Tyler Hobbs
On Thu, Dec 4, 2014 at 11:50 AM, Dong Dai wrote: > As we already did what coordinators do in client side, why don’t we do one > step more: > break the UNLOGGED batch statements into several small batch statements, > each of which contains > the statements with the same partition key. And send the

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-04 Thread Dong Dai
> On Dec 4, 2014, at 11:37 AM, Tyler Hobbs wrote: > > > On Wed, Dec 3, 2014 at 11:02 PM, Dong Dai > wrote: > > 1) except I am using TokenAwarePolicy, the async insert also can not be sent > to > the right coordinator. > > Yes. Of course, TokenAwarePolicy can w

Re: mirror between github and apache

2014-12-04 Thread Tyler Hobbs
The Apache git repo is the main repo. The github repo is periodically synched (I believe every few hours). On Thu, Dec 4, 2014 at 2:39 AM, Christian Andersson wrote: > Hi, > > Have a question regarding the mirror of git://git.apache.org/cassandra.git > and github. > > How are the repositories s

Re: Performance Difference between Batch Insert and Bulk Load

2014-12-04 Thread Tyler Hobbs
On Wed, Dec 3, 2014 at 11:02 PM, Dong Dai wrote: > > 1) except I am using TokenAwarePolicy, the async insert also can not be > sent to > the right coordinator. > Yes. Of course, TokenAwarePolicy can wrap any other policy. > > 2) the TokenAwarePolicy actually is doing the job that coordinators

unsubscribe

2014-12-04 Thread Sheausong Yang
>

Repair taking many snapshots per minute

2014-12-04 Thread Robert Wille
This is a follow-up to my previous post “Cassandra taking snapshots automatically?”. I’ve renamed the thread to better describe the new information I’ve discovered. I have a four node, RF=3, 2.0.11 cluster that was producing snapshots at a prodigious rate. I let the cluster sit idle overnight t

Replacing a dead node by deleting it and auto_bootstrap'ing a new node (Cassandra 2.0)

2014-12-04 Thread Omri Bahumi
Hi, I was wondering, how would auto_bootstrap behave in this scenario: 1. I had a cluster with 3 nodes (RF=2) 2. One node died, I deleted it with "nodetool removenode" (+ force) 3. A new node launched with "auto_bootstrap: true" The question is: will the "right" vnodes go to the new node as if i

Re: [Import csv to Cassandra] Taking too much time

2014-12-04 Thread Yuki Morishita
Here's blog post about writing SSTables from CSV and using SSTableLoader to load them. http://www.datastax.com/dev/blog/using-the-cassandra-bulk-loader-updated On Thu, Dec 4, 2014 at 5:57 AM, 严超 wrote: > Thank you very much for your advice. > Can you give me more advice for using SSTableLoader t

Re: [Import csv to Cassandra] Taking too much time

2014-12-04 Thread 严超
Thank you very much for your advice. Can you give me more advice for using SSTableLoader to import csv ? What is the best practice to use SStableLoader importing csv in Cassandra ? *Best Regards!* *Chao Yan--**My twitter:Andy Yan @yanchao727 * *My W

Re: [Import csv to Cassandra] Taking too much time

2014-12-04 Thread Akshay Ballarpure
Hello Chao Yan, CSV data import using Copy command in cassandra is always painful for large size file (say > 1Gig). CQL tool is not developed for performing such heavy operations instead try using SSTableLoader to import. Best Regards Akshay From: 严超 To: user@cassandra.apache.org Dat

Re: 2.0.10 upgrade to 2.1.2 gives "Unable to gossip with any seeds"

2014-12-04 Thread Neha
Check if u have rpc_server = hsha .. Change it to sync and try .. Sent from my iPhone On Dec 4, 2014, at 3:55 PM, sinonim wrote: Hi all, We have the case of a cassandra cluster with nodes version 2.0.10, all in a single EC2 region. We want to perform a rolling upgrade to version 2.1.2 but t

2.0.10 upgrade to 2.1.2 gives "Unable to gossip with any seeds"

2014-12-04 Thread sinonim
Hi all, We have the case of a cassandra cluster with nodes version 2.0.10, all in a single EC2 region. We want to perform a rolling upgrade to version 2.1.2 but the new node has the following exceptions: java.lang.RuntimeException: Unable to gossip with any seeds at org.apache.cassandra.gms.Gossi

Re: Wide rows best practices and GC impact

2014-12-04 Thread Jabbar Azam
Hello, I saw this earlier yesterday but didn't want to reply because I didn't know what the cause was. Basically I using wide rows with cassandra 1.x and was inserting data constantly. After about 18 hours the JVM would crash with a dump file. For some reason I removed the compaction throttling a

[Import csv to Cassandra] Taking too much time

2014-12-04 Thread 严超
Hi, Everyone: I'm importing a CSV file into Cassandra, and it always get error:"Request did not complete within rpc_timeout ", then I have to continue my COPY command of cql again. And the CSV file is 2.2 G . It is taking a long time. How can I speed up csv file importing ? Is there

mirror between github and apache

2014-12-04 Thread Christian Andersson
Hi, Have a question regarding the mirror of git://git.apache.org/cassandra.git and github. How are the repositories synchronized? Manually, some kind of automatic job doing this.. //chibbe