Sorry, Jerry, my mistake - I mistakenly thought you had emailed me
directly! You had already informed the email list.

-- Jack Krupansky

On Tue, Dec 8, 2015 at 8:58 PM, xutom <xutom2...@126.com> wrote:

> Hi Anuj,
> Thanks! I will retry now!
> By the way, how to " inform the C* email list as well so that others know"
> as Jack said? I am sorry I have not do that yet.
>
> Thanks
> jerry
>
> At 2015-12-09 01:09:07, "Anuj Wadehra" <anujw_2...@yahoo.co.in> wrote:
>
> Hi Jerry,
>
> Its great that you got performance improvement. Moreover, I agree with
> what Graham said. I think that you are using extremely large Heaps with CMS
> and that too in very odd ratio..Having 40G for new gen and leaving only 20G
> old gen seems unreasonable..Its hard to believe that you are having
> reasonable Gc pauses..Please recheck..I would suggest you to test your
> performance with much smaller heap..may be 16G max heap n 4G new
> gen..moreover make sure that you apply all the recommended Production
> settings suggested by DataStax at
> http://docs.datastax.com/en/cassandra/2.1/cassandra/install/installRecommendSettings.html
>
> Dont worry about wasting your memory, it will be used for OS caching and
> you can get even better performance..
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> <https://overview.mail.yahoo.com/mobile/?.src=Android>
> ------------------------------
> *From*:"Jack Krupansky" <jack.krupan...@gmail.com>
> *Date*:Tue, 8 Dec, 2015 at 8:07 pm
> *Subject*:Re: Re: Re: Cassandra Tuning Issue
>
> Great! Make sure to inform the C* email list as well so that others know.
>
> -- Jack Krupansky
>
> On Tue, Dec 8, 2015 at 7:44 AM, xutom <xutom2...@126.com> wrote:
>
>>
>>
>> Dear Jack,
>>     Thank you very much! Now we have much better performance when we
>> insert the same partition keys in the same batch.
>>
>> jerry
>>
>> At 2015-12-07 13:08:31, "Jack Krupansky" <jack.krupan...@gmail.com>
>> wrote:
>>
>> If you combine inserts for multiple partition keys in the same batch you
>> negate most of the effect of token-aware routing. It's best to insert only
>> rows with the same partition key in a single batch. You also need to set
>> the partition key for routing for the batch.
>>
>> Also, RF=2 is not recommended since it does not permit quorum operations
>> if a replica node is down. RF=3 is generally more appropriate.
>>
>> -- Jack Krupansky
>>
>> On Sun, Dec 6, 2015 at 10:27 PM, xutom <xutom2...@126.com> wrote:
>>
>>> Dear all,
>>>     Thanks for ur reply!
>>>     Now I`m using Apache Cassandra 2.1.1 and my JDK is 1.7.0_79,  my
>>> keyspace replication factor is 2,and I do enable the "token aware". The GC
>>> configuration is default for such as:
>>> # GC tuning options
>>> JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
>>> JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
>>> JVM_OPTS="$JVM_OPTS -XX:+CMSParallelRemarkEnabled"
>>>     And I check the gc log: gc.log.0.current, I found there is only one
>>> Full GC. The stop-the-world times is low.
>>> CMS-initial-mark: 0.2747280 secs
>>> CMS-remark: 0.3623090 secs
>>>
>>>     The insert codes in my test client are following:
>>>             String content = RandomStringUtils.randomAlphabetic(120);
>>>             cluster = Cluster
>>>                     .builder()
>>>                     .addContactPoint(this.seedIP)
>>>                     .withCredentials("test", "test")
>>>                     .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
>>>                     .withLoadBalancingPolicy(new TokenAwarePolicy(new
>>> DCAwareRoundRobinPolicy()))
>>>                     .build();
>>>             session = cluster.connect("demo");
>>>             ......
>>>             PreparedStatement insertPreparedStatement = session.prepare(
>>>                         "   INSERT INTO teacher (id, lastname,
>>> firstname, city) " +
>>>                                 "VALUES (?, ?, ?, ?); ");
>>>
>>>             BatchStatement batch = new BatchStatement();
>>>             for (; i < max; i+=5) {
>>>                 try {
>>>                     batch.add(insertPreparedStatement.bind(i, "Entre
>>> Nous", "adsfasdfa1", content));
>>>                     batch.add(insertPreparedStatement.bind(i+1, "Entre
>>> Nous", "adsfasdfa2", content));
>>>                     batch.add(insertPreparedStatement.bind(i+2, "Entre
>>> Nous", "adsfasdfa3", content));
>>>                     batch.add(insertPreparedStatement.bind(i+3, "Entre
>>> Nous", "adsfasdfa4", content));
>>>                     batch.add(insertPreparedStatement.bind(i+4, "Entre
>>> Nous", "adsfasdfa5", content));
>>>
>>> //                    System.out.println("the is is " + i);
>>>                     session.execute(batch);
>>>                     thisTimeCount += 5;
>>>                 }
>>>             }
>>>
>>>
>>>
>>> At 2015-12-07 00:40:06, "Graham Sanderson" <gra...@vast.com> wrote:
>>>
>>> What version of C* are you using; what JVM version - you showed a
>>> partial GC config but if that is still CMS (not G1) then you are going to
>>> have insane GC pauses...
>>>
>>> Depending on C* versions are you using on/off heap memtables and what
>>> type
>>>
>>> Those are the sorts of issues related to fat nodes; I'd be worried about
>>> - we run very nicely at 20G total heap and 8G new - the rest of our 128G
>>> memory is disk cache/mmap and all of the off heap stuff so it doesn't go to
>>> waste
>>>
>>> That said I think Jack is probably on the right path with overloaded
>>> coordinators- though you'd still expect to see CPU usage unless your
>>> timeouts are too low for the load, In which case the coordinator would be
>>> getting no responses in time and quite possibly the other nodes are just
>>> dropping the mutations (since they don't get to them before they know the
>>> coordinator would have timed out) - I forget the command to check dropped
>>> mutations off the top of my head but you can see it in opcenter
>>>
>>> If you have GC problems you certainly
>>> Expect to see GC cpu usage but depending on how long you run your tests
>>> it might take you a little while to run thru 40G
>>>
>>> I'm personally not a fan off >32G (ish) heaps as you can't do compressed
>>> oops and also it is unrealistic for CMS ... The word is that G1 is now
>>> working ok with C* especially on newer C* and JDK versions, but that said
>>> it takes quite a lot of thru-put to require insane quantities of young
>>> gen... We are guessing that when we remove all our legacy thrift batch
>>> inserts we will need less - and as for 20G total we actually don't need
>>> that much (we dropped from 24 when we moved memtables off heap, and believe
>>> we can drop further)
>>>
>>> Sent from my iPhone
>>>
>>> On Dec 6, 2015, at 9:07 AM, Jack Krupansky <jack.krupan...@gmail.com>
>>> wrote:
>>>
>>> What replication factor are you using? Even if your writes use CL.ONE,
>>> Cassandra will be attempting writes to the replica nodes in the background.
>>>
>>> Are your writes "token aware"? If not, the receiving node has the
>>> overhead of forwarding the request to the node that owns the token for the
>>> primary key.
>>>
>>> For the record, Cassandra is not designed and optimized for so-called
>>> "fat nodes". The design focus is "commodity hardware" and "distributed
>>> cluster" (typically a dozen or more nodes.)
>>>
>>> That said, it would be good if we had a rule of thumb for how many
>>> simultaneous requests a node can handle, both external requests and
>>> inter-node traffic. I think there is an open Jira to enforce a limit on
>>> inflight requests so that nodes don't overloaded and start failing in the
>>> middle of writes as you seem to be seeing.
>>>
>>> -- Jack Krupansky
>>>
>>> On Sun, Dec 6, 2015 at 9:29 AM, jerry <xutom2...@126.com> wrote:
>>>
>>>> Dear All,
>>>>
>>>>     Now I have a 4 nodes Cassandra cluster, and I want to know the
>>>> highest performance of my Cassandra cluster. I write a JAVA client to batch
>>>> insert datas into ALL 4 nodes Cassandra, when I start less than 30
>>>> subthreads in my client applications to insert datas into cassandra, it
>>>> will be ok for everything, but when I start more than 80 or 100 subthreads
>>>> in my client applications, there will be too much timeout Exceptions (Such
>>>> as: Cassandra timeout during write query at consistency ONE (1 replica were
>>>> required but only 0 acknowledged the write)). And no matter how many
>>>> subthreads or even I start multiple clients with multiple subthreads on
>>>> different computers, I can get the highest performance for about 60000 -
>>>> 80000 TPS. By the way, each row I insert into cassandra is about 130 Bytes.
>>>>     My 4 nodes of Cassandra is :
>>>>         CPU: 4*15
>>>>         Memory: 512G
>>>>         Disk: flash card (only one disk but better than SSD)
>>>>     My cassandra configurations are:
>>>>         MAX_HEAP_SIZE: 60G
>>>>         NEW_HEAP_SIZE: 40G
>>>>
>>>>     When I insert datas into my cassandra cluster, each nodes has NOT
>>>> reached bottleneck such as CPU or Memory or Disk. Each of the three main
>>>> hardwares is idle。So I think maybe there is something wrong about my
>>>> configuration of cassandra cluster. Can somebody please help me to My
>>>> Cassandra Tuning? Thanks in advances!
>>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>>
>
>
>
>
>

Reply via email to