CMS is fine at 12G for sure, likely up to 16G

You’ll want to initiate CMS a bit earlier (55-69%), and you likely want new gen 
to be larger - perhaps 3-6G

You’ll want to manually set the memtable size - it scales  with heap by default

After bootstrap you can lower it again


-- 
Jeff Jirsa


> On Aug 29, 2018, at 10:52 PM, Jai Bheemsen Rao Dhanwada 
> <jaibheem...@gmail.com> wrote:
> 
> I have 72 nodes in the cluster, across 8 datacenters.. the moment I try to 
> increase the node above 84 or so, the issue starts.
> 
> I am still using CMS Heap, assuming it will create more harm if I increase 
> the heap size beyond 8G(recommended).
> 
>> On Wed, Aug 29, 2018 at 6:53 PM Jeff Jirsa <jji...@gmail.com> wrote:
>> Given the size of your schema, you’re probably getting flooded with a bunch 
>> of huge schema mutations as it hops into gossip and tries to pull the schema 
>> from every host it sees. You say 8 DCs but you don’t say how many nodes - 
>> I’m guessing it’s  a lot? 
>> 
>> This is something that’s incrementally better in 3.0, but a real proper fix 
>> has been talked about a few times  - 
>> https://issues.apache.org/jira/browse/CASSANDRA-11748 and 
>> https://issues.apache.org/jira/browse/CASSANDRA-13569 for example 
>> 
>> In the short term, you may be able to work around this by increasing your 
>> heap size. If that doesn’t work, there’s an ugly ugly hack that’ll work on 
>> 2.1:  limiting the number of schema blobs you can get at a time - in this 
>> case, that means firewall off all but a few nodes in your cluster for 10-30 
>> seconds, make sure it gets the schema (watch the logs or file system for the 
>> tables to be created), then remove the firewall so it can start the 
>> bootstrap process (it needs the schema to setup the streaming plan, and it 
>> needs all the hosts up in gossip to stream successfully, so this is an ugly 
>> hack to give you time to get the schema and then heal the cluster so it can 
>> bootstrap).
>> 
>> Yea that’s awful. Hopefully either of the two above JIRAs lands to make this 
>> less awful. 
>> 
>> 
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On Aug 29, 2018, at 6:29 PM, Jai Bheemsen Rao Dhanwada 
>>> <jaibheem...@gmail.com> wrote:
>>> 
>>> It fails before bootstrap
>>> 
>>> streaming throughpu on the nodes is set to 400Mb/ps
>>> 
>>>> On Wednesday, August 29, 2018, Jeff Jirsa <jji...@gmail.com> wrote:
>>>> Is the bootstrap plan succeeding (does streaming start or does it crash 
>>>> before it logs messages about streaming starting)?
>>>> 
>>>> Have you capped the stream throughput on the existing hosts? 
>>>> 
>>>> -- 
>>>> Jeff Jirsa
>>>> 
>>>> 
>>>>> On Aug 29, 2018, at 5:02 PM, Jai Bheemsen Rao Dhanwada 
>>>>> <jaibheem...@gmail.com> wrote:
>>>>> 
>>>>> Hello All,
>>>>> 
>>>>> We are seeing some issue when we add more nodes to the cluster, where new 
>>>>> node bootstrap is not able to stream the entire metadata and fails to 
>>>>> bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError: 
>>>>> Java heap space)
>>>>> 
>>>>> But if I remove few nodes from the cluster we don't see this issue.
>>>>> 
>>>>> Cassandra Version: 2.1.16
>>>>> # of KS and CF : 100, 3000 (approx)
>>>>> # of DC: 8
>>>>> # of Vnodes per node: 256
>>>>> 
>>>>> Not sure what is causing this behavior, has any one come across this 
>>>>> scenario? 
>>>>> thanks in advance.

Reply via email to