Re: Bootstrap streaming issues

Jeff Jirsa Wed, 29 Aug 2018 18:53:27 -0700

Given the size of your schema, you’re probably getting flooded with a bunch of 
huge schema mutations as it hops into gossip and tries to pull the schema from 
every host it sees. You say 8 DCs but you don’t say how many nodes - I’m 
guessing it’s  a lot?

This is something that’s incrementally better in 3.0, but a real proper fix has 
been talked about a few times  - 
https://issues.apache.org/jira/browse/CASSANDRA-11748 and 
https://issues.apache.org/jira/browse/CASSANDRA-13569 for example 

In the short term, you may be able to work around this by increasing your heap 
size. If that doesn’t work, there’s an ugly ugly hack that’ll work on 2.1:  
limiting the number of schema blobs you can get at a time - in this case, that 
means firewall off all but a few nodes in your cluster for 10-30 seconds, make 
sure it gets the schema (watch the logs or file system for the tables to be 
created), then remove the firewall so it can start the bootstrap process (it 
needs the schema to setup the streaming plan, and it needs all the hosts up in 
gossip to stream successfully, so this is an ugly hack to give you time to get 
the schema and then heal the cluster so it can bootstrap).

Yea that’s awful. Hopefully either of the two above JIRAs lands to make this 
less awful. 

-- 
Jeff Jirsa

> On Aug 29, 2018, at 6:29 PM, Jai Bheemsen Rao Dhanwada 
> <jaibheem...@gmail.com> wrote:
> 
> It fails before bootstrap
> 
> streaming throughpu on the nodes is set to 400Mb/ps
> 
>> On Wednesday, August 29, 2018, Jeff Jirsa <jji...@gmail.com> wrote:
>> Is the bootstrap plan succeeding (does streaming start or does it crash 
>> before it logs messages about streaming starting)?
>> 
>> Have you capped the stream throughput on the existing hosts? 
>> 
>> -- 
>> Jeff Jirsa
>> 
>> 
>>> On Aug 29, 2018, at 5:02 PM, Jai Bheemsen Rao Dhanwada 
>>> <jaibheem...@gmail.com> wrote:
>>> 
>>> Hello All,
>>> 
>>> We are seeing some issue when we add more nodes to the cluster, where new 
>>> node bootstrap is not able to stream the entire metadata and fails to 
>>> bootstrap. Finally the process dies with OOM (java.lang.OutOfMemoryError: 
>>> Java heap space)
>>> 
>>> But if I remove few nodes from the cluster we don't see this issue.
>>> 
>>> Cassandra Version: 2.1.16
>>> # of KS and CF : 100, 3000 (approx)
>>> # of DC: 8
>>> # of Vnodes per node: 256
>>> 
>>> Not sure what is causing this behavior, has any one come across this 
>>> scenario? 
>>> thanks in advance.

Re: Bootstrap streaming issues

Reply via email to