Very sorry...I got the reason for this issue..
Please ignore.

On Wed, Sep 28, 2016 at 10:14 PM, techpyaasa . <> wrote:

> @Paulo
> We have done changes as you said
> net.ipv4.tcp_keepalive_time=60
> net.ipv4.tcp_keepalive_probes=3
> net.ipv4.tcp_keepalive_intvl=10
> and increased streaming_socket_timeout_in_ms to 48 hours ,
> "phi_convict_threshold : 9".
> And once again recommissioned new data center (DC3)  , ran " nodetool
> rebuild 'DC1' " , but this time NO data got streamed and 'nodetool rebuild'
> got exit without any exception.
> Please check logs below
> *INFO [RMI TCP Connection(10)] 2016-09-28 09:18:44,571
> (line 914) rebuild from dc: IDC*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,520
> (line 87) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Executing streaming plan for Rebuild*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,521
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,522
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [StreamConnectionEstablisher:1] 2016-09-28 09:18:47,522
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,522
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [StreamConnectionEstablisher:2] 2016-09-28 09:18:47,522
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [StreamConnectionEstablisher:3] 2016-09-28 09:18:47,523
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,523
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,524
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [StreamConnectionEstablisher:4] 2016-09-28 09:18:47,524
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [StreamConnectionEstablisher:5] 2016-09-28 09:18:47,525
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,524
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,525
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [StreamConnectionEstablisher:6] 2016-09-28 09:18:47,526
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [StreamConnectionEstablisher:7] 2016-09-28 09:18:47,526
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,526
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [RMI TCP Connection(10)] 2016-09-28 09:18:47,527
> (line 91) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Beginning stream session with
> /*
> * INFO [StreamConnectionEstablisher:8] 2016-09-28 09:18:47,527
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [StreamConnectionEstablisher:9] 2016-09-28 09:18:47,528
> (line 214) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Starting streaming to
> /*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,713
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,715
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,716
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,716
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,715
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,715
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,715
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,715
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,776
> (line 186) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] Session with / is
> complete*
> * INFO [STREAM-IN-/] 2016-09-28 09:18:47,778
> (line 220) [Stream
> #3a47f8d0-8597-11e6-bd17-3f6744d54a01] All sessions completed*
> As you can see logs above , nodetool rebuild finished w/o data got stremed
> and all streaming sessions completed WITHIN NOT TIME(See time stamp in
> logs).
> And also "nodetool status" seems to be all fine from this new nodes(from
> which I run 'nodetool rebuild').
> Please let us know what could be the issue here.
> Thanks in advance.
> On Wed, Sep 28, 2016 at 1:04 AM, Paulo Motta <>
> wrote:
>> Yeah this is likely to be caused by idle connections being shut down, so
>> you may need to update your tcp_keepalive* and/or network/firewall settings.
>> 2016-09-27 15:29 GMT-03:00 laxmikanth sadula <>:
>>> Hi paul,
>>> Thanks for the reply...
>>> I'm getting following streaming exceptions during nodetool rebuild in
>>> c*-2.0.17
>>> *04:24:49,759 (line 461) [Stream
>>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred*
>>> * Connection timed out*
>>> *    at Method)*
>>> *    at*
>>> *    at*
>>> *    at*
>>> *    at*
>>> *    at
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(*
>>> *    at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(*
>>> *    at
>>> org.apache.cassandra.streaming.ConnectionHandler$*
>>> *    at*
>>> *DEBUG [STREAM-OUT-/] 2016-09-27 04:24:49,764
>>> (line 104) [Stream
>>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Closing stream connection handler on
>>> /*
>>> * INFO [STREAM-OUT-/] 2016-09-27 04:24:49,764
>>> (line 186) [Stream
>>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Session with / is
>>> complete*
>>> *ERROR [STREAM-OUT-/] 2016-09-27 04:24:49,764
>>> (line 461) [Stream
>>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred*
>>> * Broken pipe*
>>> *    at Method)*
>>> *    at*
>>> *    at*
>>> *    at*
>>> *    at*
>>> *    at
>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(*
>>> *    at
>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(*
>>> *    at
>>> org.apache.cassandra.streaming.ConnectionHandler$*
>>> *    at*
>>> *DEBUG [STREAM-IN-/] 2016-09-27 04:24:49,909
>>> (line 244) [Stream
>>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Received File (Header (cfId:
>>> 68af9ee0-96f8-3b1d-a418-e5ae844f2cc2, #3, version: jb, estimated keys:
>>> 4736, transfer size: 2306880, compressed?: true), file:
>>> /home/cassandra/data_directories/data/keyspace_name1/archiving_metadata/keyspace_name1-archiving_metadata-tmp-jb-27-Data.db)*
>>> *ERROR [STREAM-IN-/] 2016-09-27 04:24:49,909
>>> (line 461) [Stream
>>> #5e1b7f40-8496-11e6-8847-1b88665e430d] Streaming error occurred*
>>> *java.lang.RuntimeException: Outgoing stream handler has been closed*
>>> *    at
>>> org.apache.cassandra.streaming.ConnectionHandler.sendMessage(*
>>> *    at
>>> org.apache.cassandra.streaming.StreamSession.receive(*
>>> *    at
>>> org.apache.cassandra.streaming.StreamSession.messageReceived(*
>>> *    at
>>> org.apache.cassandra.streaming.ConnectionHandler$*
>>> *    at*
>>> On Sep 27, 2016 11:48 PM, "Paulo Motta" <>
>>> wrote:
>>>> What type of streaming timeout are you getting? Do you have a stack
>>>> trace? What version are you in?
>>>> See more information about tuning tcp_keepalive* here:
>>>> shooting/trblshootIdleFirewall.html
>>>> 2016-09-27 14:07 GMT-03:00 laxmikanth sadula <>:
>>>>> @Paulo Motta
>>>>> Even we are facing Streaming timeout exceptions during 'nodetool
>>>>> rebuild' , I set streaming_socket_timeout_in_ms to 86400000 (24 hours) as
>>>>> suggested in datastax blog  -
>>>>> c/en-us/articles/206502913-FAQ-How-to-reduce-the-impact-of-s
>>>>> treaming-errors-or-failures  , but still we are getting streaming
>>>>> exceptions.
>>>>> And what is the suggestible settings/values for kernel tcp_keepalive
>>>>> which would help streaming succeed ?
>>>>> Thank you
>>>>> On Tue, Aug 16, 2016 at 12:21 AM, Paulo Motta <
>>>>>> wrote:
>>>>>> What version are you in? This seems like a typical case were there
>>>>>> was a problem with streaming (hanging, etc), do you have access to the
>>>>>> logs? Maybe look for streaming errors? Typically streaming errors are
>>>>>> related to timeouts, so you should review your cassandra
>>>>>> streaming_socket_timeout_in_ms and kernel tcp_keepalive settings.
>>>>>> If you're on 2.2+ you can resume a failed bootstrap with nodetool
>>>>>> bootstrap resume. There were also some streaming hanging problems fixed
>>>>>> recently, so I'd advise you to upgrade to the latest version of your
>>>>>> particular series for a more robust version.
>>>>>> Is there any reason why you didn't use the replace procedure
>>>>>> (-Dreplace_address) to replace the node with the same tokens? This would 
>>>>>> be
>>>>>> a bit faster than remove + bootstrap procedure.
>>>>>> 2016-08-15 15:37 GMT-03:00 Jérôme Mainaud <>:
>>>>>>> Hello,
>>>>>>> A client of mime have problems when adding a node in the cluster.
>>>>>>> After 4 days, the node is still in joining mode, it doesn't have the
>>>>>>> same level of load than the other and there seems to be no streaming 
>>>>>>> from
>>>>>>> and to the new node.
>>>>>>> This node has a history.
>>>>>>>    1. At the begin, it was in a seed in the cluster.
>>>>>>>    2. Ops detected that client had problems with it.
>>>>>>>    3. They tried to reset it but failed. In their process they
>>>>>>>    launched several repair and rebuild process on the node.
>>>>>>>    4. Then they asked me to help them.
>>>>>>>    5. We stopped the node,
>>>>>>>    6. removed it from the list of seeds (more precisely it was
>>>>>>>    replaced by another node),
>>>>>>>    7. removed it from the cluster (I choose not to use decommission
>>>>>>>    since node data was compromised)
>>>>>>>    8. deleted all files from data, commitlog and savedcache
>>>>>>>    directories.
>>>>>>>    9. after the leaving process ended, it was started as a fresh
>>>>>>>    new node and began autobootstrap.
>>>>>>> As I don’t have direct access to the cluster I don't have a lot of
>>>>>>> information, but I will have tomorrow (logs and results of some 
>>>>>>> commands).
>>>>>>> And I can ask for people any required information.
>>>>>>> Does someone have any idea of what could have happened and what I
>>>>>>> should investigate first ?
>>>>>>> What would you do to unlock the situation ?
>>>>>>> Context: The cluster consists of two DC, each with 15 nodes. Average
>>>>>>> load is around 3 TB per node. The joining node froze a little after 2 
>>>>>>> TB.
>>>>>>> Thank you for your help.
>>>>>>> Cheers,
>>>>>>> --
>>>>>>> Jérôme Mainaud
>>>>> --
>>>>> Regards,
>>>>> Laxmikanth
>>>>> 99621 38051

Reply via email to