The error message indicates source side stream session failed. if source side load is heavy, consider reduce some, like stop repair, etc. If every trying failed at same file, may check that file too. Hope it helps.
On Wed, Oct 6, 2021 at 11:20 PM MyWorld <timeplus.1...@gmail.com> wrote: > Hi Jim, > It's 600 Megabits not MegaBytes. So near around 600/8=75MBps. Also, > streaming is happening from any 3 nodes at a time. > However we have tried with default Streaming throughput which is 200 > Megabits per sec (25MBps), but still the same issue. Heap we have setup to > 8gb on GCP and seems pretty much normal. > > On Thu, Oct 7, 2021, 4:05 AM Jim Shaw <jxys...@gmail.com> wrote: > >> I met similar issue before. What I did was: reduce Heap size for >> rebuild, reduce streamthroughput. >> But it depends on version, and your env., may not your case, just hope >> it helpful. >> >> ps -ef | grep , you will see a new java process for rebuild, see what >> memory size used, if use default, it may be use too much, just export >> MAX_HEAP_SIZE before nodetool rebuild, it will limit heap size. >> >> streamthroughput=600MB/s, if you look by nodetool, or OS level file, or >> the log, you will it pull files from all nodes --- that is 5 in your >> case. so it will be 3 GB/s , on-premise side may not handle it due to >> firewall setting. >> >> Regards, >> Jim >> >> On Tue, Oct 5, 2021 at 8:43 AM MyWorld <timeplus.1...@gmail.com> wrote: >> >>> Logged "nodetool failuredetector" every 5sec. Doesn't seems to be an >>> issue for phi_convict_threshold value >>> >>> On Tue, Oct 5, 2021 at 4:35 PM Surbhi Gupta <surbhi.gupt...@gmail.com> >>> wrote: >>> >>>> Hi , >>>> >>>> Try to adjust phi_convict_threshold and see if that helps. >>>> When we did migration from on prim to AWS, this was one of the factor >>>> to consider. >>>> >>>> Thanks >>>> >>>> >>>> On Tue, Oct 5, 2021 at 4:00 AM MyWorld <timeplus.1...@gmail.com> wrote: >>>> >>>>> Hi all, >>>>> >>>>> Need urgent help. >>>>> We have one Physical Data Center of 5 nodes with 1 TB data on each >>>>> (Location: Dallas). Currently we are using Cassandra ver 3.0.9. Now we are >>>>> Adding one more Data Center of 5 nodes(Location GCP-US) and have joined it >>>>> to the existing one. >>>>> >>>>> While running nodetool rebuild command, we are getting following error >>>>> : >>>>> On GCP node (where we ran rebuild command) : >>>>> >>>>>> ERROR [STREAM-IN-/192.x.x.x] 2021-10-05 15:56:52,246 >>>>>> StreamSession.java:639 - [Stream #66646d30-25a2-11ec-903b-774f88efe725] >>>>>> Remote peer 192.x.x.x failed stream session. >>>>>> INFO [STREAM-IN-/192.x.x.x] 2021-10-05 15:56:52,266 >>>>>> StreamResultFuture.java:183 - [Stream >>>>>> #66646d30-25a2-11ec-903b-774f88efe725] Session with /192.x.x.x is >>>>>> complete >>>>> >>>>> >>>>> On DL source node : >>>>> >>>>>> INFO [STREAM-IN-/34.x.x.x] 2021-10-05 15:55:53,785 >>>>>> StreamResultFuture.java:183 - [Stream >>>>>> #66646d30-25a2-11ec-903b-774f88efe725] Session with /34.x.x.x is complete >>>>>> ERROR [STREAM-OUT-/34.x.x.x] 2021-10-05 15:55:53,785 >>>>>> StreamSession.java:534 - [Stream #66646d30-25a2-11ec-903b-774f88efe725] >>>>>> Streaming error occurred >>>>>> java.lang.RuntimeException: Transfer of file >>>>>> /var/lib/cassandra/data/clickstream/glusr_usr_paid_url_mv-3c49c392b35511e9bd0a8f42dfb09617/mc-45676-big-Data.db >>>>>> already completed or aborted (perhaps session failed?). >>>>>> at >>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage.startTransfer(OutgoingFileMessage.java:120) >>>>>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>>>>> at >>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:50) >>>>>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>>>>> at >>>>>> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42) >>>>>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>>>>> at >>>>>> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:48) >>>>>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>>>>> at >>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:387) >>>>>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>>>>> at >>>>>> org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:367) >>>>>> ~[apache-cassandra-3.0.9.jar:3.0.9] >>>>>> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_192] >>>>>> WARN [STREAM-IN-/34.x.x.x] 2021-10-05 15:55:53,786 >>>>>> StreamResultFuture.java:210 - [Stream >>>>>> #66646d30-25a2-11ec-903b-774f88efe725] Stream failed >>>>> >>>>> >>>>> Before starting this rebuild, we have made the following changes: >>>>> 1. Set setstreamthroughput to 600 Mb/sec >>>>> 2. Set setinterdcstreamthroughput to 600 Mb/sec >>>>> 3. streaming_socket_timeout_in_ms is 24 hrs >>>>> 4. Disabled autocompaction on GCP node as this was heavily utilising >>>>> CPU resource >>>>> >>>>> FYI, GCP rebuild process starts with data streaming from 3 nodes, and >>>>> all fails one by one after streaming for a few hours. >>>>> Please help out how to correct this issue. >>>>> Is there any other way to rebuild such big data. >>>>> We have a few tables with 200 - 400GB of data and some smaller tables. >>>>> Also, we have Mviews in our environment >>>>> >>>>> Regards, >>>>> Ashish Gupta >>>>> >>>>> >>>>