Nevermind. I figured out this is happening on all nodes where the tokens got moved. So, explains the big streaming going around in the cluster.
-----Original Message----- From: Brandon Williams [mailto:dri...@gmail.com] Sent: Wednesday, February 22, 2017 10:53 AM To: dev@cassandra.apache.org Subject: Re: RemoveNode Behavior Question The node that invoked removenode is entirely irrelevant, any node can invoke it. On Wed, Feb 22, 2017 at 12:51 PM, Anubhav Kale < anubhav.k...@microsoft.com.invalid> wrote: > But I don't understand how the replica count is getting restored here. > The node that invoked removenode only owns partial ranges. > > -----Original Message----- > From: Brandon Williams [mailto:dri...@gmail.com] > Sent: Wednesday, February 22, 2017 10:49 AM > To: dev@cassandra.apache.org > Subject: Re: RemoveNode Behavior Question > > Every topology operation tries to respect/restore the RF except for > assassinate. > > On Wed, Feb 22, 2017 at 12:45 PM, Anubhav Kale < > anubhav.k...@microsoft.com.invalid> wrote: > > > Hello, > > > > Recently, I started noticing an interesting pattern. When I execute > > "removenode", a subset of the nodes that now own the tokens result > > it in a CPU spike / disk activity, and sometimes SSTables on those > > nodes > shoot up. > > > > After looking through the code, it appears to me that below function > > forces data to be streamed from some of the new nodes to the node > > from where "removenode" is kicked in. Is my understanding correct ? > > > > https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit > > hu > > b.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd3e9&d > > at > > a=02%7C01%7CAnubhav.Kale%40microsoft.com%7Cf22f2e33447f46c5e82a08d45 > > b5 > > 38008%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63623386157417867 > > 5& > > sdata=NGkgls2RTfWTM7MBJ4MuKdxd7pRZiSRGcWDVUmXwG5Q%3D&reserved=0 > > 66e72cd046/src/java/org/apache/cassandra/service/StorageService.java > > #L > > 2548 > > <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F% > > 2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9 > > dd > > 3 e966e72cd046%2Fsrc%2Fjava%2Forg%2Fapache%2Fcassandra% > > 2Fservice%2FStorageService.java%23L2548&data=02%7C01% > > 7CAnubhav.Kale%40microsoft.com%7C173daa48fcaf4ca6498d08d43982318c% > > 7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636196678720784947&sdat > > a= JZ9zWh%2FtJJ%2FbhXXkT41yQhANKaUSBHfP53WraY2vL8M%3D&reserved=0> > > > > Our nodes don't run very hot, but it appears this streaming causes > > them to have issues. If I understand the code correctly, the node > > that's initiated removenode may still not get all the data for moved > > over ranges. So, what is the rationale behind trying to build a > > "partial > replica" ? > > > > Maybe, I am not following this correctly so hoping someone can explain. > > > > Thanks ! > > > > >