Hello, Recently, I started noticing an interesting pattern. When I execute "removenode", a subset of the nodes that now own the tokens result it in a CPU spike / disk activity, and sometimes SSTables on those nodes shoot up.
After looking through the code, it appears to me that below function forces data to be streamed from some of the new nodes to the node from where "removenode" is kicked in. Is my understanding correct ? https://github.com/apache/cassandra/blob/d384e781d6f7c028dbe88cfe9dd3e966e72cd046/src/java/org/apache/cassandra/service/StorageService.java#L2548<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fcassandra%2Fblob%2Fd384e781d6f7c028dbe88cfe9dd3e966e72cd046%2Fsrc%2Fjava%2Forg%2Fapache%2Fcassandra%2Fservice%2FStorageService.java%23L2548&data=02%7C01%7CAnubhav.Kale%40microsoft.com%7C173daa48fcaf4ca6498d08d43982318c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636196678720784947&sdata=JZ9zWh%2FtJJ%2FbhXXkT41yQhANKaUSBHfP53WraY2vL8M%3D&reserved=0> Our nodes don't run very hot, but it appears this streaming causes them to have issues. If I understand the code correctly, the node that's initiated removenode may still not get all the data for moved over ranges. So, what is the rationale behind trying to build a "partial replica" ? Maybe, I am not following this correctly so hoping someone can explain. Thanks !