Hi Jon, Thanks for taking the time to read and reply to this proposal. Would encourage you to approach it from 
an attitude of seeking understanding on the part of the first-time CEP author, as this reply casts it off 
pretty quickly as NIH. The proposal isn't mine, but I'll offer a few notes on where I see this as valuable: – 
It's valuable for Cassandra to have an ecosystem-native mechanism of migrating data between physical/virtual 
instances outside the standard streaming path. As Hari mentions, the current ecosystem-native approach of 
executing repairs, decommissions, and bootstraps is time-consuming and cumbersome. – An ecosystem-native 
solution is safer than a bunch of bash and rsync. Defining a safe protocol to migrate data between instances 
via rsync without downtime is surprisingly difficult - and even moreso to do safely and repeatedly at scale. 
Enabling this process to be orchestrated by a control plane mechanizing offical endpoints of the database and 
sidecar – rather than trying to move data around behind its back – is much safer than hoping one's cobbled 
together the right set of scripts to move data in a way that won't violate strong / transactional consistency 
guarantees. This complexity is kind of exemplified by the "Migrating One Instance" section of the doc 
and state machine diagram, which illustrates an approach to solving that problem. – An ecosystem-native 
approach poses fewer security concerns than rsync. mTLS-authenticated endpoints in the sidecar for data 
movement eliminate the requirement for orchestration to occur via (typically) high-privilege SSH, which often 
allows for code execution of some form or complex efforts to scope SSH privileges of particular users; and 
eliminates the need to manage and secure rsyncd processes on each instance if not via SSH. – An 
ecosystem-native approach is more instrumentable and measurable than rsync. Support for data migration 
endpoints in the sidecar would allow for metrics reporting, stats collection, and alerting via mature and 
modern mechanisms rather than monitoring the output of a shell script. I'll yield to Hari to share more, though 
today is a public holiday in India. I do see this CEP as solving an important problem. Thanks, – Scott On Apr 
8, 2024, at 10:23 AM, Jon Haddad <j...@jonhaddad.com> wrote: This seems like a lot of work to create an 
rsync alternative. I can't really say I see the point. I noticed your "rejected alternatives" 
mentions it with this note: However, it might not be permitted by the administrator or available in various 
environments such as Kubernetes or virtual instances like EC2. Enabling data transfer through a sidecar 
facilitates smooth instance migration . This feels more like NIH than solving a real problem, as what you've 
listed is a hypothetical, and one that's easily addressed. Jon On Fri, Apr 5, 2024 at 3:47 AM Venkata Hari 
Krishna Nukala < n.v.harikrishna.apa...@gmail.com > wrote: Hi all, I have filed CEP-40 [1] for live 
migrating Cassandra instances using the Cassandra Sidecar. When someone needs to move all or a portion of the 
Cassandra nodes belonging to a cluster to different hosts, the traditional approach of Cassandra node 
replacement can be time-consuming due to repairs and the bootstrapping of new nodes. Depending on the volume of 
the storage service load, replacements (repair + bootstrap) may take anywhere from a few hours to days. 
Proposing a Sidecar based solution to address these challenges. This solution proposes transferring data from 
the old host (source) to the new host (destination) and then bringing up the Cassandra process at the 
destination, to enable fast instance migration. This approach would help to minimise node downtime, as it is 
based on a Sidecar solution for data transfer and avoids repairs and bootstrap. Looking forward to the 
discussions. [1] 
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-40%3A+Data+Transfer+Using+Cassandra+Sidecar+for+Live+Migrating+Instances
 Thanks! Hari

Reply via email to