These are the salient points here for me, yes: > My understanding from the proposal is that Sidecar would be able to migrate from a Cassandra instance that is already dead and cannot recover. > That’s one thing I like about having it an external process — not that it’s bullet proof but it’s one less thing to worry about. The manual/rsync version of the state machine Hari describes in the CEP is one of the best escape hatches for migrating an instance that’s overstressed, limping on ailing hardware, or that has exhausted disk. If the system is functional but the C* process is in bad shape, it’s great to have a paved-path flow for migrating the instance and data to more capable hardware. I also agree in principle that “streaming should be just as fast via the C* process itself.” This hits a couple snags today: - This option isn’t available when the C* instance is struggling. - In the scenario of replacing an entire cluster’s hardware with new machines, applying this process to an entire cluster via host replacements of all instances (which also requires repairs) or by doubling then halving capacity is incredibly cumbersome and operationally-impacting to the database’s users - especially if the DB is already having a hard time. - The host replacement process also puts a lot of stress on gossip and is a great way to encounter all sorts of painful races if you perform it hundreds or thousands of times (but shouldn’t be a problem in TCM-world). So I think I agree with both points: - Cassandra should be able to do this itself. - It is also valuable to have a paved path implementation of a safe migration/forklift state machine when you’re in a bind, or need to do this hundreds or thousands of times. On zero copy: what really makes ZCS fast compared to legacy streaming is that the JVM is able to ship entire files around, rather than deserializing SSTables and reserializing them to stream each individual row. That’s the slow and expensive part. It’s true that TLS means you incur an extra memcpy as that stream is encrypted before it’s chunked into packets — but the cost of that memcpy for encryption pales in comparison to how slow deserializing/reserializing SSTables is/was. ZCS with TLS can push 20Gbps+ today on decent but not extravagant Xeon hardware. In-kernel TLS would also still encounter a memcpy in the encryption path; the kernel.org doc alludes to this via “the kernel will need to allocate a buffer for the encrypted data.” But it would allow using sendfile and cut a copy in userspace. If someone is interested in testing it out I’d love to learn what they find. It’s always a great surprise to learn there’s a more perf left on the table. This comparison looks promising: https://tinselcity.github.io/SSL_Sendfile/ – Scott — Mobile On Apr 19, 2024, at 11:31 AM, Jordan West <jorda...@gmail.com> wrote:
|
- Re: [DISCUSS] CEP-40: Da... Jordan West
- Re: [DISCUSS] CEP-40... German Eichberger via dev
- Re: [DISCUSS] CEP-40... Claude Warren, Jr via dev
- Re: [DISCUSS] CEP-40: Data Transf... Paulo Motta
- Re: [DISCUSS] CEP-40: Data Transfer Using... Dinesh Joshi
- Re: [DISCUSS] CEP-40: Data Transfer Using Cass... Ariel Weisberg
- Re: [DISCUSS] CEP-40: Data Transfer Using... Jon Haddad
- Re: [DISCUSS] CEP-40: Data Transfer U... Jon Haddad
- Re: [DISCUSS] CEP-40: Data Transf... Francisco Guerrero
- Re: [DISCUSS] CEP-40: Data Tr... Jordan West
- Re: [DISCUSS] CEP-40: Da... C. Scott Andreas
- Re: [DISCUSS] CEP-40... Jon Haddad
- Re: [DISCUSS] CEP-40... Dinesh Joshi
- Re: [DISCUSS] CEP-40: Data Transfer Using... Dinesh Joshi
- Re: [DISCUSS] CEP-40: Data Transfer U... Jeff Jirsa
- Re: [DISCUSS] CEP-40: Data Transf... Jon Haddad
- Re: [DISCUSS] CEP-40: Data Tr... Jordan West
- Re: [DISCUSS] CEP-40: Da... Slater, Ben via dev
- Re: [DISCUSS] CEP-40... Venkata Hari Krishna Nukala
- Re: [DISCUSS] CEP-40... Patrick McFadin
- Re: [DISCUSS] CEP-40... Venkata Hari Krishna Nukala