Hi Mike, I only skimmed through the article, but I think that the basic argument made there is valid, when using a high number of VNodes in a large cluster. That’s exactly why such a configuration is discouraged.
Please refer to the detailed article at https://jolynch.github.io/pdf/cassandra-availability-virtual.pdf for more information. The generaql recommendation is to not use more than 4 VNodes, which should keep the change of a concurrent failure rather low. For very large clusters, not using VNodes at all might also be an option, though it comes with some downsides. Best regards, Sebastian > Am 29.10.2024 um 16:04 schrieb Mike James <mike.ja...@clutch.com>: > > https://martin.kleppmann.com/2017/01/26/data-loss-in-large-clusters.html > > Is this article based on any experimental data? What are the real-world stats > on probability of data loss in large clusters. A discussion of this is taking > place within the company but I wanted to get real-world experiences. > > Thanks, > Mike > > > Disclaimer: This e-mail and any attachments may contain confidential > information. If you are not the intended recipient, any disclosure, copying, > distribution or use of any information contained herein is strictly > prohibited. If you have received this transmission in error, please > immediately notify the sender and destroy the original transmission and any > attachments without reading or saving > Disclaimer: This e-mail and any attachments may contain confidential > information. If you are not the intended recipient, any disclosure, copying, > distribution or use of any information contained herein is strictly > prohibited. If you have received this transmission in error, please > immediately notify the sender and destroy the original transmission and any > attachments without reading or saving.
smime.p7s
Description: S/MIME cryptographic signature