Thanks a lot, Jordan, Jeff, Abe, guo and Jon! This is really helpful.

On Wed, Oct 9, 2024 at 8:59 AM Jon Haddad <j...@rustyrazorblade.com> wrote:

> I've worked with a few hundred teams now, including the major ones that
> used single token (Apple, Netflix, Spotify), and pretty much all the rest
> used some form of vnodes.
>
> Jeff did a good job of summarizing the tradeoffs and I don't have anything
> to add.
>
> I would never, ever, recommend > 4 tokens.  There's just not a practical
> reason to do it.  I've heard claims of people wanting elastic clusters...
> can't recall too many times where I've needed to shrink Cassandra.  It's a
> really weird thing to keep in your back pocket when it makes daily
> operations so much worse.  Even in that case, you can remove, then add.
> It's not great, but you might do it like once over multiple years.  So I
> always say don't go over 4 because the tradeoff isn't practical.
>
> Jon
>
>
>
>
> On Tue, Oct 8, 2024 at 10:58 PM Jeff Jirsa <jji...@gmail.com> wrote:
>
>> You don’t have to double.
>>
>> You can add 1 node at a time - you just have to move every other token to
>> stay balanced
>>
>> Most people don’t write the tooling to do that, but it’s not that
>> complicated
>>
>> Calculate the token positions with N nodes
>> Calculate the token positions with N+1 nodes
>>
>> Bootstrap the new machine at whichever N+1 token is furthest from an
>> existing token
>> For each existing node:
>>     Run cleanup
>>     Move node to the new token
>>
>> Run cleanup again
>>
>> It’s involved but straight forward, online, and safe.
>>
>> Because there’s only one token per node you can bootstrap/move in batches
>> (offset by 2x RF - so if you have 100 machines and RF=3, you can have 16
>> machines bootstrapping or moving at the same time). You can’t do that
>> safely with vnodes.
>>
>>
>> On Oct 9, 2024, at 12:51 AM, guo Maxwell <cclive1...@gmail.com> wrote:
>>
>> 
>> I think cost is a very important point if you are going to use *single**
>> token i*f your cluster will be very large , because every time the
>> cluster is expanded, the nodes need to be doubled.100 -> 200, 200->400 ...
>> This is one of the reasons why we maintain many small clusters.
>>
>> of course its availability will be better .
>>
>> Abe Ratnofsky <a...@aber.io> 于2024年10月9日周三 11:56写道:
>>
>>> Here’s the best post I’m aware of:
>>> https://jolynch.github.io/pdf/cassandra-availability-virtual.pdf
>>>
>>> On Oct 7, 2024, at 17:30, Long Pan <panlong...@gmail.com> wrote:
>>>
>>> 
>>>
>>> Hi Cassandra Community,
>>>
>>> I’m currently exploring the use of *single vnode* (single token) per
>>> node in large-scale Cassandra deployments. I've come across discussions
>>> suggesting that some heavy users like Apple and Netflix have opted for this
>>> configuration to simplify operations and achieve more predictable
>>> performance.
>>>
>>> I’d like to ask if anyone could point me to *resources* (blog posts,
>>> conference talks, case studies or even personal experiences) that dive
>>> deeper into:
>>>
>>>    - The *rationale* behind using a single vnode instead of multiple
>>>    vnodes.
>>>    - The *operational benefits* and any potential trade-offs
>>>    encountered.
>>>
>>> Thank you in advance for your insights and any pointers you can provide!
>>>
>>> Best regards,
>>> Long
>>>
>>>

Reply via email to