Something additional to consider (outside C* fix) is using a tool like happycache <https://github.com/hashbrowncipher/happycache> to have consistent pagecache between them. Might be sufficient if the data is in memory already.
Chris On Tue, Mar 21, 2023 at 2:48 PM Jeff Jirsa <jji...@gmail.com> wrote: > We serialize the other caches to disk to avoid cold-start problems, I > don't see why we couldn't also serialize the chunk cache? Seems worth a > JIRA to me. > > Until then, you can probably use the dynamic snitch (badness + severity) > to route around newly started hosts. > > I'm actually pretty surprised the chunk cache is that effective, sort of > nice to know. > > > > On Tue, Mar 21, 2023 at 10:17 AM Carlos Diaz <crdiaz...@gmail.com> wrote: > >> Hi Team, >> >> We are heavy users of Cassandra at a pretty big bank. Security measures >> require us to constantly refresh our C* nodes every x number of days. We >> normally do this in a rolling fashion, taking one node down at a time and >> then refreshing it with a new instance. This process has been working for >> us great for the past few years. >> >> However, we recently started having issues when a newly refreshed >> instance comes back online, our automation waits a few minutes for the node >> to become "ready (UN)" and then moves on to the next node. The problem >> that we are facing is that when the node is ready, the chunk cache is still >> empty so when the node starts accepting new connections, queries that go to >> take much longer to respond and this causes errors for our apps. >> >> I was thinking that it would be great if we had a nodetool command that >> would allow us to prefetch a certain table or a set of tables to preload >> the chunk cache. Then we could simply add another check (nodetool info?), >> to ensure that the chunk cache has been preloaded enough to handle queries >> to this particular node. >> >> Would love to hear others' feedback on the feasibility of this idea. >> >> Thanks! >> >> >> >>