It sounds like a bad policy, and you should push for that to be changed.
Failing that, you have some options:
1. Use faster disks. This improves cold start performance, without
relying on the caches.
2. Rely on row cache instead. It can be saved periodically and loaded at
startup time.
3. Ensure read CL < RF, and rely on speculative retries. Note: you will
need to avoid restarting two severs owning the same token range
consecutively for this to work.
These are on top of my head, but I'm sure there's more ways to do it.
You should decide based on your situation.
BTW, manually load the chunk cache is never going to work unless you
know what the hot data is. Load a whole table into chunk cache makes no
sense unless the table on each server can fit in 512 MB of memory, but
then why do you even need Cassandra?
On 21/03/2023 17:15, Carlos Diaz wrote:
Hi Team,
We are heavy users of Cassandra at a pretty big bank. Security
measures require us to constantly refresh our C* nodes every x number
of days. We normally do this in a rolling fashion, taking one node
down at a time and then refreshing it with a new instance. This
process has been working for us great for the past few years.
However, we recently started having issues when a newly refreshed
instance comes back online, our automation waits a few minutes for the
node to become "ready (UN)" and then moves on to the next node. The
problem that we are facing is that when the node is ready, the chunk
cache is still empty so when the node starts accepting new
connections, queries that go to take much longer to respond and this
causes errors for our apps.
I was thinking that it would be great if we had a nodetool command
that would allow us to prefetch a certain table or a set of tables to
preload the chunk cache. Then we could simply add another check
(nodetool info?), to ensure that the chunk cache has been preloaded
enough to handle queries to this particular node.
Would love to hear others' feedback on the feasibility of this idea.
Thanks!