Thanks Jeff for detailed clarifications.
We tried rebuild data in Spark DC nodes one node at a time in May again but
ran into issues. prod has 3 DC, DC1(9 nodes) and DC2 (9 nodes) are only C*
and DC3 has spark with 3 nodes and vnodes enabled with numtokens=32
We also dropped few unused indexes.
Victor,
We have 21 nodes in 3 DC, spark DC has 3 nodes. Primary datacenter nodes
has 300gb of data.
What the num_tokens you have in prod cluster? are u using default 256?
On Wed, Dec 9, 2015 at 2:19 PM, Victor Chen wrote:
> I have a 12 node cluster in prod using vnodes and C* version 2.18. I ha
Streaming with vnodes is not always pleasant – rebuild uses streaming (as does
bootstrap, repair, and decommission). The rebuild delay you see may or may not
be related to that. It could also be that the streams timed out, and you don’t
have a stream timeout set. Are you seeing data move? Are th
I have a 12 node cluster in prod using vnodes and C* version 2.18. I have
never used rebuild, and instead prefer bootstrapping new nodes, even if it
means there is additional shuffling of data and cleanup needed on the
initial nodes in each DC, mostly b/c you can tell when bootstrapping is
finished