I have a 16 node Ignite (v2.10.0) cluster with persistence enabled, and
about 20 caches, all of which are configured as cacheMode = partitioned,
backups = 1, with a rebalanceMode of ASYNC and rebalanceDelay of -1 (such
that rebalancing will only happen manually). The auto baseline adjustment
feature is disabled. The cluster uses TcpDiscoveryVmIpFinder and each of
the 16 nodes has a list of all 16 ip addresses.

I want to expand the cluster and add a 17th node and rebalance the data
accordingly. In the new node, I update the config to include all 16 nodes
plus itself, then start it up. Using ./control.sh --baseline on one of the
original 16 nodes, I see all 16 nodes in the baseline, plus the new one in
a different section at the bottom (e.g. not yet part of the baseline). I
run ./control.sh --baseline add <newNodeId>, and it seems to work, as I now
have 17 nodes in the baseline topology, and the metrics that are logged out
every minute from each node indicate that there are now 17 servers. I see
these same logs/info on the new node as well as the 16 original ones.

On the newly added node, I see logs like these after updating the baseline
topology:

Local state for group durability has changed [name=MyCache1Name,
enabled=false]
Local state for group durability has been logged to WAL [name=MyCache1Name,
enabled=false]
...
Prepared rebalancing [grp=ignite-sys-cache, mode=SYNC, supplier=...]
...
Starting rebalance routine [grp=ignite-sys-cache, mode=SYNC, supplier=...]
...
Completed rebalancing [rebalanceId=42, grp=ignite-sys-cache, supplier=...]
Local state for group durability has changed [name=ignite-sys-cache,
enabled=true]

I don't know what ignite-sys-cache is, but this all seems fine and good,
but my actual caches are not rebalanced and I have no data for them on this
new node. I tried using ignite.cache(cacheName).rebalance() on all of my
caches, but that also appeared to have no effect, even after sitting
overnight.

Is there something I'm missing with regards to how cluster expansion,
rebalancing, and baseline topology works? I've tried for a couple weeks to
get this working with no success. The official docs don't say much on the
subject other than 'update the baseline topology and data rebalancing
should occur based on your rebalanceMode and rebalanceDelay settings'.

Reply via email to