Hello,
I'm trying to setup an elastic gluster cluster and am running into a few odd
edge cases that I'm unsure how to address. I'll try and walk through the setup
as best I can.
If I have a replica 3 distributed-replicated volume, with 2 replicated volumes
to start:
MyVolume
Replica 1
serverA
serverB
serverC
Replica 2
serverD
serverE
serverF
And the client mounts the volume with serverA as the primary volfile server,
and B & C as the backups.
Then, if I perform a scale down event, it selects the first replica volume as
the one to remove. So I end up with a configuration like:
MyVolume
Replica 2
serverD
serverE
serverF
Everything rebalances and works great. However, at this point, the client has
lost any connection with a volfile server. It knows about D, E, and F, so my
data is all fine, but it can no longer retrieve a volfile. In the logs I see:
[2019-10-15 17:21:59.232819] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify]
0-glusterfsd-mgmt: Exhausted all volfile servers
This becomes problematic when I try and scale back up, and add a replicated
volume back in:
MyVolume
Replica 2
serverD
serverE
serverF
Replica 3
serverG
serverH
serverI
And then rebalance the volume. Now, I have all my data present, but the client
only knows about D,E,F, so when I run an `ls` on a directory, only about half
of the files are returned, since the other half live on G,H,I which the client
doesn't know about. The data is still there, but it would require a re-mount
at one of the new servers.
My question then, is there a way to have a more dynamic set of volfile servers?
What would be great is if there was a way to tell the mount to fall back on the
servers returned in the volfile itself in case the primary one goes away.
If there's not an easy way to do this, is there a flag on the mount helper that
can cause the mount to die or error out in the event that it is unable to
retrieve volfiles? The problem now is that it sort of silently fails and
returns incomplete file listings, which for my use cases can cause improper
processing of that data. I'd rather have it hard error than provide bad
results silently obviously.
Hope that makes sense, if you need further clarity please let me know.
Thanks,
Tim
________
Community Meeting Calendar:
APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314
NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users