Re: [Gluster-users] Client Handling of Elastic Clusters

Timothy Orme Wed, 16 Oct 2019 13:06:41 -0700

I did explore CEPH a bit, and that might be an option as well, still doing 
exploration on gluster.   Hopefully no one hates you for making the suggestion 🙂


I haven't tried NFS Ganesha yet.  I was under the impression it was maybe a 
little unstable yet, and found the docs a little limited for it.  If that 
solves the issue that might also be a good option.  I've heard others suggest 
performance is better for it than the FUSE client as well.

I don't know how other systems deal with it currently, but it seems like even 
just leveraging the volfile itself as a source for backups would work well.  
There are still likely issues where things could lapse, but that seems like an 
improvement at least.  I'll try and dig into what other's are using, though 
maybe they don't have this issue at all since they tend to use metadata servers?

Thanks,
Tim
________________________________
From: Strahil Nikolov <[email protected]>
Sent: Wednesday, October 16, 2019 11:49 AM
To: gluster-users <[email protected]>; Timothy Orme <[email protected]>
Subject: [EXTERNAL] Re: Re: [Gluster-users] Client Handling of Elastic Clusters

Most probably current version never supported (maybe there was no such need 
until now) such elasticity and the only option is to use Highly-Available NFS 
Ganesha as the built-in NFS is deprecated.
What about scaling on the same system ? Nowadays , servers have a lot of 
hot-plug disk slots and you can keep the number of servers the same ... still 
the server bandwidth will be a limit at some point .

I'm not sure how other SDS deal with such elasticity . I guess many users in 
the list will hate me for saying this , but have you checked CEPH for your 
needs ?

Best Regards,
Strahil Nikolov

В сряда, 16 октомври 2019 г., 21:13:58 ч. Гринуич+3, Timothy Orme 
<[email protected]> написа:


Yes, this makes the issue less likely, but doesn't make it impossible for 
something that is fully elastic.

For instance, if I had instead just started with A,B,C and then scaled out and 
in twice, all volfile servers would have potentially be destroyed and replaced. 
 I think the problem is that the selection of volfile servers is determined at 
mounting, rather than updating as the cluster changes.  There are ways to 
greatly reduce this issue, such as adding more backup servers, but it's still a 
possibility.

I think more important then, for me at least, is to have the option of failing 
when no volfile servers are remaining as it can produce incomplete views of the 
data.

Thanks!
Tim
________________________________
From: Strahil <[email protected]>
Sent: Tuesday, October 15, 2019 8:46 PM
To: Timothy Orme <[email protected]>; gluster-users <[email protected]>
Subject: [EXTERNAL] Re: [Gluster-users] Client Handling of Elastic Clusters


Hi Timothy,

Have you tried to mount on the client  via all servers :

mount -t glusterfs -o backup-volfile-servers=B:C:D:E:F A:/volume  /destination

Best Regards,
Strahil Nikolov

On Oct 15, 2019 22:05, Timothy Orme <[email protected]> wrote:
Hello,

I'm trying to setup an elastic gluster cluster and am running into a few odd 
edge cases that I'm unsure how to address.  I'll try and walk through the setup 
as best I can.

If I have a replica 3 distributed-replicated volume, with 2 replicated volumes 
to start:

MyVolume
   Replica 1
      serverA
      serverB
      serverC
   Replica 2
      serverD
      serverE
      serverF

And the client mounts the volume with serverA as the primary volfile server, 
and B & C as the backups.

Then, if I perform a scale down event, it selects the first replica volume as 
the one to remove.  So I end up with a configuration like:

MyVolume
   Replica 2
      serverD
      serverE
      serverF

Everything rebalances and works great.  However, at this point, the client has 
lost any connection with a volfile server.  It knows about D, E, and F, so my 
data is all fine, but it can no longer retrieve a volfile.  In the logs I see:

[2019-10-15 17:21:59.232819] I [glusterfsd-mgmt.c:2463:mgmt_rpc_notify] 
0-glusterfsd-mgmt: Exhausted all volfile servers

This becomes problematic when I try and scale back up, and add a replicated 
volume back in:

MyVolume
   Replica 2
      serverD
      serverE
      serverF
   Replica 3
      serverG
      serverH
      serverI

And then rebalance the volume.  Now, I have all my data present, but the client 
only knows about D,E,F, so when I run an `ls` on a directory, only about half 
of the files are returned, since the other half live on G,H,I which the client 
doesn't know about.  The data is still there, but it would require a re-mount 
at one of the new servers.

My question then, is there a way to have a more dynamic set of volfile servers? 
What would be great is if there was a way to tell the mount to fall back on the 
servers returned in the volfile itself in case the primary one goes away.

If there's not an easy way to do this, is there a flag on the mount helper that 
can cause the mount to die or error out in the event that it is unable to 
retrieve volfiles?  The problem now is that it sort of silently fails and 
returns incomplete file listings, which for my use cases can cause improper 
processing of that data.  I'd rather have it hard error than provide bad 
results silently obviously.

Hope that makes sense, if you need further clarity please let me know.

Thanks,
Tim

________

Community Meeting Calendar:

APAC Schedule -
Every 2nd and 4th Tuesday at 11:30 AM IST
Bridge: https://bluejeans.com/118564314

NA/EMEA Schedule -
Every 1st and 3rd Tuesday at 01:00 PM EDT
Bridge: https://bluejeans.com/118564314

Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Client Handling of Elastic Clusters

Reply via email to