That should have been sent to the drbd-users list for reference, so here it 
is...

On 18 Mar 2020, at 13:53, Jérôme Barotin <[email protected]> wrote:
> 
> We have two use cases :
> 
> 1 - Storage of about 5GB of small files (10kb in average) that are written 
> and read very often.
> 
> 2 - Archive storage (2TB) of files size from 10kb to 10+ Mb , write and read 
> are more rare and higher latency is not a problem.

Sounds like a typical use case for a simple active/passive single-resource or 
active/active dual-resource setup, where each resource is active on only one of 
the nodes at a time, each resource contains a normal filesystem, and each 
resource is replicated to the other site. In case of a downtime of one VM, the 
resource will fail over to the other node/site.

> I'm trying to set up a test environment, I'm using Ubuntu server as 
> distribution, is that a correct choice ? Or Red Hat based distrib would be 
> easier to work with ?

It works essentially identically with both.

> Also, it's not clear how to make the link between  GFS, Corosync / Pacemaker 
> and DRBD. Where could I find some good doc to understand what I'm doing ?

GFS is not necessary in the setup described above. There is the DRBD User’s 
Guide, available on the LINBIT homepage, and there are guides for setting up 
Pacemaker, available on the clusterlabs.org homepage.

> Thanks for your reply Robert, exactly what I'm thinking, but, my management 
> team directives are clear : no baremetal, I have to use only cloud solution.

It depends on the exact requirements whether or not a cluster can reasonably 
run across different sites. E.g., if you have simple client software that needs 
to find a service under a single IP address, then your service needs a service 
IP address that can be placed on each node in the cluster, which is not 
possible if the cluster nodes are in different IP subnets. And things become 
really complicated when people try to handicraft some workarounds, like VPNs, 
Proxies, etc. to work around those limitations.

The reality is that most operators struggle even with running simple 
active/passive clusters, at least as soon as some minor unexpected problem 
occurs in the cluster, and a lot of downtime is not due to failed hardware or 
power outages, but rather due to the operators’ inability to figure out why 
their software setup is not failing over properly, or is not starting/not 
working. Every piece of complexity on top of that just makes the situation 
worse.

br,
Robert
_______________________________________________
Star us on GITHUB: https://github.com/LINBIT
drbd-user mailing list
[email protected]
https://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to