Hi Paul,

I believe I had it on Lustre at least for a while, and that worked.

Other setups I've used:

- NFS with the data on an external array, connected via SAS to multiple 
servers and multipathing configured; I believe we had it set up so that all 
servers could see the data, but only one had it mounted. We did failover 
manually, but there'd be no reason not to use heartbeat/pacemaker or similar.

- data on local server disks, NFS, heartbeat/pacemaker & DRBD

- NetApp appliances (cost money, but virtually no management headache, and 
works very well)

Tina

On Monday, 9 July 2018 15:48:20 BST Paul Paul wrote:
> Hello,
> 
> In order to use the shadow master functionality, the SGE local configuration
> files have to be stored on NFS. This is usually done by using a single
> server, thus implies a single point of failure.
> 
> If you're using SGE with a distributed file system, can you please indicate
> which one? We tried GlusterFS (version 3.12) with SGE 8.1.9 but it appeared
> that jobs were randomly killed (after few days where everything run
> smoothly); going back to NFS (4.0) on a single server fixed this behavior.
> 
> Thanks for sharing,
> 
> Paul.
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users


-- 
Tina Friedrich, Snr HPC Systems Administrator, Advanced Research Computing
Research Computing and Support Services, Academic IT 
IT Services, University of Oxford 
http://www.arc.ox.ac.uk
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to