Re: [gridengine users] SGE and NFS

Skylar Thompson Wed, 12 Nov 2014 08:45:07 -0800

Hi Eric,

We produce our own RPMs using FPM, just so we don't have to have the
executables on NFS. When the NFS storage is busy, it can make GE unusable
and sometimes unstable (if you hit protocol timeouts) if the executables
and/or job spool are on NFS.


On Wed, Nov 12, 2014 at 04:26:51PM +0000, Peskin, Eric wrote:
> All,
> 
> Does SGE have to use NFS or can it work locally on each node?
> If parts of it have to be on NFS, what is the minimal subset?
> How much of this changes if you want redundant masters?
> 
> We have a cluster running CentOS 6.3, Bright Cluster Manager 6.0, and SGE 
> 2011.11.  Specifically, SGE is provided by a Bright package: 
> sge-2011.11-360_cm6.0.x86_64
> 
> Twice, we have lost all the running SGE jobs when the cluster failed over 
> from one head node to the other.  =( Not supposed to happen.
> Since then, we have also had many individual jobs get lost.  The later 
> situation correlates with messages in the system logs saying
> 
> > abrt[9007]: File '/cm/shared/apps/sge/2011.11/bin/linux-x64/sge_execd' 
> > seems to be deleted
> 
> That file lives on an NFS mount on our Isilon storage.
> Surely, the executables don't have to be on NFS?
> Interesting, we are using local spooling, the spool directory on each node is 
>  /cm/local/apps/sge/var/spool , which is, indeed local.  
> But the $SGE_ROOT ,  /cm/shared/apps/sge/2011.11 lives on NFS.  
> Does any of it need to?
> Maybe just the var part would need to:  /cm/shared/apps/sge/var ?
> 
> Thanks,
> Eric
> 
> 
> 
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users

-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Re: [gridengine users] SGE and NFS

Reply via email to