I found my entry about this:

https://arc.liv.ac.uk/trac/SGE/ticket/570

-- Reuti


> Am 06.12.2018 um 19:03 schrieb Reuti <re...@staff.uni-marburg.de>:
> 
> Hi,
> 
>> Am 06.12.2018 um 18:36 schrieb Dan Whitehouse <d.whiteho...@qmul.ac.uk>:
>> 
>> Hi,
>> I've been running some MPI jobs and I expected that when the job started
>> a $TMPDIR would be created on all of the nodes, however with our (UGE)
>> configuration that does not appear to be the case.
>> 
>> It appears that while on the "master" node a $TMPDIR is created and
>> persists for the duration of the job, for "slave" execution hosts, the
>> directory is only created when MPI processes run and is immediately
>> reaped when they exit. Is there a way to change this behaviour such that
>> the directory persists for the entire duration of the job?
> 
> Your observations are correct. I saw a need for it some time ago: 
> https://arc.liv.ac.uk/trac/SGE/ticket/1290
> 
> One can create persistent scratch directories e.g. in a job prolog (just make 
> the list of nodes unique and issue `qrsh -inherit ...` for each nodes `mkdir 
> $TMPDIR-persistent` Curley braces are optional here, as the dash can't be a 
> character in an environment variable).
> 
> There is one pitfall: in case of a job abort one can't issue `qrsh -inherit 
> ...` in the epilog any longer to remove all the directories on the nodes in 
> turn – the job was already canceld. My solution was to submit a "cleaner.sh" 
> in the prolog too – one for each node (hence they run serial) and get the 
> name of the directory they should remove as argument after the script name 
> (this is known in the prolog). The job were supposed to run in a dedicated 
> cleaner.q only with no limits regarding slots (hence they started as soon as 
> they were eligible tun start), but got a job hold on the actual job which 
> submitted them to wait until it finished.
> 
> -- Reuti
> 
> 
>> 
>> --
>> Dan Whitehouse
>> Research Systems Administrator, IT Services
>> Queen Mary University of London
>> Mile End
>> E1 4NS
>> 
>> _______________________________________________
>> users mailing list
>> users@gridengine.org
>> https://gridengine.org/mailman/listinfo/users
> 

Attachment: signature.asc
Description: Message signed with OpenPGP

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to