I found my entry about this: https://arc.liv.ac.uk/trac/SGE/ticket/570
-- Reuti > Am 06.12.2018 um 19:03 schrieb Reuti <re...@staff.uni-marburg.de>: > > Hi, > >> Am 06.12.2018 um 18:36 schrieb Dan Whitehouse <d.whiteho...@qmul.ac.uk>: >> >> Hi, >> I've been running some MPI jobs and I expected that when the job started >> a $TMPDIR would be created on all of the nodes, however with our (UGE) >> configuration that does not appear to be the case. >> >> It appears that while on the "master" node a $TMPDIR is created and >> persists for the duration of the job, for "slave" execution hosts, the >> directory is only created when MPI processes run and is immediately >> reaped when they exit. Is there a way to change this behaviour such that >> the directory persists for the entire duration of the job? > > Your observations are correct. I saw a need for it some time ago: > https://arc.liv.ac.uk/trac/SGE/ticket/1290 > > One can create persistent scratch directories e.g. in a job prolog (just make > the list of nodes unique and issue `qrsh -inherit ...` for each nodes `mkdir > $TMPDIR-persistent` Curley braces are optional here, as the dash can't be a > character in an environment variable). > > There is one pitfall: in case of a job abort one can't issue `qrsh -inherit > ...` in the epilog any longer to remove all the directories on the nodes in > turn – the job was already canceld. My solution was to submit a "cleaner.sh" > in the prolog too – one for each node (hence they run serial) and get the > name of the directory they should remove as argument after the script name > (this is known in the prolog). The job were supposed to run in a dedicated > cleaner.q only with no limits regarding slots (hence they started as soon as > they were eligible tun start), but got a job hold on the actual job which > submitted them to wait until it finished. > > -- Reuti > > >> >> -- >> Dan Whitehouse >> Research Systems Administrator, IT Services >> Queen Mary University of London >> Mile End >> E1 4NS >> >> _______________________________________________ >> users mailing list >> users@gridengine.org >> https://gridengine.org/mailman/listinfo/users >
signature.asc
Description: Message signed with OpenPGP
_______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users