On Wed, 22 Feb 2017 at 07:57 -0000, Reuti wrote:

> > - Get the user to put the log giles on /tmp or /dev/null
> > (suggested but not yet tested)
>
> Do you write the logfiles (i.e. stdout/stderr) in the home directory
> or the job directory? What about putting it local on the nodes or is
> the GPFS the only file system you got on the nodes? In the epilog
> you could copy it to the real location (this is what Torque does
> AFAIK). But unless you `ssh` to a node you won't have a live output
> of these files any longer.

I don't know the specifics of the user needs.  Currently the log files
are all being written into a single GPFS directory which is what I
think is the issue.  2000+ file creations at the same time (and
eventually 100,000 files in total which I expect any filesystem would
have trouble with).

I've suggested the user try to put the log files into /tmp, but don't
know that they have tried that yet.

> There is the parameter:
>
> max_pending_tasks_per_job
>
> in the scheduler configuration. Unfortunately it seems to have no
> effect anywhere or I got its meaning in the wrong way.

We have this set to the default of 50 and this sounds like the
parameter I'm looking for, but it doesn't seem functional.  I'll look
through the sources and see if I can understand how this is used.  We
currently use SoGE 8.1.8.

Thanks,
Stuart
-- 
I've never been lost; I was once bewildered for three days, but never lost!
                                        --  Daniel Boone
_______________________________________________
SGE-discuss mailing list
SGE-discuss@liv.ac.uk
https://arc.liv.ac.uk/mailman/listinfo/sge-discuss

Reply via email to