Hi, Am 20.09.2013 um 16:12 schrieb Noam Bernstein:
> On Sep 20, 2013, at 10:04 AM, Noam Bernstein <noam.bernst...@nrl.navy.mil> > wrote: > >> Never mind - I was sure that my earlier tests showed that the $PBS_NODEFILE >> was there, but now it seems like every time the job fails it's because this >> file really is missing. Time to check why torque isn't always creating >> the nodefile. > > Even weirder now - most of the time jobs fail it's because the PBS_NODEFILE > is really missing. But a small fraction of the time (< 1%) the PBS_NODEFILE > is there, but mpirun still fails in the way my original message specified. > > Has anyone ever seen anything like this before? Is the location for the spool directory local or shared by NFS? Disk full? -- Reuti