On Jul 12, 2018, at 11:45 AM, Noam Bernstein <noam.bernst...@nrl.navy.mil> wrote: > >> E.g., if you "ulimit -c" in your interactive shell and see "unlimited", but >> if you "ulimit -c" in a launched job and see "0", then the job scheduler is >> doing that to your environment somewhere. > > I am using a scheduler (torque), but as I also told Åke off list in our > side-discussion about VASP, I’m already doing that. I mpirun a script which > does a few things like ulimit -c and ulimit -s, and then runs the actual > executable with $* arguments.
That may not be sufficient. Remember that your job script only runs on the Mother Superior node (i.e., the first node in the job). Hence, while your job script may affect the corefile size settings in that shell (and its children), remember that the remote MPI processes are (effectively) launched via tm_spawn() -- not ssh. I.e., Open MPI will end up calling tm_spawn() to launch orted processes on all the nodes in your job. The TM daemons on the nodes in your job will then fork/exec the orteds, meaning that they inherit the environment (including corefile size restrictions) of the TM daemons. The orteds eventually fork/exec your MPI processes. This is a long way of saying: your shell startup files may not be executed, and the "ulimit -c" you did in your job script may not be propagated out to the other nodes. Instead, your MPI processes may be inheriting the corefile size limitations from the Torque daemons. In my SLURM cluster here at Cisco (which is a pretty ancient version at this point; I have no idea if things have changed), I had to put a "ulimit -c unlimited" in a relevant /etc/sysconfig/slurmd file so that that is executed before the slurmd (SLURM daemon) is executed. Then my MPI processes start with unlimited corefile size restrictions. (You may have already done this; I just want to make sure we're on the same sheet of music here...) -- Jeff Squyres jsquy...@cisco.com _______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users