Dear forum,
I probably must apologize in advance for the very basic question but I
wasn't able to find an answer elsewhere:
how do I find the maximum number of processes that can be concurrently
instantiated by mpirun on one single host of a cluster?
If I launch (on an CentOS 6.3 cluster with quad-core dual Xeons nodes,
equipped with OpenMPI 1.5.4 and IB HCAs but I think this latter is of no
consequence):
[cut]
mpirun -np 250 -host q012 hostname
[/cut]
I expect and obtain 250 rows of:
[cut]
q012.qng
[/cut]
The same for 251, 252, 253 and 254 BUT not for 255, when it returns:
[cut]
--------------------------------------------------------------------------
mpirun was unable to start the specified application as it encountered
an error
on node q012. More information may be available above.
--------------------------------------------------------------------------
[/cut]
I know that 250 processes is quite an oversubscription for a single
node that has no more than 8 real cores but I wanted to see the actual
degradation of performances instead of a crash.
Which hard limit (in OpenMPI or in the system) am I hitting for not
being able to run 255 MPI processes on one single host?
The output of ulimit -a for the user is:
[cut]
ulimit -a
core file size (blocks, -c) 1000000
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 95054
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 100000
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
[/cut]
Many thanks,
Francesco