Hi:
I managed to run a 256 process job on a single node. I ran a simple test
in which all processes send a message to all others.
This was using Sun's Binary Distribution of Open MPI on Solaris which is
based on r16572 of the 1.2 branch. The machine had 8 cores.
burl-ct-v40z-0 49 =>/opt/SUNWhpc/HPC7.1/bin/mpirun --mca
mpool_sm_max_size 2147483647 -np 256 connectivity_c
Connectivity test on 256 processes PASSED.
burl-ct-v40z-0 50 =>
burl-ct-v40z-0 50 =>/opt/SUNWhpc/HPC7.1/bin/mpirun --mca
mpool_sm_max_size 2147483647 -np 300 connectivity_c -v
Connectivity test on 300 processes PASSED.
burl-ct-v40z-0 54 =>limit
cputime unlimited
filesize unlimited
datasize unlimited
stacksize 10240 kbytes
coredumpsize 0 kbytes
vmemoryuse unlimited
descriptors 65536
burl-ct-v40z-0 55 =>
hbtcx...@yahoo.co.jp wrote:
I installed Open MPI 1.2.4 in Red Hat Enterprise Linux 3. It worked
fine in normal usage. I tested executions with extremely many
processes.
$ mpiexec -n 128 --host node0 --mca btl_tcp_if_include eth0 --mca
mpool_sm_max_size 2147483647 ./cpi
$ mpiexec -n 256 --host node0,node1 --mca btl_tcp_if_include eth0 --mca
mpool_sm_max_size 2147483647 ./cpi
If I specified the mpiexec options like this, the execution succeeded
in both cases.
$ mpiexec -n 256 --host node0 --mca btl_tcp_if_include eth0 --mca
mpool_sm_max_size 2147483647 ./cpi
mpiexec noticed that job rank 0 with PID 0 on node node0 exited on
signal 15 (Terminated).
252 additional processes aborted (not shown)
$ mpiexec -n 512 --host node0,node1 --mca btl_tcp_if_include eth0 --mca
mpool_sm_max_size 2147483647 ./cpi
mpiexec noticed that job rank 0 with PID 0 on node node0 exited on
signal 15 (Terminated).
505 additional processes aborted (not shown)
If I increased the number of processes, the executions aborted.
Could I execute 256 processes per node using Open MPI?
We would like to execute as large number of processes as possible even if
the performance becomes worse.
If we use MPICH, we can execute 256 processes per node,
but the performance of Open MPI may be better.
We understand we must increase the number of open files next because
the current implementation uses many sockets.
SUSUKITA, Ryutaro
Peta-scale System Interconnect Project
Fukuoka Industry, Science & Technology Foundation
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users