Steffen Brinkmann wrote:
Hi!

I have installed OpenMPI on a cluster consisting of ~30 nodes with 16 Xeon 
cores each. NFS is set up and working. For testing I have installed locally with

./configure --prefix=/home_dir/openmpi-1.4.3_installation/; make all install

everything smooth so far. When I run a parallel program with
/home_dir/openmpi-1.4.3_installation/bin/mpirun -n 2 ./my_parprog

everything scales perfectly up to -n 16. When I go to -n 32, the execution time is the same as with -n 16.
/home_dir/openmpi-1.4.3_installation/bin/mpirun -n 32 hostname

returns 32 times the same node.

The program is fine (runs since years on several machines) and another mpi 
installation scales well, so the cluster should be ok as well.

What did I do wrong???

Thanks for any hint!

Steffen


--
Dr. Steffen Brinkmann
High Performance Computing Center Stuttgart (HLRS)
Nobelstraße 19
D - 70569 Stuttgart
Germany

Phone:  ++49(0)711 / 685-64548
Fax:    ++49(0)711 / 685-65832


Hi Steffen

See this FAQ:

http://www.open-mpi.org/faq/?category=running#mpirun-host

If you have a resource manager, such as Torque or SGE,
you can build OpenMPI with support for it.
This will obviate the need to specify the nodes,
as the resource manager will take care of that for you:

http://www.open-mpi.org/faq/?category=building#build-rte-tm
http://www.open-mpi.org/faq/?category=building#build-rte-sge

BTW, the OpenMPI FAQ are the 'de facto' (and good)
OpenMPI documentation:

http://www.open-mpi.org/faq/

Other sources are the README file and the mpiexec man page.


I hope this helps,
Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------

Reply via email to