Re: [OMPI users] Problems on large clusters

2011-06-27 Thread Thorsten Schuett
As I said, there seems to be a problem starting the app on all nodes. I am planning to do some test with orte-ps. I hope that I can get some information why the app didn't start. Thorsten On Saturday, June 25, 2011, Jeff Squyres wrote: > Did this issue get resolved? You might also want to look

Re: [OMPI users] Problems on large clusters

2011-06-25 Thread Jeff Squyres
Did this issue get resolved? You might also want to look at our FAQ category for large clusters: http://www.open-mpi.org/faq/?category=large-clusters On Jun 22, 2011, at 9:43 AM, Thorsten Schuett wrote: > Thanks for the tip. I can't tell yet whether it helped or not. However, with > you

Re: [OMPI users] Problems on large clusters

2011-06-22 Thread Thorsten Schuett
Thanks for the tip. I can't tell yet whether it helped or not. However, with your settings I get the following warning: WARNING: Open MPI will create a shared memory backing file in a directory that appears to be mounted on a network filesystem. I repeated the run with my settings and I noticed t

Re: [OMPI users] Problems on large clusters

2011-06-22 Thread Gilbert Grosdidier
Bonjour Thorsten, I'm not surprised about the cluster type, indeed, but I do not remember getting such specific hang up you mention. Anyway, I suspect SGI Altix is a little bit special for OpenMPI, and I usually run with the following setup: - there is need to create for each job a specific tm

Re: [OMPI users] Problems on large clusters

2011-06-22 Thread Thorsten Schuett
Sure. It's an SGI ICE cluster with dual-rail IB. The HCAs are Mellanox ConnectX IB DDR. This is a 2040 cores job. I use 255 nodes with one MPI task on each node and use 8-way OpenMP. I don't need -np and -machinefile, because mpiexec picks up this information from PBS. Thorsten On Tuesday, J

Re: [OMPI users] Problems on large clusters

2011-06-21 Thread Addepalli, Srirangam V
Hello Thorsten What type of IB interface do you have (qlogic ?). I often run into simarl issue when running 256 core jobs . It mostly happens for me as i hit a node with IB issues. nothing related to openmpi. If you are using qlogic PSM try using ping-pong ex to check availability of all node

Re: [OMPI users] Problems on large clusters

2011-06-21 Thread Gilbert Grosdidier
Bonjour Thorsten, Could you please be a little bit more specific about the cluster itself ? G. Le 21 juin 11 à 17:46, Thorsten Schuett a écrit : Hi, I am running openmpi 1.5.3 on a IB cluster and I have problems starting jobs on larger node counts. With small numbers of tasks, it usua