As I said, there seems to be a problem starting the app on all nodes. I am
planning to do some test with orte-ps. I hope that I can get some information
why the app didn't start.
Thorsten
On Saturday, June 25, 2011, Jeff Squyres wrote:
> Did this issue get resolved? You might also want to look
Did this issue get resolved? You might also want to look at our FAQ category
for large clusters:
http://www.open-mpi.org/faq/?category=large-clusters
On Jun 22, 2011, at 9:43 AM, Thorsten Schuett wrote:
> Thanks for the tip. I can't tell yet whether it helped or not. However, with
> you
Thanks for the tip. I can't tell yet whether it helped or not. However, with
your settings I get the following warning:
WARNING: Open MPI will create a shared memory backing file in a
directory that appears to be mounted on a network filesystem.
I repeated the run with my settings and I noticed t
Bonjour Thorsten,
I'm not surprised about the cluster type, indeed,
but I do not remember getting such specific hang up you mention.
Anyway, I suspect SGI Altix is a little bit special for OpenMPI,
and I usually run with the following setup:
- there is need to create for each job a specific tm
Sure. It's an SGI ICE cluster with dual-rail IB. The HCAs are Mellanox
ConnectX IB DDR.
This is a 2040 cores job. I use 255 nodes with one MPI task on each node and
use 8-way OpenMP.
I don't need -np and -machinefile, because mpiexec picks up this information
from PBS.
Thorsten
On Tuesday, J
Hello Thorsten
What type of IB interface do you have (qlogic ?). I often run into simarl
issue when running 256 core jobs . It mostly happens for me as i hit a node
with IB issues.
nothing related to openmpi. If you are using qlogic PSM try using ping-pong
ex to check availability of all node
Bonjour Thorsten,
Could you please be a little bit more specific about the cluster
itself ?
G.
Le 21 juin 11 à 17:46, Thorsten Schuett a écrit :
Hi,
I am running openmpi 1.5.3 on a IB cluster and I have problems
starting jobs
on larger node counts. With small numbers of tasks, it usua