basically, without --hetero-nodes, ompi assumes all nodes have the same
topology (fast startup)
with --hetero-nodes, ompi does not assume anything and request node
topology (slower startup)

I am nor sure if this is still 100% true on all versions.
iirc, at least on master, a hwloc signature is checked and ompi
transparently fall back to --heyero-nodes if needed

bottom line, on a heterogeneous cluster, it is required or safer to use the
--hetero-nodes option


Cheers,

Gilles

On Wednesday, August 12, 2015, Dave Love <d.l...@liverpool.ac.uk> wrote:

> "Lane, William" <william.l...@cshs.org <javascript:;>> writes:
>
> > I can successfully run my OpenMPI 1.8.7 jobs outside of
> Son-of-Gridengine but not via qrsh. We're
> > using CentOS 6.3 and a heterogeneous cluster of hyperthreaded and
> non-hyperthreaded blades
> > and x3550 chassis. OpenMPI 1.8.7 has been built w/the debug switch as
> well.
>
> I think you want to explain exactly why you need this world of pain.  It
> seems unlikely that MPI programs will run efficiently in it.  Our Intel
> nodes mostly have hyperthreading on in BIOS -- or what passes for BIOS
> on them -- but disabled at startup, and we only run MPI across identical
> nodes in the heterogeneous system.
>
> > Here's my latest errors:
> > qrsh -V -now yes -pe mpi 209 mpirun -np 209 -display-devel-map --prefix
> /hpc/apps/mpi/openmpi/1.8.7/ --mca btl ^sm --hetero-nodes --bind-to core
> /hpc/home/lanew/mpi/openmpi/ProcessColors3
>
> [What does --hetero-nodes do?  It's undocumented as far as I can tell.]
>
> > error: executing task of job 211298 failed: execution daemon on host
> "csclprd3-0-4" didn't accept task
> > error: executing task of job 211298 failed: execution daemon on host
> "csclprd3-4-1" didn't accept task
>
> So you need to find out why that was (probably lack of slots on the exec
> host, which might be explained in the execd messages).
>
> > [...]
>
> > NOTE: the hosts that "didn't accept task" were different in two
> different runs but the errors were the same.
> >
> > Here's the definition of the mpi Parallel Environment on our
> Son-of-Gridengine cluster:
> >
> > pe_name            mpi
> > slots              9999
> > user_lists         NONE
> > xuser_lists        NONE
> > start_proc_args    /opt/sge/mpi/startmpi.sh $pe_hostfile
> > stop_proc_args     /opt/sge/mpi/stopmpi.sh
>
> Why are those two not NONE?
>
> > allocation_rule    $fill_up
>
> As I said, that doesn't seem wise (unless you use -l exclusive).
>
> > control_slaves     FALSE
> > job_is_first_task  TRUE
> > urgency_slots      min
> > accounting_summary TRUE
> > qsort_args         NONE
> >
> > Qsort_args is set to NONE, but it's supposed to be set to TRUE right?
>
> No see sge_pe(5).  (I think the text I supplied for the FAQ is accurate,
> but reuti might confirm if he's reading this.)
>
> > -Bill L.
> >
> > If I can run my OpenMPI 1.8.7 jobs outside of Son-of-Gridengine w/no
> issues it has to be Son-of-Gridengine that's
> > the issue right?
>
> I don't see any evidence of an SGE bug, if that's what you mean, but
> clearly you have a problem if execds won't accept the jobs, and this
> isn't the place to discuss it.  I asked about SGE core binding, and it's
> presumably also relevant how slots are defined on the compute nodes, but
> I'd just say "Don't do that" without a pressing reason.
> _______________________________________________
> users mailing list
> us...@open-mpi.org <javascript:;>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/08/27436.php
>

Reply via email to