You might also add the —display-allocation flag to mpirun so we can see what it thinks the allocation looks like. If there are only 16 slots on the node, it seems odd that OMPI would assign 32 procs to it unless it thinks there is only 1 node in the job, and oversubscription is allowed (which it won’t be by default if it read the GE allocation)
> On Nov 9, 2014, at 9:56 AM, Reuti <re...@staff.uni-marburg.de> wrote: > > Hi, > >> Am 09.11.2014 um 18:20 schrieb SLIM H.A. <h.a.s...@durham.ac.uk >> <mailto:h.a.s...@durham.ac.uk>>: >> >> We switched on hyper threading on our cluster with two eight core sockets >> per node (32 threads per node). >> >> We configured gridengine with 16 slots per node to allow the 16 extra >> threads for kernel process use but this apparently does not work. Printout >> of the gridengine hostfile shows that for a 32 slots job, 16 slots are >> placed on each of two nodes as expected. Including the openmpi --display-map >> option shows that all 32 processes are incorrectly placed on the head node. > > You mean the master node of the parallel job I assume. > >> Here is part of the output >> >> master=cn6083 >> PE=orte > > What allocation rule was defined for this PE - "control_slave yes" is set? > >> JOB_ID=2481793 >> Got 32 slots. >> slots: >> cn6083 16 par6.q@cn6083 <NULL> >> cn6085 16 par6.q@cn6085 <NULL> >> Sun Nov 9 16:50:59 GMT 2014 >> Data for JOB [44767,1] offset 0 >> >> ======================== JOB MAP ======================== >> >> Data for node: cn6083 Num slots: 16 Max slots: 0 Num procs: 32 >> Process OMPI jobid: [44767,1] App: 0 Process rank: 0 >> Process OMPI jobid: [44767,1] App: 0 Process rank: 1 >> ... >> Process OMPI jobid: [44767,1] App: 0 Process rank: 31 >> >> ============================================================= >> >> I found some related mailings about a new warning in 1.8.2 about >> oversubscription and I tried a few options to avoid the use of the extra >> threads for MPI tasks by openmpi without success, e.g. variants of >> >> --cpus-per-proc 1 >> --bind-to-core >> >> and some others. Gridengine treats hw threads as cores==slots (?) but the >> content of $PE_HOSTFILE suggests it distributes the slots sensibly so it >> seems there is an option for openmpi required to get 16 cores per node? > > Was Open MPI configured with --with-sge? > > -- Reuti > >> I tried both 1.8.2, 1.8.3 and also 1.6.5. >> >> Thanks for some clarification that anyone can give. >> >> Henk >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/11/25718.php >> <http://www.open-mpi.org/community/lists/users/2014/11/25718.php> > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/11/25719.php > <http://www.open-mpi.org/community/lists/users/2014/11/25719.php>