Am 11.11.2014 um 17:52 schrieb Ralph Castain: > >> On Nov 11, 2014, at 7:57 AM, Reuti <re...@staff.uni-marburg.de> wrote: >> >> Am 11.11.2014 um 16:13 schrieb Ralph Castain: >> >>> This clearly displays the problem - if you look at the reported “allocated >>> nodes”, you see that we only got one node (cn6050). This is why we mapped >>> all your procs onto that node. >>> >>> So the real question is - why? Can you show us the content of PE_HOSTFILE? >>> >>> >>>> On Nov 11, 2014, at 4:51 AM, SLIM H.A. <h.a.s...@durham.ac.uk> wrote: >>>> >>>> Dear Reuti and Ralph >>>> >>>> Below is the output of the run for openmpi 1.8.3 with this line >>>> >>>> mpirun -np $NSLOTS --display-map --display-allocation --cpus-per-proc 1 >>>> $exe >>>> >>>> >>>> master=cn6050 >>>> PE=orte >>>> JOB_ID=2482923 >>>> Got 32 slots. >>>> slots: >>>> cn6050 16 par6.q@cn6050 <NULL> >>>> cn6045 16 par6.q@cn6045 <NULL> >> >> The above looks like the PE_HOSTFILE. So it should be 16 slots per node. > > Hey Reuti > > Is that the standard PE_HOSTFILE format? I’m looking at the ras/gridengine > module, and it looks like it is expecting a different format. I suspect that > is the problem
Well, the fourth column can be a processer range in older versions of SGE and the binding in newer ones, but the first three columns were always this way. -- Reuti > Ralph > >> >> I wonder whether any environment variable was reset, which normally allows >> Open MPI to discover that it's running inside SGE. >> >> I.e. SGE_ROOT, JOB_ID, ARC and PE_HOSTFILE are untouched before the job >> starts? >> >> Supplying "-np $NSLOTS" shouldn't be necessary though. >> >> -- Reuti >> >> >> >>>> Tue Nov 11 12:37:37 GMT 2014 >>>> >>>> ====================== ALLOCATED NODES ====================== >>>> cn6050: slots=16 max_slots=0 slots_inuse=0 state=UP >>>> ================================================================= >>>> Data for JOB [57374,1] offset 0 >>>> >>>> ======================== JOB MAP ======================== >>>> >>>> Data for node: cn6050 Num slots: 16 Max slots: 0 Num procs: 32 >>>> Process OMPI jobid: [57374,1] App: 0 Process rank: 0 >>>> Process OMPI jobid: [57374,1] App: 0 Process rank: 1 >>>> >>>> … >>>> Process OMPI jobid: [57374,1] App: 0 Process rank: 31 >>>> >>>> >>>> Also >>>> ompi_info | grep grid >>>> gives MCA ras: gridengine (MCA v2.0, API v2.0, Component >>>> v1.8.3) >>>> and >>>> ompi_info | grep psm >>>> gives MCA mtl: psm (MCA v2.0, API v2.0, Component v1.8.3) >>>> because the intercoonect is TrueScale/QLogic >>>> >>>> and >>>> >>>> setenv OMPI_MCA_mtl "psm" >>>> >>>> is set in the script. This is the PE >>>> >>>> pe_name orte >>>> slots 4000 >>>> user_lists NONE >>>> xuser_lists NONE >>>> start_proc_args /bin/true >>>> stop_proc_args /bin/true >>>> allocation_rule $fill_up >>>> control_slaves TRUE >>>> job_is_first_task FALSE >>>> urgency_slots min >>>> >>>> Many thanks >>>> >>>> Henk >>>> >>>> >>>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain >>>> Sent: 10 November 2014 05:07 >>>> To: Open MPI Users >>>> Subject: Re: [OMPI users] oversubscription of slots with GridEngine >>>> >>>> You might also add the —display-allocation flag to mpirun so we can see >>>> what it thinks the allocation looks like. If there are only 16 slots on >>>> the node, it seems odd that OMPI would assign 32 procs to it unless it >>>> thinks there is only 1 node in the job, and oversubscription is allowed >>>> (which it won’t be by default if it read the GE allocation) >>>> >>>> >>>> On Nov 9, 2014, at 9:56 AM, Reuti <re...@staff.uni-marburg.de> wrote: >>>> >>>> Hi, >>>> >>>> >>>> Am 09.11.2014 um 18:20 schrieb SLIM H.A. <h.a.s...@durham.ac.uk>: >>>> >>>> We switched on hyper threading on our cluster with two eight core sockets >>>> per node (32 threads per node). >>>> >>>> We configured gridengine with 16 slots per node to allow the 16 extra >>>> threads for kernel process use but this apparently does not work. Printout >>>> of the gridengine hostfile shows that for a 32 slots job, 16 slots are >>>> placed on each of two nodes as expected. Including the openmpi >>>> --display-map option shows that all 32 processes are incorrectly placed >>>> on the head node. >>>> >>>> You mean the master node of the parallel job I assume. >>>> >>>> >>>> Here is part of the output >>>> >>>> master=cn6083 >>>> PE=orte >>>> >>>> What allocation rule was defined for this PE - "control_slave yes" is set? >>>> >>>> >>>> JOB_ID=2481793 >>>> Got 32 slots. >>>> slots: >>>> cn6083 16 par6.q@cn6083 <NULL> >>>> cn6085 16 par6.q@cn6085 <NULL> >>>> Sun Nov 9 16:50:59 GMT 2014 >>>> Data for JOB [44767,1] offset 0 >>>> >>>> ======================== JOB MAP ======================== >>>> >>>> Data for node: cn6083 Num slots: 16 Max slots: 0 Num procs: 32 >>>> Process OMPI jobid: [44767,1] App: 0 Process rank: 0 >>>> Process OMPI jobid: [44767,1] App: 0 Process rank: 1 >>>> ... >>>> Process OMPI jobid: [44767,1] App: 0 Process rank: 31 >>>> >>>> ============================================================= >>>> >>>> I found some related mailings about a new warning in 1.8.2 about >>>> oversubscription and I tried a few options to avoid the use of the extra >>>> threads for MPI tasks by openmpi without success, e.g. variants of >>>> >>>> --cpus-per-proc 1 >>>> --bind-to-core >>>> >>>> and some others. Gridengine treats hw threads as cores==slots (?) but the >>>> content of $PE_HOSTFILE suggests it distributes the slots sensibly so it >>>> seems there is an option for openmpi required to get 16 cores per node? >>>> >>>> Was Open MPI configured with --with-sge? >>>> >>>> -- Reuti >>>> >>>> >>>> I tried both 1.8.2, 1.8.3 and also 1.6.5. >>>> >>>> Thanks for some clarification that anyone can give. >>>> >>>> Henk >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2014/11/25718.php >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2014/11/25719.php >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> http://www.open-mpi.org/community/lists/users/2014/11/25741.php >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2014/11/25747.php >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2014/11/25750.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/11/25753.php >