Re: [OMPI users] Bind multiple cores to rank - OpenMPI 1.8.1

Ralph Castain Fri, 6 Jun 2014 15:00:48 -0400 (EDT)

Sorry to pester with questions, but I'm trying to narrow down the issue.

* What kind of chips are on these machines?


* If they have h/w threads, are they enabled?

* you might have lstopo on one of those machines - could you pass along its 
output? Otherwise, you can run a simple "mpirun -n 1 -mca ess_base_verbose 20 
hostname" and it will print out. Only need one node in your allocation as we 
don't need a fountain of output.

I'll look into the segfault - hard to understand offhand, but could be an 
uninitialized variable. If you have a chance, could you rerun that test with 
"-mca plm_base_verbose 10" on the cmd line?

Thanks again
Ralph

On Jun 6, 2014, at 10:31 AM, Dan Dietz <ddi...@purdue.edu> wrote:

> Thanks for the reply. I tried out the --display-allocation option with
> several different combinations and have attached the output. I see
> this behavior on both RHEL6.4, RHEL6.5, and RHEL5.10 clusters.
> 
> 
> Here's debugging info on the segfault. Does that help? FWIW this does
> not seem to crash on the RHEL5 cluster or RHEL6.5 cluster. Just
> crashes on RHEL6.4.
> 
> ddietz@conte-a009:/scratch/conte/d/ddietz/hello$ gdb -c core.22623
> `which mpirun`
> No symbol table is loaded.  Use the "file" command.
> GNU gdb (GDB) 7.5-1.3.187
> Copyright (C) 2012 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from
> /scratch/conte/d/ddietz/openmpi-1.8.1-debug/intel-14.0.2.144/bin/mpirun...done.
> [New LWP 22623]
> [New LWP 22624]
> 
> warning: Can't read pathname for load map: Input/output error.
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `mpirun -np 2 -machinefile ./nodes ./hello'.
> Program terminated with signal 11, Segmentation fault.
> #0  0x00002acc602920e1 in orte_plm_base_complete_setup (fd=-1,
> args=-1, cbdata=0x20c0840) at base/plm_base_launch_support.c:422
> 422                    node->hostid = node->daemon->name.vpid;
> (gdb) bt
> #0  0x00002acc602920e1 in orte_plm_base_complete_setup (fd=-1,
> args=-1, cbdata=0x20c0840) at base/plm_base_launch_support.c:422
> #1  0x00002acc60eec145 in opal_libevent2021_event_base_loop () from
> /scratch/conte/d/ddietz/openmpi-1.8.1-debug/intel-14.0.2.144/lib/libopen-pal.so.6
> #2  0x00000000004073b5 in orterun (argc=6, argv=0x7fff5bb2a3a8) at
> orterun.c:1077
> #3  0x00000000004048f4 in main (argc=6, argv=0x7fff5bb2a3a8) at main.c:13
> 
> ddietz@conte-a009:/scratch/conte/d/ddietz/hello$ cat nodes
> conte-a009
> conte-a009
> conte-a055
> conte-a055
> ddietz@conte-a009:/scratch/conte/d/ddietz/hello$ uname -r
> 2.6.32-358.14.1.el6.x86_64
> 
> On Thu, Jun 5, 2014 at 7:54 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> 
>> On Jun 5, 2014, at 2:13 PM, Dan Dietz <ddi...@purdue.edu> wrote:
>> 
>>> Hello all,
>>> 
>>> I'd like to bind 8 cores to a single MPI rank for hybrid MPI/OpenMP
>>> codes. In OMPI 1.6.3, I can do:
>>> 
>>> $ mpirun -np 2 -cpus-per-rank 8  -machinefile ./nodes ./hello
>>> 
>>> I get one rank bound to procs 0-7 and the other bound to 8-15. Great!
>>> 
>>> But I'm having some difficulties doing this with openmpi 1.8.1:
>>> 
>>> $ mpirun -np 2 -cpus-per-rank 8  -machinefile ./nodes ./hello
>>> --------------------------------------------------------------------------
>>> The following command line options and corresponding MCA parameter have
>>> been deprecated and replaced as follows:
>>> 
>>> Command line options:
>>>   Deprecated:  --cpus-per-proc, -cpus-per-proc, --cpus-per-rank,
>>> -cpus-per-rank
>>>   Replacement: --map-by <obj>:PE=N
>>> 
>>> Equivalent MCA parameter:
>>>   Deprecated:  rmaps_base_cpus_per_proc
>>>   Replacement: rmaps_base_mapping_policy=<obj>:PE=N
>>> 
>>> The deprecated forms *will* disappear in a future version of Open MPI.
>>> Please update to the new syntax.
>>> --------------------------------------------------------------------------
>>> --------------------------------------------------------------------------
>>> There are not enough slots available in the system to satisfy the 2 slots
>>> that were requested by the application:
>>> ./hello
>>> 
>>> Either request fewer slots for your application, or make more slots 
>>> available
>>> for use.
>>> --------------------------------------------------------------------------
>>> 
>>> OK, let me try the new syntax...
>>> 
>>> $ mpirun -np 2 --map-by core:pe=8 -machinefile ./nodes ./hello
>>> --------------------------------------------------------------------------
>>> There are not enough slots available in the system to satisfy the 2 slots
>>> that were requested by the application:
>>> ./hello
>>> 
>>> Either request fewer slots for your application, or make more slots 
>>> available
>>> for use.
>>> --------------------------------------------------------------------------
>>> 
>>> What am I doing wrong? The documentation on these new options is
>>> somewhat poor and confusing so I'm probably doing something wrong. If
>>> anyone could provide some pointers here it'd be much appreciated! If
>>> it's not something simple and you need config logs and such please let
>>> me know.
>> 
>> Looks like we think there are less than 16 slots allocated on that node. 
>> What is in this "nodes" file? Without it, OMPI should read the Torque 
>> allocation directly. You might check what we think the allocation is by 
>> adding --display-allocation to the cmd line
>> 
>>> 
>>> As a side note -
>>> 
>>> If I try this using the PBS nodefile with the above, I get a confusing 
>>> message:
>>> 
>>> --------------------------------------------------------------------------
>>> A request for multiple cpus-per-proc was given, but a directive
>>> was also give to map to an object level that has less cpus than
>>> requested ones:
>>> 
>>> #cpus-per-proc:  8
>>> number of cpus:  1
>>> map-by:          BYCORE:NOOVERSUBSCRIBE
>>> 
>>> Please specify a mapping level that has more cpus, or else let us
>>> define a default mapping that will allow multiple cpus-per-proc.
>>> --------------------------------------------------------------------------
>>> 
>>> From what I've gathered this is because I have a node listed 16 times
>>> in my PBS nodefile so it's assuming then I have 1 core per node?
>> 
>> 
>> No - if listed 16 times, it should compute 16 slots. Try adding 
>> --display-allocation to your cmd line and it should tell you how many slots 
>> are present.
>> 
>> However, it doesn't assume there is a core for each slot. Instead, it 
>> detects the cores directly on the node. It sounds like it isn't seeing them 
>> for some reason. What OS are you running on that node?
>> 
>> FWIW: the 1.6 series has a different detection system for cores. Could be 
>> something is causing problems for the new one.
>> 
>>> Some
>>> better documentation here would be helpful. I haven't been able to
>>> figure out how to use the "oversubscribe" option listed in the docs.
>>> Not that I really want to oversubscribe, of course, I need to modify
>>> the nodefile, but this just stumped me for a while as 1.6.3 didn't
>>> have this behavior.
>>> 
>>> 
>>> As a extra bonus, I get a segfault in this situation:
>>> 
>>> $ mpirun -np 2 -machinefile ./nodes ./hello
>>> [conte-a497:13255] *** Process received signal ***
>>> [conte-a497:13255] Signal: Segmentation fault (11)
>>> [conte-a497:13255] Signal code: Address not mapped (1)
>>> [conte-a497:13255] Failing at address: 0x2c
>>> [conte-a497:13255] [ 0] /lib64/libpthread.so.0[0x3c9460f500]
>>> [conte-a497:13255] [ 1]
>>> /apps/rhel6/openmpi/1.8.1/intel-14.0.2.144/lib/libopen-rte.so.7(orte_plm_base_complete_setup+0x615)[0x2ba960a59015]
>>> [conte-a497:13255] [ 2]
>>> /apps/rhel6/openmpi/1.8.1/intel-14.0.2.144/lib/libopen-pal.so.6(opal_libevent2021_event_base_loop+0xa05)[0x2ba961666715]
>>> [conte-a497:13255] [ 3] mpirun(orterun+0x1b45)[0x40684f]
>>> [conte-a497:13255] [ 4] mpirun(main+0x20)[0x4047f4]
>>> [conte-a497:13255] [ 5] 
>>> /lib64/libc.so.6(__libc_start_main+0xfd)[0x3a1bc1ecdd]
>>> [conte-a497:13255] [ 6] mpirun[0x404719]
>>> [conte-a497:13255] *** End of error message ***
>>> Segmentation fault (core dumped)
>>> 
>> 
>> Huh - that's odd. Could you perhaps configure OMPI with --enable-debug and 
>> gdb the core file to tell us the line numbers involved?
>> 
>> Thanks
>> Ralph
>> 
>>> 
>>> My "nodes" file simply contains the first two lines of my original
>>> $PBS_NODEFILE provided by Torque. See above why I modified. Works fine
>>> if use the full file.
>>> 
>>> 
>>> 
>>> Thanks in advance for any pointers you all may have!
>>> 
>>> Dan
>>> 
>>> 
>>> --
>>> Dan Dietz
>>> Scientific Applications Analyst
>>> ITaP Research Computing, Purdue University
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Dan Dietz
> Scientific Applications Analyst
> ITaP Research Computing, Purdue University
> <slots>_______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Bind multiple cores to rank - OpenMPI 1.8.1

Reply via email to