Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

Ralph Castain Wed, 13 Nov 2013 19:55:59 -0500 (EST)

On Nov 13, 2013, at 4:43 PM, [email protected] wrote:

> 
> 
> Yes, the node08 has 8 slots but the process I run is also 8.
> 
> #PBS -l nodes=node08:ppn=8
> 
> Therefore, I think it should allow this allocation. Is that right?


Correct

> 
> My question is why scritp1 works and script2 does not. They are
> almost same.
> 
> #PBS -l nodes=node08:ppn=8
> export OMP_NUM_THREADS=1
> cd $PBS_O_WORKDIR
> cp $PBS_NODEFILE pbs_hosts
> NPROCS=`wc -l < pbs_hosts`
> 
> #SCRITP1
> mpirun -report-bindings -bind-to core Myprog
> 
> #SCRIPT2
> mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings -bind-to core

This version is not only reading the PBS allocation, but also invoking the 
hostfile filter on top of it. Different code path. I'll take a look - it should 
still match up assuming NPROCS=8. Any possibility that it is a different 
number? I don't recall, but isn't there some extra lines in the nodefile - 
e.g., comments?


> Myprog
> 
> tmishima
> 
>> I guess here's my confusion. If you are using only one node, and that
> node has 8 allocated slots, then we will not allow you to run more than 8
> processes on that node unless you specifically provide
>> the --oversubscribe flag. This is because you are operating in a managed
> environment (in this case, under Torque), and so we treat the allocation as
> "mandatory" by default.
>> 
>> I suspect that is the issue here, in which case the system is behaving as
> it should.
>> 
>> Is the above accurate?
>> 
>> 
>> On Nov 13, 2013, at 4:11 PM, Ralph Castain <[email protected]> wrote:
>> 
>>> It has nothing to do with LAMA as you aren't using that mapper.
>>> 
>>> How many nodes are in this allocation?
>>> 
>>> On Nov 13, 2013, at 4:06 PM, [email protected] wrote:
>>> 
>>>> 
>>>> 
>>>> Hi Ralph, this is an additional information.
>>>> 
>>>> Here is the main part of output by adding "-mca rmaps_base_verbose
> 50".
>>>> 
>>>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm
>>>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm creating map
>>>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm only HNP in
>>>> allocation
>>>> [node08.cluster:26952] mca:rmaps: mapping job [56581,1]
>>>> [node08.cluster:26952] mca:rmaps: creating new map for job [56581,1]
>>>> [node08.cluster:26952] mca:rmaps:ppr: job [56581,1] not using ppr
> mapper
>>>> [node08.cluster:26952] [[56581,0],0] rmaps:seq mapping job [56581,1]
>>>> [node08.cluster:26952] mca:rmaps:seq: job [56581,1] not using seq
> mapper
>>>> [node08.cluster:26952] mca:rmaps:resilient: cannot perform initial map
> of
>>>> job [56581,1] - no fault groups
>>>> [node08.cluster:26952] mca:rmaps:mindist: job [56581,1] not using
> mindist
>>>> mapper
>>>> [node08.cluster:26952] mca:rmaps:rr: mapping job [56581,1]
>>>> [node08.cluster:26952] [[56581,0],0] Starting with 1 nodes in list
>>>> [node08.cluster:26952] [[56581,0],0] Filtering thru apps
>>>> [node08.cluster:26952] [[56581,0],0] Retained 1 nodes in list
>>>> [node08.cluster:26952] [[56581,0],0] Removing node node08 slots 0
> inuse 0
>>>> 
>>>> From this result, I guess it's related to oversubscribe.
>>>> So I added "-oversubscribe" and rerun, then it worked well as show
> below:
>>>> 
>>>> [node08.cluster:27019] [[56774,0],0] Starting with 1 nodes in list
>>>> [node08.cluster:27019] [[56774,0],0] Filtering thru apps
>>>> [node08.cluster:27019] [[56774,0],0] Retained 1 nodes in list
>>>> [node08.cluster:27019] AVAILABLE NODES FOR MAPPING:
>>>> [node08.cluster:27019]     node: node08 daemon: 0
>>>> [node08.cluster:27019] [[56774,0],0] Starting bookmark at node node08
>>>> [node08.cluster:27019] [[56774,0],0] Starting at node node08
>>>> [node08.cluster:27019] mca:rmaps:rr: mapping by slot for job [56774,1]
>>>> slots 1 num_procs 8
>>>> [node08.cluster:27019] mca:rmaps:rr:slot working node node08
>>>> [node08.cluster:27019] mca:rmaps:rr:slot node node08 is full -
> skipping
>>>> [node08.cluster:27019] mca:rmaps:rr:slot job [56774,1] is
> oversubscribed -
>>>> performing second pass
>>>> [node08.cluster:27019] mca:rmaps:rr:slot working node node08
>>>> [node08.cluster:27019] mca:rmaps:rr:slot adding up to 8 procs to node
>>>> node08
>>>> [node08.cluster:27019] mca:rmaps:base: computing vpids by slot for job
>>>> [56774,1]
>>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 0 to node node08
>>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 1 to node node08
>>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 2 to node node08
>>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 3 to node node08
>>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 4 to node node08
>>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 5 to node node08
>>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 6 to node node08
>>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 7 to node node08
>>>> 
>>>> I think something is wrong with treatment of oversubscription, which
> might
>>>> be
>>>> related to "#3893: LAMA mapper has problems"
>>>> 
>>>> tmishima
>>>> 
>>>>> Hmmm...looks like we aren't getting your allocation. Can you rerun
> and
>>>> add -mca ras_base_verbose 50?
>>>>> 
>>>>> On Nov 12, 2013, at 11:30 PM, [email protected] wrote:
>>>>> 
>>>>>> 
>>>>>> 
>>>>>> Hi Ralph,
>>>>>> 
>>>>>> Here is the output of "-mca plm_base_verbose 5".
>>>>>> 
>>>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component
> [rsh]
>>>>>> [node08.cluster:23573] [[INVALID],INVALID] plm:rsh_lookup on
>>>>>> agent /usr/bin/rsh path NULL
>>>>>> [node08.cluster:23573] mca:base:select:(  plm) Query of component
> [rsh]
>>>> set
>>>>>> priority to 10
>>>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component
>>>> [slurm]
>>>>>> [node08.cluster:23573] mca:base:select:(  plm) Skipping component
>>>> [slurm].
>>>>>> Query failed to return a module
>>>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component
> [tm]
>>>>>> [node08.cluster:23573] mca:base:select:(  plm) Query of component
> [tm]
>>>> set
>>>>>> priority to 75
>>>>>> [node08.cluster:23573] mca:base:select:(  plm) Selected component
> [tm]
>>>>>> [node08.cluster:23573] plm:base:set_hnp_name: initial bias 23573
>>>> nodename
>>>>>> hash 85176670
>>>>>> [node08.cluster:23573] plm:base:set_hnp_name: final jobfam 59480
>>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:receive start comm
>>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_job
>>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm
>>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm creating map
>>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm only HNP in
>>>>>> allocation
>>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>> All nodes which are allocated for this job are already filled.
>>>>>> 
>>>> 
> --------------------------------------------------------------------------
>>>>>> 
>>>>>> Here, openmpi's configuration is as follows:
>>>>>> 
>>>>>> ./configure \
>>>>>> --prefix=/home/mishima/opt/mpi/openmpi-1.7.4a1-pgi13.10 \
>>>>>> --with-tm \
>>>>>> --with-verbs \
>>>>>> --disable-ipv6 \
>>>>>> --disable-vt \
>>>>>> --enable-debug \
>>>>>> CC=pgcc CFLAGS="-tp k8-64e" \
>>>>>> CXX=pgCC CXXFLAGS="-tp k8-64e" \
>>>>>> F77=pgfortran FFLAGS="-tp k8-64e" \
>>>>>> FC=pgfortran FCFLAGS="-tp k8-64e"
>>>>>> 
>>>>>>> Hi Ralph,
>>>>>>> 
>>>>>>> Okey, I can help you. Please give me some time to report the
> output.
>>>>>>> 
>>>>>>> Tetsuya Mishima
>>>>>>> 
>>>>>>>> I can try, but I have no way of testing Torque any more - so all I
>>>> can
>>>>>> do
>>>>>>> is a code review. If you can build --enable-debug and add -mca
>>>>>>> plm_base_verbose 5 to your cmd line, I'd appreciate seeing the
>>>>>>>> output.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Nov 12, 2013, at 9:58 PM, [email protected] wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Ralph,
>>>>>>>>> 
>>>>>>>>> Thank you for your quick response.
>>>>>>>>> 
>>>>>>>>> I'd like to report one more regressive issue about Torque support
> of
>>>>>>>>> openmpi-1.7.4a1r29646, which might be related to "#3893: LAMA
> mapper
>>>>>>>>> has problems" I reported a few days ago.
>>>>>>>>> 
>>>>>>>>> The script below does not work with openmpi-1.7.4a1r29646,
>>>>>>>>> although it worked with openmpi-1.7.3 as I told you before.
>>>>>>>>> 
>>>>>>>>> #!/bin/sh
>>>>>>>>> #PBS -l nodes=node08:ppn=8
>>>>>>>>> export OMP_NUM_THREADS=1
>>>>>>>>> cd $PBS_O_WORKDIR
>>>>>>>>> cp $PBS_NODEFILE pbs_hosts
>>>>>>>>> NPROCS=`wc -l < pbs_hosts`
>>>>>>>>> mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings
>>>> -bind-to
>>>>>>> core
>>>>>>>>> Myprog
>>>>>>>>> 
>>>>>>>>> If I drop "-machinefile pbs_hosts -np ${NPROCS} ", then it works
>>>>>> fine.
>>>>>>>>> Since this happens without lama request, I guess it's not the
>>>> problem
>>>>>>>>> in lama itself. Anyway, please look into this issue as well.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>>> Done - thanks!
>>>>>>>>>> 
>>>>>>>>>> On Nov 12, 2013, at 7:35 PM, [email protected] wrote:
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Dear openmpi developers,
>>>>>>>>>>> 
>>>>>>>>>>> I got a segmentation fault in traial use of
> openmpi-1.7.4a1r29646
>>>>>>> built
>>>>>>>>> by
>>>>>>>>>>> PGI13.10 as shown below:
>>>>>>>>>>> 
>>>>>>>>>>> [mishima@manage testbed-openmpi-1.7.3]$ mpirun -np 4
>>>> -cpus-per-proc
>>>>>> 2
>>>>>>>>>>> -report-bindings mPre
>>>>>>>>>>> [manage.cluster:23082] MCW rank 2 bound to socket 0[core 4[hwt
>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 0[core 5[hwt 0]]: [././././B/B][./././././.]
>>>>>>>>>>> [manage.cluster:23082] MCW rank 3 bound to socket 1[core 6[hwt
>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 1[core 7[hwt 0]]: [./././././.][B/B/./././.]
>>>>>>>>>>> [manage.cluster:23082] MCW rank 0 bound to socket 0[core 0[hwt
>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./././.][./././././.]
>>>>>>>>>>> [manage.cluster:23082] MCW rank 1 bound to socket 0[core 2[hwt
>>>> 0]],
>>>>>>>>> socket
>>>>>>>>>>> 0[core 3[hwt 0]]: [././B/B/./.][./././././.]
>>>>>>>>>>> [manage:23082] *** Process received signal ***
>>>>>>>>>>> [manage:23082] Signal: Segmentation fault (11)
>>>>>>>>>>> [manage:23082] Signal code: Address not mapped (1)
>>>>>>>>>>> [manage:23082] Failing at address: 0x34
>>>>>>>>>>> [manage:23082] *** End of error message ***
>>>>>>>>>>> Segmentation fault (core dumped)
>>>>>>>>>>> 
>>>>>>>>>>> [mishima@manage testbed-openmpi-1.7.3]$ gdb mpirun core.23082
>>>>>>>>>>> GNU gdb (GDB) CentOS (7.0.1-42.el5.centos.1)
>>>>>>>>>>> Copyright (C) 2009 Free Software Foundation, Inc.
>>>>>>>>>>> ...
>>>>>>>>>>> Core was generated by `mpirun -np 4 -cpus-per-proc 2
>>>>>> -report-bindings
>>>>>>>>>>> mPre'.
>>>>>>>>>>> Program terminated with signal 11, Segmentation fault.
>>>>>>>>>>> #0  0x00002b5f861c9c4f in recv_connect (mod=0x5f861ca20b00007f,
>>>>>>>>> sd=32767,
>>>>>>>>>>> hdr=0x1ca20b00007fff25) at ./oob_tcp.c:631
>>>>>>>>>>> 631             peer = OBJ_NEW(mca_oob_tcp_peer_t);
>>>>>>>>>>> (gdb) where
>>>>>>>>>>> #0  0x00002b5f861c9c4f in recv_connect (mod=0x5f861ca20b00007f,
>>>>>>>>> sd=32767,
>>>>>>>>>>> hdr=0x1ca20b00007fff25) at ./oob_tcp.c:631
>>>>>>>>>>> #1  0x00002b5f861ca20b in recv_handler (sd=1778385023,
>>>> flags=32767,
>>>>>>>>>>> cbdata=0x8eb06a00007fff25) at ./oob_tcp.c:760
>>>>>>>>>>> #2  0x00002b5f848eb06a in event_process_active_single_queue
>>>>>>>>>>> (base=0x5f848eb27000007f, activeq=0x848eb27000007fff)
>>>>>>>>>>> at ./event.c:1366
>>>>>>>>>>> #3  0x00002b5f848eb270 in event_process_active
>>>>>>>>> (base=0x5f848eb84900007f)
>>>>>>>>>>> at ./event.c:1435
>>>>>>>>>>> #4  0x00002b5f848eb849 in opal_libevent2021_event_base_loop
>>>>>>>>>>> (base=0x4077a000007f, flags=32767) at ./event.c:1645
>>>>>>>>>>> #5  0x00000000004077a0 in orterun (argc=7, argv=0x7fff25bbd4a8)
>>>>>>>>>>> at ./orterun.c:1030
>>>>>>>>>>> #6  0x00000000004067fb in main (argc=7, argv=0x7fff25bbd4a8)
>>>>>>>>> at ./main.c:13
>>>>>>>>>>> (gdb) quit
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> The line 627 in orte/mca/oob/tcp/oob_tcp.c is apparently
>>>>>> unnecessary,
>>>>>>>>> which
>>>>>>>>>>> causes the segfault.
>>>>>>>>>>> 
>>>>>>>>>>> 624      /* lookup the corresponding process */
>>>>>>>>>>> 625      peer = mca_oob_tcp_peer_lookup(mod, &hdr->origin);
>>>>>>>>>>> 626      if (NULL == peer) {
>>>>>>>>>>> 627          ui64 = (uint64_t*)(&peer->name);
>>>>>>>>>>> 628          opal_output_verbose(OOB_TCP_DEBUG_CONNECT,
>>>>>>>>>>> orte_oob_base_framework.framework_output,
>>>>>>>>>>> 629                              "%s mca_oob_tcp_recv_connect:
>>>>>>>>>>> connection from new peer",
>>>>>>>>>>> 630                              ORTE_NAME_PRINT
>>>>>>> (ORTE_PROC_MY_NAME));
>>>>>>>>>>> 631          peer = OBJ_NEW(mca_oob_tcp_peer_t);
>>>>>>>>>>> 632          peer->mod = mod;
>>>>>>>>>>> 633          peer->name = hdr->origin;
>>>>>>>>>>> 634          peer->state = MCA_OOB_TCP_ACCEPTING;
>>>>>>>>>>> 635          ui64 = (uint64_t*)(&peer->name);
>>>>>>>>>>> 636          if (OPAL_SUCCESS !=
> opal_hash_table_set_value_uint64
>>>>>>>>> (&mod->
>>>>>>>>>>> peers, (*ui64), peer)) {
>>>>>>>>>>> 637              OBJ_RELEASE(peer);
>>>>>>>>>>> 638              return;
>>>>>>>>>>> 639          }
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Please fix this mistake in the next release.
>>>>>>>>>>> 
>>>>>>>>>>> Regards,
>>>>>>>>>>> Tetsuya Mishima
>>>>>>>>>>> 
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> users mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> users mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> [email protected]
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> [email protected]
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> [email protected]
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> [email protected]
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> [email protected]
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> [email protected]
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
>> _______________________________________________
>> users mailing list
>> [email protected]
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> _______________________________________________
> users mailing list
> [email protected]
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

Reply via email to