Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

tmishima Wed, 13 Nov 2013 20:54:45 -0500 (EST)


Since what I really want is to run script2 correctly, please let us
concentrate script2.


I'm not an expert of the inside of openmpi. What I can do is just
obsabation
from the outside. I doubt these lines are strange, especially the last one.

[node08.cluster:26952] mca:rmaps:rr: mapping job [56581,1]
[node08.cluster:26952] [[56581,0],0] Starting with 1 nodes in list
[node08.cluster:26952] [[56581,0],0] Filtering thru apps
[node08.cluster:26952] [[56581,0],0] Retained 1 nodes in list
[node08.cluster:26952] [[56581,0],0] Removing node node08 slots 0 inuse 0

These lines come from this part of orte_rmaps_base_get_target_nodes
in rmaps_base_support_fns.c:

        } else if (node->slots <= node->slots_inuse &&
                   (ORTE_MAPPING_NO_OVERSUBSCRIBE &
ORTE_GET_MAPPING_DIRECTIVE(policy))) {
            /* remove the node as fully used */
            OPAL_OUTPUT_VERBOSE((5,
orte_rmaps_base_framework.framework_output,
                                 "%s Removing node %s slots %d inuse %d",
                                 ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
                                 node->name, node->slots, node->
slots_inuse));
            opal_list_remove_item(allocated_nodes, item);
            OBJ_RELEASE(item);  /* "un-retain" it */

I wonder why node->slots and node->slots_inuse is 0, which I can read
from the above line "Removing node node08 slots 0 inuse 0".

Or I'm not sure but
   "else if (node->slots <= node->slots_inuse &&" should be
   "else if (node->slots < node->slots_inuse &&" ?

tmishima

> On Nov 13, 2013, at 4:43 PM, tmish...@jcity.maeda.co.jp wrote:
>
> >
> >
> > Yes, the node08 has 8 slots but the process I run is also 8.
> >
> > #PBS -l nodes=node08:ppn=8
> >
> > Therefore, I think it should allow this allocation. Is that right?
>
> Correct
>
> >
> > My question is why scritp1 works and script2 does not. They are
> > almost same.
> >
> > #PBS -l nodes=node08:ppn=8
> > export OMP_NUM_THREADS=1
> > cd $PBS_O_WORKDIR
> > cp $PBS_NODEFILE pbs_hosts
> > NPROCS=`wc -l < pbs_hosts`
> >
> > #SCRITP1
> > mpirun -report-bindings -bind-to core Myprog
> >
> > #SCRIPT2
> > mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings -bind-to
core
>
> This version is not only reading the PBS allocation, but also invoking
the hostfile filter on top of it. Different code path. I'll take a look -
it should still match up assuming NPROCS=8. Any
> possibility that it is a different number? I don't recall, but isn't
there some extra lines in the nodefile - e.g., comments?
>
>
> > Myprog
> >
> > tmishima
> >
> >> I guess here's my confusion. If you are using only one node, and that
> > node has 8 allocated slots, then we will not allow you to run more than
8
> > processes on that node unless you specifically provide
> >> the --oversubscribe flag. This is because you are operating in a
managed
> > environment (in this case, under Torque), and so we treat the
allocation as
> > "mandatory" by default.
> >>
> >> I suspect that is the issue here, in which case the system is behaving
as
> > it should.
> >>
> >> Is the above accurate?
> >>
> >>
> >> On Nov 13, 2013, at 4:11 PM, Ralph Castain <r...@open-mpi.org> wrote:
> >>
> >>> It has nothing to do with LAMA as you aren't using that mapper.
> >>>
> >>> How many nodes are in this allocation?
> >>>
> >>> On Nov 13, 2013, at 4:06 PM, tmish...@jcity.maeda.co.jp wrote:
> >>>
> >>>>
> >>>>
> >>>> Hi Ralph, this is an additional information.
> >>>>
> >>>> Here is the main part of output by adding "-mca rmaps_base_verbose
> > 50".
> >>>>
> >>>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm
> >>>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm creating map
> >>>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm only HNP in
> >>>> allocation
> >>>> [node08.cluster:26952] mca:rmaps: mapping job [56581,1]
> >>>> [node08.cluster:26952] mca:rmaps: creating new map for job [56581,1]
> >>>> [node08.cluster:26952] mca:rmaps:ppr: job [56581,1] not using ppr
> > mapper
> >>>> [node08.cluster:26952] [[56581,0],0] rmaps:seq mapping job [56581,1]
> >>>> [node08.cluster:26952] mca:rmaps:seq: job [56581,1] not using seq
> > mapper
> >>>> [node08.cluster:26952] mca:rmaps:resilient: cannot perform initial
map
> > of
> >>>> job [56581,1] - no fault groups
> >>>> [node08.cluster:26952] mca:rmaps:mindist: job [56581,1] not using
> > mindist
> >>>> mapper
> >>>> [node08.cluster:26952] mca:rmaps:rr: mapping job [56581,1]
> >>>> [node08.cluster:26952] [[56581,0],0] Starting with 1 nodes in list
> >>>> [node08.cluster:26952] [[56581,0],0] Filtering thru apps
> >>>> [node08.cluster:26952] [[56581,0],0] Retained 1 nodes in list
> >>>> [node08.cluster:26952] [[56581,0],0] Removing node node08 slots 0
> > inuse 0
> >>>>
> >>>> From this result, I guess it's related to oversubscribe.
> >>>> So I added "-oversubscribe" and rerun, then it worked well as show
> > below:
> >>>>
> >>>> [node08.cluster:27019] [[56774,0],0] Starting with 1 nodes in list
> >>>> [node08.cluster:27019] [[56774,0],0] Filtering thru apps
> >>>> [node08.cluster:27019] [[56774,0],0] Retained 1 nodes in list
> >>>> [node08.cluster:27019] AVAILABLE NODES FOR MAPPING:
> >>>> [node08.cluster:27019]     node: node08 daemon: 0
> >>>> [node08.cluster:27019] [[56774,0],0] Starting bookmark at node
node08
> >>>> [node08.cluster:27019] [[56774,0],0] Starting at node node08
> >>>> [node08.cluster:27019] mca:rmaps:rr: mapping by slot for job
[56774,1]
> >>>> slots 1 num_procs 8
> >>>> [node08.cluster:27019] mca:rmaps:rr:slot working node node08
> >>>> [node08.cluster:27019] mca:rmaps:rr:slot node node08 is full -
> > skipping
> >>>> [node08.cluster:27019] mca:rmaps:rr:slot job [56774,1] is
> > oversubscribed -
> >>>> performing second pass
> >>>> [node08.cluster:27019] mca:rmaps:rr:slot working node node08
> >>>> [node08.cluster:27019] mca:rmaps:rr:slot adding up to 8 procs to
node
> >>>> node08
> >>>> [node08.cluster:27019] mca:rmaps:base: computing vpids by slot for
job
> >>>> [56774,1]
> >>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 0 to node
node08
> >>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 1 to node
node08
> >>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 2 to node
node08
> >>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 3 to node
node08
> >>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 4 to node
node08
> >>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 5 to node
node08
> >>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 6 to node
node08
> >>>> [node08.cluster:27019] mca:rmaps:base: assigning rank 7 to node
node08
> >>>>
> >>>> I think something is wrong with treatment of oversubscription, which
> > might
> >>>> be
> >>>> related to "#3893: LAMA mapper has problems"
> >>>>
> >>>> tmishima
> >>>>
> >>>>> Hmmm...looks like we aren't getting your allocation. Can you rerun
> > and
> >>>> add -mca ras_base_verbose 50?
> >>>>>
> >>>>> On Nov 12, 2013, at 11:30 PM, tmish...@jcity.maeda.co.jp wrote:
> >>>>>
> >>>>>>
> >>>>>>
> >>>>>> Hi Ralph,
> >>>>>>
> >>>>>> Here is the output of "-mca plm_base_verbose 5".
> >>>>>>
> >>>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component
> > [rsh]
> >>>>>> [node08.cluster:23573] [[INVALID],INVALID] plm:rsh_lookup on
> >>>>>> agent /usr/bin/rsh path NULL
> >>>>>> [node08.cluster:23573] mca:base:select:(  plm) Query of component
> > [rsh]
> >>>> set
> >>>>>> priority to 10
> >>>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component
> >>>> [slurm]
> >>>>>> [node08.cluster:23573] mca:base:select:(  plm) Skipping component
> >>>> [slurm].
> >>>>>> Query failed to return a module
> >>>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component
> > [tm]
> >>>>>> [node08.cluster:23573] mca:base:select:(  plm) Query of component
> > [tm]
> >>>> set
> >>>>>> priority to 75
> >>>>>> [node08.cluster:23573] mca:base:select:(  plm) Selected component
> > [tm]
> >>>>>> [node08.cluster:23573] plm:base:set_hnp_name: initial bias 23573
> >>>> nodename
> >>>>>> hash 85176670
> >>>>>> [node08.cluster:23573] plm:base:set_hnp_name: final jobfam 59480
> >>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:receive start comm
> >>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_job
> >>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm
> >>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm creating
map
> >>>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm only HNP in
> >>>>>> allocation
> >>>>>>
> >>>>
> >
--------------------------------------------------------------------------
> >>>>>> All nodes which are allocated for this job are already filled.
> >>>>>>
> >>>>
> >
--------------------------------------------------------------------------
> >>>>>>
> >>>>>> Here, openmpi's configuration is as follows:
> >>>>>>
> >>>>>> ./configure \
> >>>>>> --prefix=/home/mishima/opt/mpi/openmpi-1.7.4a1-pgi13.10 \
> >>>>>> --with-tm \
> >>>>>> --with-verbs \
> >>>>>> --disable-ipv6 \
> >>>>>> --disable-vt \
> >>>>>> --enable-debug \
> >>>>>> CC=pgcc CFLAGS="-tp k8-64e" \
> >>>>>> CXX=pgCC CXXFLAGS="-tp k8-64e" \
> >>>>>> F77=pgfortran FFLAGS="-tp k8-64e" \
> >>>>>> FC=pgfortran FCFLAGS="-tp k8-64e"
> >>>>>>
> >>>>>>> Hi Ralph,
> >>>>>>>
> >>>>>>> Okey, I can help you. Please give me some time to report the
> > output.
> >>>>>>>
> >>>>>>> Tetsuya Mishima
> >>>>>>>
> >>>>>>>> I can try, but I have no way of testing Torque any more - so all
I
> >>>> can
> >>>>>> do
> >>>>>>> is a code review. If you can build --enable-debug and add -mca
> >>>>>>> plm_base_verbose 5 to your cmd line, I'd appreciate seeing the
> >>>>>>>> output.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Nov 12, 2013, at 9:58 PM, tmish...@jcity.maeda.co.jp wrote:
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Hi Ralph,
> >>>>>>>>>
> >>>>>>>>> Thank you for your quick response.
> >>>>>>>>>
> >>>>>>>>> I'd like to report one more regressive issue about Torque
support
> > of
> >>>>>>>>> openmpi-1.7.4a1r29646, which might be related to "#3893: LAMA
> > mapper
> >>>>>>>>> has problems" I reported a few days ago.
> >>>>>>>>>
> >>>>>>>>> The script below does not work with openmpi-1.7.4a1r29646,
> >>>>>>>>> although it worked with openmpi-1.7.3 as I told you before.
> >>>>>>>>>
> >>>>>>>>> #!/bin/sh
> >>>>>>>>> #PBS -l nodes=node08:ppn=8
> >>>>>>>>> export OMP_NUM_THREADS=1
> >>>>>>>>> cd $PBS_O_WORKDIR
> >>>>>>>>> cp $PBS_NODEFILE pbs_hosts
> >>>>>>>>> NPROCS=`wc -l < pbs_hosts`
> >>>>>>>>> mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings
> >>>> -bind-to
> >>>>>>> core
> >>>>>>>>> Myprog
> >>>>>>>>>
> >>>>>>>>> If I drop "-machinefile pbs_hosts -np ${NPROCS} ", then it
works
> >>>>>> fine.
> >>>>>>>>> Since this happens without lama request, I guess it's not the
> >>>> problem
> >>>>>>>>> in lama itself. Anyway, please look into this issue as well.
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Tetsuya Mishima
> >>>>>>>>>
> >>>>>>>>>> Done - thanks!
> >>>>>>>>>>
> >>>>>>>>>> On Nov 12, 2013, at 7:35 PM, tmish...@jcity.maeda.co.jp wrote:
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Dear openmpi developers,
> >>>>>>>>>>>
> >>>>>>>>>>> I got a segmentation fault in traial use of
> > openmpi-1.7.4a1r29646
> >>>>>>> built
> >>>>>>>>> by
> >>>>>>>>>>> PGI13.10 as shown below:
> >>>>>>>>>>>
> >>>>>>>>>>> [mishima@manage testbed-openmpi-1.7.3]$ mpirun -np 4
> >>>> -cpus-per-proc
> >>>>>> 2
> >>>>>>>>>>> -report-bindings mPre
> >>>>>>>>>>> [manage.cluster:23082] MCW rank 2 bound to socket 0[core 4
[hwt
> >>>> 0]],
> >>>>>>>>> socket
> >>>>>>>>>>> 0[core 5[hwt 0]]: [././././B/B][./././././.]
> >>>>>>>>>>> [manage.cluster:23082] MCW rank 3 bound to socket 1[core 6
[hwt
> >>>> 0]],
> >>>>>>>>> socket
> >>>>>>>>>>> 1[core 7[hwt 0]]: [./././././.][B/B/./././.]
> >>>>>>>>>>> [manage.cluster:23082] MCW rank 0 bound to socket 0[core 0
[hwt
> >>>> 0]],
> >>>>>>>>> socket
> >>>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./././.][./././././.]
> >>>>>>>>>>> [manage.cluster:23082] MCW rank 1 bound to socket 0[core 2
[hwt
> >>>> 0]],
> >>>>>>>>> socket
> >>>>>>>>>>> 0[core 3[hwt 0]]: [././B/B/./.][./././././.]
> >>>>>>>>>>> [manage:23082] *** Process received signal ***
> >>>>>>>>>>> [manage:23082] Signal: Segmentation fault (11)
> >>>>>>>>>>> [manage:23082] Signal code: Address not mapped (1)
> >>>>>>>>>>> [manage:23082] Failing at address: 0x34
> >>>>>>>>>>> [manage:23082] *** End of error message ***
> >>>>>>>>>>> Segmentation fault (core dumped)
> >>>>>>>>>>>
> >>>>>>>>>>> [mishima@manage testbed-openmpi-1.7.3]$ gdb mpirun core.23082
> >>>>>>>>>>> GNU gdb (GDB) CentOS (7.0.1-42.el5.centos.1)
> >>>>>>>>>>> Copyright (C) 2009 Free Software Foundation, Inc.
> >>>>>>>>>>> ...
> >>>>>>>>>>> Core was generated by `mpirun -np 4 -cpus-per-proc 2
> >>>>>> -report-bindings
> >>>>>>>>>>> mPre'.
> >>>>>>>>>>> Program terminated with signal 11, Segmentation fault.
> >>>>>>>>>>> #0  0x00002b5f861c9c4f in recv_connect
(mod=0x5f861ca20b00007f,
> >>>>>>>>> sd=32767,
> >>>>>>>>>>> hdr=0x1ca20b00007fff25) at ./oob_tcp.c:631
> >>>>>>>>>>> 631             peer = OBJ_NEW(mca_oob_tcp_peer_t);
> >>>>>>>>>>> (gdb) where
> >>>>>>>>>>> #0  0x00002b5f861c9c4f in recv_connect
(mod=0x5f861ca20b00007f,
> >>>>>>>>> sd=32767,
> >>>>>>>>>>> hdr=0x1ca20b00007fff25) at ./oob_tcp.c:631
> >>>>>>>>>>> #1  0x00002b5f861ca20b in recv_handler (sd=1778385023,
> >>>> flags=32767,
> >>>>>>>>>>> cbdata=0x8eb06a00007fff25) at ./oob_tcp.c:760
> >>>>>>>>>>> #2  0x00002b5f848eb06a in event_process_active_single_queue
> >>>>>>>>>>> (base=0x5f848eb27000007f, activeq=0x848eb27000007fff)
> >>>>>>>>>>> at ./event.c:1366
> >>>>>>>>>>> #3  0x00002b5f848eb270 in event_process_active
> >>>>>>>>> (base=0x5f848eb84900007f)
> >>>>>>>>>>> at ./event.c:1435
> >>>>>>>>>>> #4  0x00002b5f848eb849 in opal_libevent2021_event_base_loop
> >>>>>>>>>>> (base=0x4077a000007f, flags=32767) at ./event.c:1645
> >>>>>>>>>>> #5  0x00000000004077a0 in orterun (argc=7,
argv=0x7fff25bbd4a8)
> >>>>>>>>>>> at ./orterun.c:1030
> >>>>>>>>>>> #6  0x00000000004067fb in main (argc=7, argv=0x7fff25bbd4a8)
> >>>>>>>>> at ./main.c:13
> >>>>>>>>>>> (gdb) quit
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> The line 627 in orte/mca/oob/tcp/oob_tcp.c is apparently
> >>>>>> unnecessary,
> >>>>>>>>> which
> >>>>>>>>>>> causes the segfault.
> >>>>>>>>>>>
> >>>>>>>>>>> 624      /* lookup the corresponding process */
> >>>>>>>>>>> 625      peer = mca_oob_tcp_peer_lookup(mod, &hdr->origin);
> >>>>>>>>>>> 626      if (NULL == peer) {
> >>>>>>>>>>> 627          ui64 = (uint64_t*)(&peer->name);
> >>>>>>>>>>> 628          opal_output_verbose(OOB_TCP_DEBUG_CONNECT,
> >>>>>>>>>>> orte_oob_base_framework.framework_output,
> >>>>>>>>>>> 629                              "%s
mca_oob_tcp_recv_connect:
> >>>>>>>>>>> connection from new peer",
> >>>>>>>>>>> 630                              ORTE_NAME_PRINT
> >>>>>>> (ORTE_PROC_MY_NAME));
> >>>>>>>>>>> 631          peer = OBJ_NEW(mca_oob_tcp_peer_t);
> >>>>>>>>>>> 632          peer->mod = mod;
> >>>>>>>>>>> 633          peer->name = hdr->origin;
> >>>>>>>>>>> 634          peer->state = MCA_OOB_TCP_ACCEPTING;
> >>>>>>>>>>> 635          ui64 = (uint64_t*)(&peer->name);
> >>>>>>>>>>> 636          if (OPAL_SUCCESS !=
> > opal_hash_table_set_value_uint64
> >>>>>>>>> (&mod->
> >>>>>>>>>>> peers, (*ui64), peer)) {
> >>>>>>>>>>> 637              OBJ_RELEASE(peer);
> >>>>>>>>>>> 638              return;
> >>>>>>>>>>> 639          }
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Please fix this mistake in the next release.
> >>>>>>>>>>>
> >>>>>>>>>>> Regards,
> >>>>>>>>>>> Tetsuya Mishima
> >>>>>>>>>>>
> >>>>>>>>>>> _______________________________________________
> >>>>>>>>>>> users mailing list
> >>>>>>>>>>> us...@open-mpi.org
> >>>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>>>
> >>>>>>>>>> _______________________________________________
> >>>>>>>>>> users mailing list
> >>>>>>>>>> us...@open-mpi.org
> >>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>>
> >>>>>>>>> _______________________________________________
> >>>>>>>>> users mailing list
> >>>>>>>>> us...@open-mpi.org
> >>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>>
> >>>>>>>> _______________________________________________
> >>>>>>>> users mailing list
> >>>>>>>> us...@open-mpi.org
> >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>>
> >>>>>>> _______________________________________________
> >>>>>>> users mailing list
> >>>>>>> us...@open-mpi.org
> >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>>
> >>>>>> _______________________________________________
> >>>>>> users mailing list
> >>>>>> us...@open-mpi.org
> >>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>>
> >>>>> _______________________________________________
> >>>>> users mailing list
> >>>>> us...@open-mpi.org
> >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>>
> >>>> _______________________________________________
> >>>> users mailing list
> >>>> us...@open-mpi.org
> >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>>
> >>
> >> _______________________________________________
> >> users mailing list>> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Segmentation fault in oob_tcp.c of openmpi-1.7.4a1r29646

Reply via email to