I guess here's my confusion. If you are using only one node, and that node has 
8 allocated slots, then we will not allow you to run more than 8 processes on 
that node unless you specifically provide the --oversubscribe flag. This is 
because you are operating in a managed environment (in this case, under 
Torque), and so we treat the allocation as "mandatory" by default.

I suspect that is the issue here, in which case the system is behaving as it 
should.

Is the above accurate?


On Nov 13, 2013, at 4:11 PM, Ralph Castain <r...@open-mpi.org> wrote:

> It has nothing to do with LAMA as you aren't using that mapper.
> 
> How many nodes are in this allocation?
> 
> On Nov 13, 2013, at 4:06 PM, tmish...@jcity.maeda.co.jp wrote:
> 
>> 
>> 
>> Hi Ralph, this is an additional information.
>> 
>> Here is the main part of output by adding "-mca rmaps_base_verbose 50".
>> 
>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm
>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm creating map
>> [node08.cluster:26952] [[56581,0],0] plm:base:setup_vm only HNP in
>> allocation
>> [node08.cluster:26952] mca:rmaps: mapping job [56581,1]
>> [node08.cluster:26952] mca:rmaps: creating new map for job [56581,1]
>> [node08.cluster:26952] mca:rmaps:ppr: job [56581,1] not using ppr mapper
>> [node08.cluster:26952] [[56581,0],0] rmaps:seq mapping job [56581,1]
>> [node08.cluster:26952] mca:rmaps:seq: job [56581,1] not using seq mapper
>> [node08.cluster:26952] mca:rmaps:resilient: cannot perform initial map of
>> job [56581,1] - no fault groups
>> [node08.cluster:26952] mca:rmaps:mindist: job [56581,1] not using mindist
>> mapper
>> [node08.cluster:26952] mca:rmaps:rr: mapping job [56581,1]
>> [node08.cluster:26952] [[56581,0],0] Starting with 1 nodes in list
>> [node08.cluster:26952] [[56581,0],0] Filtering thru apps
>> [node08.cluster:26952] [[56581,0],0] Retained 1 nodes in list
>> [node08.cluster:26952] [[56581,0],0] Removing node node08 slots 0 inuse 0
>> 
>> From this result, I guess it's related to oversubscribe.
>> So I added "-oversubscribe" and rerun, then it worked well as show below:
>> 
>> [node08.cluster:27019] [[56774,0],0] Starting with 1 nodes in list
>> [node08.cluster:27019] [[56774,0],0] Filtering thru apps
>> [node08.cluster:27019] [[56774,0],0] Retained 1 nodes in list
>> [node08.cluster:27019] AVAILABLE NODES FOR MAPPING:
>> [node08.cluster:27019]     node: node08 daemon: 0
>> [node08.cluster:27019] [[56774,0],0] Starting bookmark at node node08
>> [node08.cluster:27019] [[56774,0],0] Starting at node node08
>> [node08.cluster:27019] mca:rmaps:rr: mapping by slot for job [56774,1]
>> slots 1 num_procs 8
>> [node08.cluster:27019] mca:rmaps:rr:slot working node node08
>> [node08.cluster:27019] mca:rmaps:rr:slot node node08 is full - skipping
>> [node08.cluster:27019] mca:rmaps:rr:slot job [56774,1] is oversubscribed -
>> performing second pass
>> [node08.cluster:27019] mca:rmaps:rr:slot working node node08
>> [node08.cluster:27019] mca:rmaps:rr:slot adding up to 8 procs to node
>> node08
>> [node08.cluster:27019] mca:rmaps:base: computing vpids by slot for job
>> [56774,1]
>> [node08.cluster:27019] mca:rmaps:base: assigning rank 0 to node node08
>> [node08.cluster:27019] mca:rmaps:base: assigning rank 1 to node node08
>> [node08.cluster:27019] mca:rmaps:base: assigning rank 2 to node node08
>> [node08.cluster:27019] mca:rmaps:base: assigning rank 3 to node node08
>> [node08.cluster:27019] mca:rmaps:base: assigning rank 4 to node node08
>> [node08.cluster:27019] mca:rmaps:base: assigning rank 5 to node node08
>> [node08.cluster:27019] mca:rmaps:base: assigning rank 6 to node node08
>> [node08.cluster:27019] mca:rmaps:base: assigning rank 7 to node node08
>> 
>> I think something is wrong with treatment of oversubscription, which might
>> be
>> related to "#3893: LAMA mapper has problems"
>> 
>> tmishima
>> 
>>> Hmmm...looks like we aren't getting your allocation. Can you rerun and
>> add -mca ras_base_verbose 50?
>>> 
>>> On Nov 12, 2013, at 11:30 PM, tmish...@jcity.maeda.co.jp wrote:
>>> 
>>>> 
>>>> 
>>>> Hi Ralph,
>>>> 
>>>> Here is the output of "-mca plm_base_verbose 5".
>>>> 
>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component [rsh]
>>>> [node08.cluster:23573] [[INVALID],INVALID] plm:rsh_lookup on
>>>> agent /usr/bin/rsh path NULL
>>>> [node08.cluster:23573] mca:base:select:(  plm) Query of component [rsh]
>> set
>>>> priority to 10
>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component
>> [slurm]
>>>> [node08.cluster:23573] mca:base:select:(  plm) Skipping component
>> [slurm].
>>>> Query failed to return a module
>>>> [node08.cluster:23573] mca:base:select:(  plm) Querying component [tm]
>>>> [node08.cluster:23573] mca:base:select:(  plm) Query of component [tm]
>> set
>>>> priority to 75
>>>> [node08.cluster:23573] mca:base:select:(  plm) Selected component [tm]
>>>> [node08.cluster:23573] plm:base:set_hnp_name: initial bias 23573
>> nodename
>>>> hash 85176670
>>>> [node08.cluster:23573] plm:base:set_hnp_name: final jobfam 59480
>>>> [node08.cluster:23573] [[59480,0],0] plm:base:receive start comm
>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_job
>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm
>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm creating map
>>>> [node08.cluster:23573] [[59480,0],0] plm:base:setup_vm only HNP in
>>>> allocation
>>>> 
>> --------------------------------------------------------------------------
>>>> All nodes which are allocated for this job are already filled.
>>>> 
>> --------------------------------------------------------------------------
>>>> 
>>>> Here, openmpi's configuration is as follows:
>>>> 
>>>> ./configure \
>>>> --prefix=/home/mishima/opt/mpi/openmpi-1.7.4a1-pgi13.10 \
>>>> --with-tm \
>>>> --with-verbs \
>>>> --disable-ipv6 \
>>>> --disable-vt \
>>>> --enable-debug \
>>>> CC=pgcc CFLAGS="-tp k8-64e" \
>>>> CXX=pgCC CXXFLAGS="-tp k8-64e" \
>>>> F77=pgfortran FFLAGS="-tp k8-64e" \
>>>> FC=pgfortran FCFLAGS="-tp k8-64e"
>>>> 
>>>>> Hi Ralph,
>>>>> 
>>>>> Okey, I can help you. Please give me some time to report the output.
>>>>> 
>>>>> Tetsuya Mishima
>>>>> 
>>>>>> I can try, but I have no way of testing Torque any more - so all I
>> can
>>>> do
>>>>> is a code review. If you can build --enable-debug and add -mca
>>>>> plm_base_verbose 5 to your cmd line, I'd appreciate seeing the
>>>>>> output.
>>>>>> 
>>>>>> 
>>>>>> On Nov 12, 2013, at 9:58 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Hi Ralph,
>>>>>>> 
>>>>>>> Thank you for your quick response.
>>>>>>> 
>>>>>>> I'd like to report one more regressive issue about Torque support of
>>>>>>> openmpi-1.7.4a1r29646, which might be related to "#3893: LAMA mapper
>>>>>>> has problems" I reported a few days ago.
>>>>>>> 
>>>>>>> The script below does not work with openmpi-1.7.4a1r29646,
>>>>>>> although it worked with openmpi-1.7.3 as I told you before.
>>>>>>> 
>>>>>>> #!/bin/sh
>>>>>>> #PBS -l nodes=node08:ppn=8
>>>>>>> export OMP_NUM_THREADS=1
>>>>>>> cd $PBS_O_WORKDIR
>>>>>>> cp $PBS_NODEFILE pbs_hosts
>>>>>>> NPROCS=`wc -l < pbs_hosts`
>>>>>>> mpirun -machinefile pbs_hosts -np ${NPROCS} -report-bindings
>> -bind-to
>>>>> core
>>>>>>> Myprog
>>>>>>> 
>>>>>>> If I drop "-machinefile pbs_hosts -np ${NPROCS} ", then it works
>>>> fine.
>>>>>>> Since this happens without lama request, I guess it's not the
>> problem
>>>>>>> in lama itself. Anyway, please look into this issue as well.
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Tetsuya Mishima
>>>>>>> 
>>>>>>>> Done - thanks!
>>>>>>>> 
>>>>>>>> On Nov 12, 2013, at 7:35 PM, tmish...@jcity.maeda.co.jp wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Dear openmpi developers,
>>>>>>>>> 
>>>>>>>>> I got a segmentation fault in traial use of openmpi-1.7.4a1r29646
>>>>> built
>>>>>>> by
>>>>>>>>> PGI13.10 as shown below:
>>>>>>>>> 
>>>>>>>>> [mishima@manage testbed-openmpi-1.7.3]$ mpirun -np 4
>> -cpus-per-proc
>>>> 2
>>>>>>>>> -report-bindings mPre
>>>>>>>>> [manage.cluster:23082] MCW rank 2 bound to socket 0[core 4[hwt
>> 0]],
>>>>>>> socket
>>>>>>>>> 0[core 5[hwt 0]]: [././././B/B][./././././.]
>>>>>>>>> [manage.cluster:23082] MCW rank 3 bound to socket 1[core 6[hwt
>> 0]],
>>>>>>> socket
>>>>>>>>> 1[core 7[hwt 0]]: [./././././.][B/B/./././.]
>>>>>>>>> [manage.cluster:23082] MCW rank 0 bound to socket 0[core 0[hwt
>> 0]],
>>>>>>> socket
>>>>>>>>> 0[core 1[hwt 0]]: [B/B/./././.][./././././.]
>>>>>>>>> [manage.cluster:23082] MCW rank 1 bound to socket 0[core 2[hwt
>> 0]],
>>>>>>> socket
>>>>>>>>> 0[core 3[hwt 0]]: [././B/B/./.][./././././.]
>>>>>>>>> [manage:23082] *** Process received signal ***
>>>>>>>>> [manage:23082] Signal: Segmentation fault (11)
>>>>>>>>> [manage:23082] Signal code: Address not mapped (1)
>>>>>>>>> [manage:23082] Failing at address: 0x34
>>>>>>>>> [manage:23082] *** End of error message ***
>>>>>>>>> Segmentation fault (core dumped)
>>>>>>>>> 
>>>>>>>>> [mishima@manage testbed-openmpi-1.7.3]$ gdb mpirun core.23082
>>>>>>>>> GNU gdb (GDB) CentOS (7.0.1-42.el5.centos.1)
>>>>>>>>> Copyright (C) 2009 Free Software Foundation, Inc.
>>>>>>>>> ...
>>>>>>>>> Core was generated by `mpirun -np 4 -cpus-per-proc 2
>>>> -report-bindings
>>>>>>>>> mPre'.
>>>>>>>>> Program terminated with signal 11, Segmentation fault.
>>>>>>>>> #0  0x00002b5f861c9c4f in recv_connect (mod=0x5f861ca20b00007f,
>>>>>>> sd=32767,
>>>>>>>>> hdr=0x1ca20b00007fff25) at ./oob_tcp.c:631
>>>>>>>>> 631             peer = OBJ_NEW(mca_oob_tcp_peer_t);
>>>>>>>>> (gdb) where
>>>>>>>>> #0  0x00002b5f861c9c4f in recv_connect (mod=0x5f861ca20b00007f,
>>>>>>> sd=32767,
>>>>>>>>> hdr=0x1ca20b00007fff25) at ./oob_tcp.c:631
>>>>>>>>> #1  0x00002b5f861ca20b in recv_handler (sd=1778385023,
>> flags=32767,
>>>>>>>>> cbdata=0x8eb06a00007fff25) at ./oob_tcp.c:760
>>>>>>>>> #2  0x00002b5f848eb06a in event_process_active_single_queue
>>>>>>>>> (base=0x5f848eb27000007f, activeq=0x848eb27000007fff)
>>>>>>>>> at ./event.c:1366
>>>>>>>>> #3  0x00002b5f848eb270 in event_process_active
>>>>>>> (base=0x5f848eb84900007f)
>>>>>>>>> at ./event.c:1435
>>>>>>>>> #4  0x00002b5f848eb849 in opal_libevent2021_event_base_loop
>>>>>>>>> (base=0x4077a000007f, flags=32767) at ./event.c:1645
>>>>>>>>> #5  0x00000000004077a0 in orterun (argc=7, argv=0x7fff25bbd4a8)
>>>>>>>>> at ./orterun.c:1030
>>>>>>>>> #6  0x00000000004067fb in main (argc=7, argv=0x7fff25bbd4a8)
>>>>>>> at ./main.c:13
>>>>>>>>> (gdb) quit
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> The line 627 in orte/mca/oob/tcp/oob_tcp.c is apparently
>>>> unnecessary,
>>>>>>> which
>>>>>>>>> causes the segfault.
>>>>>>>>> 
>>>>>>>>> 624      /* lookup the corresponding process */
>>>>>>>>> 625      peer = mca_oob_tcp_peer_lookup(mod, &hdr->origin);
>>>>>>>>> 626      if (NULL == peer) {
>>>>>>>>> 627          ui64 = (uint64_t*)(&peer->name);
>>>>>>>>> 628          opal_output_verbose(OOB_TCP_DEBUG_CONNECT,
>>>>>>>>> orte_oob_base_framework.framework_output,
>>>>>>>>> 629                              "%s mca_oob_tcp_recv_connect:
>>>>>>>>> connection from new peer",
>>>>>>>>> 630                              ORTE_NAME_PRINT
>>>>> (ORTE_PROC_MY_NAME));
>>>>>>>>> 631          peer = OBJ_NEW(mca_oob_tcp_peer_t);
>>>>>>>>> 632          peer->mod = mod;
>>>>>>>>> 633          peer->name = hdr->origin;
>>>>>>>>> 634          peer->state = MCA_OOB_TCP_ACCEPTING;
>>>>>>>>> 635          ui64 = (uint64_t*)(&peer->name);
>>>>>>>>> 636          if (OPAL_SUCCESS != opal_hash_table_set_value_uint64
>>>>>>> (&mod->
>>>>>>>>> peers, (*ui64), peer)) {
>>>>>>>>> 637              OBJ_RELEASE(peer);
>>>>>>>>> 638              return;
>>>>>>>>> 639          }
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Please fix this mistake in the next release.
>>>>>>>>> 
>>>>>>>>> Regards,
>>>>>>>>> Tetsuya Mishima
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> users mailing list
>>>>>>>>> us...@open-mpi.org
>>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 

Reply via email to