Okay, I found it - fix coming in a bit.

Thanks!
Ralph

On Mar 21, 2013, at 4:02 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi Ralph,
> 
> Sorry for late reply. Here is my result.
> 
> mpirun -v -np 8 -hostfile pbs_hosts -x OMP_NUM_THREADS --display-allocation
> -mca ras_base_verbose 5 -mca rmaps_base_verb
> ose 5 /home/mishima/Ducom/testbed/mPre m02-ld
> [node04.cluster:28175] mca:base:select:(  ras) Querying component
> [loadleveler]
> [node04.cluster:28175] [[29518,0],0] ras:loadleveler: NOT available for
> selection
> [node04.cluster:28175] mca:base:select:(  ras) Skipping component
> [loadleveler]. Query failed to return a module
> [node04.cluster:28175] mca:base:select:(  ras) Querying component
> [simulator]
> [node04.cluster:28175] mca:base:select:(  ras) Skipping component
> [simulator]. Query failed to return a module
> [node04.cluster:28175] mca:base:select:(  ras) Querying component [slurm]
> [node04.cluster:28175] [[29518,0],0] ras:slurm: NOT available for selection
> [node04.cluster:28175] mca:base:select:(  ras) Skipping component [slurm].
> Query failed to return a module
> [node04.cluster:28175] mca:base:select:(  ras) Querying component [tm]
> [node04.cluster:28175] mca:base:select:(  ras) Query of component [tm] set
> priority to 100
> [node04.cluster:28175] mca:base:select:(  ras) Selected component [tm]
> [node04.cluster:28175] mca:rmaps:select: checking available component ppr
> [node04.cluster:28175] mca:rmaps:select: Querying component [ppr]
> [node04.cluster:28175] mca:rmaps:select: checking available component
> rank_file
> [node04.cluster:28175] mca:rmaps:select: Querying component [rank_file]
> [node04.cluster:28175] mca:rmaps:select: checking available component
> resilient
> [node04.cluster:28175] mca:rmaps:select: Querying component [resilient]
> [node04.cluster:28175] mca:rmaps:select: checking available component
> round_robin
> [node04.cluster:28175] mca:rmaps:select: Querying component [round_robin]
> [node04.cluster:28175] mca:rmaps:select: checking available component seq
> [node04.cluster:28175] mca:rmaps:select: Querying component [seq]
> [node04.cluster:28175] [[29518,0],0]: Final mapper priorities
> [node04.cluster:28175]  Mapper: ppr Priority: 90
> [node04.cluster:28175]  Mapper: seq Priority: 60
> [node04.cluster:28175]  Mapper: resilient Priority: 40
> [node04.cluster:28175]  Mapper: round_robin Priority: 10
> [node04.cluster:28175]  Mapper: rank_file Priority: 0
> [node04.cluster:28175] [[29518,0],0] ras:base:allocate
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: got hostname
> node04
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: not found --
> added to list
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: got hostname
> node04
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: found --
> bumped slots to 2
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: got hostname
> node04
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: found --
> bumped slots to 3
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: got hostname
> node04
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: found --
> bumped slots to 4
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: got hostname
> node03
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: not found --
> added to list
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: got hostname
> node03
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: found --
> bumped slots to 2
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: got hostname
> node03
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: found --
> bumped slots to 3
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: got hostname
> node03
> [node04.cluster:28175] [[29518,0],0] ras:tm:allocate:discover: found --
> bumped slots to 4
> [node04.cluster:28175] [[29518,0],0] ras:base:node_insert inserting 2 nodes
> [node04.cluster:28175] [[29518,0],0] ras:base:node_insert updating HNP info
> to 4 slots
> [node04.cluster:28175] [[29518,0],0] ras:base:node_insert node node03
> 
> ======================   ALLOCATED NODES   ======================
> 
> Data for node: node04  Num slots: 4    Max slots: 0
> Data for node: node03  Num slots: 4    Max slots: 0
> 
> =================================================================
> [node04.cluster:28175] HOSTFILE: CHECKING FILE NODE node04 VS LIST NODE
> node03
> --------------------------------------------------------------------------
> A hostfile was provided that contains at least one node not
> present in the allocation:
> 
>  hostfile:  pbs_hosts
>  node:      node04
> 
> If you are operating in a resource-managed environment, then only
> nodes that are in the allocation can be used in the hostfile. You
> may find relative node syntax to be a useful alternative to
> specifying absolute node names see the orte_hosts man page for
> further information.
> --------------------------------------------------------------------------
> 
> Regards,
> Tetsuya Mishima
> 
>> Hmmm...okay, let's try one more thing. Can you please add the following
> to your command line:
>> 
>> -mca ras_base_verbose 5 -mca rmaps_base_verbose 5
>> 
>> Appreciate your patience. For some reason, we are losing your head node
> from the allocation when we start trying to map processes. I'm trying to
> track down where this is happening so we can figure
>> out why.
>> 
>> 
>> On Mar 20, 2013, at 10:32 PM, tmish...@jcity.maeda.co.jp wrote:
>> 
>>> 
>>> 
>>> Hi Ralph,
>>> 
>>> Here is the result on patched openmpi-1.7rc8.
>>> 
>>> mpirun -v -np 8 -hostfile pbs_hosts -x OMP_NUM_THREADS
>>> --display-allocation /home/mishima/Ducom/testbed/mPre m02-ld
>>> 
>>> ======================   ALLOCATED NODES   ======================
>>> 
>>> Data for node: node06  Num slots: 4    Max slots: 0
>>> Data for node: node05  Num slots: 4    Max slots: 0
>>> 
>>> =================================================================
>>> [node06.cluster:21149] HOSTFILE: CHECKING FILE NODE node06 VS LIST NODE
>>> node05
>>> 
> --------------------------------------------------------------------------
>>> A hostfile was provided that contains at least one node not
>>> present in the allocation:
>>> 
>>> hostfile:  pbs_hosts
>>> node:      node06
>>> 
>>> If you are operating in a resource-managed environment, then only
>>> nodes that are in the allocation can be used in the hostfile. You
>>> may find relative node syntax to be a useful alternative to
>>> specifying absolute node names see the orte_hosts man page for
>>> further information.
>>> 
> --------------------------------------------------------------------------
>>> 
>>> Regards,
>>> Tetsuya
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to