Thanks - yes, the problem was in the launch_support.c code. I'll mark this as
checked and apply it to the v1.7.0 release.
Thanks for the help!
Ralph
On Mar 21, 2013, at 9:06 PM, tmish...@jcity.maeda.co.jp wrote:
>
>
> Hi Ralph,
>
> I tried to patch trunk/orte/mca/plm/base/plm_base_launch_sup
Hi Ralph,
I tried to patch trunk/orte/mca/plm/base/plm_base_launch_support.c.
I didn't touch debugging part of plm_base_launch_support.c and whole of
trunk/orte/mca/rmaps/base/rmaps_base_support_fns.c, because
rmaps_base_support_fns.c seems to include only updates for debugging.
Then, it works
Okay, I found it - fix coming in a bit.
Thanks!
Ralph
On Mar 21, 2013, at 4:02 PM, tmish...@jcity.maeda.co.jp wrote:
>
>
> Hi Ralph,
>
> Sorry for late reply. Here is my result.
>
> mpirun -v -np 8 -hostfile pbs_hosts -x OMP_NUM_THREADS --display-allocation
> -mca ras_base_verbose 5 -mca rma
Hi Ralph,
Sorry for late reply. Here is my result.
mpirun -v -np 8 -hostfile pbs_hosts -x OMP_NUM_THREADS --display-allocation
-mca ras_base_verbose 5 -mca rmaps_base_verb
ose 5 /home/mishima/Ducom/testbed/mPre m02-ld
[node04.cluster:28175] mca:base:select:( ras) Querying component
[loadlevele
Hmmm...okay, let's try one more thing. Can you please add the following to your
command line:
-mca ras_base_verbose 5 -mca rmaps_base_verbose 5
Appreciate your patience. For some reason, we are losing your head node from
the allocation when we start trying to map processes. I'm trying to track
Hi Ralph,
Here is the result on patched openmpi-1.7rc8.
mpirun -v -np 8 -hostfile pbs_hosts -x OMP_NUM_THREADS
--display-allocation /home/mishima/Ducom/testbed/mPre m02-ld
== ALLOCATED NODES ==
Data for node: node06 Num slots: 4Max slots: 0
D
Please try it again with the attached patch. The --disable-vt is fine.
Thanks
Ralph
user2.diff
Description: Binary data
On Mar 20, 2013, at 7:47 PM, tmish...@jcity.maeda.co.jp wrote:
>
>
> Hi Ralph,
>
> I have completed rebuild of openmpi1.7rc8.
> To save time, I added --disable-vt. ( Is i
Hi Ralph,
Here is an output on openmpi-1.6.4, just for your information.
Small difference is obserbed. I hope this helps you.
Regards,
Tetusya Mishima
openmpi-1.6.4:
== ALLOCATED NODES ==
Data for node: node06.cluster Num slots: 4Max slots: 0
Hi Ralph,
I have completed rebuild of openmpi1.7rc8.
To save time, I added --disable-vt. ( Is it OK? )
Well, what shall I do ?
./configure \
--prefix=/home/mishima/opt/mpi/openmpi-1.7rc8-pgi12.9 \
--with-tm \
--with-verbs \
--disable-ipv6 \
--disable-vt \
--enable-debug \
CC=pgcc CFLAGS="-fast
Hi Ralph,
I have a line below in ~/.openmpi/mca-params.conf to use rsh.
orte_rsh_agent = /usr/bin/rsh
I changed this line to:
plm_rsh_agent = /usr/bin/rsh # for openmpi-1.7
Then, the error message disappeared. Thanks.
Retruning to the subject, I can rebuild with --enable-debug.
Just wait unt
Could you please apply the attached patch and try it again? If you haven't had
time to configure with --enable-debug, that is fine - this will output
regardless.
Thanks
Ralph
user.diff
Description: Binary data
On Mar 20, 2013, at 4:59 PM, Ralph Castain wrote:
> You obviously have some MCA
You obviously have some MCA params set somewhere:
> --
> A deprecated MCA parameter value was specified in an MCA parameter
> file. Deprecated MCA parameters should be avoided; they may disappear
> in future releases.
>
> D
Hi Ralph,
Here is a result of rerun with --display-allocation.
I set OMP_NUM_THREADS=1 to make the problem clear.
Regards,
Tetsuya Mishima
P.S. As far as I checked, these 2 cases are OK(no problem).
(1)mpirun -v -np $NPROCS-x OMP_NUM_THREADS --display-allocation
~/Ducom/testbed/mPre m02-ld
(2)
I've submitted a patch to fix the Torque launch issue - just some leftover
garbage that existed at the time of the 1.7.0 branch and didn't get removed.
For the hostfile issue, I'm stumped as I can't see how the problem would come
about. Could you please rerun your original test and add "--displa
Hi Gus,
Thank you for your comments. I understand your advice.
Our script used to be --npernode type as well.
As I told before, our cluster consists of nodes having 4, 8,
and 32 cores, although it used to be homogeneous at the
starting time. Furthermore, since performance of each core
is almost
Hi Tetsuya
Your script that edits $PBS_NODEFILE into a separate hostfile
is very similar to some that I used here for
hybrid OpenMP+MPI programs on older versions of OMPI.
I haven't tried this in 1.6.X,
but it looks like you did and it works also.
I haven't tried 1.7 either.
Since we run producti
Hi Reuti and Gus,
Thank you for your comments.
Our cluster is a little bit heterogeneous, which has nodes with 4, 8, 32
cores.
I used 8-core nodes for "-l nodes=4:ppn=8" and 4-core nodes for "-l
nodes=2:ppn=4".
(strictly speaking, Torque picked up proper nodes.)
As I mentioned before, I usuall
Hi Tetsuya Mishima
Mpiexec offers you a number of possibilities that you could try:
--bynode,
--pernode,
--npernode,
--bysocket,
--bycore,
--cpus-per-proc,
--cpus-per-rank,
--rankfile
and more.
Most likely one or more of them will fit your needs.
There are also associated flags to bind processe
Hi,
Am 19.03.2013 um 08:00 schrieb tmish...@jcity.maeda.co.jp:
> I didn't have much time to test this morning. So, I checked it again
> now. Then, the trouble seems to depend on the number of nodes to use.
>
> This works(nodes < 4):
> mpiexec -bynode -np 4 ./my_program && #PBS -l nodes=2:ppn
Hi Jeff,
I didn't have much time to test this morning. So, I checked it again
now. Then, the trouble seems to depend on the number of nodes to use.
This works(nodes < 4):
mpiexec -bynode -np 4 ./my_program && #PBS -l nodes=2:ppn=8
(OMP_NUM_THREADS=4)
This causes error(nodes >= 4):
mpiexec
Oy; that's weird.
I'm afraid we're going to have to wait for Ralph to answer why that is
happening -- sorry!
On Mar 18, 2013, at 4:45 PM, wrote:
>
>
> Hi Correa and Jeff,
>
> Thank you for your comments. I quickly checked your suggestion.
>
> As a result, my simple example case worked wel
Hi Correa and Jeff,
Thank you for your comments. I quickly checked your suggestion.
As a result, my simple example case worked well.
export OMP_NUM_THREADS=4
mpiexec -bynode -np 2 ./my_program && #PBS -l nodes=2:ppn=4
But, practical case that more than 1 process was allocated to a node lik
On Mar 17, 2013, at 10:55 PM, Gustavo Correa wrote:
> In your example, have you tried not to modify the node file,
> launch two mpi processes with mpiexec, and request a "-bynode" distribution
> of processes:
>
> mpiexec -bynode -np 2 ./my_program
This should work in 1.7, too (I use these kin
Hi Tetsuya Mishima
In your example, have you tried not to modify the node file,
launch two mpi processes with mpiexec, and request a "-bynode" distribution of
processes:
mpiexec -bynode -np 2 ./my_program
This should launch one MPI process in each of two nodes.
See 'man mpiexec' for details.
W
24 matches
Mail list logo