Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-22 Thread Ralph Castain
Thanks - yes, the problem was in the launch_support.c code. I'll mark this as checked and apply it to the v1.7.0 release. Thanks for the help! Ralph On Mar 21, 2013, at 9:06 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, > > I tried to patch trunk/orte/mca/plm/base/plm_base_launch_sup

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-22 Thread tmishima
Hi Ralph, I tried to patch trunk/orte/mca/plm/base/plm_base_launch_support.c. I didn't touch debugging part of plm_base_launch_support.c and whole of trunk/orte/mca/rmaps/base/rmaps_base_support_fns.c, because rmaps_base_support_fns.c seems to include only updates for debugging. Then, it works

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-21 Thread Ralph Castain
Okay, I found it - fix coming in a bit. Thanks! Ralph On Mar 21, 2013, at 4:02 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, > > Sorry for late reply. Here is my result. > > mpirun -v -np 8 -hostfile pbs_hosts -x OMP_NUM_THREADS --display-allocation > -mca ras_base_verbose 5 -mca rma

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-21 Thread tmishima
Hi Ralph, Sorry for late reply. Here is my result. mpirun -v -np 8 -hostfile pbs_hosts -x OMP_NUM_THREADS --display-allocation -mca ras_base_verbose 5 -mca rmaps_base_verb ose 5 /home/mishima/Ducom/testbed/mPre m02-ld [node04.cluster:28175] mca:base:select:( ras) Querying component [loadlevele

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-21 Thread Ralph Castain
Hmmm...okay, let's try one more thing. Can you please add the following to your command line: -mca ras_base_verbose 5 -mca rmaps_base_verbose 5 Appreciate your patience. For some reason, we are losing your head node from the allocation when we start trying to map processes. I'm trying to track

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-21 Thread tmishima
Hi Ralph, Here is the result on patched openmpi-1.7rc8. mpirun -v -np 8 -hostfile pbs_hosts -x OMP_NUM_THREADS --display-allocation /home/mishima/Ducom/testbed/mPre m02-ld == ALLOCATED NODES == Data for node: node06 Num slots: 4Max slots: 0 D

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-21 Thread Ralph Castain
Please try it again with the attached patch. The --disable-vt is fine. Thanks Ralph user2.diff Description: Binary data On Mar 20, 2013, at 7:47 PM, tmish...@jcity.maeda.co.jp wrote: > > > Hi Ralph, > > I have completed rebuild of openmpi1.7rc8. > To save time, I added --disable-vt. ( Is i

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-20 Thread tmishima
Hi Ralph, Here is an output on openmpi-1.6.4, just for your information. Small difference is obserbed. I hope this helps you. Regards, Tetusya Mishima openmpi-1.6.4: == ALLOCATED NODES == Data for node: node06.cluster Num slots: 4Max slots: 0

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-20 Thread tmishima
Hi Ralph, I have completed rebuild of openmpi1.7rc8. To save time, I added --disable-vt. ( Is it OK? ) Well, what shall I do ? ./configure \ --prefix=/home/mishima/opt/mpi/openmpi-1.7rc8-pgi12.9 \ --with-tm \ --with-verbs \ --disable-ipv6 \ --disable-vt \ --enable-debug \ CC=pgcc CFLAGS="-fast

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-20 Thread tmishima
Hi Ralph, I have a line below in ~/.openmpi/mca-params.conf to use rsh. orte_rsh_agent = /usr/bin/rsh I changed this line to: plm_rsh_agent = /usr/bin/rsh # for openmpi-1.7 Then, the error message disappeared. Thanks. Retruning to the subject, I can rebuild with --enable-debug. Just wait unt

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-20 Thread Ralph Castain
Could you please apply the attached patch and try it again? If you haven't had time to configure with --enable-debug, that is fine - this will output regardless. Thanks Ralph user.diff Description: Binary data On Mar 20, 2013, at 4:59 PM, Ralph Castain wrote: > You obviously have some MCA

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-20 Thread Ralph Castain
You obviously have some MCA params set somewhere: > -- > A deprecated MCA parameter value was specified in an MCA parameter > file. Deprecated MCA parameters should be avoided; they may disappear > in future releases. > > D

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-20 Thread tmishima
Hi Ralph, Here is a result of rerun with --display-allocation. I set OMP_NUM_THREADS=1 to make the problem clear. Regards, Tetsuya Mishima P.S. As far as I checked, these 2 cases are OK(no problem). (1)mpirun -v -np $NPROCS-x OMP_NUM_THREADS --display-allocation ~/Ducom/testbed/mPre m02-ld (2)

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-20 Thread Ralph Castain
I've submitted a patch to fix the Torque launch issue - just some leftover garbage that existed at the time of the 1.7.0 branch and didn't get removed. For the hostfile issue, I'm stumped as I can't see how the problem would come about. Could you please rerun your original test and add "--displa

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-19 Thread tmishima
Hi Gus, Thank you for your comments. I understand your advice. Our script used to be --npernode type as well. As I told before, our cluster consists of nodes having 4, 8, and 32 cores, although it used to be homogeneous at the starting time. Furthermore, since performance of each core is almost

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-19 Thread Gus Correa
Hi Tetsuya Your script that edits $PBS_NODEFILE into a separate hostfile is very similar to some that I used here for hybrid OpenMP+MPI programs on older versions of OMPI. I haven't tried this in 1.6.X, but it looks like you did and it works also. I haven't tried 1.7 either. Since we run producti

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-19 Thread tmishima
Hi Reuti and Gus, Thank you for your comments. Our cluster is a little bit heterogeneous, which has nodes with 4, 8, 32 cores. I used 8-core nodes for "-l nodes=4:ppn=8" and 4-core nodes for "-l nodes=2:ppn=4". (strictly speaking, Torque picked up proper nodes.) As I mentioned before, I usuall

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-19 Thread Gus Correa
Hi Tetsuya Mishima Mpiexec offers you a number of possibilities that you could try: --bynode, --pernode, --npernode, --bysocket, --bycore, --cpus-per-proc, --cpus-per-rank, --rankfile and more. Most likely one or more of them will fit your needs. There are also associated flags to bind processe

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-19 Thread Reuti
Hi, Am 19.03.2013 um 08:00 schrieb tmish...@jcity.maeda.co.jp: > I didn't have much time to test this morning. So, I checked it again > now. Then, the trouble seems to depend on the number of nodes to use. > > This works(nodes < 4): > mpiexec -bynode -np 4 ./my_program && #PBS -l nodes=2:ppn

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-19 Thread tmishima
Hi Jeff, I didn't have much time to test this morning. So, I checked it again now. Then, the trouble seems to depend on the number of nodes to use. This works(nodes < 4): mpiexec -bynode -np 4 ./my_program && #PBS -l nodes=2:ppn=8 (OMP_NUM_THREADS=4) This causes error(nodes >= 4): mpiexec

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-18 Thread Jeff Squyres (jsquyres)
Oy; that's weird. I'm afraid we're going to have to wait for Ralph to answer why that is happening -- sorry! On Mar 18, 2013, at 4:45 PM, wrote: > > > Hi Correa and Jeff, > > Thank you for your comments. I quickly checked your suggestion. > > As a result, my simple example case worked wel

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-18 Thread tmishima
Hi Correa and Jeff, Thank you for your comments. I quickly checked your suggestion. As a result, my simple example case worked well. export OMP_NUM_THREADS=4 mpiexec -bynode -np 2 ./my_program && #PBS -l nodes=2:ppn=4 But, practical case that more than 1 process was allocated to a node lik

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-18 Thread Jeff Squyres (jsquyres)
On Mar 17, 2013, at 10:55 PM, Gustavo Correa wrote: > In your example, have you tried not to modify the node file, > launch two mpi processes with mpiexec, and request a "-bynode" distribution > of processes: > > mpiexec -bynode -np 2 ./my_program This should work in 1.7, too (I use these kin

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

2013-03-17 Thread Gustavo Correa
Hi Tetsuya Mishima In your example, have you tried not to modify the node file, launch two mpi processes with mpiexec, and request a "-bynode" distribution of processes: mpiexec -bynode -np 2 ./my_program This should launch one MPI process in each of two nodes. See 'man mpiexec' for details. W