Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

Ralph Castain Wed, 20 Mar 2013 11:14:45 -0400

I've submitted a patch to fix the Torque launch issue - just some leftover 
garbage that existed at the time of the 1.7.0 branch and didn't get removed.


For the hostfile issue, I'm stumped as I can't see how the problem would come 
about. Could you please rerun your original test and add "--display-allocation" 
to your cmd line? Let's see if it is correctly finding the original allocation.

Thanks
Ralph

On Mar 19, 2013, at 5:08 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi Gus,
> 
> Thank you for your comments. I understand your advice.
> Our script used to be --npernode type as well.
> 
> As I told before, our cluster consists of nodes having 4, 8,
> and 32 cores, although it used to be homogeneous at the
> starting time. Furthermore, since performance of each core
> is almost same, a mixed use of nodes with different number
> of cores is possible, just like #PBS -l nodes=1:ppn=32+4:ppn=8.
> 
> --npernode type is not applicable to such a mixed use.
> That's why I'd like to continue to use modified hostfile.
> 
> By the way, the problem I reported to Jeff yesterday
> was that openmpi-1.7 with torque is something wrong,
> because it caused error against such a simple case as
> shown below, which surprised me. Now, the problem is not
> limited to modified hostfile, I guess.
> 
> #PBS -l nodes=4:ppn=8
> mpirun -np 8 ./my_program
> (OMP_NUM_THREADS=4)
> 
> Regards,
> Tetsuya Mishima
> 
>> Hi Tetsuya
>> 
>> Your script that edits $PBS_NODEFILE into a separate hostfile
>> is very similar to some that I used here for
>> hybrid OpenMP+MPI programs on older versions of OMPI.
>> I haven't tried this in 1.6.X,
>> but it looks like you did and it works also.
>> I haven't tried 1.7 either.
>> Since we run production machines,
>> I try to stick to the stable versions of OMPI (even numbered:
>> 1.6.X, 1.4.X, 1.2.X).
>> 
>> I believe you can get the same effect even if you
>> don't edit your $PBS_NODEFILE and let OMPI use it as is.
>> Say, if you choose carefully the values in your
>> #PBS -l nodes=?:ppn=?
>> of your
>> $OMP_NUM_THREADS
>> and use an mpiexec with --npernode or --cpus-per-proc.
>> 
>> For instance, for twelve MPI processes, with two threads each,
>> on nodes with eight cores each, I would try
>> (but I haven't tried!):
>> 
>> #PBS -l nodes=3:ppn=8
>> 
>> export $OMP_NUM_THREADS=2
>> 
>> mpiexec -np 12 -npernode 4
>> 
>> or perhaps more tightly:
>> 
>> mpiexec -np 12 --report-bindings --bind-to-core --cpus-per-proc 2
>> 
>> I hope this helps,
>> Gus Correa
>> 
>> 
>> 
>> On 03/19/2013 03:12 PM, tmish...@jcity.maeda.co.jp wrote:
>>> 
>>> 
>>> Hi Reuti and Gus,
>>> 
>>> Thank you for your comments.
>>> 
>>> Our cluster is a little bit heterogeneous, which has nodes with 4, 8,
> 32
>>> cores.
>>> I used 8-core nodes for "-l nodes=4:ppn=8" and 4-core nodes for "-l
>>> nodes=2:ppn=4".
>>> (strictly speaking, Torque picked up proper nodes.)
>>> 
>>> As I mentioned before, I usually use openmpi-1.6.x, which has no troble
>>> against that kind
>>> of use. I encountered the issue when I was evaluating openmpi-1.7 to
> check
>>> when we could
>>> move on to it, although we have no positive reason to do that at this
>>> moment.
>>> 
>>> As Gus pointed out, I use a script file as shown below for a practical
> use
>>> of openmpi-1.6.x.
>>> 
>>> #PBS -l nodes=2:ppn=32  # even "-l nodes=1:ppn=32+4:ppn=8" works fine
>>> export OMP_NUM_THREADS=4
>>> modify $PBS_NODEFILE pbs_hosts # 64 lines are condensed to 16 lines
> here
>>> mpirun -hostfile pbs_hosts -np 16 -cpus-per-proc 4 -report-bindings \
>>> -x OMP_NUM_THREADS ./my_program  # 32-core node has 8 numanodes, 8-core
>>> node has 2 numanodes
>>> 
>>> It works well under the combination of openmpi-1.6.x and Torque. The
>>> problem is just
>>> openmpi-1.7's behavior.
>>> 
>>> Regards,
>>> Tetsuya Mishima
>>> 
>>>> Hi Tetsuya Mishima
>>>> 
>>>> Mpiexec offers you a number of possibilities that you could try:
>>>> --bynode,
>>>> --pernode,
>>>> --npernode,
>>>> --bysocket,
>>>> --bycore,
>>>> --cpus-per-proc,
>>>> --cpus-per-rank,
>>>> --rankfile
>>>> and more.
>>>> 
>>>> Most likely one or more of them will fit your needs.
>>>> 
>>>> There are also associated flags to bind processes to cores,
>>>> to sockets, etc, to report the bindings, and so on.
>>>> 
>>>> Check the mpiexec man page for details.
>>>> 
>>>> Nevertheless, I am surprised that modifying the
>>>> $PBS_NODEFILE doesn't work for you in OMPI 1.7.
>>>> I have done this many times in older versions of OMPI.
>>>> 
>>>> Would it work for you to go back to the stable OMPI 1.6.X,
>>>> or does it lack any special feature that you need?
>>>> 
>>>> I hope this helps,
>>>> Gus Correa
>>>> 
>>>> On 03/19/2013 03:00 AM, tmish...@jcity.maeda.co.jp wrote:
>>>>> 
>>>>> 
>>>>> Hi Jeff,
>>>>> 
>>>>> I didn't have much time to test this morning. So, I checked it again
>>>>> now. Then, the trouble seems to depend on the number of nodes to use.
>>>>> 
>>>>> This works(nodes<   4):
>>>>> mpiexec -bynode -np 4 ./my_program&&     #PBS -l nodes=2:ppn=8
>>>>> (OMP_NUM_THREADS=4)
>>>>> 
>>>>> This causes error(nodes>= 4):
>>>>> mpiexec -bynode -np 8 ./my_program&&     #PBS -l nodes=4:ppn=8
>>>>> (OMP_NUM_THREADS=4)
>>>>> 
>>>>> Regards,
>>>>> Tetsuya Mishima
>>>>> 
>>>>>> Oy; that's weird.
>>>>>> 
>>>>>> I'm afraid we're going to have to wait for Ralph to answer why that
> is
>>>>> happening -- sorry!
>>>>>> 
>>>>>> 
>>>>>> On Mar 18, 2013, at 4:45 PM,<tmish...@jcity.maeda.co.jp>   wrote:
>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Hi Correa and Jeff,
>>>>>>> 
>>>>>>> Thank you for your comments. I quickly checked your suggestion.
>>>>>>> 
>>>>>>> As a result, my simple example case worked well.
>>>>>>> export OMP_NUM_THREADS=4
>>>>>>> mpiexec -bynode -np 2 ./my_program&&     #PBS -l nodes=2:ppn=4
>>>>>>> 
>>>>>>> But, practical case that more than 1 process was allocated to a
> node
>>>>> like
>>>>>>> below did not work.
>>>>>>> export OMP_NUM_THREADS=4
>>>>>>> mpiexec -bynode -np 4 ./my_program&&     #PBS -l nodes=2:ppn=8
>>>>>>> 
>>>>>>> The error message is as follows:
>>>>>>> [node08.cluster:11946] [[30666,0],3] ORTE_ERROR_LOG: A message is
>>>>>>> attempting to be sent to a process whose contact infor
>>>>>>> mation is unknown in file rml_oob_send.c at line 316
>>>>>>> [node08.cluster:11946] [[30666,0],3] unable to find address for
>>>>>>> [[30666,0],1]
>>>>>>> [node08.cluster:11946] [[30666,0],3] ORTE_ERROR_LOG: A message is
>>>>>>> attempting to be sent to a process whose contact infor
>>>>>>> mation is unknown in file base/grpcomm_base_rollup.c at line 123
>>>>>>> 
>>>>>>> Here is our openmpi configuration:
>>>>>>> ./configure \
>>>>>>> --prefix=/home/mishima/opt/mpi/openmpi-1.7rc8-pgi12.9 \
>>>>>>> --with-tm \
>>>>>>> --with-verbs \
>>>>>>> --disable-ipv6 \
>>>>>>> CC=pgcc CFLAGS="-fast -tp k8-64e" \
>>>>>>> CXX=pgCC CXXFLAGS="-fast -tp k8-64e" \
>>>>>>> F77=pgfortran FFLAGS="-fast -tp k8-64e" \
>>>>>>> FC=pgfortran FCFLAGS="-fast -tp k8-64e"
>>>>>>> 
>>>>>>> Regards,
>>>>>>> Tetsuya Mishima
>>>>>>> 
>>>>>>>> On Mar 17, 2013, at 10:55 PM, Gustavo
> Correa<g...@ldeo.columbia.edu>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> In your example, have you tried not to modify the node file,
>>>>>>>>> launch two mpi processes with mpiexec, and request a "-bynode"
>>>>>>> distribution of processes:
>>>>>>>>> 
>>>>>>>>> mpiexec -bynode -np 2 ./my_program
>>>>>>>> 
>>>>>>>> This should work in 1.7, too (I use these kinds of options with
>>> SLURM
>>>>> all
>>>>>>> the time).
>>>>>>>> 
>>>>>>>> However, we should probably verify that the hostfile functionality
>>> in
>>>>>>> batch jobs hasn't been broken in 1.7, too, because I'm pretty sure
>>> that
>>>>>>> what you described should work.  However, Ralph, our
>>>>>>>> run-time guy, is on vacation this week.  There might be a delay in
>>>>>>> checking into this.
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Jeff Squyres
>>>>>>>> jsquy...@cisco.com
>>>>>>>> For corporate legal information go to:
>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Jeff Squyres
>>>>>> jsquy...@cisco.com
>>>>>> For corporate legal information go to:
>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>> 
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] modified hostfile does not work with openmpi1.7rc8

Reply via email to