Re: [OMPI users] specifying hosts in mpi_spawn()

2008-06-02 Thread Ralph H Castain
Appreciate the clarification. Unfortunately, the answer is ³no² for any of
our current releases. We only use the ³host² info argument to tell us which
nodes to use ­ the info has no bearing on the eventual mapping of ranks to
nodes. Repeated entries are simply ignored.

I was mainly asking for the version to check if you were working with our
svn trunk. The upcoming 1.3 release does support mapping such as you
describe. However, it currently only supports it for entries in a hostfile,
not as specified via ­host or in the host info argument.

Historically, we have maintained a direct correspondence between hostfile
and ­host operations ­ i.e., whatever you can do with a hostfile could also
be done via ­host. I¹ll have to discuss with the developers whether or not
to extend this to sequential mapping of ranks.

The short answer, therefore, is that we don¹t support what you are
requesting at this time, and may not support it in 1.3 (though you could get
around that perhaps by putting the ordering in a file).

Ralph
 


On 5/30/08 11:32 AM, "Bruno Coutinho"  wrote:

> I'm using open mpi 1.2.6 from the open mpi site, but I can switch to another
> version if necessary.
> 
> 
> 2008/5/30 Ralph H Castain :
>> I'm afraid I cannot answer that question without first knowing what version
>> of Open MPI you are using. Could you provide that info?
>> 
>> Thanks
>> Ralph
>> 
>> 
>> 
>> On 5/29/08 6:41 PM, "Bruno Coutinho"  wrote:
>> 
>>> > How mpi handles the host string passed in the info argument to
>>> > mpi_comm_spawn() ?
>>> >
>>> > if I set host to:
>>> > "host1,host2,host3,host2,host2,host1"
>>> >
>>> > then ranks 0 and 5 will run in host1, ranks 1,3,4 in host 2 and rank 3
>>> > in host3?
>>> > ___
>>> > users mailing list
>>> > us...@open-mpi.org
>>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 




[OMPI users] HPMPI versus OpenMPI performance

2008-06-02 Thread Ayer, Timothy C.


> We a performing a comparison of HPMPI versus OpenMPI using Infiniband and
> seeing a performance hit in the vicinity of 60% (OpenMPI is slower) on
> controlled benchmarks.  Since everything else is similar, we suspect a
> problem with the way we are using or have installed OpenMPI. 
> 
> Please find attached the following info as requested from
> http://www.open-mpi.org/community/help/
>  
> 
> Application:  in house CFD solver using both point-point and collective
> operations. Also, for historical reasons it makes extensive use of BSEND.
> We recognize that BSEND's can be inefficient but it is not practical to
> change them at this time.  We are trying to understand why the performance
> is so significantly different from HPMPI.  The application is mixed
> FORTRAN 90 and C built with Portland Group compilers.
> 
> HPMPI Version info:
> 
> mpirun: HP MPI 02.02.05.00 Linux x86-64
> major version 202 minor version 5
> 
> OpenMPI Version info:
> 
> mpirun (Open MPI) 1.2.4
> Report bugs to http://www.open-mpi.org/community/help/
>  
> 
> 
> 
> Configuration info :
> 
> The benchmark was a 4-processor job run on a single dual-socket dual core
> HP DL140G3 (Woodcrest 3.0) with 4 GB of memory.  Each rank requires
> approximately 250MB of memory.
> 
> 1) Output from ompi_info --all 
> 
> See attached file ompi_info_output.txt
>  << File: ompi_info_output.txt >> 
> 
> Below is the output requested in the FAQ section:
> 
> In order for us to help you, it is most helpful if you can run a few steps
> before sending an e-mail to both perform some basic troubleshooting and
> provide us with enough information about your environment to help you.
> Please include answers to the following questions in your e-mail: 
> 
> 
> 1.Which OpenFabrics version are you running? Please specify where you
> got the software from (e.g., from the OpenFabrics community web site, from
> a vendor, or it was already included in your Linux distribution).
> 
> We obtained the software from  www.openfabrics.org   
> 
> Output from ofed_info command:
> 
> OFED-1.1
> 
> openib-1.1 (REV=9905)
> # User space
> https://openib.org/svn/gen2/branches/1.1/src/userspace
>  
> Git:
> ref: refs/heads/ofed_1_1
> commit a083ec1174cb4b5a5052ef5de9a8175df82e864a
> 
> # MPI
> mpi_osu-0.9.7-mlx2.2.0.tgz
> openmpi-1.1.1-1.src.rpm
> mpitests-2.0-0.src.rpm
> 
> 
> 
> 2.What distro and version of Linux are you running? What is your
> kernel version?
> 
> Linux  2.6.9-64.EL.IT133935.jbtest.1smp #1 SMP Fri Oct 19 11:28:12
> EDT 2007 x86_64 x86_64 x86_64 GNU/Linux
> 
> 
> 3.Which subnet manager are you running? (e.g., OpenSM, a
> vendor-specific subnet manager, etc.)
> 
> We believe this to be HP or Voltaire but we are not certain how to
> determine this.
> 
> 
> 4.What is the output of the ibv_devinfo command on a known "good" node
> and a known "bad" node? (NOTE: there must be at least one port listed as
> "PORT_ACTIVE" for Open MPI to work. If there is not at least one
> PORT_ACTIVE port, something is wrong with your OpenFabrics environment and
> Open MPI will not be able to run).
> 
> hca_id: mthca0
> fw_ver: 1.2.0
> node_guid:  001a:4bff:ff0b:5f9c
> sys_image_guid: 001a:4bff:ff0b:5f9f
> vendor_id:  0x08f1
> vendor_part_id: 25204
> hw_ver: 0xA0
> board_id:   VLT0030010001
> phys_port_cnt:  1
> port:   1
> state:  PORT_ACTIVE (4)
> max_mtu:2048 (4)
> active_mtu: 2048 (4)
> sm_lid: 1
> port_lid:   161
> port_lmc:   0x00
> 
> 
> 5.What is the output of the ifconfig command on a known "good" node
> and a known "bad" node? (mainly relevant for IPoIB installations) Note
> that some Linux distributions do not put ifconfig in the default path for
> normal users; look for it in /sbin/ifconfig or /usr/sbin/ifconfig.
> 
> eth0  Link encap:Ethernet  HWaddr 00:XX:XX:XX:XX:XX
>   inet addr:X.Y.Z.Q  Bcast:X.Y.Z.255  Mask:255.255.255.0
>   inet6 addr: X::X:X:X:X/64 Scope:Link
>   UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>   RX packets:1021733054 errors:0 dropped:10717 overruns:0 frame:0
>   TX packets:1047320834 errors:0 dropped:0 overruns:0 carrier:0
>   collisions:0 txqueuelen:1000
>   RX bytes:1035986839096 (964.8 GiB)  TX bytes:1068055599116
> (994.7 GiB)
>   Interrupt:169
> 
> ib0   Link encap:UNSPEC  HWaddr
> 80-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00
>   inet