FWIW, if this configuration is for all of your users, you might want to specify these MCA params in the default MCA param file, or the environment, ...etc. Just so that you don't have to specify it on every mpirun command line.

See http://www.open-mpi.org/faq/?category=tuning#setting-mca-params.


On Oct 7, 2008, at 5:43 AM, Lenny Verkhovsky wrote:

Sorry, misunderstood the question,

thanks for Pasha the right command line will be

-mca btl openib,self -mca btl_openib_of_pkey_val 0x8109 -mca btl_openib_of_pkey_ix 1

ex.

#mpirun -np 2 -H witch2,witch3 -mca btl openib,self -mca btl_openib_of_pkey_val 0x8001 -mca btl_openib_of_pkey_ix 1 ./ mpi_p1_4_TRUNK -t lt
LT (2) (size min max avg) 1 3.443480 3.443480 3.443480


Best regards

Lenny.


On 10/6/08, Jeff Squyres <jsquy...@cisco.com> wrote: On Oct 5, 2008, at 1:22 PM, Lenny Verkhovsky wrote:

you should probably use -mca tcp,self -mca btl_openib_if_include ib0.8109


Really? I thought we only took OpenFabrics device names in the openib_if_include MCA param...? It looks like ib0.8109 is an IPoIB device name.



Lenny.


On 10/3/08, Matt Burgess <burgess.m...@gmail.com> wrote:
Hi,


I'm trying to get openmpi working over openib partitions. On this cluster, the partition number is 0x109. The ib interfaces are pingable over the appropriate ib0.8109 interface:

d2:/opt/openmpi-ib # ifconfig ib0.8109
ib0.8109 Link encap:UNSPEC HWaddr 80-00-00-4A- FE-80-00-00-00-00-00-00-00-00-00-00
         inet addr:10.21.48.2  Bcast:10.21.255.255  Mask:255.255.0.0
         inet6 addr: fe80::202:c902:26:ca01/64 Scope:Link
         UP BROADCAST RUNNING MULTICAST  MTU:65520  Metric:1
         RX packets:16811 errors:0 dropped:0 overruns:0 frame:0
         TX packets:15848 errors:0 dropped:1 overruns:0 carrier:0
         collisions:0 txqueuelen:256
         RX bytes:102229428 (97.4 Mb)  TX bytes:102324172 (97.5 Mb)


I have tried the following:

/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl openib,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109 -mca btl_openib_ib_pkey_ix 1 /cluster/ pallas/x86_64-ib/IMB-MPI1

but I just get a RETRY EXCEEDED ERROR. Is there a MCA parameter I am missing?

I was successful using tcp only:

/opt/openmpi-ib/1.2.6/bin/mpirun -np 2 -machinefile machinefile -mca btl tcp,self -mca btl_openib_max_btls 1 -mca btl_openib_ib_pkey_val 0x8109 /cluster/pallas/x86_64-ib/IMB-MPI1



Thanks,
Matt Burgess

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems

Reply via email to