Re: [OMPI users] myrinet mx and openmpi using solaris, sun compilers

2006-11-21 Thread Lydia Heck

Thank you very much.

I tried

mpirun -np 6 -machinefile ./myh -mca pml cm ./b_eff

and to amuse you

 mpirun -np 6 -machinefile ./myh -mca btl mx,sm,self ./b_eff

with myh containing two host names

and both commands went swimmingly.

To make absolutely sure, I checked the usage of the myrinet ports
and on each system 3 myrinet ports were open.

Lydia

On Mon, 20 Nov 2006 users-requ...@open-mpi.org wrote:
>
> --
>
> Message: 2
> Date: Mon, 20 Nov 2006 20:05:22 + (GMT)
> From: Lydia Heck 
> Subject: [OMPI users] myrinet mx and openmpi using solaris, sun
>   compilers
> To: us...@open-mpi.org
> Message-ID:
>   
> Content-Type: TEXT/PLAIN; charset=US-ASCII
>
>
> I have built the myrinet drivers with gcc or the studio 11 compilers from sun.
> The following problem appears for both installations.
>
> I have tested the myrinet installations using myricoms own test programs.
>
> Then I build open-mpi using the studio11 compilers enabling myrinet.
>
> All the library paths are correctly set and I can run my test program
> which is written in C, successfully, if I choose the number of CPUs to be 
> equal
> the number of nodes, which means one instance of process per node!
>
> Each node has 4 CPUs.
>
> If I now request the number of CPUs for the run to be larger than the
> number of nodes I get an error message, which clearly indicates
> that the openmpi cannot communicate over more than one channel
> on the myrinet card. However I should be able to communicate over
> 4 channels at least - colleagues of mine are doing that using
> mpich and the same type of myrinet card.
>
> Any idead why this should happen?
>
> the hostfile looks like:
>
> m2009 slots=4
> m2010 slots=4
>
>
> but it will provide the same error if the hosts file is
>
> m2009
> m2010
>
> ompi_info | grep mx
> 2001(128) > ompi_info | grep mx
>  MCA btl: mx (MCA v1.0, API v1.0.1, Component v1.2)
>  MCA mtl: mx (MCA v1.0, API v1.0, Component v1.2)
> m2009(160) > /opt/mx/bin/mx_endpoint_info
> 1 Myrinet board installed.
> The MX driver is configured to support up to 4 endpoints on 4 boards.
> ===
> Board #0:
> Endpoint PID Command Info
>15039
> 0   15544
> There are currently 1 regular endpoint open
>
>
>
>
> m2001(120) > mpirun -np 6 -hostfile hostsfile -mca btl mx,self  b_eff
> --
> Process 0.1.0 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> Process 0.1.2 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> Process 0.1.4 is unable to reach 0.1.4 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> Process 0.1.1 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> Process 0.1.5 is unable to reach 0.1.4 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
> If you specified the use of a BTL component, you may have
> forgotten a component (such as "self") in the list of
> usable components.
> --
> --
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (whi

Re: [OMPI users] myrinet mx and openmpi using solaris, sun compilers

2006-11-21 Thread Galen M. Shipman

Lydia Heck wrote:


Thank you very much.

I tried

mpirun -np 6 -machinefile ./myh -mca pml cm ./b_eff

 


What was the performance (latency and bandwidth)?


and to amuse you

mpirun -np 6 -machinefile ./myh -mca btl mx,sm,self ./b_eff
 


Same question here as well..

Thanks,
Galen


with myh containing two host names

and both commands went swimmingly.

To make absolutely sure, I checked the usage of the myrinet ports
and on each system 3 myrinet ports were open.

Lydia

 







Re: [OMPI users] removing hard-coded paths from OpenMPI shared libraries

2006-11-21 Thread Adam Young
Not knowing the openmpi build system, I am a little reluctant to say.  But in 
most projects  there is usually multiple paths that can be set at configure 
time.  In most autoconfed projects it is called prefix.  There are other ones 
that can be set for headers, etc.

 -Original Message-
From:   Patrick Jessee [mailto:p...@fluent.com]
Sent:   Mon Nov 20 11:56:48 2006
To: us...@open-mpi.org
Subject:[OMPI users] removing hard-coded paths from OpenMPI shared 
libraries


Hello.  I'm wondering if anyone knows of a way to get OpenMPI to compile 
shared libraries without hard-coding the installation directory in 
them.  After compiling and installing OpenMPI, the shared libraries have 
the installation libraries hard-coded in them.  For instance:

$ ldd libmpi.so
liborte.so.0 => 
/usr/local/fluent/develop/multiport4.4/packages/lnamd64/openmpi/openmpi-1.1.2/lib/liborte.so.0
 
(0x002a956ea000)
libnsl.so.1 => /lib64/libnsl.so.1 (0x002a95852000)
libutil.so.1 => /lib64/libutil.so.1 (0x002a95968000)
libm.so.6 => /lib64/tls/libm.so.6 (0x002a95a6c000)
libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x002a95bc4000)
libc.so.6 => /lib64/tls/libc.so.6 (0x002a95cd8000)
libopal.so.0 => 
/usr/local/fluent/develop/multiport4.4/packages/lnamd64/openmpi/openmpi-1.1.2/lib/libopal.so.0
 
(0x002a95f0)
/lib64/ld-linux-x86-64.so.2 (0x00552000)
libdl.so.2 => /lib64/libdl.so.2 (0x002a9605a000)


In the above, 
"/usr/local/fluent/develop/multiport4.4/packages/lnamd64/openmpi/openmpi-1.1.2/lib"
 
is hardcoded into libmpi.so using --rpath when libmpi.so is compiled.

This is problematic because the installation cannot be moved after it is 
installed.  It is often useful to compile/install libraries on one 
machine and then move the libraries to a different location on other 
machines (of course, LD_LIBRARY_PATH or some means then needs to be used 
to pick up libs are runtime).  This relocation is also useful when 
redistributing the MPI installation with an application.  The hard-coded 
paths prohibit this.

I've tried to modify the "--rpath" argument in libtool and 
opal/libltdl/libtool, but have not gotten this to work.

Has anyone else had experience with this?  (I'm building OpenMPI 1.1.2 
on linux x86_64.)  Thanks in advance for any potential help.

Regards,

-Patrick




[OMPI users] myirnet problems on OSX

2006-11-21 Thread Brock Palen
I had sent a message two weeks ago about this problem and talked with  
jeff at SC06 about how it might not be a OMPI problem.  But it  
appears now working with myricom that it is a problem in both  
lam-7.1.2 and openmpi-1.1.2/1.1.1.   Basically the results from a HPL  
run are wrong,  Also causes a large number of packets to be dropped  
by the fabric.


This problem does not happen when using mpichgm.  The number of  
dropped packets does not go up.  There is a ticket open with myircom  
on this.  They are a member of the group working on OMPI but i sent  
this out just to bring the list uptodate.


If you have any questions feel free to ask me.  The details are in  
the archive.


Brock Palen
Center for Advanced Computing
bro...@umich.edu
(734)936-1985




Re: [OMPI users] myirnet problems on OSX

2006-11-21 Thread Scott Atchley

On Nov 21, 2006, at 1:27 PM, Brock Palen wrote:


I had sent a message two weeks ago about this problem and talked with
jeff at SC06 about how it might not be a OMPI problem.  But it
appears now working with myricom that it is a problem in both
lam-7.1.2 and openmpi-1.1.2/1.1.1.   Basically the results from a HPL
run are wrong,  Also causes a large number of packets to be dropped
by the fabric.

This problem does not happen when using mpichgm.  The number of
dropped packets does not go up.  There is a ticket open with myircom
on this.  They are a member of the group working on OMPI but i sent
this out just to bring the list uptodate.

If you have any questions feel free to ask me.  The details are in
the archive.

Brock Palen


Hi all,

I am looking into this at Myricom.

So far, I have compiled OMPI version 1.2b1 using the --with-gm=/path/ 
to/gm flag. I have compiled HPCC (contains HPL) using OMPI's mpicc.  
Trying to run hpcc fails with "Myrinet/GM on host fog33 was unable to  
find any NICs". See mpirun output below.


I run gm_board_info and it finds two NICs.

I run ompi_info and it has the gm btl (see ompi_info below).

I have tried using the --prefix flag to mpirun as well as setting  
PATH and LD_LIBRARY_PATH.


What am I missing?

Scott


% ompi_info -param btl gm
 MCA btl: parameter "btl_base_debug" (current value:  
"0")
  If btl_base_debug is 1 standard debug is  
output, if > 1 verbose debug

  is output
 MCA btl: parameter "btl" (current value: )
  Default selection set of components for  
the btl framework (

  means "use all components that can be found")
 MCA btl: parameter "btl_base_verbose" (current  
value: "0")
  Verbosity level for the btl framework (0 =  
no verbosity)
 MCA btl: parameter "btl_gm_free_list_num" (current  
value: "8")
 MCA btl: parameter "btl_gm_free_list_max" (current  
value: "-1")
 MCA btl: parameter "btl_gm_free_list_inc" (current  
value: "8")

 MCA btl: parameter "btl_gm_debug" (current value: "0")
 MCA btl: parameter "btl_gm_mpool" (current value:  
"gm")
 MCA btl: parameter "btl_gm_max_ports" (current  
value: "16")
 MCA btl: parameter "btl_gm_max_boards" (current  
value: "4")
 MCA btl: parameter "btl_gm_max_modules" (current  
value: "4")
 MCA btl: parameter  
"btl_gm_num_high_priority" (current value: "8")
 MCA btl: parameter "btl_gm_num_repost" (current  
value: "4")
 MCA btl: parameter "btl_gm_port_name" (current  
value: "OMPI")
 MCA btl: parameter "btl_gm_exclusivity" (current  
value: "1024")
 MCA btl: parameter "btl_gm_eager_limit" (current  
value: "32768")
 MCA btl: parameter "btl_gm_min_send_size" (current  
value: "32768")
 MCA btl: parameter "btl_gm_max_send_size" (current  
value: "65536")
 MCA btl: parameter "btl_gm_min_rdma_size" (current  
value: "524288")
 MCA btl: parameter "btl_gm_max_rdma_size" (current  
value: "131072")
 MCA btl: parameter "btl_gm_flags" (current value:  
"50")
 MCA btl: parameter "btl_gm_bandwidth" (current  
value: "250")
 MCA btl: parameter "btl_gm_priority" (current  
value: "0")
 MCA btl: parameter  
"btl_base_warn_component_unused" (current value: "1")
  This parameter is used to turn on warning  
messages when certain NICs

  are not used





% mpirun --prefix $OMPI -np 4 --host fog33,fog33,fog34,fog34 -mca btl  
self,sm,gm ./hpcc
 
--

[0,1,1]: Myrinet/GM on host fog33 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
 
--
 
--

[0,1,0]: Myrinet/GM on host fog33 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
 
--
 
--

Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
 
--
 
--

Process 0.1.1 is unable to reach 0.1.2 for MPI communication.
If you specified the use of a BTL component, you may have
forg

[OMPI users] Hostfile parse error

2006-11-21 Thread Greg Wolffe

Hello,

   Experienced lam/mpi user trying openmpi for the first time.  For 
some reason, my hostfile is not being recognized.  It's such a simple 
problem (see below) that I couldn't find anything in the FAQ or archives.


   Thanks in advance for any help.

gw


// simple program compiles just fine
[eos02:~/openmpi]$ mpicc -Wall hello.c -o hello

// program runs just fine using command line to specify hosts
[eos02:~/openmpi]$ mpirun -np 3 --hosts eos01,eos02,eos03 hello
Nodes: 3
Hello from Master (process 0)!
Hello from process 1!
Hello from process 2!

// same hosts listed in hostfile
[eos02:~/openmpi]$ cat my-hostfile
# Hostfile for OpenMPI environment
eos01
eos02
eos03

// hostfile cannot be read?
[eos02:~/openmpi]$ mpirun -np 3 --hostfile my-hostfile hello
--
Open RTE detected a parse error in the hostfile:
   my-hostfile
It occured on line number 2 on token 1.
--
[eos02.cis.gvsu.edu:02480] [0,0,0] ORTE_ERROR_LOG: Error in file 
rmgr_urm.c at line 407

[eos02.cis.gvsu.edu:02480] mpirun: spawn failed with errno=-1
[eos02:~/openmpi]$


Re: [OMPI users] myirnet problems on OSX

2006-11-21 Thread Brock Palen

On Nov 21, 2006, at 2:28 PM, Scott Atchley wrote:


On Nov 21, 2006, at 1:27 PM, Brock Palen wrote:


I had sent a message two weeks ago about this problem and talked with
jeff at SC06 about how it might not be a OMPI problem.  But it
appears now working with myricom that it is a problem in both
lam-7.1.2 and openmpi-1.1.2/1.1.1.   Basically the results from a HPL
run are wrong,  Also causes a large number of packets to be dropped
by the fabric.

This problem does not happen when using mpichgm.  The number of
dropped packets does not go up.  There is a ticket open with myircom
on this.  They are a member of the group working on OMPI but i sent
this out just to bring the list uptodate.

If you have any questions feel free to ask me.  The details are in
the archive.

Brock Palen


Hi all,

I am looking into this at Myricom.

So far, I have compiled OMPI version 1.2b1 using the --with-gm=/path/
to/gm flag. I have compiled HPCC (contains HPL) using OMPI's mpicc.
Trying to run hpcc fails with "Myrinet/GM on host fog33 was unable to
find any NICs". See mpirun output below.


I do not have the same problem with 1.2b1.  But i still get errors  
running HPL.




I run gm_board_info and it finds two NICs.


Yes same here



I run ompi_info and it has the gm btl (see ompi_info below).

I have tried using the --prefix flag to mpirun as well as setting
PATH and LD_LIBRARY_PATH.


Thats all i did also,



What am I missing?

Scott


% ompi_info -param btl gm
  MCA btl: parameter "btl_base_debug" (current value:
"0")
   If btl_base_debug is 1 standard debug is
output, if > 1 verbose debug
   is output
  MCA btl: parameter "btl" (current value: )
   Default selection set of components for
the btl framework (
   means "use all components that can be  
found")

  MCA btl: parameter "btl_base_verbose" (current
value: "0")
   Verbosity level for the btl framework (0 =
no verbosity)
  MCA btl: parameter "btl_gm_free_list_num" (current
value: "8")
  MCA btl: parameter "btl_gm_free_list_max" (current
value: "-1")
  MCA btl: parameter "btl_gm_free_list_inc" (current
value: "8")
  MCA btl: parameter "btl_gm_debug" (current value:  
"0")

  MCA btl: parameter "btl_gm_mpool" (current value:
"gm")
  MCA btl: parameter "btl_gm_max_ports" (current
value: "16")
  MCA btl: parameter "btl_gm_max_boards" (current
value: "4")
  MCA btl: parameter "btl_gm_max_modules" (current
value: "4")
  MCA btl: parameter
"btl_gm_num_high_priority" (current value: "8")
  MCA btl: parameter "btl_gm_num_repost" (current
value: "4")
  MCA btl: parameter "btl_gm_port_name" (current
value: "OMPI")
  MCA btl: parameter "btl_gm_exclusivity" (current
value: "1024")
  MCA btl: parameter "btl_gm_eager_limit" (current
value: "32768")
  MCA btl: parameter "btl_gm_min_send_size" (current
value: "32768")
  MCA btl: parameter "btl_gm_max_send_size" (current
value: "65536")
  MCA btl: parameter "btl_gm_min_rdma_size" (current
value: "524288")
  MCA btl: parameter "btl_gm_max_rdma_size" (current
value: "131072")
  MCA btl: parameter "btl_gm_flags" (current value:
"50")
  MCA btl: parameter "btl_gm_bandwidth" (current
value: "250")
  MCA btl: parameter "btl_gm_priority" (current
value: "0")
  MCA btl: parameter
"btl_base_warn_component_unused" (current value: "1")
   This parameter is used to turn on warning
messages when certain NICs
   are not used





% mpirun --prefix $OMPI -np 4 --host fog33,fog33,fog34,fog34 -mca btl
self,sm,gm ./hpcc
-- 
--

--
[0,1,1]: Myrinet/GM on host fog33 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
-- 
--

--
-- 
--

--
[0,1,0]: Myrinet/GM on host fog33 was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
-- 
--

--
-- 
--

--
Process 0.1.3 is unable to reach 0.1.0 for MPI communication.
If you specified the use of a BTL component, you may have
forgotten a component (such as "self") in the list of
usable components.
-- 
--

--

Re: [OMPI users] removing hard-coded paths from OpenMPI shared libraries

2006-11-21 Thread Patrick Jessee

Jeff,

Thanks for the response:


mpirun --prefix /otherdir ...



This might be good enough to do what you need.


I don't think this will work (or is all that is needed to work).  We are 
actually already using the --prefix option to mpirun and still run into 
the problem.


When fluent is distributed, we typically package/distribute all the 
runtime MPI files that it needs so that the application is 
self-contained.  The distribution directory for the MPIs is different 
from where the MPIs were originally built/installed (and will definitely 
be different still when the user installs the application).   We use the 
common approach of setting LD_LIBRARY_PATH (or --prefix or other) in a 
runtime wrapper script to reflect the final installation location and to 
pick shared libraries at runtime.   Thus, the application (and the MPI) 
can be infinitely redistributed/installed and still function.  Some MPIs 
have a runtime-settable environment variable to define the final MPI 
installation location.  For instance, HP-MPI use MPI_ROOT to define the 
installation location.


OpenMPI seems to be a little different because the final installation 
location seems to be fixed at compile time.   When the libraries are 
compiled, the installation location is encoded into the OpenMPI shared 
libraries by the use of --rpath during linking (it's encode into 
libmpi.so and many shared libs under lib/openmpi).  Thus, the 
installation doesn't seem to be able to be moved after it is originally 
installed.


For many users this works out well, but an option to build openMPI so 
that it has the flexibility to be moved would be very nice :-).   I was 
able to play with the libtool file to get most of OpenMPI to build 
without --rpath (I think ompi_info didn't build), so there may not be 
too much involved.  Whomever setup the shared library part of the build 
process may know exactly what is needed.  I can share what I've done if 
it helps.


-Patrick 





Jeff Squyres wrote:

This is certainly the default for GNU Libtool-build shared libraries  
(such as Open MPI).  Ralf W -- is there a way to disable this?


As a sidenote, we added the "--prefix" option to mpirun to be able to  
override the LD_LIBRARY_PATH on remote nodes for circumstances like  
this.  For example, say you build/install to /somedir, but then  
distribute Open MPI and the user installs it to /otherdir.  I know  
almost nothing about Fluent :-(, but do you wrap the call to mpirun  
in a script/executable somewhere?  Such that you could hide:


mpirun --prefix /otherdir ...

This might be good enough to do what you need.

Would that work?


On Nov 20, 2006, at 2:54 PM, Patrick Jessee wrote:

 

Hello.  I'm wondering if anyone knows of a way to get OpenMPI to  
compile shared libraries without hard-coding the installation  
directory in them.  After compiling and installing OpenMPI, the  
shared libraries have the installation libraries hard-coded in  
them.  For instance:


$ ldd libmpi.so
  liborte.so.0 => /usr/local/fluent/develop/multiport4.4/ 
packages/lnamd64/openmpi/openmpi-1.1.2/lib/liborte.so.0  
(0x002a956ea000)

  libnsl.so.1 => /lib64/libnsl.so.1 (0x002a95852000)
  libutil.so.1 => /lib64/libutil.so.1 (0x002a95968000)
  libm.so.6 => /lib64/tls/libm.so.6 (0x002a95a6c000)
  libpthread.so.0 => /lib64/tls/libpthread.so.0  
(0x002a95bc4000)

  libc.so.6 => /lib64/tls/libc.so.6 (0x002a95cd8000)
  libopal.so.0 => /usr/local/fluent/develop/multiport4.4/ 
packages/lnamd64/openmpi/openmpi-1.1.2/lib/libopal.so.0  
(0x002a95f0)

  /lib64/ld-linux-x86-64.so.2 (0x00552000)
  libdl.so.2 => /lib64/libdl.so.2 (0x002a9605a000)


In the above, "/usr/local/fluent/develop/multiport4.4/packages/ 
lnamd64/openmpi/openmpi-1.1.2/lib" is hardcoded into libmpi.so  
using --rpath when libmpi.so is compiled.


This is problematic because the installation cannot be moved after  
it is installed.  It is often useful to compile/install libraries  
on one machine and then move the libraries to a different location  
on other machines (of course, LD_LIBRARY_PATH or some means then  
needs to be used to pick up libs are runtime).  This relocation is  
also useful when redistributing the MPI installation with an  
application.  The hard-coded paths prohibit this.


I've tried to modify the "--rpath" argument in libtool and opal/ 
libltdl/libtool, but have not gotten this to work.


Has anyone else had experience with this?  (I'm building OpenMPI  
1.1.2 on linux x86_64.)  Thanks in advance for any potential help.


Regards,

-Patrick



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
   




 



<>

[OMPI users] Build OpenMPI for SHM only

2006-11-21 Thread Adam Moody

Hello,
We have some clusters which consist of a large pool of 8-way nodes 
connected via ethernet.  On these particular machines, we'd like our 
users to be able to run 8-way MPI jobs on node, but we *don't* want them 
to run MPI jobs across nodes via the ethernet.  Thus, I'd like to 
configure and build OpenMPI to provide shared memory support (or TCP 
loopback) but disable general TCP support.


I realize that you can run without tcp via something like "mpirun --mca 
btl ^tcp", but this is up to the user's discretion.  I need a way to 
disable it systematically.  Is there a way to configure it out at build 
time or is there some runtime configuration file I can modify to turn it 
off?  Also, when we configure "--without-tcp", the configure script 
doesn't complain, but TCP support is added anyway.


Thanks,
-Adam Moody
MPI Support @ LLNL


Re: [OMPI users] Build OpenMPI for SHM only

2006-11-21 Thread Brian W. Barrett

On Nov 21, 2006, at 5:49 PM, Adam Moody wrote:


Hello,
We have some clusters which consist of a large pool of 8-way nodes
connected via ethernet.  On these particular machines, we'd like our
users to be able to run 8-way MPI jobs on node, but we *don't* want  
them

to run MPI jobs across nodes via the ethernet.  Thus, I'd like to
configure and build OpenMPI to provide shared memory support (or TCP
loopback) but disable general TCP support.

I realize that you can run without tcp via something like "mpirun -- 
mca

btl ^tcp", but this is up to the user's discretion.  I need a way to
disable it systematically.  Is there a way to configure it out at  
build
time or is there some runtime configuration file I can modify to  
turn it

off?  Also, when we configure "--without-tcp", the configure script
doesn't complain, but TCP support is added anyway.


Try adding --enable-mca-no-build=btl-tcp to the configure line.   
Autoconf is a bit odd -- it doesn't complain about --with or --enable  
arguments that it doesn't know about.  The reason for this is to make  
it easier to support packages that run sub-configure scripts, but it  
does make argument checking significantly more difficult...


Brian

--
  Brian Barrett
  Open MPI Team, CCS-1
  Los Alamos National Laboratory




Re: [OMPI users] Build OpenMPI for SHM only

2006-11-21 Thread Tim Prins
Hi,

I don't know if there is a way to do it in configure, but after installing you 
can go into the $prefix/lib/openmpi directory and delete mca_btl_tcp.*

This will remove the tcp component and thus users will not be able to use it. 
Note that you must NOT delete the mca_oob_tcp.* files, as these are used for 
our internal administrative messaging and we currently require it to be 
there.

Thanks,

Tim Prins


On Tuesday 21 November 2006 07:49 pm, Adam Moody wrote:
> Hello,
> We have some clusters which consist of a large pool of 8-way nodes
> connected via ethernet.  On these particular machines, we'd like our
> users to be able to run 8-way MPI jobs on node, but we *don't* want them
> to run MPI jobs across nodes via the ethernet.  Thus, I'd like to
> configure and build OpenMPI to provide shared memory support (or TCP
> loopback) but disable general TCP support.
>
> I realize that you can run without tcp via something like "mpirun --mca
> btl ^tcp", but this is up to the user's discretion.  I need a way to
> disable it systematically.  Is there a way to configure it out at build
> time or is there some runtime configuration file I can modify to turn it
> off?  Also, when we configure "--without-tcp", the configure script
> doesn't complain, but TCP support is added anyway.
>
> Thanks,
> -Adam Moody
> MPI Support @ LLNL
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


[OMPI users] MX performance problem on two processor nodes

2006-11-21 Thread Iannetti, Anthony C. (GRC-RTB0)
Dear OpenMPI List:

 

I am running the Myrinet MX btl with OpenMPI on MacOSX 10.4.
I am running into a problem.  When I run on one processor per node,
OpenMPI runs just fine.   When I run on two processors per node
(slots=2), it seems to take forever (something is hanging).

 

Here is the command:

mpirun -mca btl mx,self -np 2 pi3f90.x

 

However, if I give the command:

mpirun -np 2 pi3f90.x

 

The process runs normally. But, I do not know if it is using the Myrinet
network.  Is there a way to diagnose this problem.  mpirun -v and -d do
not seem to indicate which mca is actually being used.

 

Thanks,

Tony

 

Anthony C. Iannetti, P.E.

NASA Glenn Research Center

Propulsion Systems Division, Combustion Branch

21000 Brookpark Road, MS 5-10

Cleveland, OH 44135

phone: (216)433-5586

email: anthony.c.ianne...@nasa.gov

 

Please note:  All opinions expressed in this message are my own and NOT
of NASA.  Only the NASA Administrator can speak on behalf of NASA.

 



Re: [OMPI users] MX performance problem on two processor nodes

2006-11-21 Thread Iannetti, Anthony C. (GRC-RTB0)
Dear OpenMPI List:

 

>From looking at a recent thread, I see an mpirun command with shared
memory and mx:

 

mpirun -mca btl mx,sm,self -np 2 pi3f90.x

 

This works.  I may have forgot to mention it, but I am using 1.1.2.  I
see there is an -mca mtl in version 1.2b1 .  I do not think this exists
in 1.1.2.

Still, I would like to know what -mca is given automatically.

 

Thanks,

Tony

 

 

 

Anthony C. Iannetti, P.E.

NASA Glenn Research Center

Propulsion Systems Division, Combustion Branch

21000 Brookpark Road, MS 5-10

Cleveland, OH 44135

phone: (216)433-5586

email: anthony.c.ianne...@nasa.gov

 

Please note:  All opinions expressed in this message are my own and NOT
of NASA.  Only the NASA Administrator can speak on behalf of NASA.

 



From: Iannetti, Anthony C. (GRC-RTB0) 
Sent: Tuesday, November 21, 2006 8:39 PM
To: 'us...@open-mpi.org'
Subject: MX performance problem on two processor nodes

 

Dear OpenMPI List:

 

I am running the Myrinet MX btl with OpenMPI on MacOSX 10.4.
I am running into a problem.  When I run on one processor per node,
OpenMPI runs just fine.   When I run on two processors per node
(slots=2), it seems to take forever (something is hanging).

 

Here is the command:

mpirun -mca btl mx,self -np 2 pi3f90.x

 

However, if I give the command:

mpirun -np 2 pi3f90.x

 

The process runs normally. But, I do not know if it is using the Myrinet
network.  Is there a way to diagnose this problem.  mpirun -v and -d do
not seem to indicate which mca is actually being used.

 

Thanks,

Tony

 

Anthony C. Iannetti, P.E.

NASA Glenn Research Center

Propulsion Systems Division, Combustion Branch

21000 Brookpark Road, MS 5-10

Cleveland, OH 44135

phone: (216)433-5586

email: anthony.c.ianne...@nasa.gov

 

Please note:  All opinions expressed in this message are my own and NOT
of NASA.  Only the NASA Administrator can speak on behalf of NASA.