Re: [OMPI users] running with the dr pml.

2006-12-05 Thread Galen M. Shipman
Brock Palen wrote: I was asked by mirycom to run a test using the data reliability pml. (dr) I ran it like so: $ mpirun --mca pml dr -np 4 ./xhpl Is this the right format for running the dr pml? This should be fine, yes. I can running HPL on our test cluster to see if something is wr

Re: [OMPI users] myrinet mx and openmpi using solaris, sun compilers

2006-11-21 Thread Galen M. Shipman
Lydia Heck wrote: Thank you very much. I tried mpirun -np 6 -machinefile ./myh -mca pml cm ./b_eff What was the performance (latency and bandwidth)? and to amuse you mpirun -np 6 -machinefile ./myh -mca btl mx,sm,self ./b_eff Same question here as well.. Thanks, Galen with myh c

Re: [OMPI users] myrinet mx and openmpi using solaris, sun compilers

2006-11-20 Thread Galen M. Shipman
m2001(120) > mpirun -np 6 -hostfile hostsfile -mca btl mx,self b_eff This does appear to be a bug, although you are using the MX BTL. Our higher performance path is the MX MTL. To use this try: mpirun -np 6 -hostfile hostsfile -mca pml cm b_eff Also, just for grins, could you try: mpi

Re: [OMPI users] Fault Tolerance & Behavior

2006-10-31 Thread Galen M. Shipman
Galen M. Shipman wrote: Gleb Natapov wrote: On Mon, Oct 30, 2006 at 11:45:53AM -0700, Troy Telford wrote: On Sun, 29 Oct 2006 01:34:06 -0700, Gleb Natapov wrote: If you use OB1 PML (default one) it will never recover from link down error no matter how many other

Re: [OMPI users] Fault Tolerance & Behavior

2006-10-31 Thread Galen M. Shipman
Gleb Natapov wrote: On Mon, Oct 30, 2006 at 11:45:53AM -0700, Troy Telford wrote: On Sun, 29 Oct 2006 01:34:06 -0700, Gleb Natapov wrote: If you use OB1 PML (default one) it will never recover from link down error no matter how many other transports you have. The reason is that OB1

Re: [OMPI users] dual Gigabit ethernet support

2006-08-22 Thread Galen M. Shipman
Jayanta, What is your bus on this machine? If it is PCI-X 133 you are going to be limited, also memory bandwidth could also be the bottleneck. Thanks, Galen Jayanta Roy wrote: Hi, In between two nodes I have dual Gigabit ethernet full duplex links. I was doing benchmarking using non-blo

Re: [OMPI users] network preference

2006-08-10 Thread Galen M. Shipman
I think the message was missing from one of the releases but is now in the trunk - Galen On Aug 10, 2006, at 3:44 PM, Andrew Friedley wrote: Donald Kerr wrote: Hey Andrew I have one for you... I get the following error message on a node that does not have any IB cards -

Re: [OMPI users] Proprieatary transport layer for openMPI...

2006-08-07 Thread Galen M. Shipman
Durga, Currently there are two options for porting an interconnect to Open MPI, one would be to use the BTL interface (Byte Transfer Layer). Another would be to use the MTL (Matching Transport Layer). The difference is that the MTL is useful for those APIs which expose matching and other

Re: [OMPI users] Problem with Openmpi 1.1

2006-07-11 Thread Galen M. Shipman
m an OS X run. I'm about to test against 1.0.3a1r10670. Justin. On 7/6/06, *Galen M. Shipman* < gship...@lanl.gov <mailto:gship...@lanl.gov>> wrote: Justin, Is the OS X run showing the same residual failure?

Re: [OMPI users] Problem with Openmpi 1.1

2006-07-06 Thread Galen M. Shipman
peedy response, Justin. On 7/6/06, Galen M. Shipman < gship...@lanl.gov> wrote: Hey Justin, Please provide us your mca parameters (if any), these could be in a config file, environment variables or on the command line. Thanks, Galen On Jul 6, 2006, at 9:22 AM, Justin Bronder wrote: As

Re: [OMPI users] Problem with Openmpi 1.1

2006-07-06 Thread Galen M. Shipman
Hey Justin, Please provide us your mca parameters (if any), these could be in a config file, environment variables or on the command line. Thanks, Galen On Jul 6, 2006, at 9:22 AM, Justin Bronder wrote: As far as the nightly builds go, I'm still seeing what I believe to be this problem in

Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB

2006-06-29 Thread Galen M. Shipman
I'm currently working with Owen on this issue.. will continue my findings on list.. - Galen On Jun 29, 2006, at 7:56 AM, Jeff Squyres ((jsquyres)) wrote: Owen -- Sorry, we all fell [way] behind on e-mail because many of us were at an OMPI developer's meeting last week. :-( In the interi

Re: [OMPI users] gm bandwidth results disappointing

2006-06-13 Thread Galen M. Shipman
Hi Brock, You may wish to try running with the runtime option: -mca mpi_leave_pinned 1 This turns on registration caching and such.. - Galen On Jun 13, 2006, at 8:01 AM, Brock Palen wrote: I ran a test using openmpi-1.0.2 on OSX vs mpich-1.2.6 from mryicom and i get lacking results from

Re: [OMPI users] Open MPI 1.1a2 w/VAPI RHEL3 4-node Pallas Alltoall fails

2006-05-10 Thread Galen M. Shipman
Hi Scott, I believe this has been corrected on the trunk. This should hit the 1.1 release branch tonight. Thanks, Galen On May 9, 2006, at 10:27 AM, Scott Weitzenkamp (sweitzen) wrote: Pallas runs OK up to Alltoall test, then we get: /data/software/qa/MPI/openmpi-1.1a2-rhel4-`uname -m`-

Re: [OMPI users] how can I tell for sure that I'm using mvapi

2006-04-13 Thread Galen M. Shipman
Hi Bernie, You may specify which BTLs to use at runtime using an mca parameter: mpirun -np 2 -mca btl self,mvapi ./my_app This specifies to only use self (loopback) and mvapi. You may want to also use sm (shared memory) if you have multi-core or multi-proc.. such as: mpirun -np 2 -mca btl s

Re: [OMPI users] Performance of ping-pong using OpenMPI over Infiniband

2006-03-16 Thread Galen M. Shipman
Hi Jean, Take a look here: http://www.open-mpi.org/faq/?category=infiniband#ib- leave-pinned This should improve performance for micro-benchmarks and some applications. Please let mw know if this doesn't solve the issue. Thanks, Galen On Mar 16, 2006, at 10:34 AM, Jean Latour wrote: Hel

Re: [OMPI users] Memory allocation issue with OpenIB

2006-03-16 Thread Galen M. Shipman
Emanuel, Thanks for the tip on this issue, we will be adding it to the FAQ shortly. - Galen On Mar 15, 2006, at 4:29 PM, Emanuel Ziegler wrote: Hi Davide! You are using the -prefix option. I guess this is due to the fact that You cannot set the paths appropriately. Most likely You are

Re: [OMPI users] Memory allocation issue with OpenIB

2006-03-15 Thread Galen M. Shipman
report? Galen mpirun -v -d -hostfile hostfile -np 2 -prefix /usr/local mpi_test Thank you so much for your help! Davide Galen M. Shipman wrote: Hi Davide, Are you able to run this as root? This would tell me if this is a permissions issue. Also what options are you specifying to mpirun? Tha

Re: [OMPI users] Memory allocation issue with OpenIB

2006-03-15 Thread Galen M. Shipman
Hi Davide, Are you able to run this as root? This would tell me if this is a permissions issue. Also what options are you specifying to mpirun? Thanks, Galen On Mar 15, 2006, at 1:41 PM, Davide Bergamasco wrote: Hello, I'm trying to run OpenMPI on top of OpenIB but I'm running into a wei

Re: [O-MPI users] direct openib btl and latency

2006-02-10 Thread Galen M. Shipman
I've been working for the MVAPICH project for around three years. Since this thread is discussing MVAPICH, I thought I should post to this thread. Galen's description of MVAPICH is not accurate. MVAPICH uses RDMA for short message to deliver performance benefits to the applications. However, i

Re: [O-MPI users] does btl_openib work with multiple ports ?

2006-02-07 Thread Galen M. Shipman
This is very helpfull, I will try to obtain a system wired for dual port in order to correct this. Thanks, Galen On Tue, 7 Feb 2006, Jean-Christophe Hugly wrote: On Thu, 2006-02-02 at 21:49 -0700, Galen M. Shipman wrote: I suspect the problem may be in the bcast

Re: [O-MPI users] Open-MPI all-to-all performance

2006-02-05 Thread Galen M. Shipman
Hi Konstantin, MPI_Alltoall_Isend_Irecv This is a very unscalable algorithm in skampi as it simply posts N MPI_Irecv's and MPI_Isend's and then does a Waitall. We shouldn't have an issue though on 8 procs but in general I would expect the performance of this algorithm to degrade quite q

Re: [O-MPI users] Open-MPI all-to-all performance

2006-02-03 Thread Galen M. Shipman
Hello Konstantin, By using coll_basic_crossover 8 you are forcing all of your benchmarks to use the basic collectives, which offer poor performance. I ran the skampi Alltoall benchmark with the tuned collectives I get the following results which seem to scale quite well, when I have a bit

Re: [O-MPI users] does btl_openib work ?

2006-02-02 Thread Galen M. Shipman
isolate the problem. Thanks, Galen On Feb 2, 2006, at 7:04 PM, Jean-Christophe Hugly wrote: On Thu, 2006-02-02 at 15:19 -0700, Galen M. Shipman wrote: Is it possible for you to get a stack trace where this is hanging? You might try: mpirun -prefix /opt/ompi -wdir `pwd` -machinefile /root

Re: [O-MPI users] does btl_openib work ?

2006-02-02 Thread Galen M. Shipman
Hi Jean, I just noticed that you are running Quad proc nodes and are using: bench1 slots=4 max-slots=4 in your machine file and you are running the benchmark using only 2 processes via: mpirun -prefix /opt/ompi -wdir `pwd` -machinefile /root/machines - np 2 PMB-MPI1 By using slots=4 y

Re: [O-MPI users] does btl_openib work ?

2006-01-20 Thread Galen M. Shipman
Jean, I am not able to reproduce this problem on a non-threaded build, can you try taking a fresh src package and configuring without thread support. I am wondering if this is simply a threading issue. I did note that you said you configured both with and without threads but try the conf

Re: [O-MPI users] error creating high priority cq for mthca0

2005-12-06 Thread Galen M. Shipman
Hi Daryl, Sounds like this might be a ulimit issue, what do you get when you run ulimit -l? Also, check out: http://www.open-mpi.org/faq/?category=infiniband Thanks, Galen On Dec 6, 2005, at 10:46 AM, Daryl W. Grunau wrote: Hi, I'm running OMPI 1.1a1r8378 on 2.6.14 + recent OpenIB stack a

Re: [O-MPI users] problem with overflow 1.8ab code using GM

2005-11-21 Thread Galen M. Shipman
Bernard, This code is using MPI_Alloc_mem, which is good. Do you have an idea of approx. how much memory has been allocated via MPI_Alloc_mem at the time of failure? Thanks, Galen On Mon, 21 Nov 2005, Borenstein, Bernard S wrote: Things have improved alot since I ran the code using the e

Re: [O-MPI users] 1.0rc5 is up

2005-11-11 Thread Galen M. Shipman
The bad: OpenIB frequently crashes with the error: *** [0,1,2][btl_openib_endpoint.c: 135:mca_btl_openib_endpoint_post_send] error posting send request errno says Operation now in progress[0,1,2d [0,1,3][btl_openib_endpoint.c: 135:mca_btl_openib_endpoint_post_send] error posting s

Re: [O-MPI users] OpenIB module problem/questions:

2005-11-09 Thread Galen M. Shipman
On Nov 8, 2005, at 6:10 PM, Troy Telford wrote: I decided to try OpenMPI using the 'openib' module, rather than 'mvapi'; however I'm having a bit of difficulty: The test hardware is the same as in my earlier posts, the only software difference is: Linux 2.6.14 (OpenIB 2nd gen IB drivers)

Re: [O-MPI users] Infiniband performance problems (mvapi)

2005-10-31 Thread Galen M. Shipman
On Oct 31, 2005, at 8:50 AM, Mike Houston wrote: When only sending a few messages, we get reasonably good IB performance, ~500MB/s (MVAPICH is 850MB/s). What is your message size? Are you using the leave pinned option? If not, specify -mca mpi_leave_pinned 1 option to mpirun. This tells O

Re: [O-MPI users] HPL & HPCC: Wedged

2005-10-25 Thread Galen M. Shipman
Correction: HPL_NO_DATATYPE should be: HPL_NO_MPI_DATATYPE. - Galen On Oct 25, 2005, at 10:13 AM, Galen M. Shipman wrote: Hi Troy, Sorry for the delay, I am now able to reproduce this behavior when I do not specify HPL_NO_DATATYPE. If I do specify HPL_NO_DATATYPE the run completes. We will

Re: [O-MPI users] HPL & HPCC: Wedged

2005-10-25 Thread Galen M. Shipman
Hi Troy, Sorry for the delay, I am now able to reproduce this behavior when I do not specify HPL_NO_DATATYPE. If I do specify HPL_NO_DATATYPE the run completes. We will be looking into this now. Thanks, Galen On Oct 21, 2005, at 5:03 PM, Troy Telford wrote: I've been trying out the RC4

Re: [O-MPI users] how do you select which network/trasport to use at run-time?

2005-08-23 Thread Galen M. Shipman
Hi Peter, I'm a little surprised that tcp was used -- OMPI should "prefer" the low latency interconnects (such as mvapi) to tcp and automatically use them. This is a small issue and should be fixed in the next day or two. In the meantime to run using mvapi only use: mpirun -np 2 -mca