Re: [OMPI users] Memory manager

2008-05-20 Thread Gleb Natapov
On Tue, May 20, 2008 at 12:17:02PM +1000, Terry Frankcombe wrote: > To tell you all what noone wanted to tell me, yes, it does seem to be > the memory manager. Compiling everything with > --with-memory-manager=none returns the vmem use to the more reasonable > ~100MB per process (down from >8GB).

Re: [OMPI users] build OpenMPI with OpenIB

2008-03-07 Thread Gleb Natapov
On Fri, Mar 07, 2008 at 10:36:42AM +, Yuan Wan wrote: > > Hi all, > > I want to build OpenMPI-1.2.5 on my Infiniband cluster which has OFED-2.1 > installed. > > I configured OpenMPI as: > > ./configure --prefix=/ex

Re: [OMPI users] OpenMPI 1.2.5 race condition / core dump with MPI_Reduce and MPI_Gather

2008-02-29 Thread Gleb Natapov
unfortunate to say it, only few days after we had the discussion > about the flow control, but the only correct solution here is to add PML > level flow control ... > > george. > > On Feb 28, 2008, at 2:55 PM, Christian Bell wrote: > >> On Thu, 28 Feb 2008, Gleb Na

Re: [OMPI users] OpenMPI 1.2.5 race condition / core dump with MPI_Reduce and MPI_Gather

2008-02-28 Thread Gleb Natapov
On Wed, Feb 27, 2008 at 10:01:06AM -0600, Brian W. Barrett wrote: > The only solution to this problem is to suck it up and audit all the code > to eliminate calls to opal_progress() in situations where infinite > recursion can result. It's going to be long and painful, but there's no > quick

Re: [OMPI users] openmpi credits for eager messages

2008-02-05 Thread Gleb Natapov
On Tue, Feb 05, 2008 at 08:07:59AM -0500, Richard Treumann wrote: > There is no misunderstanding of the MPI standard or the definition of > blocking in the bug3 example. Both bug 3 and the example I provided are > valid MPI. > > As you say, blocking means the send buffer can be reused when the MP

Re: [OMPI users] openmpi credits for eager messages

2008-02-05 Thread Gleb Natapov
On Mon, Feb 04, 2008 at 04:23:13PM -0500, Sacerdoti, Federico wrote: > Bug3 is a test-case derived from a real, scalable application (desmond > for molecular dynamics) that several experienced MPI developers have > worked on. Note the MPI_Send calls of processes N>0 are *blocking*; the > openmpi si

Re: [OMPI users] openmpi credits for eager messages

2008-02-04 Thread Gleb Natapov
On Mon, Feb 04, 2008 at 02:54:46PM -0500, Richard Treumann wrote: > In my example, each sender task 1 to n-1 will have one rendezvous message > to task 0 at a time. The MPI standard suggests descriptors be small enough > and there be enough descriptor space for reasonable programs . The > standar

Re: [OMPI users] openmpi credits for eager messages

2008-02-04 Thread Gleb Natapov
On Mon, Feb 04, 2008 at 09:08:45AM -0500, Richard Treumann wrote: > To me, the MPI standard is clear that a program like this: > > task 0: > MPI_Init > sleep(3000); > start receiving messages > > each of tasks 1 to n-1: > MPI_Init > loop 5000 times >MPI_Send(small message to 0) > end loop >

Re: [OMPI users] mixed myrinet/non-myrinet nodes

2008-01-15 Thread Gleb Natapov
On Tue, Jan 15, 2008 at 09:49:40AM -0500, M Jones wrote: > Hi, > >We have a mixed environment in which roughly 2/3 of the nodes > in our cluster have myrinet (mx 1.2.1), while the full cluster has > gigE. Running open-mpi exclusively on myrinet nodes or exclusively > on non-myrinet nodes is f

Re: [OMPI users] Ideal MTU in Infiniband

2008-01-10 Thread Gleb Natapov
On Thu, Jan 10, 2008 at 06:23:50PM +0530, Parag Kalra wrote: > Hello all, > > Any ideas? Yes. The idea is that Open MPI knows what best. Run it with a default value. Usually bigger MTU is better, but some HW has bugs. Open MPI knows this and choses the best value for your HW. > > -- > Parag Kalr

Re: [OMPI users] Gigabit ethernet (PCI Express) and openmpi v1.2.4

2007-12-17 Thread Gleb Natapov
On Sun, Dec 16, 2007 at 06:49:30PM -0500, Allan Menezes wrote: > Hi, > How many PCI-Express Gigabit ethernet cards does OpenMPI version 1.2.4 > support with a corresponding linear increase in bandwith measured with > netpipe NPmpi and openmpi mpirun? > With two PCI express cards I get a B/W of 1

Re: [OMPI users] Does MPI_Bsend always use the buffer?

2007-12-11 Thread Gleb Natapov
On Tue, Dec 11, 2007 at 10:27:32AM -0500, Bradley, Peter C. (MIS/CFD) wrote: > In OpenMPI, does MPI_Bsend always copy the message to the user-specified > buffer, or will it avoid the copy in situations where it knows the send can > complete? If the message size if smaller than eager limit Open MPI

Re: [OMPI users] machinefile and rank

2007-11-07 Thread Gleb Natapov
On Tue, Nov 06, 2007 at 09:22:50PM -0500, Jeff Squyres wrote: > Unfortunately, not yet. I believe that this kind of functionality is > slated for the v1.3 series -- is that right Ralph/Voltaire? > Yes, the file format will be different, but arbitrary mapping will be possible. > > On Nov 5, 20

Re: [OMPI users] IB latency on Mellanox ConnectX hardware

2007-10-18 Thread Gleb Natapov
On Wed, Oct 17, 2007 at 05:43:14PM -0400, Jeff Squyres wrote: > Several users have noticed poor latency with Open MPI when using the > new Mellanox ConnectX HCA hardware. Open MPI was getting about 1.9us > latency with 0 byte ping-pong benchmarks (e.g., NetPIPE or > osu_latency). This has b

Re: [OMPI users] Multiple threads

2007-10-01 Thread Gleb Natapov
On Mon, Oct 01, 2007 at 10:39:12AM +0200, Olivier DUBUISSON wrote: > Hello, > > I compile openmpi 1.2.3 with options ./configure --with-threads=posix > --enable-mpi-thread --enable-progress-threads --enable-smp-locks. > > My program has 2 threads (main thread and an other). When i run it, i > ca

Re: [OMPI users] SKaMPI hangs on collectives and onesided

2007-09-20 Thread Gleb Natapov
usually applications call MPI_Finalize at some point in time and we have a barrier there so all outstanding request are progressed. > > Thanks, > Jelena > > Gleb Natapov wrote: >> On Wed, Sep 19, 2007 at 01:58:35PM -0600, Edmund Sumbar wrote: >> >>> I'm t

Re: [OMPI users] SKaMPI hangs on collectives and onesided

2007-09-19 Thread Gleb Natapov
On Wed, Sep 19, 2007 at 01:58:35PM -0600, Edmund Sumbar wrote: > I'm trying to run skampi-5.0.1-r0191 under PBS > over IB with the command line > >mpirun -np 2 ./skampi -i coll.ski -o coll_ib.sko Can you add choose_barrier_synchronization() to coll.ski and try again? It looks like this one: h

Re: [OMPI users] OpenMPI and Port Range

2007-08-31 Thread Gleb Natapov
On Fri, Aug 31, 2007 at 10:49:10AM +0200, Sven Stork wrote: > On Friday 31 August 2007 09:07, Gleb Natapov wrote: > > On Fri, Aug 31, 2007 at 08:04:00AM +0100, Simon Hammond wrote: > > > On 31/08/2007, Lev Givon wrote: > > > > Received from George Bosilca on Thu,

Re: [OMPI users] OpenMPI and Port Range

2007-08-31 Thread Gleb Natapov
On Fri, Aug 31, 2007 at 08:17:36AM +0100, Simon Hammond wrote: > On 31/08/2007, Gleb Natapov wrote: > > On Fri, Aug 31, 2007 at 08:04:00AM +0100, Simon Hammond wrote: > > > On 31/08/2007, Lev Givon wrote: > > > > Received from George Bosilca on Thu, Aug 30, 2007

Re: [OMPI users] OpenMPI and Port Range

2007-08-31 Thread Gleb Natapov
On Fri, Aug 31, 2007 at 08:04:00AM +0100, Simon Hammond wrote: > On 31/08/2007, Lev Givon wrote: > > Received from George Bosilca on Thu, Aug 30, 2007 at 07:42:52PM EDT: > > > I have a patch for this, but I never felt a real need for it, so I > > > never push it in the trunk. I'm not completely co

Re: [OMPI users] Basic problems with OpenMPI

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 03:22:54PM +0530, Amit Kumar Saha wrote: > Hi Glib, > > i am sending a sample trace of my program: > > amit@ubuntu-desktop-1:~/mpi-exec$ mpirun --np 3 --hostfile > mpi-host-file HellMPI > > amit@debian-desktop-1's password: [ubuntu-desktop-1:28575] [0,0,0] > ORTE_ERROR_LO

Re: [OMPI users] Basic problems with OpenMPI

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 02:49:35PM +0530, Amit Kumar Saha wrote: > Hi gleb, > > > > Have you installed Open MPI at the same place on all nodes? What command > > line are you using to run app on more then one host? > > this is a sample run > > amit@ubuntu-desktop-1:~/mpi-exec$ mpirun --np 2 --ho

Re: [OMPI users] Basic problems with OpenMPI

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 02:32:58PM +0530, Amit Kumar Saha wrote: > Hi all, > > I have installed OpenMPI 1.2.3 on all my hosts (3). > > Now when I try to start a simple demo program ("hello world") using > ./a.out I get the error. When I run my program using "mpirun" on more > than one host it giv

Re: [OMPI users] Basic problems with OpenMPI

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 01:03:30PM +0530, Amit Kumar Saha wrote: > Also, is open MPI 1.1 compatible with MPI 1.2.3, I mean to ask is > whether a MPI executable generated using 1.1 is executable by 1.2.3? No. They are not compatible. -- Gleb.

Re: [OMPI users] Basic problems with OpenMPI

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 12:26:54PM +0530, Amit Kumar Saha wrote: > Hello all, > > I have installed Open MPI 1.2.3 from source on Debian 4.0. I did the > "make all install" using root privileges. > > Now when I try to execute a simple program , I get the following: > > debian-desktop-1:/home/amit

Re: [OMPI users] Basic problems with OpenMPI

2007-08-29 Thread Gleb Natapov
On Wed, Aug 29, 2007 at 11:42:29AM +0530, Amit Kumar Saha wrote: > hello all, > > I am just trying to get started with OpenMPI (version 1.1) on Linux. Vesrion 1.1 is old an no longer supported. > > When I try to run a simple MPI - "Hello World" program, here is what i get: > > amit@ubuntu-deskt

Re: [OMPI users] OpenMPI fails to initalize the openib btl when run from SGE

2007-08-22 Thread Gleb Natapov
On Wed, Aug 22, 2007 at 03:31:20PM +0300, Noam Meltzer wrote: > Hi, > > I am running openmpi-1.2.3 compiled for 64bit on RHEL4u4. > I also have a Voltaire InfiniBand interconnect. > When I manually run jobs using the following command: > > /opt/local/openmpi-1.2.3-gcc4/bin/orterun -np 8 -hostfile

Re: [OMPI users] opal_init_Segmentation Fault

2007-07-17 Thread Gleb Natapov
On Tue, Jul 17, 2007 at 07:17:58AM -0400, Jeff Squyres wrote: > Unfortunately, this looks like a problem with your gcc installation > -- a compiler should never seg fault when it's trying to compile C > source code. > > FWIW: the file in question that it's trying to compile is actually > fro

Re: [OMPI users] Processes stuck in MPI_BARRIER

2007-06-20 Thread Gleb Natapov
On Tue, Jun 19, 2007 at 11:24:24AM -0700, George Bosilca wrote: > 1. I don't believe the OS to release the binding when we close the > socket. As an example on Linux the kernel sockets are release at a > later moment. That means the socket might be still in use for the > next run. > This is n

Re: [OMPI users] OpenMPI/OpenIB/IMB hangs[Scanned]

2007-01-19 Thread Gleb Natapov
On Fri, Jan 19, 2007 at 05:51:49PM +, Arif Ali wrote: > >>I tried the nightly snapshot of OpenMPI-1.2b4r13137, which failed > >>miserably. > >> > > > >Can you describe what happened there? Is it failing in a different way? > > > Here's the output > > #-

Re: [OMPI users] IB bandwidth vs. kernels

2007-01-18 Thread Gleb Natapov
On Thu, Jan 18, 2007 at 07:17:13AM -0500, Robin Humble wrote: > On Thu, Jan 18, 2007 at 11:08:04AM +0200, Gleb Natapov wrote: > >On Thu, Jan 18, 2007 at 03:52:19AM -0500, Robin Humble wrote: > >> On Wed, Jan 17, 2007 at 08:55:31AM -0700, Brian W. Barrett wrote: > >> &

Re: [OMPI users] IB bandwidth vs. kernels

2007-01-18 Thread Gleb Natapov
On Thu, Jan 18, 2007 at 03:52:19AM -0500, Robin Humble wrote: > On Wed, Jan 17, 2007 at 08:55:31AM -0700, Brian W. Barrett wrote: > >On Jan 17, 2007, at 2:39 AM, Gleb Natapov wrote: > >> On Wed, Jan 17, 2007 at 04:12:10AM -0500, Robin Humble wrote: > >>> basical

Re: [OMPI users] IB bandwidth vs. kernels

2007-01-17 Thread Gleb Natapov
Hi Robin, On Wed, Jan 17, 2007 at 04:12:10AM -0500, Robin Humble wrote: > > so this isn't really an OpenMPI questions (I don't think), but you guys > will have hit the problem if anyone has... > > basically I'm seeing wildly different bandwidths over InfiniBand 4x DDR > when I use different kern

Re: [OMPI users] mpool_gm_module error

2006-12-12 Thread Gleb Natapov
On Tue, Dec 12, 2006 at 12:58:00PM -0800, Reese Faucette wrote: > > Well I have no luck in finding a way to up the amount the system will > > allow GM to use. What is a recommended solution? Is this even a > > problem in most cases? Like am i encountering a corner case? > > upping the limit was

Re: [OMPI users] mpool_gm_module error

2006-12-11 Thread Gleb Natapov
On Mon, Dec 11, 2006 at 02:52:40PM -0500, Brock Palen wrote: > On Dec 11, 2006, at 2:45 PM, Reese Faucette wrote: > > >> Also I have no idea what the memory window question is, i will > >> look it up on google. > >> > >> aon075:~ root# dmesg | grep GM > >> GM: gm_register_memory will be able to lo

Re: [OMPI users] multiple LIDs

2006-12-06 Thread Gleb Natapov
s, say ethernet and > > > infiniband and a million byte of data is to be sent, then in this > > > case the data will be sent through infiniband (since its a fast path > > > .. please correct me here if i m wrong). > > > > > > If there are mulitple such sen

Re: [OMPI users] multiple LIDs

2006-12-04 Thread Gleb Natapov
in a RR manner if they are connected > to the same port? One message can be split between multiple BTLs. > > -chev > > > On 12/4/06, Gleb Natapov wrote: > > On Mon, Dec 04, 2006 at 10:53:26PM +0530, Chevchenkovic Chevchenkovic wrote: > > > Hi, > > > It is no

Re: [OMPI users] multiple LIDs

2006-12-04 Thread Gleb Natapov
he same port of the same HCA. Can you explain what are you trying to achieve? > -chev > > > On 12/4/06, Gleb Natapov wrote: > > On Mon, Dec 04, 2006 at 01:07:08AM +0530, Chevchenkovic Chevchenkovic wrote: > > > Also could you please tell me which part of the openMP

Re: [OMPI users] multiple LIDs

2006-12-04 Thread Gleb Natapov
.c. > On 12/4/06, Chevchenkovic Chevchenkovic wrote: > > Is it possible to control the LID where the send and recvs are > > posted.. on either ends? > > > > On 12/3/06, Gleb Natapov wrote: > > > On Sun, Dec 03, 2006 at 07:03:33PM +0530, Chevchenkovic Chevchenkovi

Re: [OMPI users] multiple LIDs

2006-12-04 Thread Gleb Natapov
nib_max_lmc" parameter. > > On 12/3/06, Gleb Natapov wrote: > > On Sun, Dec 03, 2006 at 07:03:33PM +0530, Chevchenkovic Chevchenkovic wrote: > > > Hi, > > > I had this query. I hope some expert replies to it. > > > I have 2 nodes connected point-to-point using i

Re: [OMPI users] multiple LIDs

2006-12-03 Thread Gleb Natapov
On Sun, Dec 03, 2006 at 07:03:33PM +0530, Chevchenkovic Chevchenkovic wrote: > Hi, > I had this query. I hope some expert replies to it. > I have 2 nodes connected point-to-point using infiniband cable. There > are multiple LIDs for each of the end node ports. >When I give an MPI_Send, Are the

Re: [OMPI users] How to set paffinity on a multi-cpu node?

2006-12-01 Thread Gleb Natapov
On Fri, Dec 01, 2006 at 09:35:09AM -0500, Brock Palen wrote: > On Dec 1, 2006, at 9:23 AM, Gleb Natapov wrote: > > > On Fri, Dec 01, 2006 at 04:14:31PM +0200, Gleb Natapov wrote: > >> On Fri, Dec 01, 2006 at 11:51:24AM +0100, Peter Kjellstrom wrote: > >>> On Satu

Re: [OMPI users] How to set paffinity on a multi-cpu node?

2006-12-01 Thread Gleb Natapov
On Fri, Dec 01, 2006 at 04:14:31PM +0200, Gleb Natapov wrote: > On Fri, Dec 01, 2006 at 11:51:24AM +0100, Peter Kjellstrom wrote: > > On Saturday 25 November 2006 15:31, shap...@isp.nsc.ru wrote: > > > Hello, > > > i cant figure out, is there a way with open-mpi to

Re: [OMPI users] How to set paffinity on a multi-cpu node?

2006-12-01 Thread Gleb Natapov
On Fri, Dec 01, 2006 at 11:51:24AM +0100, Peter Kjellstrom wrote: > On Saturday 25 November 2006 15:31, shap...@isp.nsc.ru wrote: > > Hello, > > i cant figure out, is there a way with open-mpi to bind all > > threads on a given node to a specified subset of CPUs. > > For example, on a multi-socket

Re: [OMPI users] How to set paffinity on a multi-cpu node?

2006-11-29 Thread Gleb Natapov
On Wed, Nov 29, 2006 at 08:48:48AM -0500, Jeff Squyres wrote: > - There's also the issue that the BIOS determines core/socket order > mapping to Linux virtual processor IDs. Linux virtual processor 0 is > always socket 0, core 0. But what is linux virtual processor 1? Is > it socket 0, cor

Re: [OMPI users] dma using infiniband protocol

2006-11-02 Thread Gleb Natapov
On Thu, Nov 02, 2006 at 11:57:16AM -0800, Brian Budge wrote: > Thanks for the help guys. > > In my case the memory will be allocated and pinned by my other device > driver. Is it safe to simply use that memory? My pages won't be unpinned > as a result? > If your driver plays nicely with openib

Re: [OMPI users] dma using infiniband protocol

2006-11-02 Thread Gleb Natapov
On Thu, Nov 02, 2006 at 10:37:24AM -0800, Brian Budge wrote: > Hi all - > > I'm wondering how DMA is handled in OpenMPI when using the infiniband > protocol. In particular, will I get a speed gain if my read/write buffers > are already pinned via mlock? > No you will not. mlock has nothing to do

Re: [OMPI users] Fault Tolerance & Behavior

2006-10-31 Thread Gleb Natapov
On Mon, Oct 30, 2006 at 11:45:53AM -0700, Troy Telford wrote: > On Sun, 29 Oct 2006 01:34:06 -0700, Gleb Natapov > wrote: > > > If you use OB1 PML (default one) it will never recover from link down > > error no matter how many other transports you have. The reason is that

Re: [OMPI users] Fault Tolerance & Behavior

2006-10-29 Thread Gleb Natapov
On Thu, Oct 26, 2006 at 05:39:13PM -0600, Troy Telford wrote: > I'm also confident that both TCP & Myrinet would throw an error when they > time out; it's just that I haven't felt the need to verify it. (And with > some-odd 20 minutes for Myrinet, it takes a bit of attention span. The > las

Re: [OMPI users] Error Polling HP CQ Status on PPC64 LInux with IB

2006-06-19 Thread Gleb Natapov
What version of OpenMPI are you using? On Mon, Jun 19, 2006 at 07:06:54AM -0700, Owen Stampflee wrote: > I'm currently working on getting OpenMPI + OpenIB 1.0 (might be an RC) > working on our 8 node Xserve G5 cluster running Linux kernel version > 2.6.16 and get the following errors: > > Process

Re: [O-MPI users] Questions on status

2005-06-15 Thread Gleb Natapov
On Tue, Jun 14, 2005 at 06:54:42PM -0700, Scott Feldman wrote: > > On Jun 14, 2005, at 5:45 PM, Jeff Squyres wrote: > > > We're a quiet bunch. :-) > > Which is a bad thing for Open Source development. It seems Open MPI is > closed-source development project with an open-source release model.