[OMPI users] Re: Avoiding localhost as rank 0 with openmpi-default-hostfile

2025-02-27 Thread Saurabh T
I asked this before but did not receive a reply. Now with openmpi 5, I tried doing this with prte-default-hostfile and rmaps_default_mapping_policy = node:OVERSUBSCRIBE but I still get the same behavior: openmpi always wants rank 0 to be localhost. Is there a way to override this and set ranks b

Re: [OMPI users] Re-locate OpenMPI installation on OS X

2013-08-16 Thread Nathan Hjelm
You may also need to update where the binaries and libraries look. See the man pages for otool and install_name_tool for more information. Here is a basic example: bash-3.2# otool -L libmpi.dylib libmpi.dylib: /opt/local/lib/libmpi.1.dylib (compatibility version 3.0.0, current version 3.

Re: [OMPI users] Re-locate OpenMPI installation on OS X

2013-08-16 Thread Reuti
Hi, Am 16.08.2013 um 01:33 schrieb Eric Heien: > I'm compiling OpenMPI 1.6.5 on a set of different machines with different > operating systems. I install OpenMPI in directory A, then later move it to > directory B and compile my own code with mpicc or mpic++. Of course I need > to set the OP

[OMPI users] Re-locate OpenMPI installation on OS X

2013-08-15 Thread Eric Heien
Hello, I'm compiling OpenMPI 1.6.5 on a set of different machines with different operating systems. I install OpenMPI in directory A, then later move it to directory B and compile my own code with mpicc or mpic++. Of course I need to set the OPAL_PREFIX environment variable to point to direct

[OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-06-14 Thread Hayato KUNIIE
Hello Following problem is solved by recompiling and reinstall Open MPI for each nodes. Thank you for your coorpolation. - I build bewulf type PC Cluster (Cent OS release 6.4). And I studing about MPI.(Open MPI Ver.1.6.4) I tried following sample which using MPI_REDUCE

Re: [OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-29 Thread Jeff Squyres (jsquyres)
George -- I've confirmed that it works with 1.6.4 and am awaiting additional information from this user. On May 29, 2013, at 8:08 AM, George Bosilca wrote: > I can't check on the 1.6.4 posted on the web but I can confirm this test > works as expected on the current 1.6 branch (the next to b

Re: [OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-29 Thread George Bosilca
I can't check on the 1.6.4 posted on the web but I can confirm this test works as expected on the current 1.6 branch (the next to be 1.6.5). So this might have been fixed along the way. George. On May 27, 2013, at 07:05 , Hayato KUNIIE wrote: > Hello > > I posted this topic in last week.

Re: [OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-28 Thread mohamed khuili
My operating system is opensuse 12.3 X86_64 I use openmpi is 1.6-3.1.2 the message is Open RTE was unable to open the hostfile: /usr/lib64/mpi/gcc/openmpi/etc/openmpi-default-hostfile Check to make sure the path and filename are correct.

Re: [OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-28 Thread Jeff Squyres (jsquyres)
Per the email that you forwarded below, I replied to you off list saying that we could figure it out without bothering people, and then post the final resolution back to the list (I do this sometimes when figuring out a problem is going to take a bunch of back-and-forth). On May 25th, I replied

[OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-27 Thread Hayato KUNIIE
Hello I posted this topic in last week. But Information about this problem was few. And I post again with more information. I build bewulf type PC Cluster (Cent OS release 6.4). And I studing about MPI.(Open MPI Ver.1.6.4) I tried following sample which using MPI_REDUCE (FORTRAN). Then, followi

[OMPI users] Re...

2013-02-02 Thread Randolph Pullen
http://www.compu-gen.com/components/com_content/yaid3522.php Randolph Pullen 2/3/2013 1:41:11 AM .

Re: [OMPI users] Re :Re: OpenMP and OpenMPI Issue

2012-07-23 Thread Paul Kapinos
Jack, note that support for THREAD_MULTIPLE is available in [newer] versions of open MPI, but disabled by default. You have to enable it by configuring, in 1.6: --enable-mpi-thread-multiple Enable MPI_THREAD_MULTIPLE support (default: disabl

Re: [OMPI users] Re :Re: OpenMP and OpenMPI Issue

2012-07-20 Thread Jack Galloway
Jeff Squyres cisco.com> writes: > > On Oct 31, 2007, at 9:52 PM, Neeraj Chourasia wrote: > > > but the program is running on TCP interconnect with same > > datasize and also on IB with small datasize say 1MB. So i dont > > think problem is in OpenMPI, it has to do something with IB logi

[OMPI users] RE : fortran program with integer kind=8 using openmpi

2012-06-30 Thread Secretan Yves
Well, With openmpi compiled with Fortran default integer*8, MPI_TYPE_2INTEGER seem to have an incorrect size. The attached Fortran program shows it, When run on openmpi with integer*8 Size of MPI_INTEGER is 8 Size of MPI_INTEGER4 is 4 Size of MPI_INTE

[OMPI users] Re: [OMPI users] 回复: [OMPI users] Fault Tolerant Features in OpenMPI

2012-06-25 Thread Josh Hursey
> Could you give me some kind of official guide to enable the C/R feature? I > googled some aritcles but there seems problems with those methods. > > Best wishes. > > - 原始邮件信息 - > 发件人: "Open MPI Users" > 收件人: "Open MPI Users" > 主题: [OMPI

[OMPI users] Re: [OMPI users] 回复: Re: [OMPI users] 2012/06/18 14:35:07 自动保存草稿

2012-06-20 Thread Josh Hursey
You are correct that the Open MPI project combined the efforts of a few preexisting MPI implementations towards building a single, extensible MPI implementation with the best features of the prior MPI implementations. From the beginning of the project the Open MPI developer community has desired to

[OMPI users] RE : RE : RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-12 Thread BOUVIER Benjamin
Hi, I've found, in ifconfig, that each node has 2 interfaces, eth0 and eth1. I've run mpiexec with parameter --mca btl_tcp_if_include eth0 (or eth1) to see if there was some issues between nodes. Here are the results : - node1,node2 works with eth1, not with eth0. - node1,node3 works with eth1,

Re: [OMPI users] RE : RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread Jeff Squyres
On Jun 11, 2012, at 12:11 PM, BOUVIER Benjamin wrote: > Wow. I thought in the first place that all combinations would be equivalent, > but in fact, this is not the case... > I've kept the firewalls down during all the tests. > >> - on node1, "mpirun --host node1,node2 ring_c" > Works. > >> - on

[OMPI users] RE : RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread BOUVIER Benjamin
Wow. I thought in the first place that all combinations would be equivalent, but in fact, this is not the case... I've kept the firewalls down during all the tests. > - on node1, "mpirun --host node1,node2 ring_c" Works. > - on node1, "mpirun --host node1,node3 ring_c" > - on node1, "mpirun --ho

Re: [OMPI users] RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread Jeff Squyres
On Jun 11, 2012, at 11:15 AM, BOUVIER Benjamin wrote: > Thanks for your hints Jeff. > I've just tried without any firewalls on involved machines, but the issue > remains. > > # /etc/init.d/ip6tables status > ip6tables: Firewall is not running. > # /etc/init.d/iptables status > iptables: Firewall

[OMPI users] RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread BOUVIER Benjamin
Hi, Thanks for your hints Jeff. I've just tried without any firewalls on involved machines, but the issue remains. # /etc/init.d/ip6tables status ip6tables: Firewall is not running. # /etc/init.d/iptables status iptables: Firewall is not running. The machines have the host names "node1", "node2

Re: [OMPI users] RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread Jeff Squyres
To start, I would ensure that all firewalling (e.g., iptables) is disabled on all machines involved. On Jun 11, 2012, at 10:16 AM, BOUVIER Benjamin wrote: > Hi, > >> I'd guess that running net pipe with 3 procs may be undefined. > > It is indeed undefined. Running the net pipe program locally

[OMPI users] RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread BOUVIER Benjamin
Hi, > I'd guess that running net pipe with 3 procs may be undefined. It is indeed undefined. Running the net pipe program locally with 3 processors blocks, on my computer. This issue is especially weird as there is no problem for running the example program on network with MPICH2 implementatio

Re: [OMPI users] RE : Bug when mixing sent types in version 1.6

2012-06-08 Thread Jeff Squyres
On Jun 8, 2012, at 8:51 AM, BOUVIER Benjamin wrote: > I have downloaded the Netpipe benchmarks suite, launched `make mpi` and > launched with mpirun the resulting executable. > > Here is an interesting fact : by launching this executable on 2 nodes, it > works ; on 3 nodes, it blocks, I guess o

[OMPI users] RE : Bug when mixing sent types in version 1.6

2012-06-08 Thread BOUVIER Benjamin
Hi Jeff, Thanks for your answer. I have downloaded the Netpipe benchmarks suite, launched `make mpi` and launched with mpirun the resulting executable. Here is an interesting fact : by launching this executable on 2 nodes, it works ; on 3 nodes, it blocks, I guess on connect. Each process is

Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-11-09 Thread Sébastien Boisvert
Hello, We did more tests concerning the latency using 512 MPI ranks on our super-computer. (64 machines * 8 cores per machine) By default in Ray, any rank can communicate directly with any other. Thus we have a complete graph with 512 vertices and 130816 edges (512*511/2) where vertices are ran

Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-26 Thread Yevgeny Kliteynik
On 26-Sep-11 11:27 AM, Yevgeny Kliteynik wrote: > On 22-Sep-11 12:09 AM, Jeff Squyres wrote: >> On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote: >> What happens if you run 2 ibv_rc_pingpong's on each node? Or N ibv_rc_pingpongs? >>> >>> With 11 ibv_rc_pingpong's >>> >>> http://pas

Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-26 Thread Yevgeny Kliteynik
On 22-Sep-11 12:09 AM, Jeff Squyres wrote: > On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote: > >>> What happens if you run 2 ibv_rc_pingpong's on each node? Or N >>> ibv_rc_pingpongs? >> >> With 11 ibv_rc_pingpong's >> >> http://pastebin.com/85sPcA47 >> >> Code to do that => https://gist

[OMPI users] RE : RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Sébastien Boisvert
sco.com] > Date d'envoi : 21 septembre 2011 17:09 > À : Open MPI Users > Objet : Re: [OMPI users] RE : RE : Latency of 250 microseconds with > Open-MPI1.4.3, Mellanox Infiniband and 256 MPI ranks > > On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote: > >&g

Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Jeff Squyres
On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote: >> What happens if you run 2 ibv_rc_pingpong's on each node? Or N >> ibv_rc_pingpongs? > > With 11 ibv_rc_pingpong's > > http://pastebin.com/85sPcA47 > > Code to do that => https://gist.github.com/1233173 > > Latencies are around 20 micr

[OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Sébastien Boisvert
À : Open MPI Users > Objet : Re: [OMPI users] RE : Latency of 250 microseconds with Open-MPI > 1.4.3, Mellanox Infiniband and 256 MPI ranks > > On Sep 21, 2011, at 3:17 PM, Sébastien Boisvert wrote: > >> Meanwhile, I contacted some people at SciNet, which is also part of Compu

Re: [OMPI users] RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Jeff Squyres
On Sep 21, 2011, at 3:17 PM, Sébastien Boisvert wrote: > Meanwhile, I contacted some people at SciNet, which is also part of Compute > Canada. > > They told me to try Open-MPI 1.4.3 with the Intel compiler with --mca btl > self,ofud to use the ofud BTL instead of openib for OpenFabrics transpo

[OMPI users] RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Sébastien Boisvert
Hi Yevgeny, You are right on comparing apples with apples. But MVAPICH2 is not installed on colosse, which is in the CLUMEQ consortium, a part of Compute Canada. Meanwhile, I contacted some people at SciNet, which is also part of Compute Canada. They told me to try Open-MPI 1.4.3 with the

Re: [OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Gus Correa
Hi Ole, Eugene For what it is worth, I tried Ole's program here, as Devendra Rai had done before. I ran it across two nodes, with a total of 16 processes. I tried mca parameters for openib Infiniband, then for tcp on Gigabit Ethernet. Both work. I am using OpenMPI 1.4.3 compiled with GCC 4.1.2 on

Re: [OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Gus Correa
Hi Eugene You're right, it is blocking send, buffers can be reused after MPI_Send returns. My bad, I only read your answer to Sebastien and Ole after I posted mine. Could MPI run out of [internal] buffers to hold the messages, perhaps? The messages aren't that big anyway [5000 doubles]. Could

Re: [OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Gus Correa
Hi Ole You could try the examples/connectivity.c program in the OpenMPI source tree, to test if everything is alright. It also hints how to solve the buffer re-use issue that Sebastien [rightfully] pointed out [i.e., declare separate buffers for MPI_Send and MPI_Recv]. Gus Correa Sébastien Bois

Re: [OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Eugene Loh
Should be fine. Once MPI_Send returns, it should be safe to reuse the buffer. In fact, the return of the call is the only way you have of checking that the message has left the user's send buffer. The case you're worried about is probably MPI_Isend, where you have to check completion with an

[OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Sébastien Boisvert
Hello, Is it safe to re-use the same buffer (variable A) for MPI_Send and MPI_Recv given that MPI_Send may be eager depending on the MCA parameters ? > > > Sébastien > > De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de > Ole

Re: [OMPI users] RE : Problems with MPI_Init_Thread(...)

2011-09-19 Thread Jeff Squyres
On Sep 19, 2011, at 8:37 AM, Sébastien Boisvert wrote: > You need to call MPI_Init before calling MPI_Init_thread. This is incorrect -- MPI_INIT_THREAD does the same job as MPI_INIT, but it allows you to request a specific thread level. > According to http://cw.squyres.com/columns/2004-02-CW-MP

[OMPI users] RE : Problems with MPI_Init_Thread(...)

2011-09-19 Thread Sébastien Boisvert
Hello, You need to call MPI_Init before calling MPI_Init_thread. According to http://cw.squyres.com/columns/2004-02-CW-MPI-Mechanic.pdf (Past MPI Mechanic Columns written by Jeff Squyres) only 3 functions that can be called before calling MPI_Init and they are: - MPI_Initialized - MPI_Finalized

[OMPI users] "Re: RoCE (IBoE) & OpenMPI"

2011-03-22 Thread Eli Cohen
Hi, this discussion has been brought to my attention so I joined this mailing list to try to help. As you already stated that the SL maps correctly to PCP when using ibv_rc_pingpong, I assume OpenMPI works over rdma_cm. In that cases please note the following: 1. If you're using OFED-1.5.2, than if

[OMPI users] RE : Unable to connect to a server using MX MTL with TCP

2010-06-04 Thread Audet, Martin
Sorry, I forgot the attachements... Martin De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de Audet, Martin [martin.au...@imi.cnrc-nrc.gc.ca] Date d'envoi : 4 juin 2010 19:18 À : us...@open-mpi.org Objet : [OMPI users] Unable to c

[OMPI users] Re : Re : Yet another stdin problem

2009-10-08 Thread Kilou Zelabia
Thanks a lot ! will try this solution. Best regards. Zellabia. S De : Roman Cheplyaka À : Open MPI Users Envoyé le : Mer 7 Octobre 2009, 17 h 42 min 58 s Objet : Re: [OMPI users] Re : Yet another stdin problem As a slight modification, you can write a

Re: [OMPI users] Re : Yet another stdin problem

2009-10-07 Thread Ralph Castain
FWIW: an upcoming version will have the ability for you to specify all ranks to receive stdin...but that's a little ways off. For now, only rank=0 does. On Oct 7, 2009, at 9:42 AM, Roman Cheplyaka wrote: As a slight modification, you can write a wrapper script #!/bin/sh my_exe < inputs.txt

Re: [OMPI users] Re : Yet another stdin problem

2009-10-07 Thread Ashley Pittman
Or better still if you want to be able to pass the filename and args on the mpirun command line use the following and then run it as mpirun -np 64 ./input_wrapper inputs.txt my_exe #!/bin/bash FILE=$1 shift "$@" < $FILE In general though using stdin on parallel applications is rarely a good

Re: [OMPI users] Re : Yet another stdin problem

2009-10-07 Thread Roman Cheplyaka
As a slight modification, you can write a wrapper script #!/bin/sh my_exe < inputs.txt and pass it to mpirun. 2009/10/7 Kilou Zelabia : > Ok thanks! > That's a solution but i was wondering if there could exist a more elegant > one ? means without any modification at the source level > >

[OMPI users] Re : Yet another stdin problem

2009-10-07 Thread Kilou Zelabia
Ok thanks! That's a solution but i was wondering if there could exist a more elegant one ? means without any modification at the source level De : Roman Cheplyaka À : Open MPI Users Envoyé le : Mer 7 Octobre 2009, 17 h 06 min 55 s Objet : Re: [OMPI users] Yet

Re: [OMPI users] "Re: Best way to overlap computation and transfer using MPI over TCP/Ethernet?"

2009-06-08 Thread shan axida
Hi, Would you please tell me how did you do the experiment by calling MPI_Test in little more details? Thanks! From: Lars Andersson To: us...@open-mpi.org Sent: Tuesday, June 9, 2009 6:11:11 AM Subject: Re: [OMPI users] "Re: Best way to overlap comput

Re: [OMPI users] "Re: Best way to overlap computation and transfer using MPI over TCP/Ethernet?"

2009-06-08 Thread Lars Andersson
On Mon, Jun 8, 2009 at 11:07 PM, Lars Andersson wrote: > I'd say that your own workaround here is to intersperse MPI_TEST's > periodically. This will trigger OMPI's pipelined protocol for large > messages, and should allow partial bursts of progress while you're > assumedly off doing useful work. I

Re: [OMPI users] "Re: Best way to overlap computation and transfer using MPI over TCP/Ethernet?"

2009-06-05 Thread jody
I am no expert here, and i don't know the specific requirements for your problem, but wouldn't it make sense to have 2 "master" processes? One which deals out the jobs, and one which collects the results? Jody On Fri, Jun 5, 2009 at 1:58 AM, Lars Andersson wrote: >>> I've been trying to get over

[OMPI users] "Re: Best way to overlap computation and transfer using MPI over TCP/Ethernet?"

2009-06-04 Thread Lars Andersson
>> I've been trying to get overlapping computation and data transfer to >> work, without much success so far. > > If this is so important to you, why do you insist in using Ethernet > and not a more HPC-oriented interconnect which can make progress in > the background ? We have a medium sized clus

[OMPI users] Re :Re: Linpack Benchmark and File Descriptor Limits

2008-09-19 Thread Neeraj Chourasia
Hello,    With openmpi-1.3,  new mca feature is introduced namely --mca routed binomial. This ensures out of band communication to happen in binomial fashion and reduces the net socket opening and hence solves file open issues.-NeerajOn Thu, 18 Sep 2008 16:46:23 -0700 Open MPI Users wrote I'm

Re: [OMPI users] RE : RE : MPI_Comm_connect() fails

2008-03-17 Thread Edgar Gabriel
2 with George's patch and my small examples now work. Martin De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de Edgar Gabriel [gabr...@cs.uh.edu] Date d'envoi : 17 mars 2008 15:59 À : Open MPI Users Objet : Re: [OMPI use

[OMPI users] RE : RE : MPI_Comm_connect() fails

2008-03-17 Thread Audet, Martin
[gabr...@cs.uh.edu] Date d'envoi : 17 mars 2008 15:59 À : Open MPI Users Objet : Re: [OMPI users] RE : MPI_Comm_connect() fails already working on it, together with a move_request Thanks Edgar Jeff Squyres wrote: > Edgar -- > > Can you make a patch for the 1.2 series? > > On

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Edgar Gabriel
Feel free to try yourself the two small client and server programs I posted in my first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, Martin (Martin.Audet_at_[hidden]) Date: 2008-03-13 17:04:25 Hi Georges, Thanks for your patch, but I'

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Jeff Squyres
first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, Martin (Martin.Audet_at_[hidden]) Date: 2008-03-13 17:04:25 Hi Georges, Thanks for your patch, but I'm not sure I got it correctly. The patch I got modify a few arguments passed to isend()/ir

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Edgar Gabriel
e the server freeze when for example the server is started on 3 process and the client on 2 process. Feel free to try yourself the two small client and server programs I posted in my first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, M

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Audet, Martin
n for example the server is started on 3 process and the client on 2 process. Feel free to try yourself the two small client and server programs I posted in my first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, Martin (Martin.Audet_at_[hidden]) List

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-14 Thread Jeff Squyres
008 08:21:51 +0100 From: jody Subject: Re: [OMPI users] RE : MPI_Comm_connect() fails To: "Open MPI Users" Message-ID: <9b0da5ce0803130021l4ead0f91qaf43e4ac7d332...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 HI I think the recvcount argument you pass to MPI

[OMPI users] RE : users Digest, Vol 841, Issue 3

2008-03-13 Thread Audet, Martin
Hi Georges, Thanks for your patch, but I'm not sure I got it correctly. The patch I got modify a few arguments passed to isend()/irecv()/recv() in coll_basic_allgather.c. Here is the patch I applied: Index: ompi/mca/coll/basic/coll_basic_allgather.c

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread George Bosilca
small examples work perfectly with mpich2 ch3:sock. Regards, Martin Audet -- Message: 4 Date: Thu, 13 Mar 2008 08:21:51 +0100 From: jody Subject: Re: [OMPI users] RE : MPI_Comm_connect() fails To: "Open MPI Users" Message-ID: <9b0da5ce0

[OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread Audet, Martin
work perfectly with mpich2 ch3:sock. Regards, Martin Audet -- Message: 4 List-Post: users@lists.open-mpi.org Date: Thu, 13 Mar 2008 08:21:51 +0100 From: jody Subject: Re: [OMPI users] RE : MPI_Comm_connect() fails To: "Ope

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread George Bosilca
I am not aware of any problems with the allreduce/allgather. But, we are aware of the problem with valgrind that report non initialized values when used with TCP. It's a long story, but I can guarantee that this should not affect a correct MPI application. george. PS: For those who want

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread jody
Sorry! That reply was intended to another post! Jody On Thu, Mar 13, 2008 at 8:21 AM, jody wrote: > HI > I think the recvcount argument you pass to MPI_Allgather should not be > 1 but instead > the number of MPI_INTs your buffer rem_rank_tbl can contain. > As it stands now, you tell MPI_Allg

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread jody
HI I think the recvcount argument you pass to MPI_Allgather should not be 1 but instead the number of MPI_INTs your buffer rem_rank_tbl can contain. As it stands now, you tell MPI_Allgather that it may only receive 1 MPI_INT. Furthermore, i'm not sure, but i think your receive buffer should be lar

[OMPI users] RE : MPI_Comm_connect() fails

2008-03-12 Thread Audet, Martin
Hi again, Thanks Pak for the link and suggesting to start an "orted" deamon, by doing so my clients and servers jobs were able to establish an intercommunicator between them. However I modified my programs to perform an MPI_Allgather() of a single "int" over the new intercommunicator to test

Re: [OMPI users] Re :Re: what is MPI_IN_PLACE

2007-12-11 Thread George Bosilca
Neeraj, The rationale is clearly explained in the MPI standard. Here is the relevant paragraph from section 7.3.2: The ``in place'' operations are provided to reduce unnecessary memory motion by both the MPI implementation and by the user. Note that while the simple check of testing wheth

[OMPI users] Re :Re: what is MPI_IN_PLACE

2007-12-11 Thread Neeraj Chourasia
Thanks  George, But what is the need for user to specify it. The api can check the address of  input buffers and output buffers. Is there some extra advantage of MPI_IN_PLACE over automatically detecting it using pointers?-NeerajOn Tue, 11 Dec 2007 06:10:06 -0500 Open MPI Users wrote Neer

[OMPI users] Re: [OMPI users] MPI_Probe succeeds, but subsequent MPI_Recv gets stuck

2007-11-06 Thread hpe...@infonie.fr
Just a thought, Behaviour can be unpredictable if you use MPI_Bsend or MPI_Ibsend ... on your sender side cause nothing is checked regard to allocated buffer. MPI_Send or MPI_Isend shall be used instead. Regards Herve ALICE C'EST ENCORE MIEUX AVEC CANAL+ LE BOUQUET

Re: [OMPI users] Re :Re: OpenMP and OpenMPI Issue

2007-11-01 Thread Jeff Squyres
On Oct 31, 2007, at 9:52 PM, Neeraj Chourasia wrote: but the program is running on TCP interconnect with same datasize and also on IB with small datasize say 1MB. So i dont think problem is in OpenMPI, it has to do something with IB logic, which probably doesnt work well with threads.

[OMPI users] Re :Re: OpenMP and OpenMPI Issue

2007-11-01 Thread Neeraj Chourasia
Thanks for your reply,    but the program is running on TCP interconnect with same datasize and also on IB with small datasize say 1MB. So i dont think problem is in OpenMPI, it has to do something with IB logic, which probably doesnt work well with threads.I also tried the program with MPI_THR

[OMPI users] Re :Re: Process 0 with different time executing the same code

2007-10-26 Thread Neeraj Chourasia
Hi,    Please ensure if following things are correct1) The array bounds are equal. Means \"my_x\" and \"size_y\" has the same value on all nodes.2) Nodes are homogenous. To check that, you could decide root to be some different node and run the program-NeerajOn Fri, 26 Oct 2007 10:13:15 +0500 (

Re: [OMPI users] Re :Re: Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Torsten Hoefler
Hi, >Yes, the buffer was being re-used. No we didnt try to benchmark it with >netpipe and other stuffs. But the program was pretty simple. Do you think, >I need to test it with bigger chunks (>8MB) for communication.? >We also tried manipulating eager_limit and min_rdma_sze, but no

Re: [OMPI users] Re :Re: Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Jeff Squyres
The mailing list snipped off the end of my mail -- here's the rest of what I said: The meanings of the 3 phases are explained in this pager: http:// www.open-mpi.org/papers/euro-pvmmpi-2006-hpc-protocols. If you use the mpi_leave_pinned parameter and Open MPI is able to leave your entire buffe

Re: [OMPI users] Re :Re: Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Jeff Squyres
On Oct 12, 2007, at 8:38 AM, Neeraj Chourasia wrote: Yes, the buffer was being re-used. No we didnt try to benchmark it with netpipe and other stuffs. But the program was pretty simple. Do you think, I need to test it with bigger chunks (>8MB) for communication.? We also tried manipulating

[OMPI users] Re :Re: Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Neeraj Chourasia
Yes, the buffer was being re-used. No we didnt try to benchmark it with netpipe and other stuffs. But the program was pretty simple. Do you think, I need to test it with bigger chunks (>8MB) for communication.?We also tried manipulating eager_limit and min_rdma_sze, but no success.NeerajOn Fri,

Re: [OMPI users] Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Torsten Hoefler
Hello, >The code was pretty simple. I was trying to send 8MB data from one >rank to other in a loop(say 1000 iterations). And then i was taking the >average of time taken and was calculating the bandwidth. > >The above logic i tried with both mpirun-with-mca-parameters and with

[OMPI users] Re :Re: Tuning Openmpi with IB Interconnect

2007-10-11 Thread Neeraj Chourasia
Hi,    The code was pretty simple. I was trying to send 8MB data from one rank to other in a loop(say 1000 iterations). And then i was taking the average of time taken and was calculating the bandwidth.The above logic i tried with both mpirun-with-mca-parameters and without any parameters. And t

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Bas van der Vlies
Jeff Squyres wrote: On Apr 4, 2007, at 7:59 AM, Bas van der Vlies wrote: http://www.open-mpi.org/svn/building.php Yes i get this error message: Note the following one the building.php web page: "Autoconf/Automake Note: Autoconf 2.59 / Automake 1.9.6 will currently work with all bran

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Jeff Squyres
On Apr 4, 2007, at 7:59 AM, Bas van der Vlies wrote: http://www.open-mpi.org/svn/building.php Yes i get this error message: Note the following one the building.php web page: "Autoconf/Automake Note: Autoconf 2.59 / Automake 1.9.6 will currently work with all branches available in the

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Bas van der Vlies
Jeff Squyres wrote: On Apr 4, 2007, at 4:28 AM, Bas van der Vlies wrote: Is the fix in trunk or also in the nighly build release. When i download the trunk version ./autogen.sh fails. You only need to use autogen.sh when building from an SVN checkout. Did you follow the instructions for

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Jeff Squyres
Oops! Someone else just mailed me off-list and told me the same thing; I mis-read the version number in Bas' first mail. Tim Mattox is exactly right; the fix is on the OMPI trunk but not yet in the 1.2 branch (and therefore not in the 1.2 nightly tarballs). On Apr 4, 2007, at 7:56 AM, Ti

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Tim Mattox
Hello Bas van der Vlies, The memory leak you found in Open MPI 1.2 has not yet been fixed in the 1.2 branch. You can follow the status of that particular fix for the 1.2 branch here: https://svn.open-mpi.org/trac/ompi/ticket/970 The fix should go in soon, but I had a problem yesterday applying th

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Jeff Squyres
On Apr 4, 2007, at 4:28 AM, Bas van der Vlies wrote: Is the fix in trunk or also in the nighly build release. When i download the trunk version ./autogen.sh fails. You only need to use autogen.sh when building from an SVN checkout. Did you follow the instructions for SVN builds listed her

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Bas van der Vlies
Bas van der Vlies wrote: Mohamad Chaarawi wrote: Yes we saw the memory leak, and a fix is already in the trunk right now.. Sorry i didn't reply back earlier... The fix will be merged in V1.2, as soon as the release managers approve it.. Thank you, Thanks we will test it and do some more scal

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Bas van der Vlies
Mohamad Chaarawi wrote: Yes we saw the memory leak, and a fix is already in the trunk right now.. Sorry i didn't reply back earlier... The fix will be merged in V1.2, as soon as the release managers approve it.. Thank you, Thanks we will test it and do some more scalapack testing. On Tue

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-03 Thread Mohamad Chaarawi
Yes we saw the memory leak, and a fix is already in the trunk right now.. Sorry i didn't reply back earlier... The fix will be merged in V1.2, as soon as the release managers approve it.. Thank you, On Tue, April 3, 2007 5:14 am, Bas van der Vlies wrote: > Mohamad Chaarawi wrote: >> Hello Mr. V

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-03 Thread Bas van der Vlies
Mohamad Chaarawi wrote: Hello Mr. Van der Vlies, We are currently looking into this problem and will send out an email as soon as we recognize something and fix it. Thank you, Mohamed, Just curious. Did you test this program and see the same behavior as at our site? Regards Subject:

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-03-27 Thread Mohamad Chaarawi
Hello Mr. Van der Vlies, We are currently looking into this problem and will send out an email as soon as we recognize something and fix it. Thank you, > Subject: Re: [OMPI users] Memory leak in openmpi-1.2? > Date: Tue, 27 Mar 2007 13:58:15 +0200 > From: Bas van der Vlies > Reply-To: Open MPI

[OMPI users] Re: MPI_Comm_spawn multiple bproc support

2006-11-07 Thread hpe...@infonie.fr
Hi Ralf, sorry for the delay in the answer but I encountered some difficulties to access to internet since yesterday. I have tried all your suggestions but I continue to experience problems. Actually, I have a problem with bjs on the one hand that I may submit to a bproc forum and I still spawn

[OMPI users] Re: Re: Re: Re: Re:MPI_Comm_spawn multiple bproc support

2006-11-02 Thread hpe...@infonie.fr
I again Ralf, >I gather you have access to bjs? Could you use bjs to get a node allocation, >and then send me a printout of the environment? I have slightly changed my cluster configuration for something like: master is running on a machine call: machine10 node 0 is running on a machine call: ma

[OMPI users] Re: Re: MPI_Comm_spawn multiple bproc support

2006-10-31 Thread hpe...@infonie.fr
Thank you for you quick reply Ralf, As far as I know, the NODES environment variable is created when a job is submitted to the bjs scheduler. The only way I know (but I am a bproc newbe) is to use the bjssub command. Then, I have retried my test with the following running command: "bjssub -i mp

Re: [OMPI users] Re : OpenMPI 1.1: Signal:10, info.si_errno:0(Unknown, error: 0), si_code:1(BUS_ADRALN)

2006-06-28 Thread Eric Thibodeau
I am actually running the released 1.1. I can send you my code, if you want, and you could try running it off a single node with -np 4 or 5 (oversubscribing) and see if you get a BUS_ADRALN error off one node. The only restriction to compiling the code is that X libs be available (display is not

Re: [OMPI users] Re : OpenMPI 1.1: Signal:10, info.si_errno:0(Unknown, error: 0), si_code:1(BUS_ADRALN)

2006-06-28 Thread Terry D. Dontje
Well, I've been using the trunk and not 1.1. I also just built 1.1.1a1r10538 and ran it with no bus error. Though you are running 1.1b5r10421 so we're not running the same thing, as of yet. I have a cluster of two v440 that have 4 cpus each running Solaris 10. The tests I am running are np

Re: [OMPI users] Re : OpenMPI 1.1: Signal:10, info.si_errno:0(Unknown, error: 0 ), si_code:1(BUS_ADRALN)

2006-06-28 Thread Eric Thibodeau
Terry, I was about to comment on this. could you tell me the specs of your machine. As you will notice in "my thread", I am running into problems on Sparc SPM systems where the CPU borad's RTC are in a doubtfull state. Are-you running 1.1 on SMP machines. If so, on how many procs and wh

[OMPI users] Re : OpenMPI 1.1: Signal:10, info.si_errno:0(Unknown, error: 0), si_code:1(BUS_ADRALN)

2006-06-28 Thread Terry D. Dontje
Frank, Can you set your limit coredumpsize to non-zero rerun the program and then get the stack via dbx? So, I have a similar case of BUS_ADRALN on SPARC systems with an older version (June 21st) of the trunk. I've since run using the latest trunk and the bus went away. I am now going to try