subject:"\[OMPI users\] Re..."

Re: [OMPI users] Re: CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

2025-06-03 Thread 'Tomislav Janjusic US' via Open MPI users

add --mca pml_base_verbose 90 And should see something like this: [rock18:3045236] select: component ucx selected [rock18:3045236] select: component ob1 not selected / finalized Or whatever your ompi instance selected. -Tommy On Tuesday, June 3, 2025 at 12:44:00 PM UTC-5 Mike Adams wrote: > mpiru

Re: [OMPI users] Re: CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

2025-06-03 Thread Mike Adams

mpirun --mca btl_smcuda_use_cuda_ipc_same_gpu 0 --mca btl_smcuda_use_cuda_ipc 0 --map-by ppr:2:numa --bind-to core --rank-by slot --display-map --display-allocation --report-bindings ./multilane_ring_allreduce where there is 1 GPU per NUMA region. I am not sure which pml I'm using, but since th

Re: [OMPI users] Re: CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

2025-06-03 Thread 'Tomislav Janjusic US' via Open MPI users

Can you post the full mpirun command? or at least the relevant mpi mca params? " I'm still curious about your input on whether or not those mca parameters I mentioned yesterday are disabling GPUDirect RDMA as well?" Even if you disable sm_cuda_ipc, it's possible you're still using cuda ipc via

Re: [OMPI users] Re: CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

2025-05-31 Thread Mike Adams

Interestingly, I made an error - Delta on 4.1.5 did fail like some of the cases on Bridges2 on 4.0.5, but at 16 ranks per GPU. This is the core count of the AMD processor on Delta with 4 GPUs. So, it looks like Bridges2 needs an OpenMPI upgrade. Tommy, I'm still curious about your input on wh

Re: [OMPI users] Re: CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

2025-05-30 Thread Mike Adams

Dmitry, I'm not too familiar with the internals of OpenMPI, but I just tried 4.1.5 on NCSA Delta and received the same IPC errors (no mca flags switched). The actual calls didn't fail this time to perform the actual operation, so maybe that's an improvement from v4.0.x to v4.1.x? Thanks, Mi

Re: [OMPI users] Re: CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

2025-05-30 Thread Dmitry N. Mikushin

There is a relevant explanation of the same issue reported for Julia: https://github.com/JuliaGPU/CUDA.jl/issues/1053 пт, 30 мая 2025 г. в 19:05, Mike Adams : > Hi Tommy, > > I'm setting btl_smcuda_use_cuda_ipc_same_gpu 0 and btl_smcuda_use_cuda_ipc 0. > > So, are you saying that with these param

[OMPI users] Re: CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

2025-05-30 Thread Mike Adams

Hi Tommy, I'm setting btl_smcuda_use_cuda_ipc_same_gpu 0 and btl_smcuda_use_cuda_ipc 0. So, are you saying that with these params, it is also not using GPUDirect RDMA? PSC Bridges 2 only has v4 OpenMPI, but they may be working on installing v5 now. Everything works on v5 on NCSA Delta - I'll

[OMPI users] Re: CUDA-Aware on OpenMPI v4 with CUDA IPC buffers

2025-05-30 Thread 'Tomislav Janjusic US' via Open MPI users

Hi, I'm not sure if it's a known issue, in v4.0 possibly, not sure about v4.1 or v5.0 - can you try? As far as CUDA IPC - how are you disabling it? I don't remember the mca params in v4.0 If it's either through pml ucx, or smcuda then no, it won't use it. -Tommy On Saturday, May 24, 2025 at 8:

[OMPI users] Re: Avoiding localhost as rank 0 with openmpi-default-hostfile

2025-02-27 Thread Saurabh T

I asked this before but did not receive a reply. Now with openmpi 5, I tried doing this with prte-default-hostfile and rmaps_default_mapping_policy = node:OVERSUBSCRIBE but I still get the same behavior: openmpi always wants rank 0 to be localhost. Is there a way to override this and set ranks b

Re: [OMPI users] Re-locate OpenMPI installation on OS X

2013-08-16 Thread Nathan Hjelm

You may also need to update where the binaries and libraries look. See the man pages for otool and install_name_tool for more information. Here is a basic example: bash-3.2# otool -L libmpi.dylib libmpi.dylib: /opt/local/lib/libmpi.1.dylib (compatibility version 3.0.0, current version 3.

Re: [OMPI users] Re-locate OpenMPI installation on OS X

2013-08-16 Thread Reuti

Hi, Am 16.08.2013 um 01:33 schrieb Eric Heien: > I'm compiling OpenMPI 1.6.5 on a set of different machines with different > operating systems. I install OpenMPI in directory A, then later move it to > directory B and compile my own code with mpicc or mpic++. Of course I need > to set the OP

[OMPI users] Re-locate OpenMPI installation on OS X

2013-08-15 Thread Eric Heien

Hello, I'm compiling OpenMPI 1.6.5 on a set of different machines with different operating systems. I install OpenMPI in directory A, then later move it to directory B and compile my own code with mpicc or mpic++. Of course I need to set the OPAL_PREFIX environment variable to point to direct

[OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-06-14 Thread Hayato KUNIIE

Hello Following problem is solved by recompiling and reinstall Open MPI for each nodes. Thank you for your coorpolation. - I build bewulf type PC Cluster (Cent OS release 6.4). And I studing about MPI.(Open MPI Ver.1.6.4) I tried following sample which using MPI_REDUCE

Re: [OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-29 Thread Jeff Squyres (jsquyres)

George -- I've confirmed that it works with 1.6.4 and am awaiting additional information from this user. On May 29, 2013, at 8:08 AM, George Bosilca wrote: > I can't check on the 1.6.4 posted on the web but I can confirm this test > works as expected on the current 1.6 branch (the next to b

Re: [OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-29 Thread George Bosilca

I can't check on the 1.6.4 posted on the web but I can confirm this test works as expected on the current 1.6 branch (the next to be 1.6.5). So this might have been fixed along the way. George. On May 27, 2013, at 07:05 , Hayato KUNIIE wrote: > Hello > > I posted this topic in last week.

Re: [OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-28 Thread mohamed khuili

My operating system is opensuse 12.3 X86_64 I use openmpi is 1.6-3.1.2 the message is Open RTE was unable to open the hostfile: /usr/lib64/mpi/gcc/openmpi/etc/openmpi-default-hostfile Check to make sure the path and filename are correct.

Re: [OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-28 Thread Jeff Squyres (jsquyres)

Per the email that you forwarded below, I replied to you off list saying that we could figure it out without bothering people, and then post the final resolution back to the list (I do this sometimes when figuring out a problem is going to take a bunch of back-and-forth). On May 25th, I replied

[OMPI users] -Re Post- MPI_SUM is not defined on the MPI_INTEGER datatype

2013-05-27 Thread Hayato KUNIIE

Hello I posted this topic in last week. But Information about this problem was few. And I post again with more information. I build bewulf type PC Cluster (Cent OS release 6.4). And I studing about MPI.(Open MPI Ver.1.6.4) I tried following sample which using MPI_REDUCE (FORTRAN). Then, followi

[OMPI users] Re...

2013-02-02 Thread Randolph Pullen

http://www.compu-gen.com/components/com_content/yaid3522.php Randolph Pullen 2/3/2013 1:41:11 AM .

Re: [OMPI users] Re :Re: OpenMP and OpenMPI Issue

2012-07-23 Thread Paul Kapinos

Jack, note that support for THREAD_MULTIPLE is available in [newer] versions of open MPI, but disabled by default. You have to enable it by configuring, in 1.6: --enable-mpi-thread-multiple Enable MPI_THREAD_MULTIPLE support (default: disabl

Re: [OMPI users] Re :Re: OpenMP and OpenMPI Issue

2012-07-20 Thread Jack Galloway

Jeff Squyres cisco.com> writes: > > On Oct 31, 2007, at 9:52 PM, Neeraj Chourasia wrote: > > > but the program is running on TCP interconnect with same > > datasize and also on IB with small datasize say 1MB. So i dont > > think problem is in OpenMPI, it has to do something with IB logi

[OMPI users] RE : fortran program with integer kind=8 using openmpi

2012-06-30 Thread Secretan Yves

Well, With openmpi compiled with Fortran default integer*8, MPI_TYPE_2INTEGER seem to have an incorrect size. The attached Fortran program shows it, When run on openmpi with integer*8 Size of MPI_INTEGER is 8 Size of MPI_INTEGER4 is 4 Size of MPI_INTE

[OMPI users] Re: [OMPI users] 回复: [OMPI users] Fault Tolerant Features in OpenMPI

2012-06-25 Thread Josh Hursey

> Could you give me some kind of official guide to enable the C/R feature? I > googled some aritcles but there seems problems with those methods. > > Best wishes. > > - 原始邮件信息 - > 发件人: "Open MPI Users" > 收件人: "Open MPI Users" > 主题: [OMPI

[OMPI users] Re: [OMPI users] 回复: Re: [OMPI users] 2012/06/18 14:35:07 自动保存草稿

2012-06-20 Thread Josh Hursey

You are correct that the Open MPI project combined the efforts of a few preexisting MPI implementations towards building a single, extensible MPI implementation with the best features of the prior MPI implementations. From the beginning of the project the Open MPI developer community has desired to

[OMPI users] RE : RE : RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-12 Thread BOUVIER Benjamin

Hi, I've found, in ifconfig, that each node has 2 interfaces, eth0 and eth1. I've run mpiexec with parameter --mca btl_tcp_if_include eth0 (or eth1) to see if there was some issues between nodes. Here are the results : - node1,node2 works with eth1, not with eth0. - node1,node3 works with eth1,

Re: [OMPI users] RE : RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread Jeff Squyres

On Jun 11, 2012, at 12:11 PM, BOUVIER Benjamin wrote: > Wow. I thought in the first place that all combinations would be equivalent, > but in fact, this is not the case... > I've kept the firewalls down during all the tests. > >> - on node1, "mpirun --host node1,node2 ring_c" > Works. > >> - on

[OMPI users] RE : RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread BOUVIER Benjamin

Wow. I thought in the first place that all combinations would be equivalent, but in fact, this is not the case... I've kept the firewalls down during all the tests. > - on node1, "mpirun --host node1,node2 ring_c" Works. > - on node1, "mpirun --host node1,node3 ring_c" > - on node1, "mpirun --ho

Re: [OMPI users] RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread Jeff Squyres

On Jun 11, 2012, at 11:15 AM, BOUVIER Benjamin wrote: > Thanks for your hints Jeff. > I've just tried without any firewalls on involved machines, but the issue > remains. > > # /etc/init.d/ip6tables status > ip6tables: Firewall is not running. > # /etc/init.d/iptables status > iptables: Firewall

[OMPI users] RE : RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread BOUVIER Benjamin

Hi, Thanks for your hints Jeff. I've just tried without any firewalls on involved machines, but the issue remains. # /etc/init.d/ip6tables status ip6tables: Firewall is not running. # /etc/init.d/iptables status iptables: Firewall is not running. The machines have the host names "node1", "node2

Re: [OMPI users] RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread Jeff Squyres

To start, I would ensure that all firewalling (e.g., iptables) is disabled on all machines involved. On Jun 11, 2012, at 10:16 AM, BOUVIER Benjamin wrote: > Hi, > >> I'd guess that running net pipe with 3 procs may be undefined. > > It is indeed undefined. Running the net pipe program locally

[OMPI users] RE : RE : Bug when mixing sent types in version 1.6

2012-06-11 Thread BOUVIER Benjamin

Hi, > I'd guess that running net pipe with 3 procs may be undefined. It is indeed undefined. Running the net pipe program locally with 3 processors blocks, on my computer. This issue is especially weird as there is no problem for running the example program on network with MPICH2 implementatio

Re: [OMPI users] RE : Bug when mixing sent types in version 1.6

2012-06-08 Thread Jeff Squyres

On Jun 8, 2012, at 8:51 AM, BOUVIER Benjamin wrote: > I have downloaded the Netpipe benchmarks suite, launched `make mpi` and > launched with mpirun the resulting executable. > > Here is an interesting fact : by launching this executable on 2 nodes, it > works ; on 3 nodes, it blocks, I guess o

[OMPI users] RE : Bug when mixing sent types in version 1.6

2012-06-08 Thread BOUVIER Benjamin

Hi Jeff, Thanks for your answer. I have downloaded the Netpipe benchmarks suite, launched `make mpi` and launched with mpirun the resulting executable. Here is an interesting fact : by launching this executable on 2 nodes, it works ; on 3 nodes, it blocks, I guess on connect. Each process is

Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-11-09 Thread Sébastien Boisvert

Hello, We did more tests concerning the latency using 512 MPI ranks on our super-computer. (64 machines * 8 cores per machine) By default in Ray, any rank can communicate directly with any other. Thus we have a complete graph with 512 vertices and 130816 edges (512*511/2) where vertices are ran

Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-26 Thread Yevgeny Kliteynik

On 26-Sep-11 11:27 AM, Yevgeny Kliteynik wrote: > On 22-Sep-11 12:09 AM, Jeff Squyres wrote: >> On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote: >> What happens if you run 2 ibv_rc_pingpong's on each node? Or N ibv_rc_pingpongs? >>> >>> With 11 ibv_rc_pingpong's >>> >>> http://pas

Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-26 Thread Yevgeny Kliteynik

On 22-Sep-11 12:09 AM, Jeff Squyres wrote: > On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote: > >>> What happens if you run 2 ibv_rc_pingpong's on each node? Or N >>> ibv_rc_pingpongs? >> >> With 11 ibv_rc_pingpong's >> >> http://pastebin.com/85sPcA47 >> >> Code to do that => https://gist

[OMPI users] RE : RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Sébastien Boisvert

sco.com] > Date d'envoi : 21 septembre 2011 17:09 > À : Open MPI Users > Objet : Re: [OMPI users] RE : RE : Latency of 250 microseconds with > Open-MPI1.4.3, Mellanox Infiniband and 256 MPI ranks > > On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote: > >&g

Re: [OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Jeff Squyres

On Sep 21, 2011, at 4:24 PM, Sébastien Boisvert wrote: >> What happens if you run 2 ibv_rc_pingpong's on each node? Or N >> ibv_rc_pingpongs? > > With 11 ibv_rc_pingpong's > > http://pastebin.com/85sPcA47 > > Code to do that => https://gist.github.com/1233173 > > Latencies are around 20 micr

[OMPI users] RE : RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Sébastien Boisvert

À : Open MPI Users > Objet : Re: [OMPI users] RE : Latency of 250 microseconds with Open-MPI > 1.4.3, Mellanox Infiniband and 256 MPI ranks > > On Sep 21, 2011, at 3:17 PM, Sébastien Boisvert wrote: > >> Meanwhile, I contacted some people at SciNet, which is also part of Compu

Re: [OMPI users] RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Jeff Squyres

On Sep 21, 2011, at 3:17 PM, Sébastien Boisvert wrote: > Meanwhile, I contacted some people at SciNet, which is also part of Compute > Canada. > > They told me to try Open-MPI 1.4.3 with the Intel compiler with --mca btl > self,ofud to use the ofud BTL instead of openib for OpenFabrics transpo

[OMPI users] RE : Latency of 250 microseconds with Open-MPI 1.4.3, Mellanox Infiniband and 256 MPI ranks

2011-09-21 Thread Sébastien Boisvert

Hi Yevgeny, You are right on comparing apples with apples. But MVAPICH2 is not installed on colosse, which is in the CLUMEQ consortium, a part of Compute Canada. Meanwhile, I contacted some people at SciNet, which is also part of Compute Canada. They told me to try Open-MPI 1.4.3 with the

Re: [OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Gus Correa

Hi Ole, Eugene For what it is worth, I tried Ole's program here, as Devendra Rai had done before. I ran it across two nodes, with a total of 16 processes. I tried mca parameters for openib Infiniband, then for tcp on Gigabit Ethernet. Both work. I am using OpenMPI 1.4.3 compiled with GCC 4.1.2 on

Re: [OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Gus Correa

Hi Eugene You're right, it is blocking send, buffers can be reused after MPI_Send returns. My bad, I only read your answer to Sebastien and Ole after I posted mine. Could MPI run out of [internal] buffers to hold the messages, perhaps? The messages aren't that big anyway [5000 doubles]. Could

Re: [OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Gus Correa

Hi Ole You could try the examples/connectivity.c program in the OpenMPI source tree, to test if everything is alright. It also hints how to solve the buffer re-use issue that Sebastien [rightfully] pointed out [i.e., declare separate buffers for MPI_Send and MPI_Recv]. Gus Correa Sébastien Bois

Re: [OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Eugene Loh

Should be fine. Once MPI_Send returns, it should be safe to reuse the buffer. In fact, the return of the call is the only way you have of checking that the message has left the user's send buffer. The case you're worried about is probably MPI_Isend, where you have to check completion with an

[OMPI users] RE : MPI hangs on multiple nodes

2011-09-19 Thread Sébastien Boisvert

Hello, Is it safe to re-use the same buffer (variable A) for MPI_Send and MPI_Recv given that MPI_Send may be eager depending on the MCA parameters ? > > > Sébastien > > De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de > Ole

Re: [OMPI users] RE : Problems with MPI_Init_Thread(...)

2011-09-19 Thread Jeff Squyres

On Sep 19, 2011, at 8:37 AM, Sébastien Boisvert wrote: > You need to call MPI_Init before calling MPI_Init_thread. This is incorrect -- MPI_INIT_THREAD does the same job as MPI_INIT, but it allows you to request a specific thread level. > According to http://cw.squyres.com/columns/2004-02-CW-MP

[OMPI users] RE : Problems with MPI_Init_Thread(...)

2011-09-19 Thread Sébastien Boisvert

Hello, You need to call MPI_Init before calling MPI_Init_thread. According to http://cw.squyres.com/columns/2004-02-CW-MPI-Mechanic.pdf (Past MPI Mechanic Columns written by Jeff Squyres) only 3 functions that can be called before calling MPI_Init and they are: - MPI_Initialized - MPI_Finalized

[OMPI users] "Re: RoCE (IBoE) & OpenMPI"

2011-03-22 Thread Eli Cohen

Hi, this discussion has been brought to my attention so I joined this mailing list to try to help. As you already stated that the SL maps correctly to PCP when using ibv_rc_pingpong, I assume OpenMPI works over rdma_cm. In that cases please note the following: 1. If you're using OFED-1.5.2, than if

[OMPI users] RE : Unable to connect to a server using MX MTL with TCP

2010-06-04 Thread Audet, Martin

Sorry, I forgot the attachements... Martin De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de Audet, Martin [martin.au...@imi.cnrc-nrc.gc.ca] Date d'envoi : 4 juin 2010 19:18 À : us...@open-mpi.org Objet : [OMPI users] Unable to c

[OMPI users] Re : Re : Yet another stdin problem

2009-10-08 Thread Kilou Zelabia

Thanks a lot ! will try this solution. Best regards. Zellabia. S De : Roman Cheplyaka À : Open MPI Users Envoyé le : Mer 7 Octobre 2009, 17 h 42 min 58 s Objet : Re: [OMPI users] Re : Yet another stdin problem As a slight modification, you can write a

Re: [OMPI users] Re : Yet another stdin problem

2009-10-07 Thread Ralph Castain

FWIW: an upcoming version will have the ability for you to specify all ranks to receive stdin...but that's a little ways off. For now, only rank=0 does. On Oct 7, 2009, at 9:42 AM, Roman Cheplyaka wrote: As a slight modification, you can write a wrapper script #!/bin/sh my_exe < inputs.txt

Re: [OMPI users] Re : Yet another stdin problem

2009-10-07 Thread Ashley Pittman

Or better still if you want to be able to pass the filename and args on the mpirun command line use the following and then run it as mpirun -np 64 ./input_wrapper inputs.txt my_exe #!/bin/bash FILE=$1 shift "$@" < $FILE In general though using stdin on parallel applications is rarely a good

Re: [OMPI users] Re : Yet another stdin problem

2009-10-07 Thread Roman Cheplyaka

As a slight modification, you can write a wrapper script #!/bin/sh my_exe < inputs.txt and pass it to mpirun. 2009/10/7 Kilou Zelabia : > Ok thanks! > That's a solution but i was wondering if there could exist a more elegant > one ? means without any modification at the source level > >

[OMPI users] Re : Yet another stdin problem

2009-10-07 Thread Kilou Zelabia

Ok thanks! That's a solution but i was wondering if there could exist a more elegant one ? means without any modification at the source level De : Roman Cheplyaka À : Open MPI Users Envoyé le : Mer 7 Octobre 2009, 17 h 06 min 55 s Objet : Re: [OMPI users] Yet

Re: [OMPI users] "Re: Best way to overlap computation and transfer using MPI over TCP/Ethernet?"

2009-06-08 Thread shan axida

Hi, Would you please tell me how did you do the experiment by calling MPI_Test in little more details? Thanks! From: Lars Andersson To: us...@open-mpi.org Sent: Tuesday, June 9, 2009 6:11:11 AM Subject: Re: [OMPI users] "Re: Best way to overlap comput

Re: [OMPI users] "Re: Best way to overlap computation and transfer using MPI over TCP/Ethernet?"

2009-06-08 Thread Lars Andersson

On Mon, Jun 8, 2009 at 11:07 PM, Lars Andersson wrote: > I'd say that your own workaround here is to intersperse MPI_TEST's > periodically. This will trigger OMPI's pipelined protocol for large > messages, and should allow partial bursts of progress while you're > assumedly off doing useful work. I

Re: [OMPI users] "Re: Best way to overlap computation and transfer using MPI over TCP/Ethernet?"

2009-06-05 Thread jody

I am no expert here, and i don't know the specific requirements for your problem, but wouldn't it make sense to have 2 "master" processes? One which deals out the jobs, and one which collects the results? Jody On Fri, Jun 5, 2009 at 1:58 AM, Lars Andersson wrote: >>> I've been trying to get over

[OMPI users] "Re: Best way to overlap computation and transfer using MPI over TCP/Ethernet?"

2009-06-04 Thread Lars Andersson

>> I've been trying to get overlapping computation and data transfer to >> work, without much success so far. > > If this is so important to you, why do you insist in using Ethernet > and not a more HPC-oriented interconnect which can make progress in > the background ? We have a medium sized clus

[OMPI users] Re :Re: Linpack Benchmark and File Descriptor Limits

2008-09-19 Thread Neeraj Chourasia

Hello, With openmpi-1.3, new mca feature is introduced namely --mca routed binomial. This ensures out of band communication to happen in binomial fashion and reduces the net socket opening and hence solves file open issues.-NeerajOn Thu, 18 Sep 2008 16:46:23 -0700 Open MPI Users wrote I'm

Re: [OMPI users] RE : RE : MPI_Comm_connect() fails

2008-03-17 Thread Edgar Gabriel

2 with George's patch and my small examples now work. Martin De : users-boun...@open-mpi.org [users-boun...@open-mpi.org] de la part de Edgar Gabriel [gabr...@cs.uh.edu] Date d'envoi : 17 mars 2008 15:59 À : Open MPI Users Objet : Re: [OMPI use

[OMPI users] RE : RE : MPI_Comm_connect() fails

2008-03-17 Thread Audet, Martin

[gabr...@cs.uh.edu] Date d'envoi : 17 mars 2008 15:59 À : Open MPI Users Objet : Re: [OMPI users] RE : MPI_Comm_connect() fails already working on it, together with a move_request Thanks Edgar Jeff Squyres wrote: > Edgar -- > > Can you make a patch for the 1.2 series? > > On

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Edgar Gabriel

Feel free to try yourself the two small client and server programs I posted in my first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, Martin (Martin.Audet_at_[hidden]) Date: 2008-03-13 17:04:25 Hi Georges, Thanks for your patch, but I'

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Jeff Squyres

first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, Martin (Martin.Audet_at_[hidden]) Date: 2008-03-13 17:04:25 Hi Georges, Thanks for your patch, but I'm not sure I got it correctly. The patch I got modify a few arguments passed to isend()/ir

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Edgar Gabriel

e the server freeze when for example the server is started on 3 process and the client on 2 process. Feel free to try yourself the two small client and server programs I posted in my first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, M

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-17 Thread Audet, Martin

n for example the server is started on 3 process and the client on 2 process. Feel free to try yourself the two small client and server programs I posted in my first message. Thanks, Martin Subject: [OMPI users] RE : users Digest, Vol 841, Issue 3 From: Audet, Martin (Martin.Audet_at_[hidden]) List

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-14 Thread Jeff Squyres

008 08:21:51 +0100 From: jody Subject: Re: [OMPI users] RE : MPI_Comm_connect() fails To: "Open MPI Users" Message-ID: <9b0da5ce0803130021l4ead0f91qaf43e4ac7d332...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 HI I think the recvcount argument you pass to MPI

[OMPI users] RE : users Digest, Vol 841, Issue 3

2008-03-13 Thread Audet, Martin

Hi Georges, Thanks for your patch, but I'm not sure I got it correctly. The patch I got modify a few arguments passed to isend()/irecv()/recv() in coll_basic_allgather.c. Here is the patch I applied: Index: ompi/mca/coll/basic/coll_basic_allgather.c

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread George Bosilca

small examples work perfectly with mpich2 ch3:sock. Regards, Martin Audet -- Message: 4 Date: Thu, 13 Mar 2008 08:21:51 +0100 From: jody Subject: Re: [OMPI users] RE : MPI_Comm_connect() fails To: "Open MPI Users" Message-ID: <9b0da5ce0

[OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread Audet, Martin

work perfectly with mpich2 ch3:sock. Regards, Martin Audet -- Message: 4 List-Post: users@lists.open-mpi.org Date: Thu, 13 Mar 2008 08:21:51 +0100 From: jody Subject: Re: [OMPI users] RE : MPI_Comm_connect() fails To: "Ope

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread George Bosilca

I am not aware of any problems with the allreduce/allgather. But, we are aware of the problem with valgrind that report non initialized values when used with TCP. It's a long story, but I can guarantee that this should not affect a correct MPI application. george. PS: For those who want

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread jody

Sorry! That reply was intended to another post! Jody On Thu, Mar 13, 2008 at 8:21 AM, jody wrote: > HI > I think the recvcount argument you pass to MPI_Allgather should not be > 1 but instead > the number of MPI_INTs your buffer rem_rank_tbl can contain. > As it stands now, you tell MPI_Allg

Re: [OMPI users] RE : MPI_Comm_connect() fails

2008-03-13 Thread jody

HI I think the recvcount argument you pass to MPI_Allgather should not be 1 but instead the number of MPI_INTs your buffer rem_rank_tbl can contain. As it stands now, you tell MPI_Allgather that it may only receive 1 MPI_INT. Furthermore, i'm not sure, but i think your receive buffer should be lar

[OMPI users] RE : MPI_Comm_connect() fails

2008-03-12 Thread Audet, Martin

Hi again, Thanks Pak for the link and suggesting to start an "orted" deamon, by doing so my clients and servers jobs were able to establish an intercommunicator between them. However I modified my programs to perform an MPI_Allgather() of a single "int" over the new intercommunicator to test

Re: [OMPI users] Re :Re: what is MPI_IN_PLACE

2007-12-11 Thread George Bosilca

Neeraj, The rationale is clearly explained in the MPI standard. Here is the relevant paragraph from section 7.3.2: The ``in place'' operations are provided to reduce unnecessary memory motion by both the MPI implementation and by the user. Note that while the simple check of testing wheth

[OMPI users] Re :Re: what is MPI_IN_PLACE

2007-12-11 Thread Neeraj Chourasia

Thanks George, But what is the need for user to specify it. The api can check the address of input buffers and output buffers. Is there some extra advantage of MPI_IN_PLACE over automatically detecting it using pointers?-NeerajOn Tue, 11 Dec 2007 06:10:06 -0500 Open MPI Users wrote Neer

[OMPI users] Re: [OMPI users] MPI_Probe succeeds, but subsequent MPI_Recv gets stuck

2007-11-06 Thread hpe...@infonie.fr

Just a thought, Behaviour can be unpredictable if you use MPI_Bsend or MPI_Ibsend ... on your sender side cause nothing is checked regard to allocated buffer. MPI_Send or MPI_Isend shall be used instead. Regards Herve ALICE C'EST ENCORE MIEUX AVEC CANAL+ LE BOUQUET

Re: [OMPI users] Re :Re: OpenMP and OpenMPI Issue

2007-11-01 Thread Jeff Squyres

On Oct 31, 2007, at 9:52 PM, Neeraj Chourasia wrote: but the program is running on TCP interconnect with same datasize and also on IB with small datasize say 1MB. So i dont think problem is in OpenMPI, it has to do something with IB logic, which probably doesnt work well with threads.

[OMPI users] Re :Re: OpenMP and OpenMPI Issue

2007-11-01 Thread Neeraj Chourasia

Thanks for your reply, but the program is running on TCP interconnect with same datasize and also on IB with small datasize say 1MB. So i dont think problem is in OpenMPI, it has to do something with IB logic, which probably doesnt work well with threads.I also tried the program with MPI_THR

[OMPI users] Re :Re: Process 0 with different time executing the same code

2007-10-26 Thread Neeraj Chourasia

Hi, Please ensure if following things are correct1) The array bounds are equal. Means \"my_x\" and \"size_y\" has the same value on all nodes.2) Nodes are homogenous. To check that, you could decide root to be some different node and run the program-NeerajOn Fri, 26 Oct 2007 10:13:15 +0500 (

Re: [OMPI users] Re :Re: Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Torsten Hoefler

Hi, >Yes, the buffer was being re-used. No we didnt try to benchmark it with >netpipe and other stuffs. But the program was pretty simple. Do you think, >I need to test it with bigger chunks (>8MB) for communication.? >We also tried manipulating eager_limit and min_rdma_sze, but no

Re: [OMPI users] Re :Re: Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Jeff Squyres

The mailing list snipped off the end of my mail -- here's the rest of what I said: The meanings of the 3 phases are explained in this pager: http:// www.open-mpi.org/papers/euro-pvmmpi-2006-hpc-protocols. If you use the mpi_leave_pinned parameter and Open MPI is able to leave your entire buffe

Re: [OMPI users] Re :Re: Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Jeff Squyres

On Oct 12, 2007, at 8:38 AM, Neeraj Chourasia wrote: Yes, the buffer was being re-used. No we didnt try to benchmark it with netpipe and other stuffs. But the program was pretty simple. Do you think, I need to test it with bigger chunks (>8MB) for communication.? We also tried manipulating

[OMPI users] Re :Re: Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Neeraj Chourasia

Yes, the buffer was being re-used. No we didnt try to benchmark it with netpipe and other stuffs. But the program was pretty simple. Do you think, I need to test it with bigger chunks (>8MB) for communication.?We also tried manipulating eager_limit and min_rdma_sze, but no success.NeerajOn Fri,

Re: [OMPI users] Re :Re: Tuning Openmpi with IB Interconnect

2007-10-12 Thread Torsten Hoefler

Hello, >The code was pretty simple. I was trying to send 8MB data from one >rank to other in a loop(say 1000 iterations). And then i was taking the >average of time taken and was calculating the bandwidth. > >The above logic i tried with both mpirun-with-mca-parameters and with

[OMPI users] Re :Re: Tuning Openmpi with IB Interconnect

2007-10-11 Thread Neeraj Chourasia

Hi, The code was pretty simple. I was trying to send 8MB data from one rank to other in a loop(say 1000 iterations). And then i was taking the average of time taken and was calculating the bandwidth.The above logic i tried with both mpirun-with-mca-parameters and without any parameters. And t

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Bas van der Vlies

Jeff Squyres wrote: On Apr 4, 2007, at 7:59 AM, Bas van der Vlies wrote: http://www.open-mpi.org/svn/building.php Yes i get this error message: Note the following one the building.php web page: "Autoconf/Automake Note: Autoconf 2.59 / Automake 1.9.6 will currently work with all bran

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Jeff Squyres

On Apr 4, 2007, at 7:59 AM, Bas van der Vlies wrote: http://www.open-mpi.org/svn/building.php Yes i get this error message: Note the following one the building.php web page: "Autoconf/Automake Note: Autoconf 2.59 / Automake 1.9.6 will currently work with all branches available in the

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Bas van der Vlies

Jeff Squyres wrote: On Apr 4, 2007, at 4:28 AM, Bas van der Vlies wrote: Is the fix in trunk or also in the nighly build release. When i download the trunk version ./autogen.sh fails. You only need to use autogen.sh when building from an SVN checkout. Did you follow the instructions for

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Jeff Squyres

Oops! Someone else just mailed me off-list and told me the same thing; I mis-read the version number in Bas' first mail. Tim Mattox is exactly right; the fix is on the OMPI trunk but not yet in the 1.2 branch (and therefore not in the 1.2 nightly tarballs). On Apr 4, 2007, at 7:56 AM, Ti

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Tim Mattox

Hello Bas van der Vlies, The memory leak you found in Open MPI 1.2 has not yet been fixed in the 1.2 branch. You can follow the status of that particular fix for the 1.2 branch here: https://svn.open-mpi.org/trac/ompi/ticket/970 The fix should go in soon, but I had a problem yesterday applying th

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Jeff Squyres

On Apr 4, 2007, at 4:28 AM, Bas van der Vlies wrote: Is the fix in trunk or also in the nighly build release. When i download the trunk version ./autogen.sh fails. You only need to use autogen.sh when building from an SVN checkout. Did you follow the instructions for SVN builds listed her

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Bas van der Vlies

Bas van der Vlies wrote: Mohamad Chaarawi wrote: Yes we saw the memory leak, and a fix is already in the trunk right now.. Sorry i didn't reply back earlier... The fix will be merged in V1.2, as soon as the release managers approve it.. Thank you, Thanks we will test it and do some more scal

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-04 Thread Bas van der Vlies

Mohamad Chaarawi wrote: Yes we saw the memory leak, and a fix is already in the trunk right now.. Sorry i didn't reply back earlier... The fix will be merged in V1.2, as soon as the release managers approve it.. Thank you, Thanks we will test it and do some more scalapack testing. On Tue

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-03 Thread Mohamad Chaarawi

Yes we saw the memory leak, and a fix is already in the trunk right now.. Sorry i didn't reply back earlier... The fix will be merged in V1.2, as soon as the release managers approve it.. Thank you, On Tue, April 3, 2007 5:14 am, Bas van der Vlies wrote: > Mohamad Chaarawi wrote: >> Hello Mr. V

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-04-03 Thread Bas van der Vlies

Mohamad Chaarawi wrote: Hello Mr. Van der Vlies, We are currently looking into this problem and will send out an email as soon as we recognize something and fix it. Thank you, Mohamed, Just curious. Did you test this program and see the same behavior as at our site? Regards Subject:

Re: [OMPI users] [Re: Memory leak in openmpi-1.2?]

2007-03-27 Thread Mohamad Chaarawi

Hello Mr. Van der Vlies, We are currently looking into this problem and will send out an email as soon as we recognize something and fix it. Thank you, > Subject: Re: [OMPI users] Memory leak in openmpi-1.2? > Date: Tue, 27 Mar 2007 13:58:15 +0200 > From: Bas van der Vlies > Reply-To: Open MPI

[OMPI users] Re: MPI_Comm_spawn multiple bproc support

2006-11-07 Thread hpe...@infonie.fr

Hi Ralf, sorry for the delay in the answer but I encountered some difficulties to access to internet since yesterday. I have tried all your suggestions but I continue to experience problems. Actually, I have a problem with bjs on the one hand that I may submit to a bproc forum and I still spawn

[OMPI users] Re: Re: Re: Re: Re:MPI_Comm_spawn multiple bproc support

2006-11-02 Thread hpe...@infonie.fr

I again Ralf, >I gather you have access to bjs? Could you use bjs to get a node allocation, >and then send me a printout of the environment? I have slightly changed my cluster configuration for something like: master is running on a machine call: machine10 node 0 is running on a machine call: ma

[OMPI users] Re: Re: MPI_Comm_spawn multiple bproc support

2006-10-31 Thread hpe...@infonie.fr

Thank you for you quick reply Ralf, As far as I know, the NODES environment variable is created when a job is submitted to the bjs scheduler. The only way I know (but I am a bproc newbe) is to use the bjssub command. Then, I have retried my test with the following running command: "bjssub -i mp

1 2 >

1 - 100 of 104 matches

Mail list logo