date:20160121

[OMPI users] configuring open mpi 10.1.2 with cuda on NVIDIA TK1

2016-01-21 Thread Kuhl, Spencer J

Openmpi 1.10.2 cuda.h and cuda_runtime_api.h exist in /usr/local/cuda-6.5/include using the configure trigger ./configure --with-cuda does not find cuda.h or cuda_runtime_api.h using the configure trigger ./configure --with-cuda=/usr/local/cuda-6.5 does not find cuda.h or cuda_runtime_api.h e

Re: [OMPI users] MPI, Fortran, and GET_ENVIRONMENT_VARIABLE

2016-01-21 Thread Thomas Jahns

Hi Matt, On 01/15/2016 03:53 PM, Matt Thompson wrote: There is a chance in the future I might want/need to query an environment variable in a Fortran program, namely to figure out what switch a currently running process is on (via SLURM_TOPOLOGY_ADDR in my case) and perhaps make a "per-switch" c

Re: [OMPI users] Open MPI MPI-OpenMP Hybrid Binding Question

2016-01-21 Thread Jeff Hammond

On Thu, Jan 21, 2016 at 4:07 AM, Dave Love wrote: > > Jeff Hammond writes: > > > Just using Intel compilers, OpenMP and MPI. Problem solved :-) > > > > (I work for Intel and the previous statement should be interpreted as a > > joke, > > Good! > > > although Intel OpenMP and MPI interoperate as

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Jeff Squyres (jsquyres)

On Jan 21, 2016, at 7:40 AM, Eva wrote: > > Thanks Jeff. > > >>1. Can you create a small example to reproduce the problem? > > >>2. The TCP and verbs-based transports use different thresholds and > >>protocols, and can sometimes bring to light errors in the application > >>(e.g., the applica

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Eva

Thanks Jeff. >>1. Can you create a small example to reproduce the problem? >>2. The TCP and verbs-based transports use different thresholds and protocols, and can sometimes bring to light errors in the application (e.g., the application is making assumptions that just happen to be true for TCP, b

Re: [OMPI users] MPI, Fortran, and GET_ENVIRONMENT_VARIABLE

2016-01-21 Thread Dave Love

Matt Thompson writes: > All, > > I'm not too sure if this is an MPI issue, a Fortran issue, or something > else but I thought I'd ask the MPI gurus here first since my web search > failed me. > > There is a chance in the future I might want/need to query an environment > variable in a Fortran pro

Re: [OMPI users] Openmpi 1.8.8 and affinty

2016-01-21 Thread Dave Love

twu...@goodyear.com writes: > In the past (v 1.6.4-) we used mpirun args of > > --mca mpi_paffinity_alone 1 --mca btl openib,tcp,sm,self > > with lsf 7.0.6, and this was enough to make cores not be oversubscribed when > submitting 2 or more jobs to the same node. [I'm puzzled by that. It should

Re: [OMPI users] Open MPI MPI-OpenMP Hybrid Binding Question

2016-01-21 Thread Dave Love

Jeff Hammond writes: > Just using Intel compilers, OpenMP and MPI. Problem solved :-) > > (I work for Intel and the previous statement should be interpreted as a > joke, Good! > although Intel OpenMP and MPI interoperate as well as any > implementations of which I am aware.) Better than MPC (

Re: [OMPI users] cleaning up old ROMIO (MPI-IO) drivers

2016-01-21 Thread Dave Love

[Catching up...] Rob Latham writes: > Do you use any of the other ROMIO file system drivers? If you don't > know if you do, or don't know what a ROMIO file system driver is, then > it's unlikely you are using one. > > What if you use a driver and it's not on the list? First off, let me > know

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Jeff Squyres (jsquyres)

Can you create a small example to reproduce the problem? The TCP and verbs-based transports use different thresholds and protocols, and can sometimes bring to light errors in the application (e.g., the application is making assumptions that just happen to be true for TCP, but not necessarily fo

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Gilles Gouaillardet

That could be a bug in openib, openmpi and/or your application. for example, a memory corruption could be unnoticed with tcp, but might cause openib hang. you can start by running your program under a memory debugger (valgrind, ddt or other) and confirm your application works fine. you can also up

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Eva

Gilles, Actually, there are some more strange things. With the same environment and MPI version, I write a simple program by using the same communication logic with my hang program. The simple program can work without hang. So is there any possible reason? I can try them one by one. Or can I debug

Re: [OMPI users] OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Gilles Gouaillardet

You can try a more recent version of openmpi 1.10.2 was released recently, or try with a nightly snapshot of master. If all of these still fail, can you post a trimmed version of your program so we can investigate ? Cheers, Gilles Eva wrote: >Gilles, > >>>Can you try to >>>mpirun --mca btl t

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Eva

Gilles, >>Can you try to >>mpirun --mca btl tcp,self --mca btl_tcp_eager_limit 56 ... >>and confirm it works fine with TCP *and* without eager ? I have tried this and it works. So what should I do next? 2016-01-21 16:25 GMT+08:00 Eva : > Thanks Gilles. > it works fine on tcp > So I use this to

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Gilles Gouaillardet

Can you try to mpirun --mca btl tcp,self --mca btl_tcp_eager_limit 56 ... and confirm it works fine with TCP *and* without eager ? Cheers, Gilles On 1/21/2016 5:25 PM, Eva wrote: Thanks Gilles. it works fine on tcp So I use this to disable eager: -mca btl_openib_use_eager_rdma 0 -mca btl_open

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Eva

Thanks Gilles. it works fine on tcp So I use this to disable eager: -mca btl_openib_use_eager_rdma 0 -mca btl_openib_max_eager_rdma 0 2016-01-21 13:10 GMT+08:00 Eva : > I run with two machines, 2 process per node: process0, process1, process2, > process3. > After some random rounds of communicat

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Gilles Gouaillardet

and by the way, you did mpirun --mca btl_tcp_eager_limit 56 in order to disable eager mode, right ? --mca btl_tcp_rndv_eager_limit 0 does something different Cheers, Gilles On 1/21/2016 2:10 PM, Eva wrote: I run with two machines, 2 process per node: process0, process1, process2, process3. Af

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Gilles Gouaillardet

Hi, can you post a trimmed version of your program so we can reproduce and analyze the hang ? Cheers, Gilles On 1/21/2016 2:10 PM, Eva wrote: I run with two machines, 2 process per node: process0, process1, process2, process3. After some random rounds of communications, the communication ha

Re: [OMPI users] MPI hangs on poll_device() with rdma

2016-01-21 Thread Eva

I run with two machines, 2 process per node: process0, process1, process2, process3. After some random rounds of communications, the communication hangs. When I debug into the program, I found: process1 sent a message to process2; process2 received the message from process1 and then start to receiv

[OMPI users] configuring open mpi 10.1.2 with cuda on NVIDIA TK1

Re: [OMPI users] MPI, Fortran, and GET_ENVIRONMENT_VARIABLE

Re: [OMPI users] Open MPI MPI-OpenMP Hybrid Binding Question

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI, Fortran, and GET_ENVIRONMENT_VARIABLE

Re: [OMPI users] Openmpi 1.8.8 and affinty

Re: [OMPI users] Open MPI MPI-OpenMP Hybrid Binding Question

Re: [OMPI users] cleaning up old ROMIO (MPI-IO) drivers

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

Re: [OMPI users] MPI hangs on poll_device() with rdma

19 matches

Site Navigation

Mail list logo

Footer information