Re: [OMPI users] Openmpi-3.1.0 + slurm (fixed)

2018-05-08 Thread Bill Broadley
Sorry all, Chris S over on the slurm list spotted it right away. I didn't have the MpiDefault set to pmix_v2. I can confirm that Ubuntu 18.04, gcc-7.3, openmpi-3.1.0, pmix-2.1.1, and slurm-17.11.5 seem to work well together. Sorry for the bother. __

[OMPI users] Openmpi-3.1.0 + slurm?

2018-05-08 Thread Bill Broadley
I have openmpi-3.0.1, pmix-1.2.4, and slurm-17.11.5 working well on a few clusters. For things like: bill@headnode:~/src/relay$ srun -N 2 -n 2 -t 1 ./relay 1 c7-18 c7-19 size= 1, 16384 hops, 2 nodes in 0.03 sec ( 2.00 us/hop) 1953 KB/sec I've been having a tougher time trying to get

Re: [OMPI users] New ib locked pages behavior?

2014-10-22 Thread Bill Broadley
On 10/22/2014 12:37 AM, r...@q-leap.de wrote: >>>>>> "Bill" == Bill Broadley writes: > > It seems the half-life period of knowledge on the list has decayed to > two weeks on the list :) > > I've commented in detail on this (non-)issue on 2014-08-

Re: [OMPI users] New ib locked pages behavior?

2014-10-22 Thread Bill Broadley
On 10/21/2014 05:38 PM, Gus Correa wrote: > Hi Bill > > I have 2.6.X CentOS stock kernel. Heh, wow, quite a blast from the past. > I set both parameters. > It works. Yes, for kernels that old I had it working fine. > Maybe the parameter names may changed in 3.X kernels? > (Which is really bad

Re: [OMPI users] New ib locked pages behavior?

2014-10-21 Thread Bill Broadley
On 10/21/2014 04:18 PM, Gus Correa wrote: > Hi Bill > > Maybe you're missing these settings in /etc/modprobe.d/mlx4_core.conf ? > > http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem Ah, that helped. Although: /lib/modules/3.13.0-36-generic/kernel/drivers/net/ethernet/mellanox/mlx

[OMPI users] New ib locked pages behavior?

2014-10-21 Thread Bill Broadley
I've setup several clusters over the years with OpenMPI. I often get the below error: WARNING: It appears that your OpenFabrics subsystem is configured to only allow registering part of your physical memory. This can cause MPI jobs to run with erratic performance, hang, and/or crash.

Re: [OMPI users] MPI processes hang when using OpenMPI 1.3.2 and Gcc-4.4.0

2009-11-18 Thread Bill Broadley
A rather stable production code that has worked with various versions of MPI on various architectures started hanging with gcc-4.4.2 and openmpi 1.3.33 Which lead me to this thread. I made some very small changes to Eugene's code, here's the diff: $ diff testorig.c billtest.c 3,5c3,4 < < #define

Re: [OMPI users] Can't use tcp instead of openib/infinipath

2008-07-23 Thread Bill Broadley
Jeff Squyres wrote: Sorry for the delay in replying. What exactly is the relay program timing? Can you run a standard benchmark like NetPIPE, perchance? (http://www.scl.ameslab.gov/netpipe/) It gives very similar numbers to osu_latency. Turns out the mca btl seems to be completely ignor

[OMPI users] Can't use tcp instead of openib/infinipath

2008-07-19 Thread Bill Broadley
I built openib-1.2.6 on centos-5.2 with gcc-4.3.1. I did a tar xvzf, cd openib-1.2.6, mkdir obj, cd obj: (I put gcc-4.3.1/bin first in my path) ../configure --prefix=/opt/pkg/openmpi-1.2.6 --enable-shared --enable-debug If I look in config.log I see: MCA_btl_ALL_COMPONENTS=' self sm gm mvapi mx