Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread Reuti
Am 11.11.2014 um 19:29 schrieb Ralph Castain: > >> On Nov 11, 2014, at 10:06 AM, Reuti wrote: >> >> Am 11.11.2014 um 17:52 schrieb Ralph Castain: >> >>> On Nov 11, 2014, at 7:57 AM, Reuti wrote: Am 11.11.2014 um 16:13 schrieb Ralph Castain: > This clearly displays

Re: [OMPI users] File-backed mmaped I/O and openib btl.

2014-11-11 Thread Emmanuel Thomé
Thanks a lot for your analysis. This seems consistent with what I can obtain by playing around with my different test cases. It seems that munmap() does *not* unregister the memory chunk from the cache. I suppose this is the reason for the bug. In fact using mmap(..., MAP_ANONYMOUS | MAP_PRIVATE)

Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread Ralph Castain
> On Nov 11, 2014, at 10:06 AM, Reuti wrote: > > Am 11.11.2014 um 17:52 schrieb Ralph Castain: > >> >>> On Nov 11, 2014, at 7:57 AM, Reuti wrote: >>> >>> Am 11.11.2014 um 16:13 schrieb Ralph Castain: >>> This clearly displays the problem - if you look at the reported “allocated n

Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread Reuti
Am 11.11.2014 um 17:52 schrieb Ralph Castain: > >> On Nov 11, 2014, at 7:57 AM, Reuti wrote: >> >> Am 11.11.2014 um 16:13 schrieb Ralph Castain: >> >>> This clearly displays the problem - if you look at the reported “allocated >>> nodes”, you see that we only got one node (cn6050). This is wh

Re: [OMPI users] what order do I get messages coming to MPI Recv from MPI_ANY_SOURCE?

2014-11-11 Thread George Bosilca
Using MPI_ANY_SOURCE will extract one message from the queue of unexpected messages. The fairness is not guaranteed by the MPI standard, thus it is impossible to predict the order between servers. If you need fairness your second choice is the way to go. George. > On Nov 10, 2014, at 20:14

Re: [OMPI users] EXTERNAL: Re: Question on mapping processes to hosts file

2014-11-11 Thread Ralph Castain
I checked that bug using the current 1.8.4 branch and I can’t replicate it - looks like it might have already been fixed. If I give a hostfile like the one you described: node1 node1 node2 node3 and then ask to launch four processes: mpirun -n 4 --display-allocation --display-map --do-not-launch

Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread Ralph Castain
> On Nov 11, 2014, at 7:57 AM, Reuti wrote: > > Am 11.11.2014 um 16:13 schrieb Ralph Castain: > >> This clearly displays the problem - if you look at the reported “allocated >> nodes”, you see that we only got one node (cn6050). This is why we mapped >> all your procs onto that node. >> >> S

Re: [OMPI users] EXTERNAL: Re: Question on mapping processes to hosts file

2014-11-11 Thread Blosch, Edwin L
Thanks Ralph. I’ll experiment with these options. Much appreciated. From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: Tuesday, November 11, 2014 10:00 AM To: Open MPI Users Subject: Re: [OMPI users] EXTERNAL: Re: Question on mapping processes to hosts file On No

Re: [OMPI users] EXTERNAL: Re: Question on mapping processes to hosts file

2014-11-11 Thread Ralph Castain
> On Nov 11, 2014, at 6:11 AM, Blosch, Edwin L wrote: > > OK, that’s what I was suspecting. It’s a bug, right? I asked for 4 > processes and I supplied a host file with 4 lines in it, and mpirun didn’t > launch the processes where I told it to launch them. Actually, no - it’s an intended “

Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread Reuti
Am 11.11.2014 um 16:13 schrieb Ralph Castain: > This clearly displays the problem - if you look at the reported “allocated > nodes”, you see that we only got one node (cn6050). This is why we mapped all > your procs onto that node. > > So the real question is - why? Can you show us the content

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-11 Thread Jeff Squyres (jsquyres)
On Nov 11, 2014, at 9:43 AM, Dave Love wrote: > I haven't checked the source, but the commit message above says > > If the Fortran compiler supports both INTERFACE and ISO_FORTRAN_ENV, > then we'll build the MPI_SIZEOF interfaces. If not, we'll skip > MPI_SIZEOF in mpif.h and the mpi module.

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-11 Thread Jeff Squyres (jsquyres)
On Nov 11, 2014, at 9:38 AM, Dave Love wrote: >> 1. All modern compilers have ignore-TKR syntax, > > Hang on! (An equivalent of) ignore_tkr only appeared in gfortran 4.9 > (the latest release) as far as I know. The system compiler of the bulk > of GNU/Linux HPC systems currently is distinctly

Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread Ralph Castain
This clearly displays the problem - if you look at the reported “allocated nodes”, you see that we only got one node (cn6050). This is why we mapped all your procs onto that node. So the real question is - why? Can you show us the content of PE_HOSTFILE? > On Nov 11, 2014, at 4:51 AM, SLIM H.A

Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread Dave Love
"SLIM H.A." writes: > We switched on hyper threading on our cluster with two eight core > sockets per node (32 threads per node). Assuming that's Xeon-ish hyperthreading, the best advice is not to. It will typically hurt performance of HPC applications, not least if it defeats core binding, and

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-11 Thread Dave Love
"Jeff Squyres (jsquyres)" writes: > On Nov 10, 2014, at 8:27 AM, Dave Love wrote: > >>> https://github.com/open-mpi/ompi/commit/d7eaca83fac0d9783d40cac17e71c2b090437a8c >> >> I don't have time to follow this properly, but am I reading right that >> that says mpi_sizeof will now _not_ work with

Re: [OMPI users] OPENMPI-1.8.3: missing fortran bindings for MPI_SIZEOF

2014-11-11 Thread Dave Love
"Jeff Squyres (jsquyres)" writes: > There are several reasons why MPI implementations have not added explicit > interfaces to their mpif.h files, mostly boiling down to: they may/will break > real world MPI programs. > > 1. All modern compilers have ignore-TKR syntax, Hang on! (An equivalent

Re: [OMPI users] File-backed mmaped I/O and openib btl.

2014-11-11 Thread Joshua Ladd
I was able to reproduce your issue and I think I understand the problem a bit better at least. This demonstrates exactly what I was pointing to: It looks like when the test switches over from eager RDMA (I'll explain in a second), to doing a rendezvous protocol working entirely in user buffer spac

Re: [OMPI users] EXTERNAL: Re: Question on mapping processes to hosts file

2014-11-11 Thread Blosch, Edwin L
OK, that’s what I was suspecting. It’s a bug, right? I asked for 4 processes and I supplied a host file with 4 lines in it, and mpirun didn’t launch the processes where I told it to launch them. Do you know when or if this changed? I can’t recall seeing this this behavior in 1.6.5 or 1.4 or

Re: [OMPI users] oversubscription of slots with GridEngine

2014-11-11 Thread SLIM H.A.
Dear Reuti and Ralph Below is the output of the run for openmpi 1.8.3 with this line mpirun -np $NSLOTS --display-map --display-allocation --cpus-per-proc 1 $exe master=cn6050 PE=orte JOB_ID=2482923 Got 32 slots. slots: cn6050 16 par6.q@cn6050 cn6045 16 par6.q@cn6045 Tue Nov 11 12:37:37 GMT 2

Re: [OMPI users] File-backed mmaped I/O and openib btl.

2014-11-11 Thread Emmanuel Thomé
Hi again, I've been able to simplify my test case significantly. It now runs with 2 nodes, and only a single MPI_Send / MPI_Recv pair is used. The pattern is as follows. * - ranks 0 and 1 both own a local buffer. * - each fills it with (deterministically known) data. * - rank 0 collects th