Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Diego Avesani
Dear Jeff, Dear Gilles, Dear All, now is all more clear. I use CALL MPI_ISEND and CALL MPI_IRECV. Each CPU send once and revive once, this implies that I have REQUEST(2) for WAITALL. However, sometimes dome CPU does not send or receive anything, so I have to set REQUEST = MPI_REQUEST_NULL in order

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Jeff Squyres (jsquyres)
On Sep 30, 2015, at 3:13 PM, marcin.krotkiewski wrote: > > Thank you for this clear explanation. I do not have True Scale on 'my' > machine, so unless Mellanox gets involved - no juice for me. > > Makes me wonder. libfabric is marketed as a next-generation solution. Clearly > it has some repo

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Jeff Squyres (jsquyres)
On Sep 30, 2015, at 11:19 AM, marcin.krotkiewski wrote: > > Thank you, and Jeff, for clarification. > > Before I bother you all more without the need, I should probably say I was > hoping to use libfabric/OpenMPI on an InfiniBand cluster. Somehow now I feel > I have confused this altogether,

Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Jeff Squyres (jsquyres)
On Sep 30, 2015, at 4:41 PM, Diego Avesani wrote: > > Dear Gilles, > sorry to ask you again and to be frustrating, > basically is this what I shall do for each CPU: > > CALL MPI_ISEND(send_messageL, MsgLength, MPI_DOUBLE_COMPLEX, MPIdata%rank-1, > MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIda

Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Diego Avesani
Dear Gilles, sorry to ask you again and to be frustrating, basically is this what I shall do for each CPU: CALL MPI_ISEND(send_messageL, MsgLength, MPI_DOUBLE_COMPLEX, MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr) CALL MPI_IRECV(recv_messageR, MsgLength, MPI_DOUBLE_COMP

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread marcin.krotkiewski
Thank you for this clear explanation. I do not have True Scale on 'my' machine, so unless Mellanox gets involved - no juice for me. Makes me wonder. libfabric is marketed as a next-generation solution. Clearly it has some reported advantage for Cisco usnic, but since you claim no improvement

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Howard Pritchard
Hi Marcin, 2015-09-30 9:19 GMT-06:00 marcin.krotkiewski : > Thank you, and Jeff, for clarification. > > Before I bother you all more without the need, I should probably say I was > hoping to use libfabric/OpenMPI on an InfiniBand cluster. Somehow now I > feel I have confused this altogether, so

Re: [OMPI users] about MPI communication complexity

2015-09-30 Thread George Bosilca
Xing Feng, A more focused (and certainly more detailed) analysis of the cost of different algorithms for collective communications can be found in [1], and more recently in [2]. George. [1] http://icl.cs.utk.edu/projectsfiles/rib/pubs/Pjesivac-Grbovic_PMEO-PDS05.pdf [2] https://www.cs.utexas.e

Re: [OMPI users] Using POSIX shared memory as send buffer

2015-09-30 Thread marcin.krotkiewski
Hi, Nathan I have compiled 2.x with your patch. I must say it works _much_ better with your changes. I have no idea how you figured that out! A short table with my bandwidth calculations (MB/s) PROT_READPROT_READ | PROT_WRITE 1.10.02500

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Nathan Hjelm
Mike, I see a typo in the mxm warning: mxm.c:185 MXM WARN The 'ulimit -s' on the system is set to 'unlimited'. This may have negative performance implications. Please set the heap size to the default value (10240) Should say stack not heap. -Nathan On Wed, Sep 30, 2015 at 06:52:46PM +0300,

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Mike Dubman
mxm comes with mxm_dump_config utility which provides and explains all tunables. Please check HPCX/README file for details. On Wed, Sep 30, 2015 at 1:21 PM, Dave Love wrote: > Mike Dubman writes: > > > unfortunately, there is no one size fits all here. > > > > mxm provides best performance for

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Mike Dubman
we did not get to the bottom for "why". Tried different mpi packages (mvapich,intel mpi) and the observation hold true. it could be many factors affected by huge heap size (cpu cache misses? swapness?). On Wed, Sep 30, 2015 at 1:12 PM, Dave Love wrote: > Mike Dubman writes: > > > Hello Grigory

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Grigory Shamov
Hi Thomas, Thank you for the suggestion! Will try it. -- Grigory Shamov On 15-09-30 6:57 AM, "users on behalf of Thomas Jahns" wrote: >Hello, > >On 09/28/15 18:36, Grigory Shamov wrote: >> The question is if we should do as MXM wants, or ignore it? Has anyone >>an >> experience running rec

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread marcin.krotkiewski
Thank you, and Jeff, for clarification. Before I bother you all more without the need, I should probably say I was hoping to use libfabric/OpenMPI on an InfiniBand cluster. Somehow now I feel I have confused this altogether, so maybe I should go one step back: 1. libfabric is hardware indep

Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Gilles Gouaillardet
Diego, there is some confusion here... MPI_Waitall is not a collective operations, and a given tasks can only wait the requests it initiated. bottom line, each task does exactly one send and one recv, right ? in this case, you want to have an array of two requests, isend with the first element an

Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Diego Avesani
Do you have some suggestions? Is there any possibilities to use not a vector as send_request and at the same time to have a WAIT? regarding the code, you are perfectly right, I hope to improve in future Thanks again Diego On 30 September 2015 at 16:50, Jeff Squyres (jsquyres) wrote: > I don'

Re: [OMPI users] understanding mpi_gather-mpi_gatherv

2015-09-30 Thread Nick Papior
Gather receives messages of _one_ length. Hence all arrays has to be of same length (not exactly see below). Hence 625 should be 175. See the example on the documentation site: https://www.open-mpi.org/doc/v1.8/man3/MPI_Gather.3.php You should use gatherv for varying length of messages, or use gat

Re: [OMPI users] understanding mpi_gather-mpi_gatherv

2015-09-30 Thread Jeff Squyres (jsquyres)
Gather requires that all processes contribute the same size message. Gatherv allows the root to specify a different size that will be supplied by each peer process. Note, too, that X1(iStart:iEnd) may well invoke a copy to copy just that portion of the array; that might hurt your performance (

Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Jeff Squyres (jsquyres)
I don't think that this pattern was obvious from the code snippet you sent, which is why I asked for a small, self-contained reproducer. :-) I don't know offhand how send_request(:) will be passed to C. > On Sep 30, 2015, at 10:16 AM, Diego Avesani wrote: > > Dear all, > thank for the explan

[OMPI users] understanding mpi_gather-mpi_gatherv

2015-09-30 Thread Diego Avesani
dear all, I am not sure if I have understood correctly mpi_gather and mpi_gatherv. This is my problem: I have a complex vector, let's say: X1, where X1 is (1:625). Each CPU work only with some element of X1, let say: CPU 0 --> X1(iEnd-iStart) 150 elements CPU 1 --> X1(iEnd-iStart) 150 element

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Jeff Squyres (jsquyres)
On Sep 30, 2015, at 7:35 AM, Marcin Krotkiewski wrote: > > I am trying to compile the 2.x branch with libfabric support, but get this > error during configure: > > configure:100708: checking rdma/fi_ext_usnic.h presence > configure:100708: gcc -E -I/cluster/software/VERSIONS/openmpi.gnu.2.x/in

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Grigory Shamov
Absolutely. Quite a lot of quantum chemistry here codes are Fortran, and most would use Intel Fortran for performance. While some (VASP) might depend depend on -heap-arrays Intel switch being used with a small value, the default setting for Intel Fortran is -no-heap-arrays "temporary arrays are a

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Grigory Shamov
Hi Mike, It sure is possible to tune for a particular code. Especially if one aims at getting best performance numbers. Thats one thing; however when a communication library (MXM) imposes limits that might be conflicting with limits of some applications , its another. Maintaining stacks of MPI

Re: [OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Howard Pritchard
Hello Marcin What configure options are you using besides with-libfabric? Could you post your config.log file tp the list? Looks like you only install fi_ext_usnic.h if you could build the usnic libfab provider. When you configured libfabric what providers were listed at the end of configure ru

Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Diego Avesani
Dear all, thank for the explanation, but something is not clear to me. I have 4 CPUs. I use only three of them to send, let say: CPU 0 send to CPU 1 CPU 1 send to CPU 2 CPU 2 send to CPU 3 only three revive, let's say; CPU 1 from CPU 0 CPU 2 from CPU 1 CPU 3 from CPU 2 so I use ALLOCATE(send_requ

Re: [OMPI users] worse latency in 1.8 c.f. 1.6

2015-09-30 Thread Dave Love
I wrote: > I'll try some variations like that when I can get complete nodes on the > chassis. It turns out that adding --mca mtl ^mxm to the 1.8 case gives results in line with 1.6 as best as I can estimate the variation (error bars -- we've heard of them). It makes no difference to 1.6 whether

Re: [OMPI users] Problem using Open MPI 1.10.0 built with Intel compilers 16.0.0

2015-09-30 Thread Fabrice Roy
Hi, I have built openmpi from the nightly snapshot v1.10.0-73-ge27ab85 and everything seems to work fine. Thanks a lot! Fabrice Le 25/09/2015 07:38, Jeff Squyres (jsquyres) a écrit : Fabrice -- I have committed a fix to our development master; it is pending moving over to the v1.10 and v2

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Thomas Jahns
Hello, On 09/28/15 18:36, Grigory Shamov wrote: The question is if we should do as MXM wants, or ignore it? Has anyone an experience running recent OpenMPI with MXM enabled, and what kind of ulimits do you have? Any suggestions/comments appreciated, thanks! It should be sufficient to set the l

[OMPI users] libfabric/usnic does not compile in 2.x

2015-09-30 Thread Marcin Krotkiewski
Hi, I am trying to compile the 2.x branch with libfabric support, but get this error during configure: configure:100708: checking rdma/fi_ext_usnic.h presence configure:100708: gcc -E -I/cluster/software/VERSIONS/openmpi.gnu.2.x/include -I/usit/abel/u1/marcink/software/ompi-release-2.x/opal/

Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Jeff Squyres (jsquyres)
Put differently: - You have an array of N requests - If you're only filling up M of them (where N On Sep 30, 2015, at 3:43 AM, Diego Avesani wrote: > > Dear Gilles, Dear All, > > What do you mean that the array of requests has to be initialize via > MPI_Isend or MPI_Irecv? > > In my code I us

[OMPI users] send_request error with allocate

2015-09-30 Thread Gilles Gouaillardet
Diego, if you invoke 3 times Isend with three different send__request() and sane thing for Irecv, then you do not have to worry about MPI_REQUEST_NULL Based on your snippet, there could be an issue on ranks 0 and n-1, also the index of send_request is MPIdata%rank+1 if MPIdata%rank is MPi_Comm_ra

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Dave Love
Mike Dubman writes: > unfortunately, there is no one size fits all here. > > mxm provides best performance for IB. > > different application may require different OMPI, mxm, OS tunables and > requires some performance engineering. Fair enough, but is there any guidance on the MXM stuff, in parti

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Dave Love
Nathan Hjelm writes: > I would like to add that you may want to play with the value and see > what works for your applications. Most applications should be using > malloc or similar functions to allocate large memory regions in the heap > and not on the stack. It's long been a Fortran optimizati

Re: [OMPI users] Using OpenMPI (1.8, 1.10) with Mellanox MXM, ulimits ?

2015-09-30 Thread Dave Love
Mike Dubman writes: > Hello Grigory, > > We observed ~10% performance degradation with heap size set to unlimited > for CFD applications. OK, but why? It would help to understand what the mechanism is, and why MXM specifically tells you to set the stack to the default, which may well be wrong f

Re: [OMPI users] worse latency in 1.8 c.f. 1.6

2015-09-30 Thread Dave Love
Mike Dubman writes: > what is your command line and setup? (ofed version, distro) It's on up-to-date SL6 (so using whatever RHEL6 ships) running the commands below for the 1.6 and 1.8 cases respectively. The HCA is reported as mlx4_0. Core binding is configured for 1.6. I think they both had

Re: [OMPI users] send_request error with allocate

2015-09-30 Thread Diego Avesani
Dear Gilles, Dear All, What do you mean that the array of requests has to be initialize via MPI_Isend or MPI_Irecv? In my code I use three times MPI_Isend and MPI_Irecv so I have a send_request(3). According to this, do I have to use MPI_REQUEST_NULL? In the meantime I check my code Thanks Di

Re: [OMPI users] about MPI communication complexity

2015-09-30 Thread Marc-Andre Hermanns
Dear Xing Feng, there are different algorithms to implement collective communication patterns. Next to general Big-O complexity the concrete complexity also depends on the network topology, message length, etc.. Therefore many MPI implementations switch between different algorithms depending on t

[OMPI users] about MPI communication complexity

2015-09-30 Thread XingFENG
Hi, every one, I am working with open-mpi. When I tried to analyse performance of my programs, I find it is hard to understand the communication complexity of MPI routines. I have found some page on Internet such as http://stackoverflow.com/questions/10625643/mpi-communication-complexity This i