Dear Jeff, Dear Gilles, Dear All,
now is all more clear.
I use CALL MPI_ISEND and CALL MPI_IRECV. Each CPU send once and revive
once, this implies that I have REQUEST(2) for WAITALL. However, sometimes
dome CPU does not send or receive anything, so I have to set REQUEST
= MPI_REQUEST_NULL in order
On Sep 30, 2015, at 3:13 PM, marcin.krotkiewski
wrote:
>
> Thank you for this clear explanation. I do not have True Scale on 'my'
> machine, so unless Mellanox gets involved - no juice for me.
>
> Makes me wonder. libfabric is marketed as a next-generation solution. Clearly
> it has some repo
On Sep 30, 2015, at 11:19 AM, marcin.krotkiewski
wrote:
>
> Thank you, and Jeff, for clarification.
>
> Before I bother you all more without the need, I should probably say I was
> hoping to use libfabric/OpenMPI on an InfiniBand cluster. Somehow now I feel
> I have confused this altogether,
On Sep 30, 2015, at 4:41 PM, Diego Avesani wrote:
>
> Dear Gilles,
> sorry to ask you again and to be frustrating,
> basically is this what I shall do for each CPU:
>
> CALL MPI_ISEND(send_messageL, MsgLength, MPI_DOUBLE_COMPLEX, MPIdata%rank-1,
> MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIda
Dear Gilles,
sorry to ask you again and to be frustrating,
basically is this what I shall do for each CPU:
CALL MPI_ISEND(send_messageL, MsgLength, MPI_DOUBLE_COMPLEX,
MPIdata%rank-1, MPIdata%rank, MPI_COMM_WORLD, REQUEST(1), MPIdata%iErr)
CALL MPI_IRECV(recv_messageR, MsgLength, MPI_DOUBLE_COMP
Thank you for this clear explanation. I do not have True Scale on 'my'
machine, so unless Mellanox gets involved - no juice for me.
Makes me wonder. libfabric is marketed as a next-generation solution.
Clearly it has some reported advantage for Cisco usnic, but since you
claim no improvement
Hi Marcin,
2015-09-30 9:19 GMT-06:00 marcin.krotkiewski :
> Thank you, and Jeff, for clarification.
>
> Before I bother you all more without the need, I should probably say I was
> hoping to use libfabric/OpenMPI on an InfiniBand cluster. Somehow now I
> feel I have confused this altogether, so
Xing Feng,
A more focused (and certainly more detailed) analysis of the cost of
different algorithms for collective communications can be found in [1], and
more recently in [2].
George.
[1]
http://icl.cs.utk.edu/projectsfiles/rib/pubs/Pjesivac-Grbovic_PMEO-PDS05.pdf
[2] https://www.cs.utexas.e
Hi, Nathan
I have compiled 2.x with your patch. I must say it works _much_ better
with your changes. I have no idea how you figured that out! A short
table with my bandwidth calculations (MB/s)
PROT_READPROT_READ | PROT_WRITE
1.10.02500
Mike, I see a typo in the mxm warning:
mxm.c:185 MXM WARN The
'ulimit -s' on the system is set to 'unlimited'. This may have negative
performance implications. Please set the heap size to the default value
(10240)
Should say stack not heap.
-Nathan
On Wed, Sep 30, 2015 at 06:52:46PM +0300,
mxm comes with mxm_dump_config utility which provides and explains all
tunables.
Please check HPCX/README file for details.
On Wed, Sep 30, 2015 at 1:21 PM, Dave Love wrote:
> Mike Dubman writes:
>
> > unfortunately, there is no one size fits all here.
> >
> > mxm provides best performance for
we did not get to the bottom for "why".
Tried different mpi packages (mvapich,intel mpi) and the observation hold
true.
it could be many factors affected by huge heap size (cpu cache misses?
swapness?).
On Wed, Sep 30, 2015 at 1:12 PM, Dave Love wrote:
> Mike Dubman writes:
>
> > Hello Grigory
Hi Thomas,
Thank you for the suggestion! Will try it.
--
Grigory Shamov
On 15-09-30 6:57 AM, "users on behalf of Thomas Jahns"
wrote:
>Hello,
>
>On 09/28/15 18:36, Grigory Shamov wrote:
>> The question is if we should do as MXM wants, or ignore it? Has anyone
>>an
>> experience running rec
Thank you, and Jeff, for clarification.
Before I bother you all more without the need, I should probably say I
was hoping to use libfabric/OpenMPI on an InfiniBand cluster. Somehow
now I feel I have confused this altogether, so maybe I should go one
step back:
1. libfabric is hardware indep
Diego,
there is some confusion here...
MPI_Waitall is not a collective operations, and a given tasks can only wait
the requests it initiated.
bottom line, each task does exactly one send and one recv, right ?
in this case, you want to have an array of two requests, isend with the
first element an
Do you have some suggestions? Is there any possibilities to use not a
vector as send_request and at the same time to have a WAIT?
regarding the code, you are perfectly right, I hope to improve in future
Thanks again
Diego
On 30 September 2015 at 16:50, Jeff Squyres (jsquyres)
wrote:
> I don'
Gather receives messages of _one_ length. Hence all arrays has to be of
same length (not exactly see below). Hence 625 should be 175.
See the example on the documentation site:
https://www.open-mpi.org/doc/v1.8/man3/MPI_Gather.3.php
You should use gatherv for varying length of messages, or use gat
Gather requires that all processes contribute the same size message. Gatherv
allows the root to specify a different size that will be supplied by each peer
process.
Note, too, that X1(iStart:iEnd) may well invoke a copy to copy just that
portion of the array; that might hurt your performance (
I don't think that this pattern was obvious from the code snippet you sent,
which is why I asked for a small, self-contained reproducer. :-)
I don't know offhand how send_request(:) will be passed to C.
> On Sep 30, 2015, at 10:16 AM, Diego Avesani wrote:
>
> Dear all,
> thank for the explan
dear all,
I am not sure if I have understood correctly mpi_gather and mpi_gatherv.
This is my problem:
I have a complex vector, let's say: X1, where X1 is (1:625).
Each CPU work only with some element of X1, let say:
CPU 0 --> X1(iEnd-iStart) 150 elements
CPU 1 --> X1(iEnd-iStart) 150 element
On Sep 30, 2015, at 7:35 AM, Marcin Krotkiewski
wrote:
>
> I am trying to compile the 2.x branch with libfabric support, but get this
> error during configure:
>
> configure:100708: checking rdma/fi_ext_usnic.h presence
> configure:100708: gcc -E -I/cluster/software/VERSIONS/openmpi.gnu.2.x/in
Absolutely.
Quite a lot of quantum chemistry here codes are Fortran, and most would
use Intel Fortran for performance. While some (VASP) might depend depend
on -heap-arrays Intel switch being used with a small value, the default
setting for Intel Fortran is -no-heap-arrays "temporary arrays are
a
Hi Mike,
It sure is possible to tune for a particular code. Especially if one aims at
getting best performance numbers.
Thats one thing; however when a communication library (MXM) imposes limits that
might be conflicting with limits of some applications , its another.
Maintaining stacks of MPI
Hello Marcin
What configure options are you using besides with-libfabric?
Could you post your config.log file tp the list?
Looks like you only install fi_ext_usnic.h if you could build the usnic
libfab provider. When you configured libfabric what providers were listed
at the end of configure ru
Dear all,
thank for the explanation, but something is not clear to me.
I have 4 CPUs. I use only three of them to send, let say:
CPU 0 send to CPU 1
CPU 1 send to CPU 2
CPU 2 send to CPU 3
only three revive, let's say;
CPU 1 from CPU 0
CPU 2 from CPU 1
CPU 3 from CPU 2
so I use ALLOCATE(send_requ
I wrote:
> I'll try some variations like that when I can get complete nodes on the
> chassis.
It turns out that adding --mca mtl ^mxm to the 1.8 case gives results in
line with 1.6 as best as I can estimate the variation (error bars --
we've heard of them). It makes no difference to 1.6 whether
Hi,
I have built openmpi from the nightly snapshot v1.10.0-73-ge27ab85 and
everything seems to work fine.
Thanks a lot!
Fabrice
Le 25/09/2015 07:38, Jeff Squyres (jsquyres) a écrit :
Fabrice --
I have committed a fix to our development master; it is pending moving over to
the v1.10 and v2
Hello,
On 09/28/15 18:36, Grigory Shamov wrote:
The question is if we should do as MXM wants, or ignore it? Has anyone an
experience running recent OpenMPI with MXM enabled, and what kind of
ulimits do you have? Any suggestions/comments appreciated, thanks!
It should be sufficient to set the l
Hi,
I am trying to compile the 2.x branch with libfabric support, but get
this error during configure:
configure:100708: checking rdma/fi_ext_usnic.h presence
configure:100708: gcc -E
-I/cluster/software/VERSIONS/openmpi.gnu.2.x/include
-I/usit/abel/u1/marcink/software/ompi-release-2.x/opal/
Put differently:
- You have an array of N requests
- If you're only filling up M of them (where N On Sep 30, 2015, at 3:43 AM, Diego Avesani wrote:
>
> Dear Gilles, Dear All,
>
> What do you mean that the array of requests has to be initialize via
> MPI_Isend or MPI_Irecv?
>
> In my code I us
Diego,
if you invoke 3 times Isend with three different send__request() and sane
thing for Irecv, then you do not have to worry about MPI_REQUEST_NULL
Based on your snippet, there could be an issue on ranks 0 and n-1,
also the index of send_request is MPIdata%rank+1
if MPIdata%rank is MPi_Comm_ra
Mike Dubman writes:
> unfortunately, there is no one size fits all here.
>
> mxm provides best performance for IB.
>
> different application may require different OMPI, mxm, OS tunables and
> requires some performance engineering.
Fair enough, but is there any guidance on the MXM stuff, in parti
Nathan Hjelm writes:
> I would like to add that you may want to play with the value and see
> what works for your applications. Most applications should be using
> malloc or similar functions to allocate large memory regions in the heap
> and not on the stack.
It's long been a Fortran optimizati
Mike Dubman writes:
> Hello Grigory,
>
> We observed ~10% performance degradation with heap size set to unlimited
> for CFD applications.
OK, but why? It would help to understand what the mechanism is, and why
MXM specifically tells you to set the stack to the default, which may
well be wrong f
Mike Dubman writes:
> what is your command line and setup? (ofed version, distro)
It's on up-to-date SL6 (so using whatever RHEL6 ships) running the
commands below for the 1.6 and 1.8 cases respectively. The HCA is
reported as mlx4_0. Core binding is configured for 1.6. I think they
both had
Dear Gilles, Dear All,
What do you mean that the array of requests has to be initialize via
MPI_Isend or MPI_Irecv?
In my code I use three times MPI_Isend and MPI_Irecv so I have
a send_request(3). According to this, do I have to use MPI_REQUEST_NULL?
In the meantime I check my code
Thanks
Di
Dear Xing Feng,
there are different algorithms to implement collective communication
patterns. Next to general Big-O complexity the concrete complexity
also depends on the network topology, message length, etc..
Therefore many MPI implementations switch between different algorithms
depending on t
Hi, every one,
I am working with open-mpi. When I tried to analyse performance of my
programs, I find it is hard to understand the communication complexity of
MPI routines.
I have found some page on Internet such as
http://stackoverflow.com/questions/10625643/mpi-communication-complexity
This i
38 matches
Mail list logo