Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-07 Thread Hamidreza Anvari
Hello,

If I set the size2 values according to your suggestion, which is the same
values as on sending nodes, it works fine.
But by definition it does not need to be exactly the same as the length of
sent data, and it is just a maximum length of expected data to receive. If
not, it is inevitable to run a allToAll() first to communicate the data
sizes, and then doing the main allToAllV(), which is an expensive
unnecessary communication overhead.

I just created a reproducer in C++ which gives the error under OpenMPI
1.8.4, but runs correctly under OpenMPI 1.5.4 .
(I've not included the Java version of this reproducer, which I think is
not important as current version is enough to reproduce the error. But in
case, it is straight forward to convert this code to Java).

Thanks,
-- HR

On Mon, Apr 6, 2015 at 3:03 PM, Ralph Castain  wrote:

> That would imply that the issue is in the underlying C implementation in
> OMPI, not the Java bindings. The reproducer would definitely help pin it
> down.
>
> If you change the size2 values to the ones we sent you, does the program
> by chance work?
>
>
> On Apr 6, 2015, at 1:44 PM, Hamidreza Anvari  wrote:
>
> I'll try that as well.
> Meanwhile, I found that my c++ code is running fine on a machine running
> OpenMPI 1.5.4, but I receive the same error under OpenMPI 1.8.4 for both
> Java and C++.
>
> On Mon, Apr 6, 2015 at 2:21 PM, Howard Pritchard 
> wrote:
>
>> Hello HR,
>>
>> Thanks!  If you have Java 1.7 installed on your system would you mind
>> trying to test against that version too?
>>
>> Thanks,
>>
>> Howard
>>
>>
>> 2015-04-06 13:09 GMT-06:00 Hamidreza Anvari :
>>
>>> Hello,
>>>
>>> 1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.
>>>
>>> 2. I have used the following configuration for making OpenMPI:
>>> ./configure --enable-mpi-java
>>> --with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
>>> --with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
>>> --prefix="/users/hamidreza/openmpi-1.8.4"
>>>
>>> make all install
>>>
>>> 3. As a logical point of view, size2 is the maximum expected data to
>>> receive, which in turn might be less that this maximum.
>>>
>>> 4. I will try to prepare a working reproducer of my error and send it to
>>> you.
>>>
>>> Thanks,
>>> -- HR
>>>
>>> On Mon, Apr 6, 2015 at 10:46 AM, Ralph Castain  wrote:
>>>
 I've talked to the folks who wrote the Java bindings. One possibility
 we identified is that there may be an error in your code when you did the
 translation

 My immediate thought is that each process can not receive more elements
 than it was sent to them. That's the reason of truncation error.

 These are the correct values:

 rank 0 - size2: 2,2,1,1
 rank 1 - size2: 1,1,1,1
 rank 2 - size2: 0,1,1,2
 rank 3 - size2: 2,1,2,1


 Can you check your code to see if perhaps the values you are passing
 didn't get translated correctly from your C++ version to the Java version?



 On Apr 6, 2015, at 5:03 AM, Howard Pritchard 
 wrote:

 Hello HR,

 It would also be useful to know which java version you are using, as
 well
 as the configure options used when building open mpi.

 Thanks,

 Howard



 2015-04-05 19:10 GMT-06:00 Ralph Castain :

> If not too much trouble, can you extract just the alltoallv portion
> and provide us with a small reproducer?
>
>
> On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari 
> wrote:
>
> Hello,
>
> I am converting an existing MPI program in C++ to Java using OpenMPI
> 1.8.4,
> At some point I have a allToAllv() code which works fine in C++ but
> receives error in Java version:
>
> MPI.COMM_WORLD.allToAllv(data, subpartition_size, subpartition_offset,
> MPI.INT ,
> data2,subpartition_size2,subpartition_offset2,MPI.INT
> );
>
> Error:
> *** An error occurred in MPI_Alltoallv
> *** reported by process [3621322753,9223372036854775811]
> *** on communicator MPI_COMM_WORLD
> *** MPI_ERR_TRUNCATE: message truncated
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now
> abort,
> ***and potentially your MPI job)
> 3 more processes have sent help message help-mpi-errors.txt /
> mpi_errors_are_fatal
> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help /
> error messages
>
> Here are the values for parameters:
>
> data.length = 5
> data2.length = 20
>
> -- Rank 0 of 4 --
> subpartition_offset:0,2,3,3,
> subpartition_size:2,1,0,2,
> subpartition_offset2:0,5,10,15,
> subpartition_size2:5,5,5,5,
> --
> -- Rank 1 of 4 --
> subpartition_offset:0,2,3,4,
> subpartition_size:2,1,1,1,
> subpartition_offset2:0,5,10,1

Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Filippo Spiga
Thanks Rolf and Ralph for the replies!

On Apr 6, 2015, at 10:37 PM, Ralph Castain  wrote:
> That said, you can certainly have your application specify a thread-level 
> binding. You’d have to do the heavy lifting yourself in the app, I’m afraid, 
> instead of relying on us to do it for you

Ok, my application must do it and I am fine with it. But how? I mean, does Open 
MPi expose some API that allows such fine grain control?

F

--
Mr. Filippo SPIGA, M.Sc.
http://filippospiga.info ~ skype: filippo.spiga

«Nobody will drive us out of Cantor's paradise.» ~ David Hilbert

*
Disclaimer: "Please note this message and any attachments are CONFIDENTIAL and 
may be privileged or otherwise protected from disclosure. The contents are not 
to be disclosed to anyone other than the addressee. Unauthorized recipients are 
requested to preserve this confidentiality and to advise the sender immediately 
of any error in transmission."




[OMPI users] http://www.open-mpi.org/doc/current/man3/MPI_Win_lock_all.3.php

2015-04-07 Thread Thomas Jahns

Hello,

I think above web site lists the Fortran syntax section incorrectly as

INCLUDE ’mpif.h’
MPI_WIN_LOCK(ASSERT, WIN, IERROR)
INTEGER ASSERT, WIN, IERROR

when there should be

MPI_WIN_LOCK_ALL(ASSERT, WIN, IERROR)

instead.

Regards, Thomas



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-07 Thread Howard Pritchard
Hi HR,

Sorry for not noticing the receive side earlier, but as Ralph implied
earlier
in this thread, the MPI standard has more strict type matching for
collectives
than for point to point.  Namely, the number of bytes the receiver expects
to receive from a given sender in the alltoallv must match the number of
bytes
sent by the sender.

You were just getting lucky with the older open mpi.  The error message
isn't so great though.  Its likely in the newer open mpi you are using a
collective algorithm for alltoallv that assumes you're app is obeying the
standard.

You are correct that if the ranks don't know how much data will be sent
to them from each rank prior to the alltoallv op, you will need to have some
mechanism for exchanging this info prior to the alltoallv op.

Howard


2015-04-06 23:23 GMT-06:00 Hamidreza Anvari :

> Hello,
>
> If I set the size2 values according to your suggestion, which is the same
> values as on sending nodes, it works fine.
> But by definition it does not need to be exactly the same as the length of
> sent data, and it is just a maximum length of expected data to receive. If
> not, it is inevitable to run a allToAll() first to communicate the data
> sizes, and then doing the main allToAllV(), which is an expensive
> unnecessary communication overhead.
>
> I just created a reproducer in C++ which gives the error under OpenMPI
> 1.8.4, but runs correctly under OpenMPI 1.5.4 .
> (I've not included the Java version of this reproducer, which I think is
> not important as current version is enough to reproduce the error. But in
> case, it is straight forward to convert this code to Java).
>
> Thanks,
> -- HR
>
> On Mon, Apr 6, 2015 at 3:03 PM, Ralph Castain  wrote:
>
>> That would imply that the issue is in the underlying C implementation in
>> OMPI, not the Java bindings. The reproducer would definitely help pin it
>> down.
>>
>> If you change the size2 values to the ones we sent you, does the program
>> by chance work?
>>
>>
>> On Apr 6, 2015, at 1:44 PM, Hamidreza Anvari  wrote:
>>
>> I'll try that as well.
>> Meanwhile, I found that my c++ code is running fine on a machine running
>> OpenMPI 1.5.4, but I receive the same error under OpenMPI 1.8.4 for both
>> Java and C++.
>>
>> On Mon, Apr 6, 2015 at 2:21 PM, Howard Pritchard 
>> wrote:
>>
>>> Hello HR,
>>>
>>> Thanks!  If you have Java 1.7 installed on your system would you mind
>>> trying to test against that version too?
>>>
>>> Thanks,
>>>
>>> Howard
>>>
>>>
>>> 2015-04-06 13:09 GMT-06:00 Hamidreza Anvari :
>>>
 Hello,

 1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.

 2. I have used the following configuration for making OpenMPI:
 ./configure --enable-mpi-java
 --with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
 --with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
 --prefix="/users/hamidreza/openmpi-1.8.4"

 make all install

 3. As a logical point of view, size2 is the maximum expected data to
 receive, which in turn might be less that this maximum.

 4. I will try to prepare a working reproducer of my error and send it
 to you.

 Thanks,
 -- HR

 On Mon, Apr 6, 2015 at 10:46 AM, Ralph Castain 
 wrote:

> I’ve talked to the folks who wrote the Java bindings. One possibility
> we identified is that there may be an error in your code when you did the
> translation
>
> My immediate thought is that each process can not receive more
> elements than it was sent to them. That's the reason of truncation error.
>
> These are the correct values:
>
> rank 0 - size2: 2,2,1,1
> rank 1 - size2: 1,1,1,1
> rank 2 - size2: 0,1,1,2
> rank 3 - size2: 2,1,2,1
>
>
> Can you check your code to see if perhaps the values you are passing
> didn’t get translated correctly from your C++ version to the Java version?
>
>
>
> On Apr 6, 2015, at 5:03 AM, Howard Pritchard 
> wrote:
>
> Hello HR,
>
> It would also be useful to know which java version you are using, as
> well
> as the configure options used when building open mpi.
>
> Thanks,
>
> Howard
>
>
>
> 2015-04-05 19:10 GMT-06:00 Ralph Castain :
>
>> If not too much trouble, can you extract just the alltoallv portion
>> and provide us with a small reproducer?
>>
>>
>> On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari 
>> wrote:
>>
>> Hello,
>>
>> I am converting an existing MPI program in C++ to Java using OpenMPI
>> 1.8.4,
>> At some point I have a allToAllv() code which works fine in C++ but
>> receives error in Java version:
>>
>> MPI.COMM_WORLD.allToAllv(data, subpartition_size,
>> subpartition_offset, MPI.INT ,
>> data2,subpartition_size2,subpartition_offset2,MPI.INT
>> 

Re: [OMPI users] http://www.open-mpi.org/doc/current/man3/MPI_Win_lock_all.3.php

2015-04-07 Thread Howard Pritchard
Hi Thomas,

Thanks very much for pointing this out. Will fix shortly.

Howard
 On Apr 7, 2015 5:35 AM, "Thomas Jahns"  wrote:

> Hello,
>
> I think above web site lists the Fortran syntax section incorrectly as
>
> INCLUDE ’mpif.h’
> MPI_WIN_LOCK(ASSERT, WIN, IERROR)
> INTEGER ASSERT, WIN, IERROR
>
> when there should be
>
> MPI_WIN_LOCK_ALL(ASSERT, WIN, IERROR)
>
> instead.
>
> Regards, Thomas
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26633.php
>


Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Rolf vandeVaart
I still do not believe there is a way for you to steer your traffic based on 
the thread that is calling into Open MPI. While you can spawn your own threads, 
Open MPI is going to figure out what interfaces to use based on the 
characteristics of the process during MPI_Init.  Even if Open MPI decides to 
use two interfaces, the use of these will be done based on the process.  It 
will alternate between them independent of which thread happens to be doing the 
sends or receives.  There is no way of doing this with something like 
MPI_T_cvar_write which I think is what you were looking for.

Rolf  

>-Original Message-
>From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Filippo Spiga
>Sent: Tuesday, April 07, 2015 5:46 AM
>To: Open MPI Users
>Subject: Re: [OMPI users] Different HCA from different OpenMP threads
>(same rank using MPI_THREAD_MULTIPLE)
>
>Thanks Rolf and Ralph for the replies!
>
>On Apr 6, 2015, at 10:37 PM, Ralph Castain  wrote:
>> That said, you can certainly have your application specify a thread-level
>binding. You’d have to do the heavy lifting yourself in the app, I’m afraid,
>instead of relying on us to do it for you
>
>Ok, my application must do it and I am fine with it. But how? I mean, does
>Open MPi expose some API that allows such fine grain control?
>
>F
>
>--
>Mr. Filippo SPIGA, M.Sc.
>http://filippospiga.info ~ skype: filippo.spiga


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-07 Thread Hamidreza Anvari
Hello,

Thanks for your description.
I'm currently doing allToAll() prior to allToAllV(), to communicate length
of expected messages.
. 
BUT, I still strongly believe that the right implementation of this method
is something that I expected earlier!
If you check the MPI specification here:

http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf
Page 170
Line 14

It is mentioned that "... the number of elements that CAN be received...".
which implies that the actual received message may have shorter length.

While in cases where it is mandatory to have same value, the modal "MUST"
is used. for example at page 171 Line 1, it is mentioned that "... sendtype
at process i MUST be equal to the type signature ...".

SO, I would expect that any consistent implementation of MPI specification
handle this message length matching by itself, as I asked originally.

Thanks,
-- HR

On Tue, Apr 7, 2015 at 6:03 AM, Howard Pritchard 
wrote:

> Hi HR,
>
> Sorry for not noticing the receive side earlier, but as Ralph implied
> earlier
> in this thread, the MPI standard has more strict type matching for
> collectives
> than for point to point.  Namely, the number of bytes the receiver expects
> to receive from a given sender in the alltoallv must match the number of
> bytes
> sent by the sender.
>
> You were just getting lucky with the older open mpi.  The error message
> isn't so great though.  Its likely in the newer open mpi you are using a
> collective algorithm for alltoallv that assumes you're app is obeying the
> standard.
>
> You are correct that if the ranks don't know how much data will be sent
> to them from each rank prior to the alltoallv op, you will need to have
> some
> mechanism for exchanging this info prior to the alltoallv op.
>
> Howard
>
>
> 2015-04-06 23:23 GMT-06:00 Hamidreza Anvari :
>
>> Hello,
>>
>> If I set the size2 values according to your suggestion, which is the same
>> values as on sending nodes, it works fine.
>> But by definition it does not need to be exactly the same as the length
>> of sent data, and it is just a maximum length of expected data to receive.
>> If not, it is inevitable to run a allToAll() first to communicate the data
>> sizes, and then doing the main allToAllV(), which is an expensive
>> unnecessary communication overhead.
>>
>> I just created a reproducer in C++ which gives the error under OpenMPI
>> 1.8.4, but runs correctly under OpenMPI 1.5.4 .
>> (I've not included the Java version of this reproducer, which I think is
>> not important as current version is enough to reproduce the error. But in
>> case, it is straight forward to convert this code to Java).
>>
>> Thanks,
>> -- HR
>>
>> On Mon, Apr 6, 2015 at 3:03 PM, Ralph Castain  wrote:
>>
>>> That would imply that the issue is in the underlying C implementation in
>>> OMPI, not the Java bindings. The reproducer would definitely help pin it
>>> down.
>>>
>>> If you change the size2 values to the ones we sent you, does the program
>>> by chance work?
>>>
>>>
>>> On Apr 6, 2015, at 1:44 PM, Hamidreza Anvari 
>>> wrote:
>>>
>>> I'll try that as well.
>>> Meanwhile, I found that my c++ code is running fine on a machine running
>>> OpenMPI 1.5.4, but I receive the same error under OpenMPI 1.8.4 for both
>>> Java and C++.
>>>
>>> On Mon, Apr 6, 2015 at 2:21 PM, Howard Pritchard 
>>> wrote:
>>>
 Hello HR,

 Thanks!  If you have Java 1.7 installed on your system would you mind
 trying to test against that version too?

 Thanks,

 Howard


 2015-04-06 13:09 GMT-06:00 Hamidreza Anvari :

> Hello,
>
> 1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.
>
> 2. I have used the following configuration for making OpenMPI:
> ./configure --enable-mpi-java
> --with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
> --with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
> --prefix="/users/hamidreza/openmpi-1.8.4"
>
> make all install
>
> 3. As a logical point of view, size2 is the maximum expected data to
> receive, which in turn might be less that this maximum.
>
> 4. I will try to prepare a working reproducer of my error and send it
> to you.
>
> Thanks,
> -- HR
>
> On Mon, Apr 6, 2015 at 10:46 AM, Ralph Castain 
> wrote:
>
>> I've talked to the folks who wrote the Java bindings. One possibility
>> we identified is that there may be an error in your code when you did the
>> translation
>>
>> My immediate thought is that each process can not receive more
>> elements than it was sent to them. That's the reason of truncation error.
>>
>> These are the correct values:
>>
>> rank 0 - size2: 2,2,1,1
>> rank 1 - size2: 1,1,1,1
>> rank 2 - size2: 0,1,1,2
>> rank 3 - size2: 2,1,2,1
>>
>>
>> Can you check your code to see if perhaps the values you are passing
>

Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-07 Thread Ralph Castain
I’m afraid we’ll have to get someone from the Forum to interpret (Howard is a 
member as well), but here is what I see just below that, in the description 
section:

The type signature associated with sendcounts[j], sendtype at process i must be 
equal to the type signature associated with recvcounts[i], recvtype at process 
j. This implies that the amount of data sent must be equal to the amount of 
data received, pairwise between every pair of processes


> On Apr 7, 2015, at 9:56 AM, Hamidreza Anvari  wrote:
> 
> Hello,
> 
> Thanks for your description.
> I'm currently doing allToAll() prior to allToAllV(), to communicate length of 
> expected messages.
> .
> BUT, I still strongly believe that the right implementation of this method is 
> something that I expected earlier!
> If you check the MPI specification here:
> 
> http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf 
> 
> Page 170
> Line 14
> 
> It is mentioned that "... the number of elements that CAN be received...". 
> which implies that the actual received message may have shorter length.
> 
> While in cases where it is mandatory to have same value, the modal "MUST" is 
> used. for example at page 171 Line 1, it is mentioned that "... sendtype at 
> process i MUST be equal to the type signature ...".
> 
> SO, I would expect that any consistent implementation of MPI specification 
> handle this message length matching by itself, as I asked originally.
> 
> Thanks,
> -- HR
> 
> On Tue, Apr 7, 2015 at 6:03 AM, Howard Pritchard  > wrote:
> Hi HR,
> 
> Sorry for not noticing the receive side earlier, but as Ralph implied earlier
> in this thread, the MPI standard has more strict type matching for collectives
> than for point to point.  Namely, the number of bytes the receiver expects
> to receive from a given sender in the alltoallv must match the number of bytes
> sent by the sender.
> 
> You were just getting lucky with the older open mpi.  The error message
> isn't so great though.  Its likely in the newer open mpi you are using a
> collective algorithm for alltoallv that assumes you're app is obeying the
> standard.  
> 
> You are correct that if the ranks don't know how much data will be sent
> to them from each rank prior to the alltoallv op, you will need to have some
> mechanism for exchanging this info prior to the alltoallv op.
> 
> Howard
> 
> 
> 2015-04-06 23:23 GMT-06:00 Hamidreza Anvari  >:
> Hello,
> 
> If I set the size2 values according to your suggestion, which is the same 
> values as on sending nodes, it works fine.
> But by definition it does not need to be exactly the same as the length of 
> sent data, and it is just a maximum length of expected data to receive. If 
> not, it is inevitable to run a allToAll() first to communicate the data 
> sizes, and then doing the main allToAllV(), which is an expensive unnecessary 
> communication overhead.
> 
> I just created a reproducer in C++ which gives the error under OpenMPI 1.8.4, 
> but runs correctly under OpenMPI 1.5.4 .
> (I've not included the Java version of this reproducer, which I think is not 
> important as current version is enough to reproduce the error. But in case, 
> it is straight forward to convert this code to Java).
> 
> Thanks,
> -- HR
> 
> On Mon, Apr 6, 2015 at 3:03 PM, Ralph Castain  > wrote:
> That would imply that the issue is in the underlying C implementation in 
> OMPI, not the Java bindings. The reproducer would definitely help pin it down.
> 
> If you change the size2 values to the ones we sent you, does the program by 
> chance work?
> 
> 
>> On Apr 6, 2015, at 1:44 PM, Hamidreza Anvari > > wrote:
>> 
>> I'll try that as well.
>> Meanwhile, I found that my c++ code is running fine on a machine running 
>> OpenMPI 1.5.4, but I receive the same error under OpenMPI 1.8.4 for both 
>> Java and C++.
>> 
>> On Mon, Apr 6, 2015 at 2:21 PM, Howard Pritchard > > wrote:
>> Hello HR,
>> 
>> Thanks!  If you have Java 1.7 installed on your system would you mind trying 
>> to test against that version too?
>> 
>> Thanks,
>> 
>> Howard
>> 
>> 
>> 2015-04-06 13:09 GMT-06:00 Hamidreza Anvari > >:
>> Hello,
>> 
>> 1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.
>> 
>> 2. I have used the following configuration for making OpenMPI:
>> ./configure --enable-mpi-java 
>> --with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
>>  
>> --with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
>>  --prefix="/users/hamidreza/openmpi-1.8.4"
>> 
>> make all install
>> 
>> 3. As a logical point of view, size2 is the maximum expected data to 
>> receive, which in turn might be less that this maximum. 
>> 
>> 4. I will try to prepare a working reproducer of my error and send it to you

Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Abdul Rahman Riza
how to unsubscribe?

On Mon, Apr 6, 2015 at 4:45 PM, Filippo Spiga 
wrote:

> Dear Open MPI developers,
>
> I wonder if there is a way to address this particular scenario using MPI_T
> or other strategies in Open MPI. I saw a similar discussion few days ago, I
> assume the same challenges are applied in this case but I just want to
> check. Here is the scenario:
>
> We have a system composed by dual rail Mellanox IB, two distinct
> Connect-IB cards per node each one sitting on a different PCI-E lane out of
> two distinct sockets. We are seeking a way to control MPI traffic thought
> each one of them directly into the application. In specific we have a
> single MPI rank per node that goes multi-threading using OpenMP.
> MPI_THREAD_MULTIPLE is used, each OpenMP thread may initiate MPI
> communication. We would like to assign IB-0 to thread 0 and IB-1 to thread
> 1.
>
> Via mpirun or env variables we can control which IB interface to use by
> binding it to a specific MPI rank (or by apply a policy that relate IB to
> MPi ranks). But if there is only one MPI rank active, how we can
> differentiate the traffic across multiple IB cards?
>
> Thanks in advance for any suggestion about this matter.
>
> Regards,
> Filippo
>
> --
> Mr. Filippo SPIGA, M.Sc.
> http://filippospiga.info ~ skype: filippo.spiga
>
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>
> *
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL
> and may be privileged or otherwise protected from disclosure. The contents
> are not to be disclosed to anyone other than the addressee. Unauthorized
> recipients are requested to preserve this confidentiality and to advise the
> sender immediately of any error in transmission."
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26614.php


Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Abdul Rahman Riza
how to unsubscribe?

On Tue, Apr 7, 2015 at 4:45 PM, Filippo Spiga 
wrote:

> Thanks Rolf and Ralph for the replies!
>
> On Apr 6, 2015, at 10:37 PM, Ralph Castain  wrote:
> > That said, you can certainly have your application specify a
> thread-level binding. You’d have to do the heavy lifting yourself in the
> app, I’m afraid, instead of relying on us to do it for you
>
> Ok, my application must do it and I am fine with it. But how? I mean, does
> Open MPi expose some API that allows such fine grain control?
>
> F
>
> --
> Mr. Filippo SPIGA, M.Sc.
> http://filippospiga.info ~ skype: filippo.spiga
>
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>
> *
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL
> and may be privileged or otherwise protected from disclosure. The contents
> are not to be disclosed to anyone other than the addressee. Unauthorized
> recipients are requested to preserve this confidentiality and to advise the
> sender immediately of any error in transmission."
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26632.php


Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-07 Thread Ralph Castain
Easiest way is to follow the link at the bottom of the message:

http://www.open-mpi.org/mailman/listinfo.cgi/users 



> On Apr 7, 2015, at 10:39 AM, Abdul Rahman Riza  wrote:
> 
> how to unsubscribe?
> 
> On Tue, Apr 7, 2015 at 4:45 PM, Filippo Spiga  > wrote:
> Thanks Rolf and Ralph for the replies!
> 
> On Apr 6, 2015, at 10:37 PM, Ralph Castain  > wrote:
> > That said, you can certainly have your application specify a thread-level 
> > binding. You’d have to do the heavy lifting yourself in the app, I’m 
> > afraid, instead of relying on us to do it for you
> 
> Ok, my application must do it and I am fine with it. But how? I mean, does 
> Open MPi expose some API that allows such fine grain control?
> 
> F
> 
> --
> Mr. Filippo SPIGA, M.Sc.
> http://filippospiga.info  ~ skype: filippo.spiga
> 
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
> 
> *
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL 
> and may be privileged or otherwise protected from disclosure. The contents 
> are not to be disclosed to anyone other than the addressee. Unauthorized 
> recipients are requested to preserve this confidentiality and to advise the 
> sender immediately of any error in transmission."
> 
> 
> ___
> users mailing list
> us...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26632.php 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26640.php



Re: [OMPI users] 1.8.4 behaves completely different from 1.6.5

2015-04-07 Thread Thomas Klimpel
Here is a stackdump from inside the debugger (because it gives filenames
and line numbers):

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f1eb6bfd700 (LWP 24847)]
0x00366aa79252 in _int_malloc () from /lib64/libc.so.6
(gdb) bt
#0  0x00366aa79252 in _int_malloc () from /lib64/libc.so.6
#1  0x00366aa7b7da in _int_realloc () from /lib64/libc.so.6
#2  0x00366aa7baf5 in realloc () from /lib64/libc.so.6
#3  0x7f1ee005d0a8 in epoll_dispatch (base=,
arg=0x13d1310, tv=)
at ../../../../../package/openmpi-1.6.5/opal/event/epoll.c:271
#4  0x7f1ee005f1cf in opal_event_base_loop (base=0x13d1e50,
flags=)
at ../../../../../package/openmpi-1.6.5/opal/event/event.c:838
#5  0x7f1ee00842f9 in opal_progress () at
../../../../package/openmpi-1.6.5/opal/runtime/opal_progress.c:189
#6  0x7f1ecd43cd7f in mca_pml_ob1_iprobe (src=,
tag=-1, comm=0x164dd40, matched=0x7f1eb6bfb8ac, status=0x7f1eb6bfb8b0)
at
../../../../../../../package/openmpi-1.6.5/ompi/mca/pml/ob1/pml_ob1_iprobe.c:48
#7  0x7f1edffe3427 in PMPI_Iprobe (source=227, tag=-1, comm=0x164dd40,
flag=, status=)
at piprobe.c:79
#8  0x7f1eebb518e7 in OMPIConnection::Receive (this=0x13c7950,
rMessage_p=std::vector of length 0, capacity 0,
rMessageId_p=@0x7f1eb6bfc26c, NodeId_p=227)


Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3

2015-04-07 Thread Lane, William
Ralph,

I've finally had some luck using the following:
$MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-single 
--mca btl_tcp_if_include eth0 --hetero-nodes --use-hwthread-cpus --prefix 
$MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN

Where $NSLOTS was 56 and my hostfile hostfile-single is:

csclprd3-0-0 slots=12 max-slots=24
csclprd3-0-1 slots=6 max-slots=12
csclprd3-0-2 slots=6 max-slots=12
csclprd3-0-3 slots=6 max-slots=12
csclprd3-0-4 slots=6 max-slots=12
csclprd3-0-5 slots=6 max-slots=12
csclprd3-0-6 slots=6 max-slots=12
csclprd3-6-1 slots=4 max-slots=4
csclprd3-6-5 slots=4 max-slots=4

The max-slots differs from slots on some nodes
because I include the hyperthreaded cores in
the max-slots, the last two nodes have CPU's that
don't support hyperthreading at all.

Does --use-hwthread-cpus prevent slots from
being assigned to hyperthreading cores?

For some reason the manpage for OpenMPI 1.8.2
isn't installed on our CentOS 6.3 systems is there a
URL I can I find a copy of the manpages for OpenMPI 1.8.2?

Thanks for your help,

-Bill Lane


From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
[r...@open-mpi.org]
Sent: Monday, April 06, 2015 1:39 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3

Hmmm…well, that shouldn’t be the issue. To check, try running it with “bind-to 
none”. If you can get a backtrace telling us where it is crashing, that would 
also help.


On Apr 6, 2015, at 12:24 PM, Lane, William 
mailto:william.l...@cshs.org>> wrote:

Ralph,

For the following two different commandline invocations of the LAPACK benchmark

$MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-no_slots 
--mca btl_tcp_if_include eth0 --hetero-nodes --use-hwthread-cpus --bind-to 
hwthread --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN

$MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-no_slots 
--mca btl_tcp_if_include eth0 --hetero-nodes --bind-to-core --prefix $MPI_DIR 
$BENCH_DIR/$APP_DIR/$APP_BIN

I'm receiving the same kinds of OpenMPI error messages (but for different nodes 
in the ring):

[csclprd3-0-16:25940] *** Process received signal ***
[csclprd3-0-16:25940] Signal: Bus error (7)
[csclprd3-0-16:25940] Signal code: Non-existant physical address (2)
[csclprd3-0-16:25940] Failing at address: 0x7f8b1b5a2600


--
mpirun noticed that process rank 82 with PID 25936 on node 
csclprd3-0-16 exited on signal 7 (Bus error).

--
16 total processes killed (some possibly by mpirun during cleanup)

It seems to occur on systems that have more than one, physical CPU installed. 
Could
this be due to a lack of the correct NUMA libraries being installed?

-Bill L.


From: users [users-boun...@open-mpi.org] on 
behalf of Ralph Castain [r...@open-mpi.org]
Sent: Sunday, April 05, 2015 6:09 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3


On Apr 5, 2015, at 5:58 PM, Lane, William 
mailto:william.l...@cshs.org>> wrote:

I think some of the Intel Blade systems in the cluster are
dual core, but don't support hyperthreading. Maybe it
would be better to exclude hyperthreading altogether
from submitted OpenMPI jobs?

Yes - or you can add "--hetero-nodes -use-hwthread-cpus --bind-to hwthread" to 
the cmd line. This tells mpirun that the nodes aren't all the same, and so it 
has to look at each node's topology instead of taking the first node as the 
template for everything. The second tells it to use the HTs as independent cpus 
where they are supported.

I'm not entirely sure the suggestion will work - if we hit a place where HT 
isn't supported, we may balk at being asked to bind to HTs. I can probably make 
a change that supports this kind of hetero arrangement (perhaps something like 
bind-to pu) - might make it into 1.8.5 (we are just starting the release 
process on it now).


OpenMPI doesn't crash, but it doesn't run the LAPACK
benchmark either.

Thanks again Ralph.

Bill L.


From: users [users-boun...@open-mpi.org] on 
behalf of Ralph Castain [r...@open-mpi.org]
Sent: Wednesday, April 01, 2015 8:40 AM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3

Bingo - you said the magic word. This is a terminology issue. When we say 
"core", we mean the old definition of "core", not "hyperthreads". If you want 
to use HTs as your base processing unit and bind to them, then you need to 
specify --bind-to hwthread. That warning should then go away.

We don't require a swap region be mounted - I didn't see anything in your 
original message indicating 

Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3

2015-04-07 Thread Ralph Castain
I’m not sure our man pages are good enough to answer your question, but here is 
the URL

http://www.open-mpi.org/doc/v1.8/ 

I’m a tad tied up right now, but I’ll try to address this prior to 1.8.5 
release. Thanks for all that debug effort! Helps a bunch.

> On Apr 7, 2015, at 1:17 PM, Lane, William  wrote:
> 
> Ralph,
> 
> I've finally had some luck using the following:
> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-single 
> --mca btl_tcp_if_include eth0 --hetero-nodes --use-hwthread-cpus --prefix 
> $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN
> 
> Where $NSLOTS was 56 and my hostfile hostfile-single is:
> 
> csclprd3-0-0 slots=12 max-slots=24
> csclprd3-0-1 slots=6 max-slots=12
> csclprd3-0-2 slots=6 max-slots=12
> csclprd3-0-3 slots=6 max-slots=12
> csclprd3-0-4 slots=6 max-slots=12
> csclprd3-0-5 slots=6 max-slots=12
> csclprd3-0-6 slots=6 max-slots=12
> csclprd3-6-1 slots=4 max-slots=4
> csclprd3-6-5 slots=4 max-slots=4
> 
> The max-slots differs from slots on some nodes
> because I include the hyperthreaded cores in
> the max-slots, the last two nodes have CPU's that
> don't support hyperthreading at all.
> 
> Does --use-hwthread-cpus prevent slots from
> being assigned to hyperthreading cores?
> 
> For some reason the manpage for OpenMPI 1.8.2
> isn't installed on our CentOS 6.3 systems is there a
> URL I can I find a copy of the manpages for OpenMPI 1.8.2?
> 
> Thanks for your help,
> 
> -Bill Lane
> 
> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
> [r...@open-mpi.org]
> Sent: Monday, April 06, 2015 1:39 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
> 
> Hmmm…well, that shouldn’t be the issue. To check, try running it with 
> “bind-to none”. If you can get a backtrace telling us where it is crashing, 
> that would also help.
> 
> 
>> On Apr 6, 2015, at 12:24 PM, Lane, William > > wrote:
>> 
>> Ralph,
>> 
>> For the following two different commandline invocations of the LAPACK 
>> benchmark
>> 
>> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
>> hostfile-no_slots --mca btl_tcp_if_include eth0 --hetero-nodes 
>> --use-hwthread-cpus --bind-to hwthread --prefix $MPI_DIR 
>> $BENCH_DIR/$APP_DIR/$APP_BIN
>> 
>> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
>> hostfile-no_slots --mca btl_tcp_if_include eth0 --hetero-nodes 
>> --bind-to-core --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN
>> 
>> I'm receiving the same kinds of OpenMPI error messages (but for different 
>> nodes in the ring):
>> 
>> [csclprd3-0-16:25940] *** Process received signal ***
>> [csclprd3-0-16:25940] Signal: Bus error (7)
>> [csclprd3-0-16:25940] Signal code: Non-existant physical address (2)
>> [csclprd3-0-16:25940] Failing at address: 0x7f8b1b5a2600
>> 
>> 
>> --
>> mpirun noticed that process rank 82 with PID 25936 on node 
>> csclprd3-0-16 exited on signal 7 (Bus error).
>> 
>> --
>> 16 total processes killed (some possibly by mpirun during cleanup)
>> 
>> It seems to occur on systems that have more than one, physical CPU 
>> installed. Could
>> this be due to a lack of the correct NUMA libraries being installed?
>> 
>> -Bill L.
>> 
>> From: users [users-boun...@open-mpi.org ] 
>> on behalf of Ralph Castain [r...@open-mpi.org ]
>> Sent: Sunday, April 05, 2015 6:09 PM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>> 
>> 
>>> On Apr 5, 2015, at 5:58 PM, Lane, William >> > wrote:
>>> 
>>> I think some of the Intel Blade systems in the cluster are
>>> dual core, but don't support hyperthreading. Maybe it
>>> would be better to exclude hyperthreading altogether
>>> from submitted OpenMPI jobs?
>> 
>> Yes - or you can add "--hetero-nodes -use-hwthread-cpus --bind-to hwthread" 
>> to the cmd line. This tells mpirun that the nodes aren't all the same, and 
>> so it has to look at each node's topology instead of taking the first node 
>> as the template for everything. The second tells it to use the HTs as 
>> independent cpus where they are supported.
>> 
>> I'm not entirely sure the suggestion will work - if we hit a place where HT 
>> isn't supported, we may balk at being asked to bind to HTs. I can probably 
>> make a change that supports this kind of hetero arrangement (perhaps 
>> something like bind-to pu) - might make it into 1.8.5 (we are just starting 
>> the release process on it now).
>> 
>>> 
>>> OpenMPI doesn't crash, but it doesn't run the LAPACK
>>> benchmark either.
>>> 
>>> Thanks again Ralph.
>>> 
>>> Bill L.
>>> 
>>> From: users [users-boun...@open-mpi.org 
>>> ] on

[OMPI users] 1.8.3 executable with 1.8.4 mpirun/orted?

2015-04-07 Thread Alan Wild
I know this isn't "recommend", but a vendor recently gave me an executable
compiled openmpi-1.8.3 and I happened to have recently completed a build of
1.8.4 (but didn't have 1.8.3 sitting around and the vendor refuses to
provide his build).

Since these releases are so close they should be ABI compatible so I
thought I would see what happens...

[arwild1@hplcslsp2 ~]$ mpirun -n 2 -H localhost vendor_app_mpi
[hplcslsp2:11394] [[56032,0],0] tcp_peer_recv_connect_ack: received
different version from [[56032,1],0]: 1.8.3 instead of 1.8.4
[hplcslsp2:11394] [[56032,0],0] tcp_peer_recv_connect_ack: received
different version from [[56032,1],1]: 1.8.3 instead of 1.8.4


and then everything hangs.  I can clearly see the output coming from

./orte/mca/oob/tcp/oob_tcp_connection.c


and where it returns

return ORTE_ERR_CONNECTION_REFUSED;



So it looks like I'm going to have to at least build 1.8.3, but is there
any way to work around this given we are dealing with builds that are that
close?  I'm really not interested in "rolling back" to 1.8.3 or providing
both releases on my system.

(yes, "right answer" is to get the vendor to provide his build... long stoy)

-Alan



-- 
a...@madllama.net http://humbleville.blogspot.com