[OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-06 Thread Filippo Spiga
Dear Open MPI developers,

I wonder if there is a way to address this particular scenario using MPI_T or 
other strategies in Open MPI. I saw a similar discussion few days ago, I assume 
the same challenges are applied in this case but I just want to check. Here is 
the scenario:

We have a system composed by dual rail Mellanox IB, two distinct Connect-IB 
cards per node each one sitting on a different PCI-E lane out of two distinct 
sockets. We are seeking a way to control MPI traffic thought each one of them 
directly into the application. In specific we have a single MPI rank per node 
that goes multi-threading using OpenMP. MPI_THREAD_MULTIPLE is used, each 
OpenMP thread may initiate MPI communication. We would like to assign IB-0 to 
thread 0 and IB-1 to thread 1.

Via mpirun or env variables we can control which IB interface to use by binding 
it to a specific MPI rank (or by apply a policy that relate IB to MPi ranks). 
But if there is only one MPI rank active, how we can differentiate the traffic 
across multiple IB cards?

Thanks in advance for any suggestion about this matter.

Regards,
Filippo

--
Mr. Filippo SPIGA, M.Sc.
http://filippospiga.info ~ skype: filippo.spiga

«Nobody will drive us out of Cantor's paradise.» ~ David Hilbert

*
Disclaimer: "Please note this message and any attachments are CONFIDENTIAL and 
may be privileged or otherwise protected from disclosure. The contents are not 
to be disclosed to anyone other than the addressee. Unauthorized recipients are 
requested to preserve this confidentiality and to advise the sender immediately 
of any error in transmission."




Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-06 Thread Howard Pritchard
Hello HR,

It would also be useful to know which java version you are using, as well
as the configure options used when building open mpi.

Thanks,

Howard



2015-04-05 19:10 GMT-06:00 Ralph Castain :

> If not too much trouble, can you extract just the alltoallv portion and
> provide us with a small reproducer?
>
>
> On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari  wrote:
>
> Hello,
>
> I am converting an existing MPI program in C++ to Java using OpenMPI 1.8.4,
> At some point I have a allToAllv() code which works fine in C++ but
> receives error in Java version:
>
> MPI.COMM_WORLD.allToAllv(data, subpartition_size, subpartition_offset,
> MPI.INT ,
> data2,subpartition_size2,subpartition_offset2,MPI.INT );
>
> Error:
> *** An error occurred in MPI_Alltoallv
> *** reported by process [3621322753,9223372036854775811]
> *** on communicator MPI_COMM_WORLD
> *** MPI_ERR_TRUNCATE: message truncated
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***and potentially your MPI job)
> 3 more processes have sent help message help-mpi-errors.txt /
> mpi_errors_are_fatal
> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error
> messages
>
> Here are the values for parameters:
>
> data.length = 5
> data2.length = 20
>
> -- Rank 0 of 4 --
> subpartition_offset:0,2,3,3,
> subpartition_size:2,1,0,2,
> subpartition_offset2:0,5,10,15,
> subpartition_size2:5,5,5,5,
> --
> -- Rank 1 of 4 --
> subpartition_offset:0,2,3,4,
> subpartition_size:2,1,1,1,
> subpartition_offset2:0,5,10,15,
> subpartition_size2:5,5,5,5,
> --
> -- Rank 2 of 4 --
> subpartition_offset:0,1,2,3,
> subpartition_size:1,1,1,2,
> subpartition_offset2:0,5,10,15,
> subpartition_size2:5,5,5,5,
> --
> -- Rank 3 of 4 --
> subpartition_offset:0,1,2,4,
> subpartition_size:1,1,2,1,
> subpartition_offset2:0,5,10,15,
> subpartition_size2:5,5,5,5,
> --
>
> Again, this is a code which works in C++ version.
>
> Any help or advice is greatly appreciated.
>
> Thanks,
> -- HR
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26610.php
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26613.php
>


Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-06 Thread Ralph Castain
I’ve talked to the folks who wrote the Java bindings. One possibility we 
identified is that there may be an error in your code when you did the 
translation

> My immediate thought is that each process can not receive more elements than 
> it was sent to them. That's the reason of truncation error.
> 
> These are the correct values:
> 
> rank 0 - size2: 2,2,1,1
> rank 1 - size2: 1,1,1,1
> rank 2 - size2: 0,1,1,2
> rank 3 - size2: 2,1,2,1

Can you check your code to see if perhaps the values you are passing didn’t get 
translated correctly from your C++ version to the Java version?



> On Apr 6, 2015, at 5:03 AM, Howard Pritchard  wrote:
> 
> Hello HR,
> 
> It would also be useful to know which java version you are using, as well
> as the configure options used when building open mpi.
> 
> Thanks,
> 
> Howard
> 
> 
> 
> 2015-04-05 19:10 GMT-06:00 Ralph Castain  >:
> If not too much trouble, can you extract just the alltoallv portion and 
> provide us with a small reproducer?
> 
> 
>> On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari > > wrote:
>> 
>> Hello,
>> 
>> I am converting an existing MPI program in C++ to Java using OpenMPI 1.8.4,
>> At some point I have a allToAllv() code which works fine in C++ but receives 
>> error in Java version:
>> 
>> MPI.COMM_WORLD.allToAllv(data, subpartition_size, subpartition_offset, 
>> MPI.INT ,
>> data2,subpartition_size2,subpartition_offset2,MPI.INT );
>> 
>> Error:
>> *** An error occurred in MPI_Alltoallv
>> *** reported by process [3621322753,9223372036854775811]
>> *** on communicator MPI_COMM_WORLD
>> *** MPI_ERR_TRUNCATE: message truncated
>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>> ***and potentially your MPI job)
>> 3 more processes have sent help message help-mpi-errors.txt / 
>> mpi_errors_are_fatal
>> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error 
>> messages
>> 
>> Here are the values for parameters:
>> 
>> data.length = 5
>> data2.length = 20
>> 
>> -- Rank 0 of 4 --
>> subpartition_offset:0,2,3,3,
>> subpartition_size:2,1,0,2,
>> subpartition_offset2:0,5,10,15,
>> subpartition_size2:5,5,5,5,
>> --
>> -- Rank 1 of 4 --
>> subpartition_offset:0,2,3,4,
>> subpartition_size:2,1,1,1,
>> subpartition_offset2:0,5,10,15,
>> subpartition_size2:5,5,5,5,
>> --
>> -- Rank 2 of 4 --
>> subpartition_offset:0,1,2,3,
>> subpartition_size:1,1,1,2,
>> subpartition_offset2:0,5,10,15,
>> subpartition_size2:5,5,5,5,
>> --
>> -- Rank 3 of 4 --
>> subpartition_offset:0,1,2,4,
>> subpartition_size:1,1,2,1,
>> subpartition_offset2:0,5,10,15,
>> subpartition_size2:5,5,5,5,
>> --
>> 
>> Again, this is a code which works in C++ version.
>> 
>> Any help or advice is greatly appreciated.
>> 
>> Thanks,
>> -- HR
>> ___
>> users mailing list
>> us...@open-mpi.org 
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> 
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/04/26610.php 
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26613.php 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26615.php



Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-06 Thread Hamidreza Anvari
Hello,

1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.

2. I have used the following configuration for making OpenMPI:
./configure --enable-mpi-java
--with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
--with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
--prefix="/users/hamidreza/openmpi-1.8.4"

make all install

3. As a logical point of view, size2 is the maximum expected data to
receive, which in turn might be less that this maximum.

4. I will try to prepare a working reproducer of my error and send it to
you.

Thanks,
-- HR

On Mon, Apr 6, 2015 at 10:46 AM, Ralph Castain  wrote:

> I've talked to the folks who wrote the Java bindings. One possibility we
> identified is that there may be an error in your code when you did the
> translation
>
> My immediate thought is that each process can not receive more elements
> than it was sent to them. That's the reason of truncation error.
>
> These are the correct values:
>
> rank 0 - size2: 2,2,1,1
> rank 1 - size2: 1,1,1,1
> rank 2 - size2: 0,1,1,2
> rank 3 - size2: 2,1,2,1
>
>
> Can you check your code to see if perhaps the values you are passing
> didn't get translated correctly from your C++ version to the Java version?
>
>
>
> On Apr 6, 2015, at 5:03 AM, Howard Pritchard  wrote:
>
> Hello HR,
>
> It would also be useful to know which java version you are using, as well
> as the configure options used when building open mpi.
>
> Thanks,
>
> Howard
>
>
>
> 2015-04-05 19:10 GMT-06:00 Ralph Castain :
>
>> If not too much trouble, can you extract just the alltoallv portion and
>> provide us with a small reproducer?
>>
>>
>> On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari 
>> wrote:
>>
>> Hello,
>>
>> I am converting an existing MPI program in C++ to Java using OpenMPI
>> 1.8.4,
>> At some point I have a allToAllv() code which works fine in C++ but
>> receives error in Java version:
>>
>> MPI.COMM_WORLD.allToAllv(data, subpartition_size, subpartition_offset,
>> MPI.INT ,
>> data2,subpartition_size2,subpartition_offset2,MPI.INT );
>>
>> Error:
>> *** An error occurred in MPI_Alltoallv
>> *** reported by process [3621322753,9223372036854775811]
>> *** on communicator MPI_COMM_WORLD
>> *** MPI_ERR_TRUNCATE: message truncated
>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>> ***and potentially your MPI job)
>> 3 more processes have sent help message help-mpi-errors.txt /
>> mpi_errors_are_fatal
>> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error
>> messages
>>
>> Here are the values for parameters:
>>
>> data.length = 5
>> data2.length = 20
>>
>> -- Rank 0 of 4 --
>> subpartition_offset:0,2,3,3,
>> subpartition_size:2,1,0,2,
>> subpartition_offset2:0,5,10,15,
>> subpartition_size2:5,5,5,5,
>> --
>> -- Rank 1 of 4 --
>> subpartition_offset:0,2,3,4,
>> subpartition_size:2,1,1,1,
>> subpartition_offset2:0,5,10,15,
>> subpartition_size2:5,5,5,5,
>> --
>> -- Rank 2 of 4 --
>> subpartition_offset:0,1,2,3,
>> subpartition_size:1,1,1,2,
>> subpartition_offset2:0,5,10,15,
>> subpartition_size2:5,5,5,5,
>> --
>> -- Rank 3 of 4 --
>> subpartition_offset:0,1,2,4,
>> subpartition_size:1,1,2,1,
>> subpartition_offset2:0,5,10,15,
>> subpartition_size2:5,5,5,5,
>> --
>>
>> Again, this is a code which works in C++ version.
>>
>> Any help or advice is greatly appreciated.
>>
>> Thanks,
>> -- HR
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/04/26610.php
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/04/26613.php
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26615.php
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/04/26616.php
>


Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3

2015-04-06 Thread Lane, William
Ralph,

For the following two different commandline invocations of the LAPACK benchmark

$MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-no_slots 
--mca btl_tcp_if_include eth0 --hetero-nodes --use-hwthread-cpus --bind-to 
hwthread --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN

$MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-no_slots 
--mca btl_tcp_if_include eth0 --hetero-nodes --bind-to-core --prefix $MPI_DIR 
$BENCH_DIR/$APP_DIR/$APP_BIN

I'm receiving the same kinds of OpenMPI error messages (but for different nodes 
in the ring):

[csclprd3-0-16:25940] *** Process received signal ***
[csclprd3-0-16:25940] Signal: Bus error (7)
[csclprd3-0-16:25940] Signal code: Non-existant physical address (2)
[csclprd3-0-16:25940] Failing at address: 0x7f8b1b5a2600


--
mpirun noticed that process rank 82 with PID 25936 on node 
csclprd3-0-16 exited on signal 7 (Bus error).

--
16 total processes killed (some possibly by mpirun during cleanup)

It seems to occur on systems that have more than one, physical CPU installed. 
Could
this be due to a lack of the correct NUMA libraries being installed?

-Bill L.


From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
[r...@open-mpi.org]
Sent: Sunday, April 05, 2015 6:09 PM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3


On Apr 5, 2015, at 5:58 PM, Lane, William 
mailto:william.l...@cshs.org>> wrote:

I think some of the Intel Blade systems in the cluster are
dual core, but don't support hyperthreading. Maybe it
would be better to exclude hyperthreading altogether
from submitted OpenMPI jobs?

Yes - or you can add "--hetero-nodes -use-hwthread-cpus --bind-to hwthread" to 
the cmd line. This tells mpirun that the nodes aren't all the same, and so it 
has to look at each node's topology instead of taking the first node as the 
template for everything. The second tells it to use the HTs as independent cpus 
where they are supported.

I'm not entirely sure the suggestion will work - if we hit a place where HT 
isn't supported, we may balk at being asked to bind to HTs. I can probably make 
a change that supports this kind of hetero arrangement (perhaps something like 
bind-to pu) - might make it into 1.8.5 (we are just starting the release 
process on it now).


OpenMPI doesn't crash, but it doesn't run the LAPACK
benchmark either.

Thanks again Ralph.

Bill L.


From: users [users-boun...@open-mpi.org] on 
behalf of Ralph Castain [r...@open-mpi.org]
Sent: Wednesday, April 01, 2015 8:40 AM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3

Bingo - you said the magic word. This is a terminology issue. When we say 
"core", we mean the old definition of "core", not "hyperthreads". If you want 
to use HTs as your base processing unit and bind to them, then you need to 
specify --bind-to hwthread. That warning should then go away.

We don't require a swap region be mounted - I didn't see anything in your 
original message indicating that OMPI had actually crashed, but just wasn't 
launching due to the above issue. Were you actually seeing crashes as well?


On Wed, Apr 1, 2015 at 8:31 AM, Lane, William 
mailto:william.l...@cshs.org>> wrote:
Ralph,

Here's the associated hostfile:

#openMPI hostfile for csclprd3
#max slots prevents oversubscribing csclprd3-0-9
csclprd3-0-0 slots=12 max-slots=12
csclprd3-0-1 slots=6 max-slots=6
csclprd3-0-2 slots=6 max-slots=6
csclprd3-0-3 slots=6 max-slots=6
csclprd3-0-4 slots=6 max-slots=6
csclprd3-0-5 slots=6 max-slots=6
csclprd3-0-6 slots=6 max-slots=6
csclprd3-0-7 slots=32 max-slots=32
csclprd3-0-8 slots=32 max-slots=32
csclprd3-0-9 slots=32 max-slots=32
csclprd3-0-10 slots=32 max-slots=32
csclprd3-0-11 slots=32 max-slots=32
csclprd3-0-12 slots=12 max-slots=12
csclprd3-0-13 slots=24 max-slots=24
csclprd3-0-14 slots=16 max-slots=16
csclprd3-0-15 slots=16 max-slots=16
csclprd3-0-16 slots=24 max-slots=24
csclprd3-0-17 slots=24 max-slots=24
csclprd3-6-1 slots=4 max-slots=4
csclprd3-6-5 slots=4 max-slots=4

The number of slots also includes hyperthreading
cores.

One more question, would not having defined swap
partitions on all the nodes in the ring cause OpenMPI
to crash? Because no swap partitions are defined
for any of the above systems.

-Bill L.



From: users [users-boun...@open-mpi.org] on 
behalf of Ralph Castain [r...@open-mpi.org]
Sent: Wednesday, April 01, 2015 5:04 AM
To: Open MPI Users
Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3

The warning about binding to memory is due to 

Re: [OMPI users] [mpich-discuss] Buffered sends are evil?

2015-04-06 Thread Jeff Hammond
While we are tilting at windmills, can we also discuss the evils of
MPI_Cancel for MPI_Send, everything about MPI_Alltoallw, how
MPI_Reduce_scatter is named wrong, and any number of other pet peeves
that people have about MPI-3? :-D

The MPI standard contains many useful functions and at least a handful
of stupid ones.  This is remarkably similar to other outputs of the
design-by-committee process and can be observed in OpenMP 4.0, C++14,
Fortran 2008, and probably every other standardized parallel
programming interface in use today.

Fortunately, judicious parallel programmers know that less is more and
generally focus on using the useful functions effectively, while
ignoring the less useful ones, and it's usually not hard to tell the
difference.

Jeff

PS I used MPI_Bsend once and found it superior to the alternative of
MPI_Isend+MPI_Request_free for sending fire-and-forget acks, because
it forced the implementation to do what I wanted and the performance
improved as a result.

On Fri, Apr 3, 2015 at 8:34 AM, Jeff Squyres (jsquyres)
 wrote:
> Fair enough.
>
> My main point should probably be summarized as: MPI_BSEND isn't worth it; 
> there are other, less-confusing, generally-more-optimized alternatives.
>
>
>
>
>> On Apr 3, 2015, at 11:20 AM, Balaji, Pavan  wrote:
>>
>> Jeff,
>>
>> Your blog post seems to confuse what implementations currently do with what 
>> Bsend is capable of.  If users really wanted it (e.g., a big customer asked 
>> for it), every implementation will optimize the crap out of it.  The problem 
>> is that every few users really care for it, so there's not been a good 
>> incentive for implementations to optimize it.
>>
>> Coming to the technical aspects, bsend doesn't require copying into the user 
>> buffer, if you have internal buffer resources.  It only guarantees that 
>> Bsend will not block if enough user buffer space is available.  If you are 
>> blocking for progress anyway, I'm not sure the extra copy would matter too 
>> much -- it matters some, of course, but likely not to the extent of a full 
>> copy cost.  Also, the semantics it provides are different -- guaranteed 
>> nonblocking nature when there's buffer space available.  It's like saying 
>> Ssend is not as efficient as send.  That's true, but those are different 
>> semantics.
>>
>> Having said that, I do agree with some of the shortcomings you pointed out 
>> -- specifically, you can only attach one buffer.  I'd add to the list with 
>> one more shortcoming: It's not communicator safe.  That is, if I attach a 
>> buffer, some other library I linked with might chew up my buffer space.  So 
>> the nonblocking guarantee is kind of bogus at that point.
>>
>>  -- Pavan
>>
>>> On Apr 3, 2015, at 5:30 AM, Jeff Squyres (jsquyres)  
>>> wrote:
>>>
>>> Yes.  I think the blog post gives 10 excellent reasons why.  :-)
>>>
>>>
 On Apr 3, 2015, at 2:40 AM, Lei Shi  wrote:

 Hello,

 I want to use buffered sends. Read a blog said it is evil, 
 http://blogs.cisco.com/performance/top-10-reasons-why-buffered-sends-are-evil

 Is it true or not? Thanks!

 Sincerely Yours,

 Lei Shi
 -

 ___
 users mailing list
 us...@open-mpi.org
 Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
 Link to this post: 
 http://www.open-mpi.org/community/lists/users/2015/04/26597.php
>>>
>>>
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to: 
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>
>>> ___
>>> discuss mailing list disc...@mpich.org
>>> To manage subscription options or unsubscribe:
>>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>> --
>> Pavan Balaji  ✉️
>> http://www.mcs.anl.gov/~balaji
>>
>> ___
>> discuss mailing list disc...@mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> discuss mailing list disc...@mpich.org
> To manage subscription options or unsubscribe:
> https://lists.mpich.org/mailman/listinfo/discuss



-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-06 Thread Howard Pritchard
Hello HR,

Thanks!  If you have Java 1.7 installed on your system would you mind
trying to test against that version too?

Thanks,

Howard


2015-04-06 13:09 GMT-06:00 Hamidreza Anvari :

> Hello,
>
> 1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.
>
> 2. I have used the following configuration for making OpenMPI:
> ./configure --enable-mpi-java
> --with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
> --with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
> --prefix="/users/hamidreza/openmpi-1.8.4"
>
> make all install
>
> 3. As a logical point of view, size2 is the maximum expected data to
> receive, which in turn might be less that this maximum.
>
> 4. I will try to prepare a working reproducer of my error and send it to
> you.
>
> Thanks,
> -- HR
>
> On Mon, Apr 6, 2015 at 10:46 AM, Ralph Castain  wrote:
>
>> I’ve talked to the folks who wrote the Java bindings. One possibility we
>> identified is that there may be an error in your code when you did the
>> translation
>>
>> My immediate thought is that each process can not receive more elements
>> than it was sent to them. That's the reason of truncation error.
>>
>> These are the correct values:
>>
>> rank 0 - size2: 2,2,1,1
>> rank 1 - size2: 1,1,1,1
>> rank 2 - size2: 0,1,1,2
>> rank 3 - size2: 2,1,2,1
>>
>>
>> Can you check your code to see if perhaps the values you are passing
>> didn’t get translated correctly from your C++ version to the Java version?
>>
>>
>>
>> On Apr 6, 2015, at 5:03 AM, Howard Pritchard  wrote:
>>
>> Hello HR,
>>
>> It would also be useful to know which java version you are using, as well
>> as the configure options used when building open mpi.
>>
>> Thanks,
>>
>> Howard
>>
>>
>>
>> 2015-04-05 19:10 GMT-06:00 Ralph Castain :
>>
>>> If not too much trouble, can you extract just the alltoallv portion and
>>> provide us with a small reproducer?
>>>
>>>
>>> On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari 
>>> wrote:
>>>
>>> Hello,
>>>
>>> I am converting an existing MPI program in C++ to Java using OpenMPI
>>> 1.8.4,
>>> At some point I have a allToAllv() code which works fine in C++ but
>>> receives error in Java version:
>>>
>>> MPI.COMM_WORLD.allToAllv(data, subpartition_size, subpartition_offset,
>>> MPI.INT ,
>>> data2,subpartition_size2,subpartition_offset2,MPI.INT 
>>> );
>>>
>>> Error:
>>> *** An error occurred in MPI_Alltoallv
>>> *** reported by process [3621322753,9223372036854775811]
>>> *** on communicator MPI_COMM_WORLD
>>> *** MPI_ERR_TRUNCATE: message truncated
>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>>> ***and potentially your MPI job)
>>> 3 more processes have sent help message help-mpi-errors.txt /
>>> mpi_errors_are_fatal
>>> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help /
>>> error messages
>>>
>>> Here are the values for parameters:
>>>
>>> data.length = 5
>>> data2.length = 20
>>>
>>> -- Rank 0 of 4 --
>>> subpartition_offset:0,2,3,3,
>>> subpartition_size:2,1,0,2,
>>> subpartition_offset2:0,5,10,15,
>>> subpartition_size2:5,5,5,5,
>>> --
>>> -- Rank 1 of 4 --
>>> subpartition_offset:0,2,3,4,
>>> subpartition_size:2,1,1,1,
>>> subpartition_offset2:0,5,10,15,
>>> subpartition_size2:5,5,5,5,
>>> --
>>> -- Rank 2 of 4 --
>>> subpartition_offset:0,1,2,3,
>>> subpartition_size:1,1,1,2,
>>> subpartition_offset2:0,5,10,15,
>>> subpartition_size2:5,5,5,5,
>>> --
>>> -- Rank 3 of 4 --
>>> subpartition_offset:0,1,2,4,
>>> subpartition_size:1,1,2,1,
>>> subpartition_offset2:0,5,10,15,
>>> subpartition_size2:5,5,5,5,
>>> --
>>>
>>> Again, this is a code which works in C++ version.
>>>
>>> Any help or advice is greatly appreciated.
>>>
>>> Thanks,
>>> -- HR
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2015/04/26610.php
>>>
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2015/04/26613.php
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/04/26615.php
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/04/26616.php
>>
>
>
> ___
> users ma

Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3

2015-04-06 Thread Ralph Castain
Hmmm…well, that shouldn’t be the issue. To check, try running it with “bind-to 
none”. If you can get a backtrace telling us where it is crashing, that would 
also help.


> On Apr 6, 2015, at 12:24 PM, Lane, William  wrote:
> 
> Ralph,
> 
> For the following two different commandline invocations of the LAPACK 
> benchmark
> 
> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
> hostfile-no_slots --mca btl_tcp_if_include eth0 --hetero-nodes 
> --use-hwthread-cpus --bind-to hwthread --prefix $MPI_DIR 
> $BENCH_DIR/$APP_DIR/$APP_BIN
> 
> $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile 
> hostfile-no_slots --mca btl_tcp_if_include eth0 --hetero-nodes --bind-to-core 
> --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN
> 
> I'm receiving the same kinds of OpenMPI error messages (but for different 
> nodes in the ring):
> 
> [csclprd3-0-16:25940] *** Process received signal ***
> [csclprd3-0-16:25940] Signal: Bus error (7)
> [csclprd3-0-16:25940] Signal code: Non-existant physical address (2)
> [csclprd3-0-16:25940] Failing at address: 0x7f8b1b5a2600
> 
> 
> --
> mpirun noticed that process rank 82 with PID 25936 on node 
> csclprd3-0-16 exited on signal 7 (Bus error).
> 
> --
> 16 total processes killed (some possibly by mpirun during cleanup)
> 
> It seems to occur on systems that have more than one, physical CPU installed. 
> Could
> this be due to a lack of the correct NUMA libraries being installed?
> 
> -Bill L.
> 
> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
> [r...@open-mpi.org]
> Sent: Sunday, April 05, 2015 6:09 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
> 
> 
>> On Apr 5, 2015, at 5:58 PM, Lane, William > > wrote:
>> 
>> I think some of the Intel Blade systems in the cluster are
>> dual core, but don't support hyperthreading. Maybe it
>> would be better to exclude hyperthreading altogether
>> from submitted OpenMPI jobs?
> 
> Yes - or you can add "--hetero-nodes -use-hwthread-cpus --bind-to hwthread" 
> to the cmd line. This tells mpirun that the nodes aren't all the same, and so 
> it has to look at each node's topology instead of taking the first node as 
> the template for everything. The second tells it to use the HTs as 
> independent cpus where they are supported.
> 
> I'm not entirely sure the suggestion will work - if we hit a place where HT 
> isn't supported, we may balk at being asked to bind to HTs. I can probably 
> make a change that supports this kind of hetero arrangement (perhaps 
> something like bind-to pu) - might make it into 1.8.5 (we are just starting 
> the release process on it now).
> 
>> 
>> OpenMPI doesn't crash, but it doesn't run the LAPACK
>> benchmark either.
>> 
>> Thanks again Ralph.
>> 
>> Bill L.
>> 
>> From: users [users-boun...@open-mpi.org ] 
>> on behalf of Ralph Castain [r...@open-mpi.org ]
>> Sent: Wednesday, April 01, 2015 8:40 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3
>> 
>> Bingo - you said the magic word. This is a terminology issue. When we say 
>> "core", we mean the old definition of "core", not "hyperthreads". If you 
>> want to use HTs as your base processing unit and bind to them, then you need 
>> to specify --bind-to hwthread. That warning should then go away.
>> 
>> We don't require a swap region be mounted - I didn't see anything in your 
>> original message indicating that OMPI had actually crashed, but just wasn't 
>> launching due to the above issue. Were you actually seeing crashes as well?
>> 
>> 
>> On Wed, Apr 1, 2015 at 8:31 AM, Lane, William > > wrote:
>> Ralph,
>> 
>> Here's the associated hostfile:
>> 
>> #openMPI hostfile for csclprd3
>> #max slots prevents oversubscribing csclprd3-0-9
>> csclprd3-0-0 slots=12 max-slots=12
>> csclprd3-0-1 slots=6 max-slots=6
>> csclprd3-0-2 slots=6 max-slots=6
>> csclprd3-0-3 slots=6 max-slots=6
>> csclprd3-0-4 slots=6 max-slots=6
>> csclprd3-0-5 slots=6 max-slots=6
>> csclprd3-0-6 slots=6 max-slots=6
>> csclprd3-0-7 slots=32 max-slots=32
>> csclprd3-0-8 slots=32 max-slots=32
>> csclprd3-0-9 slots=32 max-slots=32
>> csclprd3-0-10 slots=32 max-slots=32
>> csclprd3-0-11 slots=32 max-slots=32
>> csclprd3-0-12 slots=12 max-slots=12
>> csclprd3-0-13 slots=24 max-slots=24
>> csclprd3-0-14 slots=16 max-slots=16
>> csclprd3-0-15 slots=16 max-slots=16
>> csclprd3-0-16 slots=24 max-slots=24
>> csclprd3-0-17 slots=24 max-slots=24
>> csclprd3-6-1 slots=4 max-slots=4
>> csclprd3-6-5 slots=4 max-slots=4
>> 
>> The number of slots also includes hyperthreading
>> cores.
>> 
>> One more question, would not having defined swap
>> partitions on all the node

Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-06 Thread Hamidreza Anvari
I'll try that as well.
Meanwhile, I found that my c++ code is running fine on a machine running
OpenMPI 1.5.4, but I receive the same error under OpenMPI 1.8.4 for both
Java and C++.

On Mon, Apr 6, 2015 at 2:21 PM, Howard Pritchard 
wrote:

> Hello HR,
>
> Thanks!  If you have Java 1.7 installed on your system would you mind
> trying to test against that version too?
>
> Thanks,
>
> Howard
>
>
> 2015-04-06 13:09 GMT-06:00 Hamidreza Anvari :
>
>> Hello,
>>
>> 1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.
>>
>> 2. I have used the following configuration for making OpenMPI:
>> ./configure --enable-mpi-java
>> --with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
>> --with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
>> --prefix="/users/hamidreza/openmpi-1.8.4"
>>
>> make all install
>>
>> 3. As a logical point of view, size2 is the maximum expected data to
>> receive, which in turn might be less that this maximum.
>>
>> 4. I will try to prepare a working reproducer of my error and send it to
>> you.
>>
>> Thanks,
>> -- HR
>>
>> On Mon, Apr 6, 2015 at 10:46 AM, Ralph Castain  wrote:
>>
>>> I've talked to the folks who wrote the Java bindings. One possibility we
>>> identified is that there may be an error in your code when you did the
>>> translation
>>>
>>> My immediate thought is that each process can not receive more elements
>>> than it was sent to them. That's the reason of truncation error.
>>>
>>> These are the correct values:
>>>
>>> rank 0 - size2: 2,2,1,1
>>> rank 1 - size2: 1,1,1,1
>>> rank 2 - size2: 0,1,1,2
>>> rank 3 - size2: 2,1,2,1
>>>
>>>
>>> Can you check your code to see if perhaps the values you are passing
>>> didn't get translated correctly from your C++ version to the Java version?
>>>
>>>
>>>
>>> On Apr 6, 2015, at 5:03 AM, Howard Pritchard 
>>> wrote:
>>>
>>> Hello HR,
>>>
>>> It would also be useful to know which java version you are using, as well
>>> as the configure options used when building open mpi.
>>>
>>> Thanks,
>>>
>>> Howard
>>>
>>>
>>>
>>> 2015-04-05 19:10 GMT-06:00 Ralph Castain :
>>>
 If not too much trouble, can you extract just the alltoallv portion and
 provide us with a small reproducer?


 On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari 
 wrote:

 Hello,

 I am converting an existing MPI program in C++ to Java using OpenMPI
 1.8.4,
 At some point I have a allToAllv() code which works fine in C++ but
 receives error in Java version:

 MPI.COMM_WORLD.allToAllv(data, subpartition_size, subpartition_offset,
 MPI.INT ,
 data2,subpartition_size2,subpartition_offset2,MPI.INT 
 );

 Error:
 *** An error occurred in MPI_Alltoallv
 *** reported by process [3621322753,9223372036854775811]
 *** on communicator MPI_COMM_WORLD
 *** MPI_ERR_TRUNCATE: message truncated
 *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
 ***and potentially your MPI job)
 3 more processes have sent help message help-mpi-errors.txt /
 mpi_errors_are_fatal
 Set MCA parameter "orte_base_help_aggregate" to 0 to see all help /
 error messages

 Here are the values for parameters:

 data.length = 5
 data2.length = 20

 -- Rank 0 of 4 --
 subpartition_offset:0,2,3,3,
 subpartition_size:2,1,0,2,
 subpartition_offset2:0,5,10,15,
 subpartition_size2:5,5,5,5,
 --
 -- Rank 1 of 4 --
 subpartition_offset:0,2,3,4,
 subpartition_size:2,1,1,1,
 subpartition_offset2:0,5,10,15,
 subpartition_size2:5,5,5,5,
 --
 -- Rank 2 of 4 --
 subpartition_offset:0,1,2,3,
 subpartition_size:1,1,1,2,
 subpartition_offset2:0,5,10,15,
 subpartition_size2:5,5,5,5,
 --
 -- Rank 3 of 4 --
 subpartition_offset:0,1,2,4,
 subpartition_size:1,1,2,1,
 subpartition_offset2:0,5,10,15,
 subpartition_size2:5,5,5,5,
 --

 Again, this is a code which works in C++ version.

 Any help or advice is greatly appreciated.

 Thanks,
 -- HR
 ___
 users mailing list
 us...@open-mpi.org
 Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
 Link to this post:
 http://www.open-mpi.org/community/lists/users/2015/04/26610.php



 ___
 users mailing list
 us...@open-mpi.org
 Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
 Link to this post:
 http://www.open-mpi.org/community/lists/users/2015/04/26613.php

>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Lin

Re: [OMPI users] OpenMPI 1.8.4 - Java Library - allToAllv()

2015-04-06 Thread Ralph Castain
That would imply that the issue is in the underlying C implementation in OMPI, 
not the Java bindings. The reproducer would definitely help pin it down.

If you change the size2 values to the ones we sent you, does the program by 
chance work?


> On Apr 6, 2015, at 1:44 PM, Hamidreza Anvari  wrote:
> 
> I'll try that as well.
> Meanwhile, I found that my c++ code is running fine on a machine running 
> OpenMPI 1.5.4, but I receive the same error under OpenMPI 1.8.4 for both Java 
> and C++.
> 
> On Mon, Apr 6, 2015 at 2:21 PM, Howard Pritchard  > wrote:
> Hello HR,
> 
> Thanks!  If you have Java 1.7 installed on your system would you mind trying 
> to test against that version too?
> 
> Thanks,
> 
> Howard
> 
> 
> 2015-04-06 13:09 GMT-06:00 Hamidreza Anvari  >:
> Hello,
> 
> 1. I'm using Java/Javac version 1.8.0_20 under OS X 10.10.2.
> 
> 2. I have used the following configuration for making OpenMPI:
> ./configure --enable-mpi-java 
> --with-jdk-bindir="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Commands"
>  
> --with-jdk-headers="/System/Library/Frameworks/JavaVM.framework/Versions/Current/Headers"
>  --prefix="/users/hamidreza/openmpi-1.8.4"
> 
> make all install
> 
> 3. As a logical point of view, size2 is the maximum expected data to receive, 
> which in turn might be less that this maximum. 
> 
> 4. I will try to prepare a working reproducer of my error and send it to you.
> 
> Thanks,
> -- HR
> 
> On Mon, Apr 6, 2015 at 10:46 AM, Ralph Castain  > wrote:
> I’ve talked to the folks who wrote the Java bindings. One possibility we 
> identified is that there may be an error in your code when you did the 
> translation
> 
>> My immediate thought is that each process can not receive more elements than 
>> it was sent to them. That's the reason of truncation error.
>> 
>> These are the correct values:
>> 
>> rank 0 - size2: 2,2,1,1
>> rank 1 - size2: 1,1,1,1
>> rank 2 - size2: 0,1,1,2
>> rank 3 - size2: 2,1,2,1
> 
> Can you check your code to see if perhaps the values you are passing didn’t 
> get translated correctly from your C++ version to the Java version?
> 
> 
> 
>> On Apr 6, 2015, at 5:03 AM, Howard Pritchard > > wrote:
>> 
>> Hello HR,
>> 
>> It would also be useful to know which java version you are using, as well
>> as the configure options used when building open mpi.
>> 
>> Thanks,
>> 
>> Howard
>> 
>> 
>> 
>> 2015-04-05 19:10 GMT-06:00 Ralph Castain > >:
>> If not too much trouble, can you extract just the alltoallv portion and 
>> provide us with a small reproducer?
>> 
>> 
>>> On Apr 5, 2015, at 12:11 PM, Hamidreza Anvari >> > wrote:
>>> 
>>> Hello,
>>> 
>>> I am converting an existing MPI program in C++ to Java using OpenMPI 1.8.4,
>>> At some point I have a allToAllv() code which works fine in C++ but 
>>> receives error in Java version:
>>> 
>>> MPI.COMM_WORLD.allToAllv(data, subpartition_size, subpartition_offset, 
>>> MPI.INT ,
>>> data2,subpartition_size2,subpartition_offset2,MPI.INT );
>>> 
>>> Error:
>>> *** An error occurred in MPI_Alltoallv
>>> *** reported by process [3621322753,9223372036854775811]
>>> *** on communicator MPI_COMM_WORLD
>>> *** MPI_ERR_TRUNCATE: message truncated
>>> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
>>> ***and potentially your MPI job)
>>> 3 more processes have sent help message help-mpi-errors.txt / 
>>> mpi_errors_are_fatal
>>> Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error 
>>> messages
>>> 
>>> Here are the values for parameters:
>>> 
>>> data.length = 5
>>> data2.length = 20
>>> 
>>> -- Rank 0 of 4 --
>>> subpartition_offset:0,2,3,3,
>>> subpartition_size:2,1,0,2,
>>> subpartition_offset2:0,5,10,15,
>>> subpartition_size2:5,5,5,5,
>>> --
>>> -- Rank 1 of 4 --
>>> subpartition_offset:0,2,3,4,
>>> subpartition_size:2,1,1,1,
>>> subpartition_offset2:0,5,10,15,
>>> subpartition_size2:5,5,5,5,
>>> --
>>> -- Rank 2 of 4 --
>>> subpartition_offset:0,1,2,3,
>>> subpartition_size:1,1,1,2,
>>> subpartition_offset2:0,5,10,15,
>>> subpartition_size2:5,5,5,5,
>>> --
>>> -- Rank 3 of 4 --
>>> subpartition_offset:0,1,2,4,
>>> subpartition_size:1,1,2,1,
>>> subpartition_offset2:0,5,10,15,
>>> subpartition_size2:5,5,5,5,
>>> --
>>> 
>>> Again, this is a code which works in C++ version.
>>> 
>>> Any help or advice is greatly appreciated.
>>> 
>>> Thanks,
>>> -- HR
>>> ___
>>> users mailing list
>>> us...@open-mpi.org 
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>> 
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/04/26610.

Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-06 Thread Rolf vandeVaart
It is my belief that you cannot do this at least with the openib BTL.  The IB 
card to be used for communication is selected during the MPI _Init() phase 
based on where the CPU process is bound to.  You can see some of this selection 
by using the --mca btl_base_verbose 1 flag.  There is a bunch of output (which 
I have deleted), but you will see a few lines like this.

[ivy5] [rank=1] openib: using port mlx5_0:1
[ivy5] [rank=1] openib: using port mlx5_0:2
[ivy4] [rank=0] openib: using port mlx5_0:1
[ivy4] [rank=0] openib: using port mlx5_0:2

And if you have multiple NICs, you may also see some messages like this:
 "[rank=%d] openib: skipping device %s; it is too far away"
(This was lifted from the  code. I do not have a configuration right now where 
I can generate the second message.)

I cannot see how we can make this specific to a thread.  Maybe others have a 
different opinion.
Rolf

>-Original Message-
>From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Filippo Spiga
>Sent: Monday, April 06, 2015 5:46 AM
>To: Open MPI Users
>Cc: Mohammed Sourouri
>Subject: [OMPI users] Different HCA from different OpenMP threads (same
>rank using MPI_THREAD_MULTIPLE)
>
>Dear Open MPI developers,
>
>I wonder if there is a way to address this particular scenario using MPI_T or
>other strategies in Open MPI. I saw a similar discussion few days ago, I assume
>the same challenges are applied in this case but I just want to check. Here is
>the scenario:
>
>We have a system composed by dual rail Mellanox IB, two distinct Connect-IB
>cards per node each one sitting on a different PCI-E lane out of two distinct
>sockets. We are seeking a way to control MPI traffic thought each one of
>them directly into the application. In specific we have a single MPI rank per
>node that goes multi-threading using OpenMP. MPI_THREAD_MULTIPLE is
>used, each OpenMP thread may initiate MPI communication. We would like to
>assign IB-0 to thread 0 and IB-1 to thread 1.
>
>Via mpirun or env variables we can control which IB interface to use by binding
>it to a specific MPI rank (or by apply a policy that relate IB to MPi ranks). 
>But if
>there is only one MPI rank active, how we can differentiate the traffic across
>multiple IB cards?
>
>Thanks in advance for any suggestion about this matter.
>
>Regards,
>Filippo
>
>--
>Mr. Filippo SPIGA, M.Sc.
>http://filippospiga.info ~ skype: filippo.spiga
>
>«Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>
>*
>Disclaimer: "Please note this message and any attachments are
>CONFIDENTIAL and may be privileged or otherwise protected from disclosure.
>The contents are not to be disclosed to anyone other than the addressee.
>Unauthorized recipients are requested to preserve this confidentiality and to
>advise the sender immediately of any error in transmission."
>
>
>___
>users mailing list
>us...@open-mpi.org
>Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>Link to this post: http://www.open-
>mpi.org/community/lists/users/2015/04/26614.php

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


Re: [OMPI users] Different HCA from different OpenMP threads (same rank using MPI_THREAD_MULTIPLE)

2015-04-06 Thread Ralph Castain
I’m afraid Rolf is correct. We can only define the binding pattern at time of 
initial process execution, which is well before you start spinning up 
individual threads. At that point, we no longer have the ability to do binding.

That said, you can certainly have your application specify a thread-level 
binding. You’d have to do the heavy lifting yourself in the app, I’m afraid, 
instead of relying on us to do it for you.


> On Apr 6, 2015, at 2:24 PM, Rolf vandeVaart  wrote:
> 
> It is my belief that you cannot do this at least with the openib BTL.  The IB 
> card to be used for communication is selected during the MPI _Init() phase 
> based on where the CPU process is bound to.  You can see some of this 
> selection by using the --mca btl_base_verbose 1 flag.  There is a bunch of 
> output (which I have deleted), but you will see a few lines like this.
> 
> [ivy5] [rank=1] openib: using port mlx5_0:1
> [ivy5] [rank=1] openib: using port mlx5_0:2
> [ivy4] [rank=0] openib: using port mlx5_0:1
> [ivy4] [rank=0] openib: using port mlx5_0:2
> 
> And if you have multiple NICs, you may also see some messages like this:
> "[rank=%d] openib: skipping device %s; it is too far away"
> (This was lifted from the  code. I do not have a configuration right now 
> where I can generate the second message.)
> 
> I cannot see how we can make this specific to a thread.  Maybe others have a 
> different opinion.
> Rolf
> 
>> -Original Message-
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Filippo Spiga
>> Sent: Monday, April 06, 2015 5:46 AM
>> To: Open MPI Users
>> Cc: Mohammed Sourouri
>> Subject: [OMPI users] Different HCA from different OpenMP threads (same
>> rank using MPI_THREAD_MULTIPLE)
>> 
>> Dear Open MPI developers,
>> 
>> I wonder if there is a way to address this particular scenario using MPI_T or
>> other strategies in Open MPI. I saw a similar discussion few days ago, I 
>> assume
>> the same challenges are applied in this case but I just want to check. Here 
>> is
>> the scenario:
>> 
>> We have a system composed by dual rail Mellanox IB, two distinct Connect-IB
>> cards per node each one sitting on a different PCI-E lane out of two distinct
>> sockets. We are seeking a way to control MPI traffic thought each one of
>> them directly into the application. In specific we have a single MPI rank per
>> node that goes multi-threading using OpenMP. MPI_THREAD_MULTIPLE is
>> used, each OpenMP thread may initiate MPI communication. We would like to
>> assign IB-0 to thread 0 and IB-1 to thread 1.
>> 
>> Via mpirun or env variables we can control which IB interface to use by 
>> binding
>> it to a specific MPI rank (or by apply a policy that relate IB to MPi 
>> ranks). But if
>> there is only one MPI rank active, how we can differentiate the traffic 
>> across
>> multiple IB cards?
>> 
>> Thanks in advance for any suggestion about this matter.
>> 
>> Regards,
>> Filippo
>> 
>> --
>> Mr. Filippo SPIGA, M.Sc.
>> http://filippospiga.info ~ skype: filippo.spiga
>> 
>> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>> 
>> *
>> Disclaimer: "Please note this message and any attachments are
>> CONFIDENTIAL and may be privileged or otherwise protected from disclosure.
>> The contents are not to be disclosed to anyone other than the addressee.
>> Unauthorized recipients are requested to preserve this confidentiality and to
>> advise the sender immediately of any error in transmission."
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: http://www.open-
>> mpi.org/community/lists/users/2015/04/26614.php
> 
> ---
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> ---
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26624.php



[OMPI users] Simple openmpi-mca-params.conf question

2015-04-06 Thread Ray Sheppard

Hello list,
  I have been given permission to impose my usual defaults on the 
system.  I have been reading documentation for the 
openmpi-mca-params.conf file. "ompi_info --param all all" did not help.  
All the FAQ's seem to do was confuse me. I can not seem to understand 
how to instantiate a simple switch like:


 -mca btl_tcp_if_exclude eth2

I have tried various ways but always seem to get:
 keyval parser: error 2 reading file 
/N/u/rsheppar/Karst/.openmpi/mca-params.conf at line 1:


I would really appreciate a simple example of a proper entry. Thanks.
  Ray



Re: [OMPI users] Simple openmpi-mca-params.conf question

2015-04-06 Thread Ralph Castain
btl_tcp_if_exclude=eth2

should work

> On Apr 6, 2015, at 5:09 PM, Ray Sheppard  wrote:
> 
> Hello list,
>  I have been given permission to impose my usual defaults on the system.  I 
> have been reading documentation for the openmpi-mca-params.conf file. 
> "ompi_info --param all all" did not help.  All the FAQ's seem to do was 
> confuse me. I can not seem to understand how to instantiate a simple switch 
> like:
> 
> -mca btl_tcp_if_exclude eth2
> 
> I have tried various ways but always seem to get:
> keyval parser: error 2 reading file 
> /N/u/rsheppar/Karst/.openmpi/mca-params.conf at line 1:
> 
> I would really appreciate a simple example of a proper entry. Thanks.
>  Ray
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26626.php



Re: [OMPI users] Simple openmpi-mca-params.conf question

2015-04-06 Thread Ray Sheppard

Thanks Ralph,
  The FAQ had me putting in prefixes to that line and I just never 
figured it out.  I have just dumbly added these things to my mpirun 
line.  I have one other question. When I write into the system conf 
file, will the mpirun know to look there (which seems what the file 
says) or should I explicitly add the .../etc directory to a variable 
like CPATH?  Thanks again,

Ray

On 4/6/2015 8:14 PM, Ralph Castain wrote:

btl_tcp_if_exclude=eth2

should work


On Apr 6, 2015, at 5:09 PM, Ray Sheppard  wrote:

Hello list,
  I have been given permission to impose my usual defaults on the system.  I have been 
reading documentation for the openmpi-mca-params.conf file. "ompi_info --param all 
all" did not help.  All the FAQ's seem to do was confuse me. I can not seem to 
understand how to instantiate a simple switch like:

-mca btl_tcp_if_exclude eth2

I have tried various ways but always seem to get:
keyval parser: error 2 reading file 
/N/u/rsheppar/Karst/.openmpi/mca-params.conf at line 1:

I would really appreciate a simple example of a proper entry. Thanks.
  Ray

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/04/26626.php

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/04/26627.php




Re: [OMPI users] Simple openmpi-mca-params.conf question

2015-04-06 Thread Ralph Castain
Yep - it will automatically pick it up. The file should be in the /etc 
directory.

> On Apr 6, 2015, at 5:49 PM, Ray Sheppard  wrote:
> 
> Thanks Ralph,
>  The FAQ had me putting in prefixes to that line and I just never figured it 
> out.  I have just dumbly added these things to my mpirun line.  I have one 
> other question. When I write into the system conf file, will the mpirun know 
> to look there (which seems what the file says) or should I explicitly add the 
> .../etc directory to a variable like CPATH?  Thanks again,
> Ray
> 
> On 4/6/2015 8:14 PM, Ralph Castain wrote:
>> btl_tcp_if_exclude=eth2
>> 
>> should work
>> 
>>> On Apr 6, 2015, at 5:09 PM, Ray Sheppard  wrote:
>>> 
>>> Hello list,
>>>  I have been given permission to impose my usual defaults on the system.  I 
>>> have been reading documentation for the openmpi-mca-params.conf file. 
>>> "ompi_info --param all all" did not help.  All the FAQ's seem to do was 
>>> confuse me. I can not seem to understand how to instantiate a simple switch 
>>> like:
>>> 
>>> -mca btl_tcp_if_exclude eth2
>>> 
>>> I have tried various ways but always seem to get:
>>> keyval parser: error 2 reading file 
>>> /N/u/rsheppar/Karst/.openmpi/mca-params.conf at line 1:
>>> 
>>> I would really appreciate a simple example of a proper entry. Thanks.
>>>  Ray
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/04/26626.php
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/04/26627.php
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26628.php



Re: [OMPI users] Simple openmpi-mca-params.conf question

2015-04-06 Thread Ray Sheppard

Thanks again!
Ray

On 4/6/2015 8:58 PM, Ralph Castain wrote:

Yep - it will automatically pick it up. The file should be in the /etc 
directory.


On Apr 6, 2015, at 5:49 PM, Ray Sheppard  wrote:

Thanks Ralph,
  The FAQ had me putting in prefixes to that line and I just never figured it 
out.  I have just dumbly added these things to my mpirun line.  I have one 
other question. When I write into the system conf file, will the mpirun know to 
look there (which seems what the file says) or should I explicitly add the 
.../etc directory to a variable like CPATH?  Thanks again,
Ray

On 4/6/2015 8:14 PM, Ralph Castain wrote:

btl_tcp_if_exclude=eth2

should work


On Apr 6, 2015, at 5:09 PM, Ray Sheppard  wrote:

Hello list,
  I have been given permission to impose my usual defaults on the system.  I have been 
reading documentation for the openmpi-mca-params.conf file. "ompi_info --param all 
all" did not help.  All the FAQ's seem to do was confuse me. I can not seem to 
understand how to instantiate a simple switch like:

-mca btl_tcp_if_exclude eth2

I have tried various ways but always seem to get:
keyval parser: error 2 reading file 
/N/u/rsheppar/Karst/.openmpi/mca-params.conf at line 1:

I would really appreciate a simple example of a proper entry. Thanks.
  Ray

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/04/26626.php

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/04/26627.php

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/04/26628.php

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/04/26629.php