[OMPI users] Open MPI collectives algorithm selection

2015-03-10 Thread Khalid Hasanov
Hello,

I would like to know if Open MPI provides some kind of mechanism to select
collective algorithms such as MPI broadcast during run time depending on
some logic. For example, I would like to
use something like this:

if (some_condition)  ompi_binomial_broadcast(...);
else   ompi_pipeline_broadcast(...);

I know it is possible to use some fixed algorithm by
coll_tuned_use_dynamic_rules
or to define a custom selection rule using coll_tuned_dynamic_rules_filename.
But
I think it is not suitable in this situation as the dynamic rules mainly
based on the message size, segment size and communicator size.

Another option could be using Open MPI internal APIS like

ompi_coll_tuned_bcast_intra_binomial( buf, count, dtype, root, comm,
module, segsize);

But it highly depends on Open MPI internals as it uses
mca_coll_base_module_t .

Is there any better option? (except using my own implementation of
collectives)

Any suggestion highly appreciated.

Thanks

Regards,
Khalid


Re: [OMPI users] Open MPI collectives algorithm selection

2015-03-10 Thread Gilles Gouaillardet
Khalid,

i am not aware of such a mechanism.

/* there might be a way to use MPI_T_* mechanisms to force the algorithm,
and i will let other folks comment on that */

you definetly cannot directly invoke ompi_coll_tuned_bcast_intra_binomial
(abstraction violation, non portable, and you miss the some parameters)

out of curiosity, what do you have in mind for (some_condition) ?
/* since it seems implicit (some_condition) is independant of communicator
size, and message size */

Cheers,

Gilles

On Tue, Mar 10, 2015 at 10:04 PM, Khalid Hasanov  wrote:

> Hello,
>
> I would like to know if Open MPI provides some kind of mechanism to select
> collective algorithms such as MPI broadcast during run time depending on
> some logic. For example, I would like to
> use something like this:
>
> if (some_condition)  ompi_binomial_broadcast(...);
> else   ompi_pipeline_broadcast(...);
>
> I know it is possible to use some fixed algorithm by 
> coll_tuned_use_dynamic_rules
> or to define a custom selection rule using
> coll_tuned_dynamic_rules_filename. But
> I think it is not suitable in this situation as the dynamic rules mainly
> based on the message size, segment size and communicator size.
>
> Another option could be using Open MPI internal APIS like
>
> ompi_coll_tuned_bcast_intra_binomial( buf, count, dtype, root, comm,
> module, segsize);
>
> But it highly depends on Open MPI internals as it uses
> mca_coll_base_module_t .
>
> Is there any better option? (except using my own implementation of
> collectives)
>
> Any suggestion highly appreciated.
>
> Thanks
>
> Regards,
> Khalid
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/03/26446.php
>


[OMPI users] disappearance of the memory registration error in 1.8.x?

2015-03-10 Thread Fischer, Greg A.
Hello,

I'm trying to run the "connectivity_c" test on a variety of systems using 
OpenMPI 1.8.4. The test returns segmentation faults when running across nodes 
on one particular type of system, and only when using the openib BTL. (The test 
runs without error if I stipulate "--mca btl tcp,self".) Here's the output:

1033 fischega@bl1415[~/tmp/openmpi/1.8.4_test_examples_SLES11_SP2/error]> 
mpirun -np 16 connectivity_c
[bl1415:29526] *** Process received signal ***
[bl1415:29526] Signal: Segmentation fault (11)
[bl1415:29526] Signal code:  (128)
[bl1415:29526] Failing at address: (nil)
[bl1415:29526] [ 0] /lib64/libpthread.so.0(+0xf5d0)[0x2ab1e72915d0]
[bl1415:29526] [ 1] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/libopen-pal.so.6(opal_memory_ptmalloc2_int_malloc+0x29e)[0x2ab1e7c550be]
[bl1415:29526] [ 2] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/libopen-pal.so.6(opal_memory_ptmalloc2_int_memalign+0x69)[0x2ab1e7c58829]
[bl1415:29526] [ 3] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/libopen-pal.so.6(opal_memory_ptmalloc2_memalign+0x6f)[0x2ab1e7c583ff]
[bl1415:29526] [ 4] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/openmpi/mca_btl_openib.so(+0x2867b)[0x2ab1eac8a67b]
[bl1415:29526] [ 5] 
/data/pgrlf/openmpi-1.8.4/SLES10_SP2_lib/lib/openmpi/mca_btl_openib.so(+0x1f712)[0x2ab1eac81712]
[bl1415:29526] [ 6] /lib64/libpthread.so.0(+0x75f0)[0x2ab1e72895f0]
[bl1415:29526] [ 7] /lib64/libc.so.6(clone+0x6d)[0x2ab1e757484d]
[bl1415:29526] *** End of error message ***

When I run the same test using a previous build of OpenMPI 1.6.5 on this 
system, it returns a memory registration warning, but otherwise executes 
normally:

--
WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory.  This can cause MPI jobs to
run with erratic performance, hang, and/or crash.

OpenMPI 1.8.4 does not seem to be reporting a memory registration warning in 
situations where previous versions would report such a warning. Is this because 
OpenMPI 1.8.4 is no longer vulnerable to this type of condition?

Thanks,
Greg


This e-mail may contain proprietary information of the sending organization. 
Any unauthorized or improper disclosure, copying, distribution, or use of the 
contents of this e-mail and attached document(s) is prohibited. The information 
contained in this e-mail and attached document(s) is intended only for the 
personal and private use of the recipient(s) named above. If you have received 
this communication in error, please notify the sender immediately by email and 
delete the original e-mail and attached document(s).


Re: [OMPI users] Open MPI collectives algorithm selection

2015-03-10 Thread George Bosilca
Khalid,

The decision is rechecked every time we create a new communicator. So, you 
might create a solution that force the algorithm to whatever you think it is 
best (using the environment variables you mentioned), then create a 
communicator, and free it once you’re done.

I have no idea what you’re trying to achieve, but be aware there is a burden 
associated with the creation of a communicator, so this approach might outweigh 
the benefits.

  George.

> On Mar 10, 2015, at 10:31 , Gilles Gouaillardet 
>  wrote:
> 
> Khalid,
> 
> i am not aware of such a mechanism.
> 
> /* there might be a way to use MPI_T_* mechanisms to force the algorithm,
> and i will let other folks comment on that */
> 
> you definetly cannot directly invoke ompi_coll_tuned_bcast_intra_binomial
> (abstraction violation, non portable, and you miss the some parameters)
> 
> out of curiosity, what do you have in mind for (some_condition) ?
> /* since it seems implicit (some_condition) is independant of communicator 
> size, and message size */
> 
> Cheers,
> 
> Gilles
> 
> On Tue, Mar 10, 2015 at 10:04 PM, Khalid Hasanov  > wrote:
> Hello,
> 
> I would like to know if Open MPI provides some kind of mechanism to select 
> collective algorithms such as MPI broadcast during run time depending on some 
> logic. For example, I would like to
> use something like this:
> 
> if (some_condition)  ompi_binomial_broadcast(...);
> else   ompi_pipeline_broadcast(...);
> 
> I know it is possible to use some fixed algorithm by 
> coll_tuned_use_dynamic_rules or to define a custom selection rule using 
> coll_tuned_dynamic_rules_filename. But
> I think it is not suitable in this situation as the dynamic rules mainly 
> based on the message size, segment size and communicator size.
> 
> Another option could be using Open MPI internal APIS like 
> 
> ompi_coll_tuned_bcast_intra_binomial( buf, count, dtype, root, comm, module, 
> segsize);
> 
> But it highly depends on Open MPI internals as it uses mca_coll_base_module_t 
> . 
> 
> Is there any better option? (except using my own implementation of 
> collectives)
> 
> Any suggestion highly appreciated.
> 
> Thanks
> 
> Regards,
> Khalid
> 
> ___
> users mailing list
> us...@open-mpi.org 
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> 
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26446.php 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/03/26447.php



Re: [OMPI users] Open MPI collectives algorithm selection

2015-03-10 Thread Khalid Hasanov
George and Gilles, thank you for your answers.


@George,  honestly I didn't know that the decision is rechecked for every
new communicator creation operation. I will try it. In fact we used
sub-communicators for some other research work previously and indeed it
outweigh the benefits for small message sizes in collectives.

In my opinion, it would be beneficial for researchers to have similar
collective operation APIs in MPI implementations with additional algorithm
parameter where applicable.


Best regards,
Khalid




On Tue, Mar 10, 2015 at 2:45 PM, George Bosilca  wrote:

> Khalid,
>
> The decision is rechecked every time we create a new communicator. So, you
> might create a solution that force the algorithm to whatever you think it
> is best (using the environment variables you mentioned), then create a
> communicator, and free it once you’re done.
>
> I have no idea what you’re trying to achieve, but be aware there is a
> burden associated with the creation of a communicator, so this approach
> might outweigh the benefits.
>
>   George.
>
> On Mar 10, 2015, at 10:31 , Gilles Gouaillardet <
> gilles.gouaillar...@gmail.com> wrote:
>
> Khalid,
>
> i am not aware of such a mechanism.
>
> /* there might be a way to use MPI_T_* mechanisms to force the algorithm,
> and i will let other folks comment on that */
>
> you definetly cannot directly invoke ompi_coll_tuned_bcast_intra_binomial
> (abstraction violation, non portable, and you miss the some parameters)
>
> out of curiosity, what do you have in mind for (some_condition) ?
> /* since it seems implicit (some_condition) is independant of communicator
> size, and message size */
>
> Cheers,
>
> Gilles
>
> On Tue, Mar 10, 2015 at 10:04 PM, Khalid Hasanov 
> wrote:
>
>> Hello,
>>
>> I would like to know if Open MPI provides some kind of mechanism to
>> select collective algorithms such as MPI broadcast during run time
>> depending on some logic. For example, I would like to
>> use something like this:
>>
>> if (some_condition)  ompi_binomial_broadcast(...);
>> else   ompi_pipeline_broadcast(...);
>>
>> I know it is possible to use some fixed algorithm by 
>> coll_tuned_use_dynamic_rules
>> or to define a custom selection rule using
>> coll_tuned_dynamic_rules_filename. But
>> I think it is not suitable in this situation as the dynamic rules mainly
>> based on the message size, segment size and communicator size.
>>
>> Another option could be using Open MPI internal APIS like
>>
>> ompi_coll_tuned_bcast_intra_binomial( buf, count, dtype, root, comm,
>> module, segsize);
>>
>> But it highly depends on Open MPI internals as it uses
>> mca_coll_base_module_t .
>>
>> Is there any better option? (except using my own implementation of
>> collectives)
>>
>> Any suggestion highly appreciated.
>>
>> Thanks
>>
>> Regards,
>> Khalid
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/03/26446.php
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/03/26447.php
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/03/26449.php
>