Hi,

George Bosilca <bosi...@icl.utk.edu> writes:

> If I'm not mistaken, hcoll is playing with the opal_progress in a way
> that conflicts with the blessed usage of progress in OMPI and prevents
> other components from advancing and timely completing requests. The
> impact is minimal for sequential applications using only blocking
> calls, but is jeopardizing performance when multiple types of
> communications are simultaneously executing or when multiple threads
> are active.
>
> The solution might be very simple: hcoll is a module providing support
> for collective communications so as long as you don't use collectives,
> or the tuned module provides collective performance similar to hcoll
> on your cluster, just go ahead and disable hcoll. You can also reach
> out to Mellanox folks asking them to fix the hcoll usage of
> opal_progress.

until we find a more robust solution I was thinking on trying to just
enquiry the MPI implementation at running time and use the threaded
version if hcoll is not present and go for the unthreaded version if it
is. Looking at the coll.h file I see that some functions there might be
useful, for example: mca_coll_base_component_comm_query_2_0_0_fn_t, but
I have never delved here. Would this be an appropriate approach? Any
examples on how to enquiry in code for a particular component?

Thanks,
-- 
Ángel de Vicente

Tel.: +34 922 605 747
Web.: http://research.iac.es/proyecto/polmag/
---------------------------------------------------------------------------------------------
ADVERTENCIA: Sobre la privacidad y cumplimiento de la Ley de Protección de 
Datos, acceda a http://www.iac.es/disclaimer.php
WARNING: For more information on privacy and fulfilment of the Law concerning 
the Protection of Data, consult http://www.iac.es/disclaimer.php?lang=en

Reply via email to