Hi,

Thanks for your reply Ralph.

The option only I'm using when configuring OpenMPI is '--prefix'.
When checking the config.log file, I see

configure:208504: checking whether the openib BTL will use malloc hooks
configure:208510: result: yes

so I guess it is properly enabled (full config.log in attachment of this
email).



However, I think I have the reason of the bug (lines refer to source code
of OpenMPI 1.8.5):

The default value of memalign_threshold is taken from eager_limit in
function btl_openib_register_mca_params() in btl_openib_mca.c line 717.
But the default value is eager_limit is set in btl_openib_component.c at
line 193 right after the call to btl_openib_register_mca_params().

To summarize, memalign_threshold gets its value from eager_limit before
this one gets its value assigned.



Best regards,

Xavier








On Mon, May 25, 2015 at 2:27 AM, Ralph Castain <r...@open-mpi.org> wrote:

>  Looking at the code, we do in fact set the memalign_threshold =
> eager_limit by default, but only if you configured with
> —enable-btl-openib-malloc-alignment AND/OR we found the malloc hook
> functions were available.
>
>  You might check config.log to see if the openib malloc hooks were
> enabled. My guess is that they weren’t, for some reason.
>
>
>  On May 24, 2015, at 9:07 AM, Xavier Besseron <xavier.besse...@uni.lu>
> wrote:
>
>     Dear OpenMPI developers / users,
>
> This is much more a comment than a question since I believe I have already
> solved my issue. But I would like to report it.
>
> I have noticed my code performed very badly with OpenMPI when Infinand is
> enabled, sometime +50% or even +100% overhead.
> I also have this slowdown when running with one thread and one process. In
> such case, there is no other MPI call than MPI_Init() and MPI_Finalize().
> This overhead disappears if I disable at runtime the openib btl, ie with 
> '--mca
> btl ^openib'.
> After further investigation, I figured out it comes from the memory
> allocator which is aligning every memory allocation when Infiniband is
> used.
> This makes sense because my code is a large irregular C++ code creating
> and deleting many objects.
>
> Just below is the documentation of the relevant MCA parameters coming
> ompi_info:
>
> MCA btl: parameter "*btl_openib_memalign*" (current value: "32", data
> source: default, level: 9 dev/all, type: int)
>          [64 | 32 | 0] - Enable (64bit or 32bit)/Disable(0)
> memoryalignment for all malloc calls if btl openib is used.
>
> MCA btl: parameter "*btl_openib_memalign_threshold*" (current value: "*0*",
> data source: default, level: 9 dev/all, type: size_t)
>          Allocating memory more than btl_openib_memalign_threshholdbytes
> will automatically be algined to the value of btl_openib_memalign 
> bytes.*memalign_threshhold
> defaults to the same value as mca_btl_openib_eager_limit*.
>
> MCA btl: parameter "*btl_openib_eager_limit*" (current value: "*12288*",
> data source: default, level: 4 tuner/basic, type: size_t)
>          Maximum size (in bytes, including header) of "short" messages
> (must be >= 1).
>
>
> In the end, the problem is that the default value for
> btl_openib_memalign_threshold is 0, which means that *all* memory
> allocations are aligned to 32 bits.
> The documentation says that the default value of
> btl_openib_memalign_threshold should be the the same as
> btl_openib_eager_limit, ie 12288 instead of 0.
>
>  On my side, changing btl_openib_memalign_threshold to 12288 fixes my
> performance issue.
> However, I believe that the default value of btl_openib_memalign_threshold
> should be fixed in the OpenMPI code (or at least the documentation should
> be fixed).
>
>  I tried OpenMPI 1.8.5, 1.7.3 and 1.6.4 and it's all the same.
>
>
>  Bonus question:
> As this issue might impact other users, I'm considering applying a global
> fix on our clusters by setting this default value
> etc/openmpi-mca-params.conf.
> Do you see any good reason not doing it?
>
>  Thank you for your comments.
>
>  Best regards,
>
>  Xavier
>
>
>     --
>  Dr Xavier BESSERON
> Research associate
> FSTC, University of Luxembourg
> Campus Kirchberg, Office E-007
> Phone: +352 46 66 44 5418
> http://luxdem.uni.lu/
>
>   _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/05/26913.php
>
>
>


-- 
Dr Xavier BESSERON
Research associate
FSTC, University of Luxembourg
Campus Kirchberg, Office E-007
Phone: +352 46 66 44 5418
http://luxdem.uni.lu/

Reply via email to