I don’t see a problem with it. FWIW: I’m getting ready to release 1.8.6 in the 
next week


> On May 25, 2015, at 8:46 AM, Xavier Besseron <xavier.besse...@uni.lu> wrote:
> 
> Good that it will be fixed in the next release!
> 
> In the meantime, and because it might impact other users,
> I would like to ask my sysadmins to set btl_openib_memalign_threshold=12288 
> in etc/openmpi-mca-params.conf on our clusters.
> 
> Do you see any good reason not doing it?
> 
> Thanks!
> 
> 
> Xavier
> 
> 
> 
> On Mon, May 25, 2015 at 4:12 PM, Ralph Castain <r...@open-mpi.org 
> <mailto:r...@open-mpi.org>> wrote:
> I found the problem - someone had a typo in btl_openib_mca.c. The threshold 
> need to be set to the module eager limit as that is the only thing defined at 
> that point.
> 
> Thanks for bringing it to our attention! I’ll set it up to go into 1.8.6
> 
> 
>> On May 25, 2015, at 3:04 AM, Xavier Besseron <xavier.besse...@uni.lu 
>> <mailto:xavier.besse...@uni.lu>> wrote:
>> 
>> Hi,
>> 
>> Thanks for your reply Ralph.
>> 
>> The option only I'm using when configuring OpenMPI is '--prefix'.
>> When checking the config.log file, I see 
>> 
>> configure:208504: checking whether the openib BTL will use malloc hooks
>> configure:208510: result: yes
>> 
>> so I guess it is properly enabled (full config.log in attachment of this 
>> email).
>> 
>> 
>> 
>> However, I think I have the reason of the bug (lines refer to source code of 
>> OpenMPI 1.8.5):
>> 
>> The default value of memalign_threshold is taken from eager_limit in 
>> function btl_openib_register_mca_params() in btl_openib_mca.c line 717.
>> But the default value is eager_limit is set in btl_openib_component.c at 
>> line 193 right after the call to btl_openib_register_mca_params().
>> 
>> To summarize, memalign_threshold gets its value from eager_limit before this 
>> one gets its value assigned.
>> 
>> 
>> 
>> Best regards,
>> 
>> Xavier
>> 
>> 
>> 
>>  
>> 
>> 
>> 
>> 
>> On Mon, May 25, 2015 at 2:27 AM, Ralph Castain <r...@open-mpi.org 
>> <mailto:r...@open-mpi.org>> wrote:
>> Looking at the code, we do in fact set the memalign_threshold = eager_limit 
>> by default, but only if you configured with 
>> —enable-btl-openib-malloc-alignment AND/OR we found the malloc hook 
>> functions were available.
>> 
>> You might check config.log to see if the openib malloc hooks were enabled. 
>> My guess is that they weren’t, for some reason.
>> 
>> 
>>> On May 24, 2015, at 9:07 AM, Xavier Besseron <xavier.besse...@uni.lu 
>>> <mailto:xavier.besse...@uni.lu>> wrote:
>>> 
>>> Dear OpenMPI developers / users,
>>> 
>>> This is much more a comment than a question since I believe I have already 
>>> solved my issue. But I would like to report it.
>>> 
>>> I have noticed my code performed very badly with OpenMPI when Infinand is 
>>> enabled, sometime +50% or even +100% overhead.
>>> I also have this slowdown when running with one thread and one process. In 
>>> such case, there is no other MPI call than MPI_Init() and MPI_Finalize().
>>> This overhead disappears if I disable at runtime the openib btl, ie with 
>>> '--mca btl ^openib'.
>>> After further investigation, I figured out it comes from the memory 
>>> allocator which is aligning every memory allocation when Infiniband is used.
>>> This makes sense because my code is a large irregular C++ code creating and 
>>> deleting many objects.
>>> 
>>> Just below is the documentation of the relevant MCA parameters coming 
>>> ompi_info:
>>> 
>>> MCA btl: parameter "btl_openib_memalign" (current value: "32", data source: 
>>> default, level: 9 dev/all, type: int)
>>>          [64 | 32 | 0] - Enable (64bit or 32bit)/Disable(0) memoryalignment 
>>> for all malloc calls if btl openib is used.
>>> 
>>> MCA btl: parameter "btl_openib_memalign_threshold" (current value: "0", 
>>> data source: default, level: 9 dev/all, type: size_t)
>>>          Allocating memory more than btl_openib_memalign_threshholdbytes 
>>> will automatically be algined to the value of btl_openib_memalign 
>>> bytes.memalign_threshhold defaults to the same value as 
>>> mca_btl_openib_eager_limit.
>>> 
>>> MCA btl: parameter "btl_openib_eager_limit" (current value: "12288", data 
>>> source: default, level: 4 tuner/basic, type: size_t)
>>>          Maximum size (in bytes, including header) of "short" messages 
>>> (must be >= 1).
>>> 
>>> 
>>> In the end, the problem is that the default value for 
>>> btl_openib_memalign_threshold is 0, which means that all memory allocations 
>>> are aligned to 32 bits.
>>> The documentation says that the default value of 
>>> btl_openib_memalign_threshold should be the the same as 
>>> btl_openib_eager_limit, ie 12288 instead of 0.
>>> 
>>> On my side, changing btl_openib_memalign_threshold to 12288 fixes my 
>>> performance issue.
>>> However, I believe that the default value of btl_openib_memalign_threshold 
>>> should be fixed in the OpenMPI code (or at least the documentation should 
>>> be fixed).
>>> 
>>> I tried OpenMPI 1.8.5, 1.7.3 and 1.6.4 and it's all the same.
>>> 
>>> 
>>> Bonus question:
>>> As this issue might impact other users, I'm considering applying a global 
>>> fix on our clusters by setting this default value 
>>> etc/openmpi-mca-params.conf.
>>> Do you see any good reason not doing it?
>>> 
>>> Thank you for your comments.
>>> 
>>> Best regards,
>>> 
>>> Xavier
>>> 
>>> 
>>> -- 
>>> Dr Xavier BESSERON
>>> Research associate
>>> FSTC, University of Luxembourg
>>> Campus Kirchberg, Office E-007
>>> Phone: +352 46 66 44 5418 <tel:%2B352%2046%2066%2044%205418>
>>> http://luxdem.uni.lu/ <http://luxdem.uni.lu/>
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/05/26913.php 
>>> <http://www.open-mpi.org/community/lists/users/2015/05/26913.php>
>> 
>> 
>> 
>> -- 
>> Dr Xavier BESSERON
>> Research associate
>> FSTC, University of Luxembourg
>> Campus Kirchberg, Office E-007
>> Phone: +352 46 66 44 5418 <tel:%2B352%2046%2066%2044%205418>
>> http://luxdem.uni.lu/ <http://luxdem.uni.lu/>
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org <mailto:us...@open-mpi.org>
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
>> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/05/26915.php 
>> <http://www.open-mpi.org/community/lists/users/2015/05/26915.php>
> 
> 
> 
> -- 
> Dr Xavier BESSERON
> Research associate
> FSTC, University of Luxembourg
> Campus Kirchberg, Office E-007
> Phone: +352 46 66 44 5418
> http://luxdem.uni.lu/ <http://luxdem.uni.lu/>
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/05/26918.php

Reply via email to