Nathan / Steve -- you guys are nominally the owners of the openib BTL: can you 
please investigate?


> On Jun 10, 2015, at 4:15 PM, Ralph Castain <r...@open-mpi.org> wrote:
> 
> Odd - without that setting, the value is essentially undefined, so it’s hard 
> to understand how that is any better. Maybe the whole alignment thing is 
> busted, and leaving it undefined (which usually defaults to zero, but not 
> always) causes it to be turned “off”?
> 
> I don’t really care, mind you - but it is clearly an error the way it was 
> before.
> 
> 
>> On Jun 10, 2015, at 12:39 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com> 
>> wrote:
>> 
>> Ralph --
>> 
>> This change was not correct 
>> (https://github.com/open-mpi/ompi/commit/ce915b5757d428d3e914dcef50bd4b2636561bca).
>>   It is causing memory corruption in the openib BTL.
>> 
>> 
>> 
>>> On May 25, 2015, at 11:56 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>> 
>>> I don’t see a problem with it. FWIW: I’m getting ready to release 1.8.6 in 
>>> the next week
>>> 
>>> 
>>>> On May 25, 2015, at 8:46 AM, Xavier Besseron <xavier.besse...@uni.lu> 
>>>> wrote:
>>>> 
>>>> Good that it will be fixed in the next release!
>>>> 
>>>> In the meantime, and because it might impact other users,
>>>> I would like to ask my sysadmins to set 
>>>> btl_openib_memalign_threshold=12288 in etc/openmpi-mca-params.conf on our 
>>>> clusters.
>>>> 
>>>> Do you see any good reason not doing it?
>>>> 
>>>> Thanks!
>>>> 
>>>> 
>>>> Xavier
>>>> 
>>>> 
>>>> 
>>>> On Mon, May 25, 2015 at 4:12 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> I found the problem - someone had a typo in btl_openib_mca.c. The 
>>>> threshold need to be set to the module eager limit as that is the only 
>>>> thing defined at that point.
>>>> 
>>>> Thanks for bringing it to our attention! I’ll set it up to go into 1.8.6
>>>> 
>>>> 
>>>>> On May 25, 2015, at 3:04 AM, Xavier Besseron <xavier.besse...@uni.lu> 
>>>>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Thanks for your reply Ralph.
>>>>> 
>>>>> The option only I'm using when configuring OpenMPI is '--prefix'.
>>>>> When checking the config.log file, I see 
>>>>> 
>>>>> configure:208504: checking whether the openib BTL will use malloc hooks
>>>>> configure:208510: result: yes
>>>>> 
>>>>> so I guess it is properly enabled (full config.log in attachment of this 
>>>>> email).
>>>>> 
>>>>> 
>>>>> 
>>>>> However, I think I have the reason of the bug (lines refer to source code 
>>>>> of OpenMPI 1.8.5):
>>>>> 
>>>>> The default value of memalign_threshold is taken from eager_limit in 
>>>>> function btl_openib_register_mca_params() in btl_openib_mca.c line 717.
>>>>> But the default value is eager_limit is set in btl_openib_component.c at 
>>>>> line 193 right after the call to btl_openib_register_mca_params().
>>>>> 
>>>>> To summarize, memalign_threshold gets its value from eager_limit before 
>>>>> this one gets its value assigned.
>>>>> 
>>>>> 
>>>>> 
>>>>> Best regards,
>>>>> 
>>>>> Xavier
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, May 25, 2015 at 2:27 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>> Looking at the code, we do in fact set the memalign_threshold = 
>>>>> eager_limit by default, but only if you configured with 
>>>>> —enable-btl-openib-malloc-alignment AND/OR we found the malloc hook 
>>>>> functions were available.
>>>>> 
>>>>> You might check config.log to see if the openib malloc hooks were 
>>>>> enabled. My guess is that they weren’t, for some reason.
>>>>> 
>>>>> 
>>>>>> On May 24, 2015, at 9:07 AM, Xavier Besseron <xavier.besse...@uni.lu> 
>>>>>> wrote:
>>>>>> 
>>>>>> Dear OpenMPI developers / users,
>>>>>> 
>>>>>> This is much more a comment than a question since I believe I have 
>>>>>> already solved my issue. But I would like to report it.
>>>>>> 
>>>>>> I have noticed my code performed very badly with OpenMPI when Infinand 
>>>>>> is enabled, sometime +50% or even +100% overhead.
>>>>>> I also have this slowdown when running with one thread and one process. 
>>>>>> In such case, there is no other MPI call than MPI_Init() and 
>>>>>> MPI_Finalize().
>>>>>> This overhead disappears if I disable at runtime the openib btl, ie with 
>>>>>> '--mca btl ^openib'.
>>>>>> After further investigation, I figured out it comes from the memory 
>>>>>> allocator which is aligning every memory allocation when Infiniband is 
>>>>>> used.
>>>>>> This makes sense because my code is a large irregular C++ code creating 
>>>>>> and deleting many objects.
>>>>>> 
>>>>>> Just below is the documentation of the relevant MCA parameters coming 
>>>>>> ompi_info:
>>>>>> 
>>>>>> MCA btl: parameter "btl_openib_memalign" (current value: "32", data 
>>>>>> source: default, level: 9 dev/all, type: int)
>>>>>>        [64 | 32 | 0] - Enable (64bit or 32bit)/Disable(0) 
>>>>>> memoryalignment for all malloc calls if btl openib is used.
>>>>>> 
>>>>>> MCA btl: parameter "btl_openib_memalign_threshold" (current value: "0", 
>>>>>> data source: default, level: 9 dev/all, type: size_t)
>>>>>>        Allocating memory more than btl_openib_memalign_threshholdbytes 
>>>>>> will automatically be algined to the value of btl_openib_memalign 
>>>>>> bytes.memalign_threshhold defaults to the same value as 
>>>>>> mca_btl_openib_eager_limit.
>>>>>> 
>>>>>> MCA btl: parameter "btl_openib_eager_limit" (current value: "12288", 
>>>>>> data source: default, level: 4 tuner/basic, type: size_t)
>>>>>>        Maximum size (in bytes, including header) of "short" messages 
>>>>>> (must be >= 1).
>>>>>> 
>>>>>> 
>>>>>> In the end, the problem is that the default value for 
>>>>>> btl_openib_memalign_threshold is 0, which means that all memory 
>>>>>> allocations are aligned to 32 bits.
>>>>>> The documentation says that the default value of 
>>>>>> btl_openib_memalign_threshold should be the the same as 
>>>>>> btl_openib_eager_limit, ie 12288 instead of 0.
>>>>>> 
>>>>>> On my side, changing btl_openib_memalign_threshold to 12288 fixes my 
>>>>>> performance issue.
>>>>>> However, I believe that the default value of 
>>>>>> btl_openib_memalign_threshold should be fixed in the OpenMPI code (or at 
>>>>>> least the documentation should be fixed).
>>>>>> 
>>>>>> I tried OpenMPI 1.8.5, 1.7.3 and 1.6.4 and it's all the same.
>>>>>> 
>>>>>> 
>>>>>> Bonus question:
>>>>>> As this issue might impact other users, I'm considering applying a 
>>>>>> global fix on our clusters by setting this default value 
>>>>>> etc/openmpi-mca-params.conf.
>>>>>> Do you see any good reason not doing it?
>>>>>> 
>>>>>> Thank you for your comments.
>>>>>> 
>>>>>> Best regards,
>>>>>> 
>>>>>> Xavier
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> Dr Xavier BESSERON
>>>>>> Research associate
>>>>>> FSTC, University of Luxembourg
>>>>>> Campus Kirchberg, Office E-007
>>>>>> Phone: +352 46 66 44 5418
>>>>>> http://luxdem.uni.lu/
>>>>>> 
>>>>>> _______________________________________________
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> Link to this post: 
>>>>>> http://www.open-mpi.org/community/lists/users/2015/05/26913.php
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Dr Xavier BESSERON
>>>>> Research associate
>>>>> FSTC, University of Luxembourg
>>>>> Campus Kirchberg, Office E-007
>>>>> Phone: +352 46 66 44 5418
>>>>> http://luxdem.uni.lu/
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2015/05/26915.php
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Dr Xavier BESSERON
>>>> Research associate
>>>> FSTC, University of Luxembourg
>>>> Campus Kirchberg, Office E-007
>>>> Phone: +352 46 66 44 5418
>>>> http://luxdem.uni.lu/
>>>> 
>>>> _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2015/05/26918.php
>>> 
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2015/05/26920.php
>> 
>> 
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to: 
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2015/06/27086.php
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/06/27087.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to