I don’t see a problem with it. FWIW: I’m getting ready to release 1.8.6 in the next week
> On May 25, 2015, at 8:46 AM, Xavier Besseron <xavier.besse...@uni.lu> wrote: > > Good that it will be fixed in the next release! > > In the meantime, and because it might impact other users, > I would like to ask my sysadmins to set btl_openib_memalign_threshold=12288 > in etc/openmpi-mca-params.conf on our clusters. > > Do you see any good reason not doing it? > > Thanks! > > > Xavier > > > > On Mon, May 25, 2015 at 4:12 PM, Ralph Castain <r...@open-mpi.org > <mailto:r...@open-mpi.org>> wrote: > I found the problem - someone had a typo in btl_openib_mca.c. The threshold > need to be set to the module eager limit as that is the only thing defined at > that point. > > Thanks for bringing it to our attention! I’ll set it up to go into 1.8.6 > > >> On May 25, 2015, at 3:04 AM, Xavier Besseron <xavier.besse...@uni.lu >> <mailto:xavier.besse...@uni.lu>> wrote: >> >> Hi, >> >> Thanks for your reply Ralph. >> >> The option only I'm using when configuring OpenMPI is '--prefix'. >> When checking the config.log file, I see >> >> configure:208504: checking whether the openib BTL will use malloc hooks >> configure:208510: result: yes >> >> so I guess it is properly enabled (full config.log in attachment of this >> email). >> >> >> >> However, I think I have the reason of the bug (lines refer to source code of >> OpenMPI 1.8.5): >> >> The default value of memalign_threshold is taken from eager_limit in >> function btl_openib_register_mca_params() in btl_openib_mca.c line 717. >> But the default value is eager_limit is set in btl_openib_component.c at >> line 193 right after the call to btl_openib_register_mca_params(). >> >> To summarize, memalign_threshold gets its value from eager_limit before this >> one gets its value assigned. >> >> >> >> Best regards, >> >> Xavier >> >> >> >> >> >> >> >> >> On Mon, May 25, 2015 at 2:27 AM, Ralph Castain <r...@open-mpi.org >> <mailto:r...@open-mpi.org>> wrote: >> Looking at the code, we do in fact set the memalign_threshold = eager_limit >> by default, but only if you configured with >> —enable-btl-openib-malloc-alignment AND/OR we found the malloc hook >> functions were available. >> >> You might check config.log to see if the openib malloc hooks were enabled. >> My guess is that they weren’t, for some reason. >> >> >>> On May 24, 2015, at 9:07 AM, Xavier Besseron <xavier.besse...@uni.lu >>> <mailto:xavier.besse...@uni.lu>> wrote: >>> >>> Dear OpenMPI developers / users, >>> >>> This is much more a comment than a question since I believe I have already >>> solved my issue. But I would like to report it. >>> >>> I have noticed my code performed very badly with OpenMPI when Infinand is >>> enabled, sometime +50% or even +100% overhead. >>> I also have this slowdown when running with one thread and one process. In >>> such case, there is no other MPI call than MPI_Init() and MPI_Finalize(). >>> This overhead disappears if I disable at runtime the openib btl, ie with >>> '--mca btl ^openib'. >>> After further investigation, I figured out it comes from the memory >>> allocator which is aligning every memory allocation when Infiniband is used. >>> This makes sense because my code is a large irregular C++ code creating and >>> deleting many objects. >>> >>> Just below is the documentation of the relevant MCA parameters coming >>> ompi_info: >>> >>> MCA btl: parameter "btl_openib_memalign" (current value: "32", data source: >>> default, level: 9 dev/all, type: int) >>> [64 | 32 | 0] - Enable (64bit or 32bit)/Disable(0) memoryalignment >>> for all malloc calls if btl openib is used. >>> >>> MCA btl: parameter "btl_openib_memalign_threshold" (current value: "0", >>> data source: default, level: 9 dev/all, type: size_t) >>> Allocating memory more than btl_openib_memalign_threshholdbytes >>> will automatically be algined to the value of btl_openib_memalign >>> bytes.memalign_threshhold defaults to the same value as >>> mca_btl_openib_eager_limit. >>> >>> MCA btl: parameter "btl_openib_eager_limit" (current value: "12288", data >>> source: default, level: 4 tuner/basic, type: size_t) >>> Maximum size (in bytes, including header) of "short" messages >>> (must be >= 1). >>> >>> >>> In the end, the problem is that the default value for >>> btl_openib_memalign_threshold is 0, which means that all memory allocations >>> are aligned to 32 bits. >>> The documentation says that the default value of >>> btl_openib_memalign_threshold should be the the same as >>> btl_openib_eager_limit, ie 12288 instead of 0. >>> >>> On my side, changing btl_openib_memalign_threshold to 12288 fixes my >>> performance issue. >>> However, I believe that the default value of btl_openib_memalign_threshold >>> should be fixed in the OpenMPI code (or at least the documentation should >>> be fixed). >>> >>> I tried OpenMPI 1.8.5, 1.7.3 and 1.6.4 and it's all the same. >>> >>> >>> Bonus question: >>> As this issue might impact other users, I'm considering applying a global >>> fix on our clusters by setting this default value >>> etc/openmpi-mca-params.conf. >>> Do you see any good reason not doing it? >>> >>> Thank you for your comments. >>> >>> Best regards, >>> >>> Xavier >>> >>> >>> -- >>> Dr Xavier BESSERON >>> Research associate >>> FSTC, University of Luxembourg >>> Campus Kirchberg, Office E-007 >>> Phone: +352 46 66 44 5418 <tel:%2B352%2046%2066%2044%205418> >>> http://luxdem.uni.lu/ <http://luxdem.uni.lu/> >>> >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org <mailto:us...@open-mpi.org> >>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >>> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2015/05/26913.php >>> <http://www.open-mpi.org/community/lists/users/2015/05/26913.php> >> >> >> >> -- >> Dr Xavier BESSERON >> Research associate >> FSTC, University of Luxembourg >> Campus Kirchberg, Office E-007 >> Phone: +352 46 66 44 5418 <tel:%2B352%2046%2066%2044%205418> >> http://luxdem.uni.lu/ <http://luxdem.uni.lu/> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org <mailto:us...@open-mpi.org> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> <http://www.open-mpi.org/mailman/listinfo.cgi/users> >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/05/26915.php >> <http://www.open-mpi.org/community/lists/users/2015/05/26915.php> > > > > -- > Dr Xavier BESSERON > Research associate > FSTC, University of Luxembourg > Campus Kirchberg, Office E-007 > Phone: +352 46 66 44 5418 > http://luxdem.uni.lu/ <http://luxdem.uni.lu/> > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/05/26918.php