from:"Xavier Besseron"

[OMPI users] Fwd: Default value of btl_openib_memalign_threshold

2015-05-24 Thread Xavier Besseron

Dear OpenMPI developers / users,

This is much more a comment than a question since I believe I have already
solved my issue. But I would like to report it.

I have noticed my code performed very badly with OpenMPI when Infinand is
enabled, sometime +50% or even +100% overhead.
I also have this slowdown when running with one thread and one process. In
such case, there is no other MPI call than MPI_Init() and MPI_Finalize().
This overhead disappears if I disable at runtime the openib btl, ie with '--mca
btl ^openib'.
After further investigation, I figured out it comes from the memory
allocator which is aligning every memory allocation when Infiniband is used.
This makes sense because my code is a large irregular C++ code creating and
deleting many objects.

Just below is the documentation of the relevant MCA parameters coming
ompi_info:

MCA btl: parameter "*btl_openib_memalign*" (current value: "32", data
source: default, level: 9 dev/all, type: int)
 [64 | 32 | 0] - Enable (64bit or 32bit)/Disable(0) memoryalignment
for all malloc calls if btl openib is used.

MCA btl: parameter "*btl_openib_memalign_threshold*" (current value: "*0*",
data source: default, level: 9 dev/all, type: size_t)
 Allocating memory more than btl_openib_memalign_threshholdbytes
will automatically be algined to the value of btl_openib_memalign
bytes.*memalign_threshhold
defaults to the same value as mca_btl_openib_eager_limit*.

MCA btl: parameter "*btl_openib_eager_limit*" (current value: "*12288*",
data source: default, level: 4 tuner/basic, type: size_t)
 Maximum size (in bytes, including header) of "short" messages
(must be >= 1).


In the end, the problem is that the default value for
btl_openib_memalign_threshold is 0, which means that *all* memory
allocations are aligned to 32 bits.
The documentation says that the default value of
btl_openib_memalign_threshold should be the the same as
btl_openib_eager_limit, ie 12288 instead of 0.

On my side, changing btl_openib_memalign_threshold to 12288 fixes my
performance issue.
However, I believe that the default value of btl_openib_memalign_threshold
should be fixed in the OpenMPI code (or at least the documentation should
be fixed).

I tried OpenMPI 1.8.5, 1.7.3 and 1.6.4 and it's all the same.


Bonus question:
As this issue might impact other users, I'm considering applying a global
fix on our clusters by setting this default value
etc/openmpi-mca-params.conf.
Do you see any good reason not doing it?

Thank you for your comments.

Best regards,

Xavier


-- 
Dr Xavier BESSERON
Research associate
FSTC, University of Luxembourg
Campus Kirchberg, Office E-007
Phone: +352 46 66 44 5418
http://luxdem.uni.lu/

Re: [OMPI users] Default value of btl_openib_memalign_threshold

2015-05-25 Thread Xavier Besseron

Hi,

Thanks for your reply Ralph.

The option only I'm using when configuring OpenMPI is '--prefix'.
When checking the config.log file, I see

configure:208504: checking whether the openib BTL will use malloc hooks
configure:208510: result: yes

so I guess it is properly enabled (full config.log in attachment of this
email).



However, I think I have the reason of the bug (lines refer to source code
of OpenMPI 1.8.5):

The default value of memalign_threshold is taken from eager_limit in
function btl_openib_register_mca_params() in btl_openib_mca.c line 717.
But the default value is eager_limit is set in btl_openib_component.c at
line 193 right after the call to btl_openib_register_mca_params().

To summarize, memalign_threshold gets its value from eager_limit before
this one gets its value assigned.



Best regards,

Xavier








On Mon, May 25, 2015 at 2:27 AM, Ralph Castain  wrote:

>  Looking at the code, we do in fact set the memalign_threshold =
> eager_limit by default, but only if you configured with
> —enable-btl-openib-malloc-alignment AND/OR we found the malloc hook
> functions were available.
>
>  You might check config.log to see if the openib malloc hooks were
> enabled. My guess is that they weren’t, for some reason.
>
>
>  On May 24, 2015, at 9:07 AM, Xavier Besseron 
> wrote:
>
> Dear OpenMPI developers / users,
>
> This is much more a comment than a question since I believe I have already
> solved my issue. But I would like to report it.
>
> I have noticed my code performed very badly with OpenMPI when Infinand is
> enabled, sometime +50% or even +100% overhead.
> I also have this slowdown when running with one thread and one process. In
> such case, there is no other MPI call than MPI_Init() and MPI_Finalize().
> This overhead disappears if I disable at runtime the openib btl, ie with 
> '--mca
> btl ^openib'.
> After further investigation, I figured out it comes from the memory
> allocator which is aligning every memory allocation when Infiniband is
> used.
> This makes sense because my code is a large irregular C++ code creating
> and deleting many objects.
>
> Just below is the documentation of the relevant MCA parameters coming
> ompi_info:
>
> MCA btl: parameter "*btl_openib_memalign*" (current value: "32", data
> source: default, level: 9 dev/all, type: int)
>  [64 | 32 | 0] - Enable (64bit or 32bit)/Disable(0)
> memoryalignment for all malloc calls if btl openib is used.
>
> MCA btl: parameter "*btl_openib_memalign_threshold*" (current value: "*0*",
> data source: default, level: 9 dev/all, type: size_t)
>  Allocating memory more than btl_openib_memalign_threshholdbytes
> will automatically be algined to the value of btl_openib_memalign 
> bytes.*memalign_threshhold
> defaults to the same value as mca_btl_openib_eager_limit*.
>
> MCA btl: parameter "*btl_openib_eager_limit*" (current value: "*12288*",
> data source: default, level: 4 tuner/basic, type: size_t)
>  Maximum size (in bytes, including header) of "short" messages
> (must be >= 1).
>
>
> In the end, the problem is that the default value for
> btl_openib_memalign_threshold is 0, which means that *all* memory
> allocations are aligned to 32 bits.
> The documentation says that the default value of
> btl_openib_memalign_threshold should be the the same as
> btl_openib_eager_limit, ie 12288 instead of 0.
>
>  On my side, changing btl_openib_memalign_threshold to 12288 fixes my
> performance issue.
> However, I believe that the default value of btl_openib_memalign_threshold
> should be fixed in the OpenMPI code (or at least the documentation should
> be fixed).
>
>  I tried OpenMPI 1.8.5, 1.7.3 and 1.6.4 and it's all the same.
>
>
>  Bonus question:
> As this issue might impact other users, I'm considering applying a global
> fix on our clusters by setting this default value
> etc/openmpi-mca-params.conf.
> Do you see any good reason not doing it?
>
>  Thank you for your comments.
>
>  Best regards,
>
>  Xavier
>
>
> --
>  Dr Xavier BESSERON
> Research associate
> FSTC, University of Luxembourg
> Campus Kirchberg, Office E-007
> Phone: +352 46 66 44 5418
> http://luxdem.uni.lu/
>
>   ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/05/26913.php
>
>
>


-- 
Dr Xavier BESSERON
Research associate
FSTC, University of Luxembourg
Campus Kirchberg, Office E-007
Phone: +352 46 66 44 5418
http://luxdem.uni.lu/

Re: [OMPI users] Default value of btl_openib_memalign_threshold

2015-05-25 Thread Xavier Besseron

Good that it will be fixed in the next release!

In the meantime, and because it might impact other users,
I would like to ask my sysadmins to set btl_openib_memalign_threshold=12288
in etc/openmpi-mca-params.conf on our clusters.

Do you see any good reason not doing it?

Thanks!


Xavier



On Mon, May 25, 2015 at 4:12 PM, Ralph Castain  wrote:

>  I found the problem - someone had a typo in btl_openib_mca.c. The
> threshold need to be set to the module eager limit as that is the only
> thing defined at that point.
>
>  Thanks for bringing it to our attention! I’ll set it up to go into 1.8.6
>
>
>  On May 25, 2015, at 3:04 AM, Xavier Besseron 
> wrote:
>
>  Hi,
>
>  Thanks for your reply Ralph.
>
>  The option only I'm using when configuring OpenMPI is '--prefix'.
> When checking the config.log file, I see
>
>  configure:208504: checking whether the openib BTL will use malloc hooks
> configure:208510: result: yes
>
>  so I guess it is properly enabled (full config.log in attachment of this
> email).
>
>
>
>  However, I think I have the reason of the bug (lines refer to source
> code of OpenMPI 1.8.5):
>
>  The default value of memalign_threshold is taken from eager_limit in
> function btl_openib_register_mca_params() in btl_openib_mca.c line 717.
> But the default value is eager_limit is set in btl_openib_component.c at
> line 193 right after the call to btl_openib_register_mca_params().
>
>  To summarize, memalign_threshold gets its value from eager_limit before
> this one gets its value assigned.
>
>
>
>  Best regards,
>
>  Xavier
>
>
>
>
>
>
>
>
> On Mon, May 25, 2015 at 2:27 AM, Ralph Castain  wrote:
>
>> Looking at the code, we do in fact set the memalign_threshold =
>> eager_limit by default, but only if you configured with
>> —enable-btl-openib-malloc-alignment AND/OR we found the malloc hook
>> functions were available.
>>
>>  You might check config.log to see if the openib malloc hooks were
>> enabled. My guess is that they weren’t, for some reason.
>>
>>
>>   On May 24, 2015, at 9:07 AM, Xavier Besseron 
>> wrote:
>>
>>   Dear OpenMPI developers / users,
>>
>> This is much more a comment than a question since I believe I have
>> already solved my issue. But I would like to report it.
>>
>> I have noticed my code performed very badly with OpenMPI when Infinand is
>> enabled, sometime +50% or even +100% overhead.
>> I also have this slowdown when running with one thread and one process.
>> In such case, there is no other MPI call than MPI_Init() and
>> MPI_Finalize().
>> This overhead disappears if I disable at runtime the openib btl, ie with 
>> '--mca
>> btl ^openib'.
>> After further investigation, I figured out it comes from the memory
>> allocator which is aligning every memory allocation when Infiniband is
>> used.
>> This makes sense because my code is a large irregular C++ code creating
>> and deleting many objects.
>>
>> Just below is the documentation of the relevant MCA parameters coming
>> ompi_info:
>>
>> MCA btl: parameter "*btl_openib_memalign*" (current value: "32", data
>> source: default, level: 9 dev/all, type: int)
>>  [64 | 32 | 0] - Enable (64bit or 32bit)/Disable(0)
>> memoryalignment for all malloc calls if btl openib is used.
>>
>> MCA btl: parameter "*btl_openib_memalign_threshold*" (current value: "*0*",
>> data source: default, level: 9 dev/all, type: size_t)
>>  Allocating memory more than btl_openib_memalign_threshholdbytes
>> will automatically be algined to the value of btl_openib_memalign 
>> bytes.*memalign_threshhold
>> defaults to the same value as mca_btl_openib_eager_limit*.
>>
>> MCA btl: parameter "*btl_openib_eager_limit*" (current value: "*12288*",
>> data source: default, level: 4 tuner/basic, type: size_t)
>>  Maximum size (in bytes, including header) of "short" messages
>> (must be >= 1).
>>
>>
>> In the end, the problem is that the default value for
>> btl_openib_memalign_threshold is 0, which means that *all* memory
>> allocations are aligned to 32 bits.
>> The documentation says that the default value of
>> btl_openib_memalign_threshold should be the the same as
>> btl_openib_eager_limit, ie 12288 instead of 0.
>>
>>  On my side, changing btl_openib_memalign_threshold to 12288 fixes my
>> performance issue.
>> However, I believe that the default value of
>> btl_openib_memalign_threshold should be fi

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-17 Thread Xavier Besseron

On Thu, Mar 17, 2016 at 3:17 PM, Ralph Castain  wrote:
> Just to clarify: I am not aware of any MPI that will allow you to relocate a
> process while it is running. You have to checkpoint the job, terminate it,
> and then restart the entire thing with the desired process on the new node.
>


Dear all,

For your information, MVAPICH2 supports live migration of MPI
processes, without the need to terminate and restart the whole job.

All the details are in the MVAPICH2 user guide:
  - How to configure MVAPICH2 for migration

http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2b-userguide.html#x1-120004.4
  - How to trigger process migration

http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2b-userguide.html#x1-760006.14.3

You can also check the paper "High Performance Pipelined Process
Migration with RDMA"
http://mvapich.cse.ohio-state.edu/static/media/publications/abstract/ouyangx-2011-ccgrid.pdf


Best regards,

Xavier



>
> On Mar 16, 2016, at 3:15 AM, Husen R  wrote:
>
> In the case of MPI application (not gromacs), How do I relocate MPI
> application from one node to another node while it is running ?
> I'm sorry, as far as I know the ompi-restart command is used to restart
> application, based on checkpoint file, once the application already
> terminated (no longer running).
>
> Thanks
>
> regards,
>
> Husen
>
> On Wed, Mar 16, 2016 at 4:29 PM, Jeff Hammond 
> wrote:
>>
>> Just checkpoint-restart the app to relocate. The overhead will be lower
>> than trying to do with MPI.
>>
>> Jeff
>>
>>
>> On Wednesday, March 16, 2016, Husen R  wrote:
>>>
>>> Hi Jeff,
>>>
>>> Thanks for the reply.
>>>
>>> After consulting the Gromacs docs, as you suggested, Gromacs already
>>> supports checkpoint/restart. thanks for the suggestion.
>>>
>>> Previously, I asked about checkpoint/restart in Open MPI because I want
>>> to checkpoint MPI Application and restart/migrate it while it is running.
>>> For the example, I run MPI application in node A,B and C in a cluster and
>>> I want to migrate process running in node A to other node, let's say to node
>>> C.
>>> is there a way to do this with open MPI ? thanks.
>>>
>>> Regards,
>>>
>>> Husen
>>>
>>>
>>>
>>>
>>> On Wed, Mar 16, 2016 at 12:37 PM, Jeff Hammond 
>>> wrote:

 Why do you need OpenMPI to do this? Molecular dynamics trajectories are
 trivial to checkpoint and restart at the application level. I'm sure 
 Gromacs
 already supports this. Please consult the Gromacs docs or user support for
 details.

 Jeff


 On Tuesday, March 15, 2016, Husen R  wrote:
>
> Dear Open MPI Users,
>
>
> Does the current stable release of Open MPI (v1.10 series) support
> fault tolerant feature ?
> I got the information from Open MPI FAQ that The checkpoint/restart
> support was last released as part of the v1.6 series.
> I just want to make sure about this.
>
> and by the way, does Open MPI able to checkpoint or restart mpi
> application/GROMACS automatically ?
> Please, I really need help.
>
> Regards,
>
>
> Husen



 --
 Jeff Hammond
 jeff.scie...@gmail.com
 http://jeffhammond.github.io/

 ___
 users mailing list
 us...@open-mpi.org
 Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
 Link to this post:
 http://www.open-mpi.org/community/lists/users/2016/03/28705.php
>>>
>>>
>>
>>
>> --
>> Jeff Hammond
>> jeff.scie...@gmail.com
>> http://jeffhammond.github.io/
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/03/28709.php
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/03/28710.php
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/03/28731.php

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-19 Thread Xavier Besseron

Dear Husen,

Did you check the information in file
./docs/chapters/01_FTB_on_Linux.txt inside the ftb tarball?
You might want to look at sub-section 4.1.

You can also try to get support on this via the MVAPICH2 mailing list.


Best regards,

Xavier


On Fri, Mar 18, 2016 at 11:24 AM, Husen R  wrote:
> Dear all,
>
> Thanks for the reply and valuable informations.
>
> I have configured MVAPICH2 using the instructions available in a resource
> provided by Xavier.
> I also have installed FTB (Fault-Tolerant Backplane) in order for MVAPICH2
> to have process migration feature.
>
> however, I got the following error message when I tried to run
> ftb_database_server.
> 
> pro@head-node:/usr/local/sbin$ ftb_database_server &
> [2] 10678
> pro@head-node:/usr/local/sbin$
> [FTB_ERROR][/home/pro/ftb-0.6.2/src/manager_lib/network/network_sock/include/ftb_network_sock.h:
> line 205][hostname:head-node]Cannot find boot-strap server ip address
> --
> Error message : "cannot find boot-strap server ip address".
> I have configured bootstrap ip address when I install FTB.
>
> does anyone have experience solving this problem when using FTB in Open MPI?
> I need help.
>
> Regards,
>
>
> Husen
>
>
> On Fri, Mar 18, 2016 at 12:06 AM, Xavier Besseron 
> wrote:
>>
>> On Thu, Mar 17, 2016 at 3:17 PM, Ralph Castain  wrote:
>> > Just to clarify: I am not aware of any MPI that will allow you to
>> > relocate a
>> > process while it is running. You have to checkpoint the job, terminate
>> > it,
>> > and then restart the entire thing with the desired process on the new
>> > node.
>> >
>>
>>
>> Dear all,
>>
>> For your information, MVAPICH2 supports live migration of MPI
>> processes, without the need to terminate and restart the whole job.
>>
>> All the details are in the MVAPICH2 user guide:
>>   - How to configure MVAPICH2 for migration
>>
>> http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2b-userguide.html#x1-120004.4
>>   - How to trigger process migration
>>
>> http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2b-userguide.html#x1-760006.14.3
>>
>> You can also check the paper "High Performance Pipelined Process
>> Migration with RDMA"
>>
>> http://mvapich.cse.ohio-state.edu/static/media/publications/abstract/ouyangx-2011-ccgrid.pdf
>>
>>
>> Best regards,
>>
>> Xavier
>>
>>
>>
>> >
>> > On Mar 16, 2016, at 3:15 AM, Husen R  wrote:
>> >
>> > In the case of MPI application (not gromacs), How do I relocate MPI
>> > application from one node to another node while it is running ?
>> > I'm sorry, as far as I know the ompi-restart command is used to restart
>> > application, based on checkpoint file, once the application already
>> > terminated (no longer running).
>> >
>> > Thanks
>> >
>> > regards,
>> >
>> > Husen
>> >
>> > On Wed, Mar 16, 2016 at 4:29 PM, Jeff Hammond 
>> > wrote:
>> >>
>> >> Just checkpoint-restart the app to relocate. The overhead will be lower
>> >> than trying to do with MPI.
>> >>
>> >> Jeff
>> >>
>> >>
>> >> On Wednesday, March 16, 2016, Husen R  wrote:
>> >>>
>> >>> Hi Jeff,
>> >>>
>> >>> Thanks for the reply.
>> >>>
>> >>> After consulting the Gromacs docs, as you suggested, Gromacs already
>> >>> supports checkpoint/restart. thanks for the suggestion.
>> >>>
>> >>> Previously, I asked about checkpoint/restart in Open MPI because I
>> >>> want
>> >>> to checkpoint MPI Application and restart/migrate it while it is
>> >>> running.
>> >>> For the example, I run MPI application in node A,B and C in a cluster
>> >>> and
>> >>> I want to migrate process running in node A to other node, let's say
>> >>> to node
>> >>> C.
>> >>> is there a way to do this with open MPI ? thanks.
>> >>>
>> >>> Regards,
>> >>>
>> >>> Husen
>> >>>
>> >>>
>> >>>
>> >>>

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-04-04 Thread Xavier Besseron

Hi Husen,

Sorry for this late reply.
I gave a quick try at FTB and I managed to get it to work on my local
machine.
I just had to apply this patch to prevent the agent to crash. Maybe this
was your issue:
https://github.com/besserox/ftb/commit/01aa44f5ed34e35429ddf99084395e4e8ba67b7c

Here is a (very) quick tutorial:

# Compile FTB (after applying patch)
./configure --enable-debug --prefix="${FTB_INSTALL_PATH}"
make
make install

# Start server
export FTB_BSTRAP_SERVER=127.0.0.1
"${FTB_INSTALL_PATH}/sbin/ftb_database_server"

# Start agent
export FTB_BSTRAP_SERVER=127.0.0.1
"${FTB_INSTALL_PATH}/sbin/ftb_agent"

# First check that server and agent are running
ps aux | grep 'ftb_'

# You should see the 2 processes running



# Compile examples
cd components
./autogen.sh
./configure --with-ftb="${FTB_INSTALL_PATH}"
make

# Start subscriber example
export FTB_BSTRAP_SERVER=127.0.0.1
export LD_LIBRARY_PATH="${FTB_INSTALL_PATH}/lib:${LD_LIBARY_PATH}"
./examples/ftb_simple_subscriber


# Start publisher example
export FTB_BSTRAP_SERVER=127.0.0.1
export LD_LIBRARY_PATH="${FTB_INSTALL_PATH}/lib:${LD_LIBARY_PATH}"

./examples/ftb_simple_publisher


The subscriber should output something like:

Caught event: event_space: FTB.FTB_EXAMPLES.SIMPLE, severity: INFO,
event_name: SIMPLE_EVENT from host: 10.91.2.156 and pid: 9654




I hope this will help you.
Unfortunately, FTB (and the CIFTS project) have been discontinued for quite
some time now, so it will be difficult to get real help on this.


Best regards,

Xavier




On Mon, Mar 21, 2016 at 3:52 AM, Husen R  wrote:

> Dear Xavier,
>
> Yes, I did. I followed the instructions available in that file, especially
> at sub-section 4.1.
>
> I configured boot-strap IP using the ./configure options.
> in front-end node, the boot-strap IP is its IP address because I want to
> make it as an ftb_database_server.
> in every compute nodes, the boot-strap IP is the front-end's IP address.
> finally, I use default values for boot-strap port and agent-port.
>
>
> I asked MVAPICH authority about this issue along with process migration
> issue and they said it looks like the feature is broken and they will take
> a look at it in a low priority due to other on-going activities in the
> project.
> Thank you.
>
> Regards,
>
> Husen
>
>
>
> On Sun, Mar 20, 2016 at 3:04 AM, Xavier Besseron 
> wrote:
>
>> Dear Husen,
>>
>> Did you check the information in file
>> ./docs/chapters/01_FTB_on_Linux.txt inside the ftb tarball?
>> You might want to look at sub-section 4.1.
>>
>> You can also try to get support on this via the MVAPICH2 mailing list.
>>
>>
>> Best regards,
>>
>> Xavier
>>
>>
>> On Fri, Mar 18, 2016 at 11:24 AM, Husen R  wrote:
>> > Dear all,
>> >
>> > Thanks for the reply and valuable informations.
>> >
>> > I have configured MVAPICH2 using the instructions available in a
>> resource
>> > provided by Xavier.
>> > I also have installed FTB (Fault-Tolerant Backplane) in order for
>> MVAPICH2
>> > to have process migration feature.
>> >
>> > however, I got the following error message when I tried to run
>> > ftb_database_server.
>> >
>> 
>> > pro@head-node:/usr/local/sbin$ ftb_database_server &
>> > [2] 10678
>> > pro@head-node:/usr/local/sbin$
>> >
>> [FTB_ERROR][/home/pro/ftb-0.6.2/src/manager_lib/network/network_sock/include/ftb_network_sock.h:
>> > line 205][hostname:head-node]Cannot find boot-strap server ip address
>> >
>> --
>> > Error message : "cannot find boot-strap server ip address".
>> > I have configured bootstrap ip address when I install FTB.
>> >
>> > does anyone have experience solving this problem when using FTB in Open
>> MPI?
>> > I need help.
>> >
>> > Regards,
>> >
>> >
>> > Husen
>> >
>> >
>> > On Fri, Mar 18, 2016 at 12:06 AM, Xavier Besseron <
>> xavier.besse...@uni.lu>
>> > wrote:
>> >>
>> >> On Thu, Mar 17, 2016 at 3:17 PM, Ralph Castain 
>> wrote:
>> >> > Just to clarify: I am not aware of any MPI that will allow you to
>> >> > relocate a
>> >> > process while it is running. You have to checkpoint the job,
>> terminate
>> >> >

[OMPI users] Invalid results with OpenMPI on Ubuntu Artful because of --enable-heterogeneous

2017-11-13 Thread Xavier Besseron

Dear all,

I want to share with you the follow issue with the OpenMPI shipped with the
latest Ubuntu Artful. It is OpenMPI 2.1.1 compiled with
option --enable-heterogeneous.

Looking at this issue https://github.com/open-mpi/ompi/issues/171, it
appears that this option is broken and should not be used.
This option is being used in Debian/Ubuntu since 2010 (
http://changelogs.ubuntu.com/changelogs/pool/universe/o/ope
nmpi/openmpi_2.1.1-6/changelog) and is still used so far. Apparently,
nobody complained so far.

However, now I complain :-)
I've found a simple example for which this option causes invalid results in
OpenMPI.


int A = 666, B = 42;
MPI_Irecv(&A, 1, MPI_INT, MPI_ANY_SOURCE, tag, comm, &req);
MPI_Send(&B, 1, MPI_INT, my_rank, tag, comm);
MPI_Wait(&req, &status);

# After that, when compiled with --enable-heterogeneous, we have A != B

This happens with just a single process. The full example is in attachment
(to be run with "mpirun -n 1 ./bug_openmpi_artful").
I extracted and simplified the code from the Zoltan library with which I
initially noticed the issue.

I find it annoying that Ubuntu distributes a broken OpenMPI.
I've also tested OpenMPI 2.1.1, 2.1.2 and 3.0.0 and using
--enable-heterogeneous causes the bug systematically.


Finally, my points/questions are:

- To share with you this small example in case you want to debug it

- What is the status of issue https://github.com/open-mpi/ompi/issues/171 ?
Is this option still considered broken?
If yes, I encourage you to remove it or mark as deprecated to avoid this
kind of mistake in the future.

- To get the feedback of OpenMPI developers on the use of this option,
which might convince the Debian/Ubuntu maintainer to remove this flag.
I have opened a bug on Ubuntu for it https://bugs.launchpad.net/
ubuntu/+source/openmpi/+bug/1731938


Thanks!

Xavier


-- 
Dr Xavier BESSERON
Research associate
FSTC, University of Luxembourg
Campus Belval, Office MNO E04 0415-040
Phone: +352 46 66 44 5418
http://luxdem.uni.lu/
#include 
#include 
#include 

int main(int argc, char* argv[])
{
int rc;

rc = MPI_Init(&argc, &argv);
if (rc != MPI_SUCCESS) abort();

int my_rank;
rc = MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
if (rc != MPI_SUCCESS) abort();


int A = 666;
int B = 42;


printf("[BEFORE] A = %d - B = %d\n", A, B);

int tag = 2999;
MPI_Comm comm = MPI_COMM_WORLD;
MPI_Status status;
MPI_Request req;

rc = MPI_Irecv(&A, 1, MPI_INT, MPI_ANY_SOURCE, tag, comm, &req);
if (rc != MPI_SUCCESS) abort();

rc = MPI_Send(&B, 1, MPI_INT, my_rank, tag, comm);
if (rc != MPI_SUCCESS) abort();
  
rc = MPI_Wait(&req, &status);
if (rc != MPI_SUCCESS) abort();

printf("[AFTER]  A = %d - B = %d\n", A, B);


if ( A != B ) 
{
printf("Error!!!\n");
}


rc = MPI_Finalize();
if (rc != MPI_SUCCESS) abort();

return 0;
}
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Invalid results with OpenMPI on Ubuntu Artful because of --enable-heterogeneous

2017-11-16 Thread Xavier Besseron

Thanks for looking at it!

Apparently, someone requested support for heterogeneous machines long time
ago:
https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/419074


Xavier



On Mon, Nov 13, 2017 at 7:56 PM, Gilles Gouaillardet <
gilles.gouaillar...@gmail.com> wrote:

> Xavier,
>
> i confirm there is a bug when using MPI_ANY_SOURCE with Open MPI
> configure'd with --enable-heterogeneous
>
> i made https://github.com/open-mpi/ompi/pull/4501 in order to fix
> that, and will merge and backport once reviewed
>
>
> Cheers,
>
> Gilles
>
> On Mon, Nov 13, 2017 at 8:46 AM, Gilles Gouaillardet
>  wrote:
> > Xavier,
> >
> > thanks for the report, i will have a look at it.
> >
> > is the bug triggered by MPI_ANY_SOURCE ?
> > /* e.g. does it work if you MPI_Irecv(..., myrank, ...) ? */
> >
> >
> > Unless ubuntu wants out of the box support between heterogeneous nodes
> > (for example x86_64 and ppc64),
> > there is little to no point in configuring Open MPI with the
> > --enable-heterogeneous option */
> >
> >
> > Cheers,
> >
> > Gilles
> >
> > On Mon, Nov 13, 2017 at 7:56 AM, Xavier Besseron 
> wrote:
> >> Dear all,
> >>
> >> I want to share with you the follow issue with the OpenMPI shipped with
> the
> >> latest Ubuntu Artful. It is OpenMPI 2.1.1 compiled with option
> >> --enable-heterogeneous.
> >>
> >> Looking at this issue https://github.com/open-mpi/ompi/issues/171, it
> >> appears that this option is broken and should not be used.
> >> This option is being used in Debian/Ubuntu since 2010
> >> (http://changelogs.ubuntu.com/changelogs/pool/universe/o/
> openmpi/openmpi_2.1.1-6/changelog)
> >> and is still used so far. Apparently, nobody complained so far.
> >>
> >> However, now I complain :-)
> >> I've found a simple example for which this option causes invalid
> results in
> >> OpenMPI.
> >>
> >>
> >> int A = 666, B = 42;
> >> MPI_Irecv(&A, 1, MPI_INT, MPI_ANY_SOURCE, tag, comm, &req);
> >> MPI_Send(&B, 1, MPI_INT, my_rank, tag, comm);
> >> MPI_Wait(&req, &status);
> >>
> >> # After that, when compiled with --enable-heterogeneous, we have A != B
> >>
> >> This happens with just a single process. The full example is in
> attachment
> >> (to be run with "mpirun -n 1 ./bug_openmpi_artful").
> >> I extracted and simplified the code from the Zoltan library with which I
> >> initially noticed the issue.
> >>
> >> I find it annoying that Ubuntu distributes a broken OpenMPI.
> >> I've also tested OpenMPI 2.1.1, 2.1.2 and 3.0.0 and using
> >> --enable-heterogeneous causes the bug systematically.
> >>
> >>
> >> Finally, my points/questions are:
> >>
> >> - To share with you this small example in case you want to debug it
> >>
> >> - What is the status of issue https://github.com/open-mpi/
> ompi/issues/171 ?
> >> Is this option still considered broken?
> >> If yes, I encourage you to remove it or mark as deprecated to avoid this
> >> kind of mistake in the future.
> >>
> >> - To get the feedback of OpenMPI developers on the use of this option,
> which
> >> might convince the Debian/Ubuntu maintainer to remove this flag.
> >> I have opened a bug on Ubuntu for it
> >> https://bugs.launchpad.net/ubuntu/+source/openmpi/+bug/1731938
> >>
> >>
> >> Thanks!
> >>
> >> Xavier
> >>
> >>
> >> --
> >> Dr Xavier BESSERON
> >> Research associate
> >> FSTC, University of Luxembourg
> >> Campus Belval, Office MNO E04 0415-040
> >> Phone: +352 46 66 44 5418
> >> http://luxdem.uni.lu/
> >>
> >>
> >> ___
> >> users mailing list
> >> users@lists.open-mpi.org
> >> https://lists.open-mpi.org/mailman/listinfo/users
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] Fwd: Default value of btl_openib_memalign_threshold

Re: [OMPI users] Default value of btl_openib_memalign_threshold

Re: [OMPI users] Default value of btl_openib_memalign_threshold

Re: [OMPI users] Fault tolerant feature in Open MPI

Re: [OMPI users] Fault tolerant feature in Open MPI

Re: [OMPI users] Fault tolerant feature in Open MPI

[OMPI users] Invalid results with OpenMPI on Ubuntu Artful because of --enable-heterogeneous

Re: [OMPI users] Invalid results with OpenMPI on Ubuntu Artful because of --enable-heterogeneous

8 matches

Site Navigation

Mail list logo

Footer information