Dear Husen,

Did you check the information in file
./docs/chapters/01_FTB_on_Linux.txt inside the ftb tarball?
You might want to look at sub-section 4.1.

You can also try to get support on this via the MVAPICH2 mailing list.


Best regards,

Xavier


On Fri, Mar 18, 2016 at 11:24 AM, Husen R <hus...@gmail.com> wrote:
> Dear all,
>
> Thanks for the reply and valuable informations.
>
> I have configured MVAPICH2 using the instructions available in a resource
> provided by Xavier.
> I also have installed FTB (Fault-Tolerant Backplane) in order for MVAPICH2
> to have process migration feature.
>
> however, I got the following error message when I tried to run
> ftb_database_server.
> ------------------------------------------------------------------------------------------------------------------------------------------------
> pro@head-node:/usr/local/sbin$ ftb_database_server &
> [2] 10678
> pro@head-node:/usr/local/sbin$
> [FTB_ERROR][/home/pro/ftb-0.6.2/src/manager_lib/network/network_sock/include/ftb_network_sock.h:
> line 205][hostname:head-node]Cannot find boot-strap server ip address
> ----------------------------------------------------------------------------------------------------------
> Error message : "cannot find boot-strap server ip address".
> I have configured bootstrap ip address when I install FTB.
>
> does anyone have experience solving this problem when using FTB in Open MPI?
> I need help.
>
> Regards,
>
>
> Husen
>
>
> On Fri, Mar 18, 2016 at 12:06 AM, Xavier Besseron <xavier.besse...@uni.lu>
> wrote:
>>
>> On Thu, Mar 17, 2016 at 3:17 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> > Just to clarify: I am not aware of any MPI that will allow you to
>> > relocate a
>> > process while it is running. You have to checkpoint the job, terminate
>> > it,
>> > and then restart the entire thing with the desired process on the new
>> > node.
>> >
>>
>>
>> Dear all,
>>
>> For your information, MVAPICH2 supports live migration of MPI
>> processes, without the need to terminate and restart the whole job.
>>
>> All the details are in the MVAPICH2 user guide:
>>   - How to configure MVAPICH2 for migration
>>
>> http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2b-userguide.html#x1-120004.4
>>   - How to trigger process migration
>>
>> http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2b-userguide.html#x1-760006.14.3
>>
>> You can also check the paper "High Performance Pipelined Process
>> Migration with RDMA"
>>
>> http://mvapich.cse.ohio-state.edu/static/media/publications/abstract/ouyangx-2011-ccgrid.pdf
>>
>>
>> Best regards,
>>
>> Xavier
>>
>>
>>
>> >
>> > On Mar 16, 2016, at 3:15 AM, Husen R <hus...@gmail.com> wrote:
>> >
>> > In the case of MPI application (not gromacs), How do I relocate MPI
>> > application from one node to another node while it is running ?
>> > I'm sorry, as far as I know the ompi-restart command is used to restart
>> > application, based on checkpoint file, once the application already
>> > terminated (no longer running).
>> >
>> > Thanks
>> >
>> > regards,
>> >
>> > Husen
>> >
>> > On Wed, Mar 16, 2016 at 4:29 PM, Jeff Hammond <jeff.scie...@gmail.com>
>> > wrote:
>> >>
>> >> Just checkpoint-restart the app to relocate. The overhead will be lower
>> >> than trying to do with MPI.
>> >>
>> >> Jeff
>> >>
>> >>
>> >> On Wednesday, March 16, 2016, Husen R <hus...@gmail.com> wrote:
>> >>>
>> >>> Hi Jeff,
>> >>>
>> >>> Thanks for the reply.
>> >>>
>> >>> After consulting the Gromacs docs, as you suggested, Gromacs already
>> >>> supports checkpoint/restart. thanks for the suggestion.
>> >>>
>> >>> Previously, I asked about checkpoint/restart in Open MPI because I
>> >>> want
>> >>> to checkpoint MPI Application and restart/migrate it while it is
>> >>> running.
>> >>> For the example, I run MPI application in node A,B and C in a cluster
>> >>> and
>> >>> I want to migrate process running in node A to other node, let's say
>> >>> to node
>> >>> C.
>> >>> is there a way to do this with open MPI ? thanks.
>> >>>
>> >>> Regards,
>> >>>
>> >>> Husen
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Wed, Mar 16, 2016 at 12:37 PM, Jeff Hammond
>> >>> <jeff.scie...@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Why do you need OpenMPI to do this? Molecular dynamics trajectories
>> >>>> are
>> >>>> trivial to checkpoint and restart at the application level. I'm sure
>> >>>> Gromacs
>> >>>> already supports this. Please consult the Gromacs docs or user
>> >>>> support for
>> >>>> details.
>> >>>>
>> >>>> Jeff
>> >>>>
>> >>>>
>> >>>> On Tuesday, March 15, 2016, Husen R <hus...@gmail.com> wrote:
>> >>>>>
>> >>>>> Dear Open MPI Users,
>> >>>>>
>> >>>>>
>> >>>>> Does the current stable release of Open MPI (v1.10 series) support
>> >>>>> fault tolerant feature ?
>> >>>>> I got the information from Open MPI FAQ that The checkpoint/restart
>> >>>>> support was last released as part of the v1.6 series.
>> >>>>> I just want to make sure about this.
>> >>>>>
>> >>>>> and by the way, does Open MPI able to checkpoint or restart mpi
>> >>>>> application/GROMACS automatically ?
>> >>>>> Please, I really need help.
>> >>>>>
>> >>>>> Regards,
>> >>>>>
>> >>>>>
>> >>>>> Husen
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Jeff Hammond
>> >>>> jeff.scie...@gmail.com
>> >>>> http://jeffhammond.github.io/
>> >>>>
>> >>>> _______________________________________________
>> >>>> users mailing list
>> >>>> us...@open-mpi.org
>> >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>>> Link to this post:
>> >>>> http://www.open-mpi.org/community/lists/users/2016/03/28705.php
>> >>>
>> >>>
>> >>
>> >>
>> >> --
>> >> Jeff Hammond
>> >> jeff.scie...@gmail.com
>> >> http://jeffhammond.github.io/
>> >>
>> >> _______________________________________________
>> >> users mailing list
>> >> us...@open-mpi.org
>> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >> Link to this post:
>> >> http://www.open-mpi.org/community/lists/users/2016/03/28709.php
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > us...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post:
>> > http://www.open-mpi.org/community/lists/users/2016/03/28710.php
>> >
>> >
>> >
>> > _______________________________________________
>> > users mailing list
>> > us...@open-mpi.org
>> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> > Link to this post:
>> > http://www.open-mpi.org/community/lists/users/2016/03/28731.php
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/03/28742.php
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/03/28752.php

Reply via email to