Dear Xavier, Yes, I did. I followed the instructions available in that file, especially at sub-section 4.1.
I configured boot-strap IP using the ./configure options. in front-end node, the boot-strap IP is its IP address because I want to make it as an ftb_database_server. in every compute nodes, the boot-strap IP is the front-end's IP address. finally, I use default values for boot-strap port and agent-port. I asked MVAPICH authority about this issue along with process migration issue and they said it looks like the feature is broken and they will take a look at it in a low priority due to other on-going activities in the project. Thank you. Regards, Husen On Sun, Mar 20, 2016 at 3:04 AM, Xavier Besseron <xavier.besse...@uni.lu> wrote: > Dear Husen, > > Did you check the information in file > ./docs/chapters/01_FTB_on_Linux.txt inside the ftb tarball? > You might want to look at sub-section 4.1. > > You can also try to get support on this via the MVAPICH2 mailing list. > > > Best regards, > > Xavier > > > On Fri, Mar 18, 2016 at 11:24 AM, Husen R <hus...@gmail.com> wrote: > > Dear all, > > > > Thanks for the reply and valuable informations. > > > > I have configured MVAPICH2 using the instructions available in a resource > > provided by Xavier. > > I also have installed FTB (Fault-Tolerant Backplane) in order for > MVAPICH2 > > to have process migration feature. > > > > however, I got the following error message when I tried to run > > ftb_database_server. > > > ------------------------------------------------------------------------------------------------------------------------------------------------ > > pro@head-node:/usr/local/sbin$ ftb_database_server & > > [2] 10678 > > pro@head-node:/usr/local/sbin$ > > > [FTB_ERROR][/home/pro/ftb-0.6.2/src/manager_lib/network/network_sock/include/ftb_network_sock.h: > > line 205][hostname:head-node]Cannot find boot-strap server ip address > > > ---------------------------------------------------------------------------------------------------------- > > Error message : "cannot find boot-strap server ip address". > > I have configured bootstrap ip address when I install FTB. > > > > does anyone have experience solving this problem when using FTB in Open > MPI? > > I need help. > > > > Regards, > > > > > > Husen > > > > > > On Fri, Mar 18, 2016 at 12:06 AM, Xavier Besseron < > xavier.besse...@uni.lu> > > wrote: > >> > >> On Thu, Mar 17, 2016 at 3:17 PM, Ralph Castain <r...@open-mpi.org> > wrote: > >> > Just to clarify: I am not aware of any MPI that will allow you to > >> > relocate a > >> > process while it is running. You have to checkpoint the job, terminate > >> > it, > >> > and then restart the entire thing with the desired process on the new > >> > node. > >> > > >> > >> > >> Dear all, > >> > >> For your information, MVAPICH2 supports live migration of MPI > >> processes, without the need to terminate and restart the whole job. > >> > >> All the details are in the MVAPICH2 user guide: > >> - How to configure MVAPICH2 for migration > >> > >> > http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2b-userguide.html#x1-120004.4 > >> - How to trigger process migration > >> > >> > http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.2b-userguide.html#x1-760006.14.3 > >> > >> You can also check the paper "High Performance Pipelined Process > >> Migration with RDMA" > >> > >> > http://mvapich.cse.ohio-state.edu/static/media/publications/abstract/ouyangx-2011-ccgrid.pdf > >> > >> > >> Best regards, > >> > >> Xavier > >> > >> > >> > >> > > >> > On Mar 16, 2016, at 3:15 AM, Husen R <hus...@gmail.com> wrote: > >> > > >> > In the case of MPI application (not gromacs), How do I relocate MPI > >> > application from one node to another node while it is running ? > >> > I'm sorry, as far as I know the ompi-restart command is used to > restart > >> > application, based on checkpoint file, once the application already > >> > terminated (no longer running). > >> > > >> > Thanks > >> > > >> > regards, > >> > > >> > Husen > >> > > >> > On Wed, Mar 16, 2016 at 4:29 PM, Jeff Hammond <jeff.scie...@gmail.com > > > >> > wrote: > >> >> > >> >> Just checkpoint-restart the app to relocate. The overhead will be > lower > >> >> than trying to do with MPI. > >> >> > >> >> Jeff > >> >> > >> >> > >> >> On Wednesday, March 16, 2016, Husen R <hus...@gmail.com> wrote: > >> >>> > >> >>> Hi Jeff, > >> >>> > >> >>> Thanks for the reply. > >> >>> > >> >>> After consulting the Gromacs docs, as you suggested, Gromacs already > >> >>> supports checkpoint/restart. thanks for the suggestion. > >> >>> > >> >>> Previously, I asked about checkpoint/restart in Open MPI because I > >> >>> want > >> >>> to checkpoint MPI Application and restart/migrate it while it is > >> >>> running. > >> >>> For the example, I run MPI application in node A,B and C in a > cluster > >> >>> and > >> >>> I want to migrate process running in node A to other node, let's say > >> >>> to node > >> >>> C. > >> >>> is there a way to do this with open MPI ? thanks. > >> >>> > >> >>> Regards, > >> >>> > >> >>> Husen > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> On Wed, Mar 16, 2016 at 12:37 PM, Jeff Hammond > >> >>> <jeff.scie...@gmail.com> > >> >>> wrote: > >> >>>> > >> >>>> Why do you need OpenMPI to do this? Molecular dynamics trajectories > >> >>>> are > >> >>>> trivial to checkpoint and restart at the application level. I'm > sure > >> >>>> Gromacs > >> >>>> already supports this. Please consult the Gromacs docs or user > >> >>>> support for > >> >>>> details. > >> >>>> > >> >>>> Jeff > >> >>>> > >> >>>> > >> >>>> On Tuesday, March 15, 2016, Husen R <hus...@gmail.com> wrote: > >> >>>>> > >> >>>>> Dear Open MPI Users, > >> >>>>> > >> >>>>> > >> >>>>> Does the current stable release of Open MPI (v1.10 series) support > >> >>>>> fault tolerant feature ? > >> >>>>> I got the information from Open MPI FAQ that The > checkpoint/restart > >> >>>>> support was last released as part of the v1.6 series. > >> >>>>> I just want to make sure about this. > >> >>>>> > >> >>>>> and by the way, does Open MPI able to checkpoint or restart mpi > >> >>>>> application/GROMACS automatically ? > >> >>>>> Please, I really need help. > >> >>>>> > >> >>>>> Regards, > >> >>>>> > >> >>>>> > >> >>>>> Husen > >> >>>> > >> >>>> > >> >>>> > >> >>>> -- > >> >>>> Jeff Hammond > >> >>>> jeff.scie...@gmail.com > >> >>>> http://jeffhammond.github.io/ > >> >>>> > >> >>>> _______________________________________________ > >> >>>> users mailing list > >> >>>> us...@open-mpi.org > >> >>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> >>>> Link to this post: > >> >>>> http://www.open-mpi.org/community/lists/users/2016/03/28705.php > >> >>> > >> >>> > >> >> > >> >> > >> >> -- > >> >> Jeff Hammond > >> >> jeff.scie...@gmail.com > >> >> http://jeffhammond.github.io/ > >> >> > >> >> _______________________________________________ > >> >> users mailing list > >> >> us...@open-mpi.org > >> >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> >> Link to this post: > >> >> http://www.open-mpi.org/community/lists/users/2016/03/28709.php > >> > > >> > > >> > _______________________________________________ > >> > users mailing list > >> > us...@open-mpi.org > >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > Link to this post: > >> > http://www.open-mpi.org/community/lists/users/2016/03/28710.php > >> > > >> > > >> > > >> > _______________________________________________ > >> > users mailing list > >> > us...@open-mpi.org > >> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> > Link to this post: > >> > http://www.open-mpi.org/community/lists/users/2016/03/28731.php > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> Link to this post: > >> http://www.open-mpi.org/community/lists/users/2016/03/28742.php > > > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > > http://www.open-mpi.org/community/lists/users/2016/03/28752.php > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/03/28759.php >