It really is just that simple :-) On Aug 22, 2012, at 8:56 AM, Brian Budge <brian.bu...@gmail.com> wrote:
> Okay. Is there a tutorial or FAQ for setting everything up? Or is it > really just that simple? I don't need to run a copy of the orte > server somewhere? > > if my current ip is 192.168.0.1, > > 0 > echo 192.168.0.11 > /tmp/hostfile > 1 > echo 192.168.0.12 >> /tmp/hostfile > 2 > export OMPI_MCA_orte_default_hostfile=/tmp/hostfile > 3 > ./mySpawningExe > > At this point, mySpawningExe will be the master, running on > 192.168.0.1, and I can have spawned, for example, childExe on > 192.168.0.11 and 192.168.0.12? Or childExe1 on 192.168.0.11 and > childExe2 on 192.168.0.12? > > Thanks for the help. > > Brian > > On Wed, Aug 22, 2012 at 7:15 AM, Ralph Castain <r...@open-mpi.org> wrote: >> Sure, that's still true on all 1.3 or above releases. All you need to do is >> set the hostfile envar so we pick it up: >> >> OMPI_MCA_orte_default_hostfile=<foo> >> >> >> On Aug 21, 2012, at 7:23 PM, Brian Budge <brian.bu...@gmail.com> wrote: >> >>> Hi. I know this is an old thread, but I'm curious if there are any >>> tutorials describing how to set this up? Is this still available on >>> newer open mpi versions? >>> >>> Thanks, >>> Brian >>> >>> On Fri, Jan 4, 2008 at 7:57 AM, Ralph Castain <r...@lanl.gov> wrote: >>>> Hi Elena >>>> >>>> I'm copying this to the user list just to correct a mis-statement on my >>>> part >>>> in an earlier message that went there. I had stated that a singleton could >>>> comm_spawn onto other nodes listed in a hostfile by setting an >>>> environmental >>>> variable that pointed us to the hostfile. >>>> >>>> This is incorrect in the 1.2 code series. That series does not allow >>>> singletons to read a hostfile at all. Hence, any comm_spawn done by a >>>> singleton can only launch child processes on the singleton's local host. >>>> >>>> This situation has been corrected for the upcoming 1.3 code series. For the >>>> 1.2 series, though, you will have to do it via an mpirun command line. >>>> >>>> Sorry for the confusion - I sometimes have too many code families to keep >>>> straight in this old mind! >>>> >>>> Ralph >>>> >>>> >>>> On 1/4/08 5:10 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote: >>>> >>>>> Hello Ralph, >>>>> >>>>> Thank you very much for the explanations. >>>>> But I still do not get it running... >>>>> >>>>> For the case >>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe >>>>> everything works. >>>>> >>>>> For the case >>>>> ./my_master.exe >>>>> it does not. >>>>> >>>>> I did: >>>>> - create my_hostfile and put it in the $HOME/.openmpi/components/ >>>>> my_hostfile : >>>>> bollenstreek slots=2 max_slots=3 >>>>> octocore01 slots=8 max_slots=8 >>>>> octocore02 slots=8 max_slots=8 >>>>> clstr000 slots=2 max_slots=3 >>>>> clstr001 slots=2 max_slots=3 >>>>> clstr002 slots=2 max_slots=3 >>>>> clstr003 slots=2 max_slots=3 >>>>> clstr004 slots=2 max_slots=3 >>>>> clstr005 slots=2 max_slots=3 >>>>> clstr006 slots=2 max_slots=3 >>>>> clstr007 slots=2 max_slots=3 >>>>> - setenv OMPI_MCA_rds_hostfile_path my_hostfile (I put it in .tcshrc and >>>>> then source .tcshrc) >>>>> - in my_master.cpp I did >>>>> MPI_Info info1; >>>>> MPI_Info_create(&info1); >>>>> char* hostname = >>>>> "clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02"; >>>>> MPI_Info_set(info1, "host", hostname); >>>>> >>>>> _intercomm = intracomm.Spawn("./childexe", argv1, _nProc, info1, 0, >>>>> MPI_ERRCODES_IGNORE); >>>>> >>>>> - After I call the executable, I've got this error message >>>>> >>>>> bollenstreek: > ./my_master >>>>> number of processes to run: 1 >>>>> -------------------------------------------------------------------------- >>>>> Some of the requested hosts are not included in the current allocation for >>>>> the application: >>>>> ./childexe >>>>> The requested hosts were: >>>>> clstr002,clstr003,clstr005,clstr006,clstr007,octocore01,octocore02 >>>>> >>>>> Verify that you have mapped the allocated resources properly using the >>>>> --host specification. >>>>> -------------------------------------------------------------------------- >>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>>>> base/rmaps_base_support_fns.c at line 225 >>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>>>> rmaps_rr.c at line 478 >>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>>>> base/rmaps_base_map_job.c at line 210 >>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>>>> rmgr_urm.c at line 372 >>>>> [bollenstreek:21443] [0,0,0] ORTE_ERROR_LOG: Out of resource in file >>>>> communicator/comm_dyn.c at line 608 >>>>> >>>>> Did I miss something? >>>>> Thanks for help! >>>>> >>>>> Elena >>>>> >>>>> >>>>> -----Original Message----- >>>>> From: Ralph H Castain [mailto:r...@lanl.gov] >>>>> Sent: Tuesday, December 18, 2007 3:50 PM >>>>> To: Elena Zhebel; Open MPI Users <us...@open-mpi.org> >>>>> Cc: Ralph H Castain >>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration >>>>> >>>>> >>>>> >>>>> >>>>> On 12/18/07 7:35 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote: >>>>> >>>>>> Thanks a lot! Now it works! >>>>>> The solution is to use mpirun -n 1 -hostfile my.hosts *.exe and pass >>>>> MPI_Info >>>>>> Key to the Spawn function! >>>>>> >>>>>> One more question: is it necessary to start my "master" program with >>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe ? >>>>> >>>>> No, it isn't necessary - assuming that my_master_host is the first host >>>>> listed in your hostfile! If you are only executing one my_master.exe >>>>> (i.e., >>>>> you gave -n 1 to mpirun), then we will automatically map that process onto >>>>> the first host in your hostfile. >>>>> >>>>> If you want my_master.exe to go on someone other than the first host in >>>>> the >>>>> file, then you have to give us the -host option. >>>>> >>>>>> >>>>>> Are there other possibilities for easy start? >>>>>> I would say just to run ./my_master.exe , but then the master process >>>>> doesn't >>>>>> know about the available in the network hosts. >>>>> >>>>> You can set the hostfile parameter in your environment instead of on the >>>>> command line. Just set OMPI_MCA_rds_hostfile_path = my.hosts. >>>>> >>>>> You can then just run ./my_master.exe on the host where you want the >>>>> master >>>>> to reside - everything should work the same. >>>>> >>>>> Just as an FYI: the name of that environmental variable is going to change >>>>> in the 1.3 release, but everything will still work the same. >>>>> >>>>> Hope that helps >>>>> Ralph >>>>> >>>>> >>>>>> >>>>>> Thanks and regards, >>>>>> Elena >>>>>> >>>>>> >>>>>> -----Original Message----- >>>>>> From: Ralph H Castain [mailto:r...@lanl.gov] >>>>>> Sent: Monday, December 17, 2007 5:49 PM >>>>>> To: Open MPI Users <us...@open-mpi.org>; Elena Zhebel >>>>>> Cc: Ralph H Castain >>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 12/17/07 8:19 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote: >>>>>> >>>>>>> Hello Ralph, >>>>>>> >>>>>>> Thank you for your answer. >>>>>>> >>>>>>> I'm using OpenMPI 1.2.3. , compiler glibc232, Linux Suse 10.0. >>>>>>> My "master" executable runs only on the one local host, then it spawns >>>>>>> "slaves" (with MPI::Intracomm::Spawn). >>>>>>> My question was: how to determine the hosts where these "slaves" will be >>>>>>> spawned? >>>>>>> You said: "You have to specify all of the hosts that can be used by >>>>>>> your job >>>>>>> in the original hostfile". How can I specify the host file? I can not >>>>>>> find it >>>>>>> in the documentation. >>>>>> >>>>>> Hmmm...sorry about the lack of documentation. I always assumed that the >>>>>> MPI >>>>>> folks in the project would document such things since it has little to do >>>>>> with the underlying run-time, but I guess that fell through the cracks. >>>>>> >>>>>> There are two parts to your question: >>>>>> >>>>>> 1. how to specify the hosts to be used for the entire job. I believe that >>>>> is >>>>>> somewhat covered here: >>>>>> http://www.open-mpi.org/faq/?category=running#simple-spmd-run >>>>>> >>>>>> That FAQ tells you what a hostfile should look like, though you may >>>>>> already >>>>>> know that. Basically, we require that you list -all- of the nodes that >>>>>> both >>>>>> your master and slave programs will use. >>>>>> >>>>>> 2. how to specify which nodes are available for the master, and which for >>>>>> the slave. >>>>>> >>>>>> You would specify the host for your master on the mpirun command line >>>>>> with >>>>>> something like: >>>>>> >>>>>> mpirun -n 1 -hostfile my_hostfile -host my_master_host my_master.exe >>>>>> >>>>>> This directs Open MPI to map that specified executable on the specified >>>>> host >>>>>> - note that my_master_host must have been in my_hostfile. >>>>>> >>>>>> Inside your master, you would create an MPI_Info key "host" that has a >>>>> value >>>>>> consisting of a string "host1,host2,host3" identifying the hosts you want >>>>>> your slave to execute upon. Those hosts must have been included in >>>>>> my_hostfile. Include that key in the MPI_Info array passed to your Spawn. >>>>>> >>>>>> We don't currently support providing a hostfile for the slaves (as >>>>>> opposed >>>>>> to the host-at-a-time string above). This may become available in a >>>>>> future >>>>>> release - TBD. >>>>>> >>>>>> Hope that helps >>>>>> Ralph >>>>>> >>>>>>> >>>>>>> Thanks and regards, >>>>>>> Elena >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On >>>>>>> Behalf Of Ralph H Castain >>>>>>> Sent: Monday, December 17, 2007 3:31 PM >>>>>>> To: Open MPI Users <us...@open-mpi.org> >>>>>>> Cc: Ralph H Castain >>>>>>> Subject: Re: [OMPI users] MPI::Intracomm::Spawn and cluster >>>>>>> configuration >>>>>>> >>>>>>> On 12/12/07 5:46 AM, "Elena Zhebel" <ezhe...@fugro-jason.com> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hello, >>>>>>>> >>>>>>>> I'm working on a MPI application where I'm using OpenMPI instead of >>>>>>>> MPICH. >>>>>>>> >>>>>>>> In my "master" program I call the function MPI::Intracomm::Spawn which >>>>>>> spawns >>>>>>>> "slave" processes. It is not clear for me how to spawn the "slave" >>>>>>> processes >>>>>>>> over the network. Currently "master" creates "slaves" on the same >>>>>>>> host. >>>>>>>> >>>>>>>> If I use 'mpirun --hostfile openmpi.hosts' then processes are spawn >>>>>>>> over >>>>>>> the >>>>>>>> network as expected. But now I need to spawn processes over the >>>>>>>> network >>>>>>> from >>>>>>>> my own executable using MPI::Intracomm::Spawn, how can I achieve it? >>>>>>>> >>>>>>> >>>>>>> I'm not sure from your description exactly what you are trying to do, >>>>>>> nor in >>>>>>> what environment this is all operating within or what version of Open >>>>>>> MPI >>>>>>> you are using. Setting aside the environment and version issue, I'm >>>>>>> guessing >>>>>>> that you are running your executable over some specified set of hosts, >>>>>>> but >>>>>>> want to provide a different hostfile that specifies the hosts to be >>>>>>> used for >>>>>>> the "slave" processes. Correct? >>>>>>> >>>>>>> If that is correct, then I'm afraid you can't do that in any version >>>>>>> of Open >>>>>>> MPI today. You have to specify all of the hosts that can be used by >>>>>>> your job >>>>>>> in the original hostfile. You can then specify a subset of those hosts >>>>>>> to be >>>>>>> used by your original "master" program, and then specify a different >>>>>>> subset >>>>>>> to be used by the "slaves" when calling Spawn. >>>>>>> >>>>>>> But the system requires that you tell it -all- of the hosts that are >>>>>>> going >>>>>>> to be used at the beginning of the job. >>>>>>> >>>>>>> At the moment, there is no plan to remove that requirement, though >>>>>>> there has >>>>>>> been occasional discussion about doing so at some point in the future. >>>>>>> No >>>>>>> promises that it will happen, though - managed environments, in >>>>>>> particular, >>>>>>> currently object to the idea of changing the allocation on-the-fly. We >>>>>>> may, >>>>>>> though, make a provision for purely hostfile-based environments (i.e., >>>>>>> unmanaged) at some time in the future. >>>>>>> >>>>>>> Ralph >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Thanks in advance for any help. >>>>>>>> >>>>>>>> Elena >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> users mailing list >>>>>>>> us...@open-mpi.org >>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>>> >>>>>>> _______________________________________________ >>>>>>> users mailing list >>>>>>> us...@open-mpi.org >>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> us...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> _______________________________________________ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users